Note how we're using custom `__non_webpack_import__`-calls in the code-base, that we replace during the post-processing stage of the build, to be able to write `import`-calls that Webpack will leave alone during parsing.
This work-around is necessary since we let Babel discards all comments, given that we generally don't need/want them in the builds, hence why we cannot utilize `/* webpackIgnore: true */`-comments in the source-code.
After the changes in PR 17563 it thus seems to me that we should be able to just move this re-writing into the Babel plugin instead.
Looking at the *built* files you'll notice some lines containing nothing more than a semicolon. This is the result of (mostly top-level) `if`-statements, which include `PDFJSDev`-checks, that evalute to `false` during Babel parsing.
This has always annoyed me a bit, and looking at Babel plugin it seems that we can fix this simply by *removing* the relevant nodes.
This part of the (modern) preprocessor is now dead code, since we no longer use `require` statements anywhere in the main code-base.
Note that as part of the changes leading up to PDF.js version `4` we removed all[1] the remaining `require` statements, and we also have an ESLint rule to ensure that no new ones are accidentally added.
---
[1] With two small exceptions, in benchmarking-code and in the Webpack-example.
All of our static evaluation & dead-code elimination transforms need to
happen in post-order, transforming inner nodes first. This is so that
in complex nested cases all transforms see the simplified version of
their inner nodes.
For example:
async getNimbusExperimentData() {
if (!PDFJSDev.test("GECKOVIEW")) { return null; }
// other code
}
-> [evaluation of PDFJSDev.*]
async getNimbusExperimentData() {
if (!false) { return null; }
// other code
}
-> [!false -> true]
async getNimbusExperimentData() {
if (true) { return null; }
// other code
}
-> [if (true) -> replace with the if branch]
async getNimbusExperimentData() {
return null;
// other code
}
-> [early return -> remove dead code]
async getNimbusExperimentData() {
return null;
// other code
}
This was done correctly in all cases except for our `UnaryExpression`
transform, which was happening in pre-order.
This commit converts the pdfjsdev-loader transform into a Babel plugin,
to skip a AST->string->AST round-trip.
Before this commit, the webpack build process was:
1. Babel parses the code
2. Babel transforms the AST
3. Babel generates the code
4. Acorn parses the code
5. pdfjsdev-loader transforms the AST
6. @javascript-obfuscator/escodegen generates the code
7. Webpack parses the file
8. Webpack concatenates the files
After this commit, it is reduced to:
1. Babel parses the code
2. Babel transforms the AST
3. babel-plugin-pdfjs-preprocessor transforms the AST
4. Babel generates the code
5. Webpack parses the file
6. Webpack concatenates the files
This change improves the build time by ~25% (tested on MacBook Air M2):
- `gulp lib`: 3.4s to 2.6s
- `gulp dist`: 36s to 29s
- `gulp generic`: 5.5s to 4.0s
- `gulp mozcentral`: 4.7s to 3.2s
The new Babel plugin doesn't support the `saveComments` option of
pdfjsdev-loader, and it just always discards comments. Even though
pdfjsdev-loader supported multiple values for that option, it was
effectively ignored due to `acorn` dropping comments by default.
This is in preparation for the next commit, which will convert
preprocessor2.mjs to a Babel plugin. The purpose of this commit
is to help git track the rename regardless of the large amount
of changes.
Given the limitations of the old pre-processor that's used for CSS/HTML files, this unfortunately isn't as "easy" to implement as it is for JavaScript code.
Since this is the first case where we've wanted to do conditional CSS imports, rather than trying to completely re-write the pre-processor, this patch settles for handling it explicitly in the `expandCssImports` function.
There are obviously cases where using `concat` makes perfect sense, since that method doesn't change any of the existing Arrays; see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/concat
However, in a few cases throughout the code-base that's not an issue and using `concat` only leads to unnecessary intermediate allocations. With modern JavaScript we can thus replace those with a combination of `push` and spread-syntax, which wasn't originally possible when the code was written.
When pre-processor blocks are being removed, since they don't apply to the current build target, we may currently end up with consecutive blank lines.
While this is obviously not a big issue, it's nonetheless undesirable and we can adjust the `writeLine` function to prevent that.
*This is a follow-up to PR 14886, which "broke" this.*
In addition to fixing the issue, using an Array and `join`-ing it at the end may also be a tiny bit more efficient than using a growing string.
An old shortcoming of the `preprocessCSS`-function is its complete lack of support for our "normal" defines, which makes it very difficult to have build-specific CSS rules. Recently we've started using specially crafted comments to remove CSS rules from the MOZCENTRAL build, but (ab)using the `preprocessCSS`-function in this way really doesn't feel great.
However, it turns out to be surprisingly simple to instead use the "regular" `preprocess`-function for the CSS files as well. The only special-handling that's still necessary is the helper-function for dealing with CSS-imports, but apart from that everything seems to just work.
One reason, as far as I can tell, for having a separate `preprocessCSS`-function was likely that we originally used *lots* of vendor-prefixed CSS rules in our CSS files. With improvements over the years, especially thanks to Autoprefixer and PostCSS, we've been able to remove *almost* all non-standard CSS rules and the need for special-casing the CSS parsing has mostly vanished.
*Please note:* As part of testing this patch I've diffed the output of `gulp generic`, `gulp mozcentral`, and `gulp chromium` against the `master`-branch to check that there was no obvious breakage.
Rather than *manually* specifying a "mode", we can simply use the regular `defines` directly instead. To improve consistency, in the `external/builder/builder.js` file, a couple of parameters are also re-named.
With ESLint 8 we should now finally be able to start using modern `class` features, see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Classes/Public_class_fields and https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Classes/Private_class_fields
However, while both ESLint and Acorn now support this, it unfortunately turns out that Escodegen (which we use during building) still lack the necessary support. Looking at https://github.com/estools/escodegen there's not been any updates since last year, and there's also open PRs adding support for these new `class` features.
To avoid blocking usage of these `class` features in the PDF.js code-base, in particular *private* fields/methods, this patch thus proposes that we (hopefully temporarily) switch to an `escodegen` fork that has the necessary support; please see https://www.npmjs.com/package/@javascript-obfuscator/escodegen
While I have no reason to doubt the security of the `escodegen` fork, this patch nonetheless pins the version number. Furthermore, I've also diffed the output of the two `.js`-files in this forked package against the original files without finding anything that looks immediately "dangerous".
As part of testing this, I've diffed the output of `gulp mozcentral` with/without this patch and the *only* difference is the incremented `version`/`build` numbers.
- Add support for logical assignment operators, i.e. `&&=`, `||=`, and `??=`, with a Babel-plugin. Given that these required incrementing the ECMAScript version in the ESLint and Acorn configurations, and that platform/browser support is still fairly limited, always transpiling them seems appropriate for now.
- Cache the `hasJSActions` promise in the API, similar to the existing `getAnnotations` caching. With this implemented, the lookup should now be cheap enough that it can be called unconditionally in the viewer.
- Slightly improve cleanup of resources when destroying the `WorkerTransport`.
- Remove the `annotationStorage`-property from the `PDFPageView` constructor, since it's not necessary and also brings it more inline with the `BaseViewer`.
- Update the `BaseViewer.createAnnotationLayerBuilder` method to actaually agree with the `IPDFAnnotationLayerFactory` interface.[1]
- Slightly tweak a couple of JSDoc comments.
---
[1] We probably ought to re-factor both the `IPDFTextLayerFactory` and `IPDFAnnotationLayerFactory` interfaces to take parameter objects instead, since especially the `IPDFAnnotationLayerFactory` one is becoming quite unwieldy. Given that that would likely be a breaking change for any custom viewer-components implementation, this probably requires careful deprecation.
Note that a number of these cases are covered by existing unit-tests, and a few others only matter for the development/build scripts.
Furthermore, I've also tried to the best of my ability to test each case *manually* to hopefully further reduce the likelihood of this patch introducing any bugs.
Please find additional details about the ESLint rule at https://eslint.org/docs/rules/no-useless-escape
I've run `gulp mozcentral`, `gulp generic`, and `gulp generic-es5` with `master` respectively this patch and then diffed the build output. With the (obvious) exception of increased version/build numbers, there were no actual changes from the updated Acorn version.
With these changes SystemJS is now only used, during development, on the worker-thread and in the unit/font-tests, since Firefox is currently missing support for worker modules; please see https://bugzilla.mozilla.org/show_bug.cgi?id=1247687
Hence all the JavaScript files in the `web/` and `src/display/` folders are now loaded *natively* by the browser (during development) using standard `import` statements/calls, thanks to a nice `import-maps` polyfill.
*Please note:* As soon as https://bugzilla.mozilla.org/show_bug.cgi?id=1247687 is fixed in Firefox, we should be able to remove all traces of SystemJS and thus finally be able to use every possible modern JavaScript feature.
Please note that these changes were done automatically, using `gulp lint --fix`.
Given that the major version number was increased, there's a fair number of (primarily whitespace) changes; please see https://prettier.io/blog/2020/03/21/2.0.0.html
In order to reduce the size of these changes somewhat, this patch maintains the old "arrowParens" style for now (once mozilla-central updates Prettier we can simply choose the same formatting, assuming it will differ here).
This rule is *not* currently enabled in mozilla-central, but it appears commented out[1] in the ESLint definition file; see https://searchfox.org/mozilla-central/rev/c80fa7258c935223fe319c5345b58eae85d4c6ae/tools/lint/eslint/eslint-plugin-mozilla/lib/configs/recommended.js#238-239
Unfortunately this rule is, for fairly obvious reasons, impossible to `--fix` automatically (even partially) and each case thus required careful manual analysis.
Hence this ESLint rule is, by some margin, probably the most difficult one that we've enabled thus far. However, using this rule does seem like a good idea in general since allowing variable shadowing could lead to subtle (and difficult to find) bugs or at the very least confusing code.
Please find additional details about the ESLint rule at https://eslint.org/docs/rules/no-shadow
---
[1] Most likely, a very large number of lint errors have prevented this rule from being enabled thus far.
This patch makes the follow changes:
- Remove no longer necessary inline `// eslint-disable-...` comments.
- Fix `// eslint-disable-...` comments that Prettier moved down, thus causing new linting errors.
- Concatenate strings which now fit on just one line.
- Fix comments that are now too long.
- Finally, and most importantly, adjust comments that Prettier moved down, since the new positions often is confusing or outright wrong.
Note that Prettier, purposely, has only limited [configuration options](https://prettier.io/docs/en/options.html). The configuration file is based on [the one in `mozilla central`](https://searchfox.org/mozilla-central/source/.prettierrc) with just a few additions (to avoid future breakage if the defaults ever changes).
Prettier is being used for a couple of reasons:
- To be consistent with `mozilla-central`, where Prettier is already in use across the tree.
- To ensure a *consistent* coding style everywhere, which is automatically enforced during linting (since Prettier is used as an ESLint plugin). This thus ends "all" formatting disussions once and for all, removing the need for review comments on most stylistic matters.
Many ESLint options are now redundant, and I've tried my best to remove all the now unnecessary options (but I may have missed some).
Note also that since Prettier considers the `printWidth` option as a guide, rather than a hard rule, this patch resorts to a small hack in the ESLint config to ensure that *comments* won't become too long.
*Please note:* This patch is generated automatically, by appending the `--fix` argument to the ESLint call used in the `gulp lint` task. It will thus require some additional clean-up, which will be done in a *separate* commit.
(On a more personal note, I'll readily admit that some of the changes Prettier makes are *extremely* ugly. However, in the name of consistency we'll probably have to live with that.)
In many cases in the code you don't actually care about the index itself, but rather just want to know if something exists in a String/Array or if a String starts in a particular way. With modern JavaScript functionality, it's thus possible to remove a number of existing `indexOf` cases.
For proof-of-concept, this patch converts a couple of `Promise` returning methods to use `async` instead.
Please note that the `generic` build, based on this patch, has been successfully testing in IE11 (i.e. the viewer loads and nothing is obviously broken).
Being able to use modern JavaScript features like `async`/`await` is a huge plus, but there's one (obvious) side-effect: The size of the built files will increase slightly (unless `SKIP_BABEL == true`). That's unavoidable, but seems like a small price to pay in the grand scheme of things.
Finally, note that the `chromium` build target was changed to no longer skip Babel translation, since the Chrome extension still supports version `49` of the browser (where native `async` support isn't available).
http://eslint.org/docs/rules/comma-danglehttp://eslint.org/docs/rules/object-curly-spacing
*Please note:* This patch was created automatically, using the ESLint `--fix` command line option. In a couple of places this caused lines to become too long, and I've fixed those manually; please refer to the interdiff below for the only hand-edits in this patch.
```diff
diff --git a/gulpfile.js b/gulpfile.js
index d18b9c58..7d47fd8d 100644
--- a/gulpfile.js
+++ b/gulpfile.js
@@ -1247,7 +1247,8 @@ gulp.task('gh-pages-git', ['gh-pages-prepare', 'wintersmith'], function () {
var reason = process.env['PDFJS_UPDATE_REASON'];
safeSpawnSync('git', ['init'], { cwd: GH_PAGES_DIR, });
- safeSpawnSync('git', ['remote', 'add', 'origin', REPO], { cwd: GH_PAGES_DIR, });
+ safeSpawnSync('git', ['remote', 'add', 'origin', REPO],
+ { cwd: GH_PAGES_DIR, });
safeSpawnSync('git', ['add', '-A'], { cwd: GH_PAGES_DIR, });
safeSpawnSync('git', [
'commit', '-am', 'gh-pages site created via gulpfile.js script',
```