Note how we're using custom `__non_webpack_import__`-calls in the code-base, that we replace during the post-processing stage of the build, to be able to write `import`-calls that Webpack will leave alone during parsing.
This work-around is necessary since we let Babel discards all comments, given that we generally don't need/want them in the builds, hence why we cannot utilize `/* webpackIgnore: true */`-comments in the source-code.
After the changes in PR 17563 it thus seems to me that we should be able to just move this re-writing into the Babel plugin instead.
Looking at the *built* files you'll notice some lines containing nothing more than a semicolon. This is the result of (mostly top-level) `if`-statements, which include `PDFJSDev`-checks, that evalute to `false` during Babel parsing.
This has always annoyed me a bit, and looking at Babel plugin it seems that we can fix this simply by *removing* the relevant nodes.
This part of the (modern) preprocessor is now dead code, since we no longer use `require` statements anywhere in the main code-base.
Note that as part of the changes leading up to PDF.js version `4` we removed all[1] the remaining `require` statements, and we also have an ESLint rule to ensure that no new ones are accidentally added.
---
[1] With two small exceptions, in benchmarking-code and in the Webpack-example.
All of our static evaluation & dead-code elimination transforms need to
happen in post-order, transforming inner nodes first. This is so that
in complex nested cases all transforms see the simplified version of
their inner nodes.
For example:
async getNimbusExperimentData() {
if (!PDFJSDev.test("GECKOVIEW")) { return null; }
// other code
}
-> [evaluation of PDFJSDev.*]
async getNimbusExperimentData() {
if (!false) { return null; }
// other code
}
-> [!false -> true]
async getNimbusExperimentData() {
if (true) { return null; }
// other code
}
-> [if (true) -> replace with the if branch]
async getNimbusExperimentData() {
return null;
// other code
}
-> [early return -> remove dead code]
async getNimbusExperimentData() {
return null;
// other code
}
This was done correctly in all cases except for our `UnaryExpression`
transform, which was happening in pre-order.
This commit converts the pdfjsdev-loader transform into a Babel plugin,
to skip a AST->string->AST round-trip.
Before this commit, the webpack build process was:
1. Babel parses the code
2. Babel transforms the AST
3. Babel generates the code
4. Acorn parses the code
5. pdfjsdev-loader transforms the AST
6. @javascript-obfuscator/escodegen generates the code
7. Webpack parses the file
8. Webpack concatenates the files
After this commit, it is reduced to:
1. Babel parses the code
2. Babel transforms the AST
3. babel-plugin-pdfjs-preprocessor transforms the AST
4. Babel generates the code
5. Webpack parses the file
6. Webpack concatenates the files
This change improves the build time by ~25% (tested on MacBook Air M2):
- `gulp lib`: 3.4s to 2.6s
- `gulp dist`: 36s to 29s
- `gulp generic`: 5.5s to 4.0s
- `gulp mozcentral`: 4.7s to 3.2s
The new Babel plugin doesn't support the `saveComments` option of
pdfjsdev-loader, and it just always discards comments. Even though
pdfjsdev-loader supported multiple values for that option, it was
effectively ignored due to `acorn` dropping comments by default.