Given that *all* fonts are, ever since PR 7347, now cached in the "normal" `fontCache` there's actually no reason for the special `font.translated` construction. (Given how Objects in JavaScript are references, rather than raw values, the old code shouldn't have caused any significant memory overhead.)
Instead we can simply store the `cacheKey`, which is a simple string, on only the Font `Dict`s where it's needed and thus look-up all fonts using the `fontCache` instead.
If this method is ever passed invalid/unexpected data, or if during the course of parsing (since it's used recursively) such data is found, it will fail in a non-graceful way.
Hence this patch, which ensures that we don't attempt to access non-existent properties and also that errors such as the one fixed in PR 12479 wouldn't have occured.
*This patch is based on a couple of smaller things that I noticed when working on PR 12479.*
- Don't store the /Fields on the `formInfo` getter, since that feels like overloading it with unintended (and too complex) data, and utilize a `hasFields` boolean instead.
This functionality was originally added in PR 12271, to help determine what kind of form data a PDF document contains, and I think that we should ensure that the return value of `formInfo` only consists of "simple" data.
With these changes the `fieldObjects` getter instead has to look-up the /Fields manually, however that shouldn't be a problem since the access is guarded by a `formInfo.hasFields` check which ensures that the data both exists and is valid. Furthermore, most documents doesn't even have any /AcroForm data anyway.
- Determine the `hasFields` property *first*, to ensure that it's always correct even if there's errors when checking e.g. the /XFA or /SigFlags entires, since the `fieldObjects` getter depends on it.
- Simplify a loop in `fieldObjects`, since the object being accessed is a `Map` and those have built-in iteration support.
- Use a higher logging level for errors in the `formInfo` getter, and include the actual error message, since that'd have helped with fixing PR 12479 a lot quicker.
- Update the JSDoc comment in `src/display/api.js` to list the return values correctly, and also slightly extend/improve the description.
The last unit-test didn't work correctly, since an error was thrown in `PDFDocument._hasOnlyDocumentSignatures` because the mocked `XRef`-instance wasn't actually being set correctly.
Also, updates the `XRefMock` to use `async` methods where appropriate.
This allows us to make a slight simplification in `PartialEvaluator.loadFont`, which thus removes an old TODO-comment from the method.
Furthermore, in `PartialEvaluator.translateFont`, the CMap-handling is now limited to only *composite* fonts to avoid having to wait for a "dummy"-Promise for most fonts.
- Actually register/unregister the `WorkerTask`s, used when saving each page, correctly.
To prevent issues when terminating the Worker, we purposely wait for all running `WorkerTask`s to complete first. Hence we need to actually handle `WorkerTask`s the same way in "SaveDocument" as in the rest of this file, see e.g. "GetOperatorList" and "GetTextContent".
- Access `PDFDocument` properties in a generally safe/consistent way.
While the current code works fine, given how the PDF document is being loaded, it still seems like a very good idea to be *consistent* in how we access these kind of properties (since in general you need to avoid `MissingDataException` everywhere in this file).
- Change a variable name, since there's essentially no precedent in the code-base for *local* variable names to start with an underscore.
Support for the `scope` parameter, in `MessageHandler.on`, was removed in PR 11110 however this particular case was unused/unnecessary for years prior to that change. (From a quick look through the history, I'm not even sure if it was actually needed in the first place.)
Given that we're no longer using SystemJS to load the `web/` files, see PR 11919, there's nothing that prevents us from using standard `ìmport` statements in this file.
Obviously it's still necessary to load part of the code conditionally on the build type, however this still allows us to clean-up and simplify at least some of this file.
The only noticeable changes are that the built files are now *slightly* smaller, and that Webpack now supports optional chaining and nullish coalescing without the need for Babel plugins.
In practice it's not uncommon for PDF documents to re-use the same TilingPatterns more than once, and parsing them is essentially equal to parsing of a (small) page since a `getOperatorList` call is required.
By caching the internal TilingPattern representation we can thus avoid having to re-parse the same data over and over, and there's also *less* asynchronous parsing required for repeated TilingPatterns.
Initially I had intended to include (standard) benchmark results with this patch, however it's not entirely clear that this is actually necessary here given the preliminary results.
When testing this manually in the development viewer, using `pdfBug=Stats`, the following (approximate) reduction in rendering times were observed when comparing `master` against this patch:
- http://pubs.usgs.gov/sim/3067/pdf/sim3067sheet-2.pdf (from issue 2765): `6800 ms` -> `4100 ms`.
- https://github.com/mozilla/pdf.js/files/1046131/stepped.pdf (from issue 8473): `54000 ms` -> `13000 ms`
- https://github.com/mozilla/pdf.js/files/1046130/proof.pdf (from issue 8473): `5900 ms` -> `2500 ms`
As always, whenever you're dealing with documents which are "slow", there's usually a certain level of subjectivity involved with regards to what's deemed acceptable performance.
Hence it's not clear to me that we want to regard any of the referenced issues as fixed, however the improvements are significant enough to warrant caching of TilingPatterns in my opinion.
I've run `gulp mozcentral`, `gulp generic`, and `gulp generic-es5` with `master` respectively this patch and then diffed the build output. With the (obvious) exception of increased version/build numbers, there were no actual changes from the updated Acorn version.
The `debugger`-statement would only, potentially, make sense during development and we thus want to prevent it from being accidentally included when landing code.
The `alert`, `confirm`, and `prompt` functions should generally be avoided, with the few intended cases manually allowed.
Please find additional details about the ESLint rules at:
- https://eslint.org/docs/rules/no-debugger
- https://eslint.org/docs/rules/no-alert
The changelogs of those dependencies showed no breaking changes for us.
Most of the time the major version bump was done to remove compatibility
with very outdated Node.js versions.
Only for `autoprefixer` and `gulp-postcss` a change was required, which
is including `postcss` in our `package.json` explicitly since it's now
a peer dependency of those packages.
Now only `acorn`,`systemjs`, `terser` and `yargs` are not the latest
versions because they require more work.
In the rest of the viewer code-base, we purposely don't treat `RenderingCancelledException`s as actual errors (since they aren't) and consequently we never log them.
Hence it makes sense, as far as I'm concerned, to simply treat `RenderingCancelledException`s the same way when printing in Firefox.
While I don't print a whole lot, I cannot remember seeing these "errors" logged when printing until *very* recently[1]. Given that the browser print functionality and UI, in Firefox, is under active development it's certainly possible that there's some recent changes to the related timings which make `RenderingCancelledException`s more likely now.
---
[1] Interestingly, only some PDF documents seem to be affected as well; I'm able to reproduce this pretty consistently by opening https://www.uni-muenster.de/imperia/md/content/ziv/pdf/printpay_flyer.pdf in Firefox and then repeating the following sequence:
Clicking on the PDF.js print button, and then cancelling printing.
This dependency hasn't been updated in two years and the only place that
uses it is the `externaltest` target in the Gulpfile. We can simply
replace `fancy-log` usage there with `console.log` like we do in all
other places in the Gulpfile because we're not interested in the
timestamps here. Gulp already prints timestamps and these tests finish
within a second anyway.
Note that it remains in `package-lock.json` because other Gulp-related
packages have it as a dependency, but at least we're no longer depending
on it directly anymore now.