*Please note:* A a similar change was attempted in PR 5005, but it was subsequently backed out in PR 5069.
Unfortunately I don't think anyone ever tried to debug *exactly* why it didn't work, since it ought to have worked, and having re-tested this now I'm not able to reproduce the problem any more. However, given just how inefficient the current code is, with thousands of strictly unnecessary function calls for each `find` invocation, I'd really like to try fixing this again.
This reduces the total number of function calls, when reading the XRef table respectively when fetching uncompressed XRef entries.
Note in particular the `XRef.readXRefTable` method, where there're *two* back-to-back `isCmd` checks rather than just one.
A lot of the `new Parser()` call-sites look quite unwieldy/ugly as-is, with a bunch of somewhat randomly ordered arguments, which we can avoid by changing the constructor to accept an object instead. As an added bonus, this provides better documentation without having to add inline argument comments in the code.
See https://github.com/mozilla/eslint-plugin-no-unsanitized
Since we've generally never allowed e.g. `innerHTML`, which is enforced during review, there's only one linting failure with this patch. (Which is white-listed, according to the existing comment and the fact that it's test-only code.)
Since all other `IPDFStream` implementations live in their own files, it seems reasonable for these to do so as well.
Furthermore, converts all of the relevant code to ES6 classes and updates the interface definitions to mark a couple of methods `async`.
Given that `cleanupAfterRender` is already set for large images, when handling 'obj' messages, this patch *should* thus be safe in general (since otherwise there ought be existing bugs related to cleanup and printing).
The border `width` will instead fallback to the default value of `1`, rather than ignoring it altoghether, to also ensure that e.g. `LinkAnnotation`s become clickable as intended.
Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1552113
Usually when the worker is terminated it will also be completely destroyed/removed, which means that any global caches (such as the ones in `src/core/primitive.js`) should be automatically cleared in the process.
However, for certain ways of loading the `pdf.worker.js` file, e.g. passing in a re-usable worker to `getDocument`, using the `workerPort` functionality, or even disabling workers completely (even though this is never a good idea), the worker file may be kept in memory and these caches will not be cleared as expected.
Calling `someArray = []` will create a new Array, which seems completely unnecessary when it's sufficient to just call `someArray.length = 0` to achieve the same effect.
Even though I cannot imagine these particular cases having any noticeable performance impact, similar changes were made in `core/` code years ago since it's apparently more efficient memory wise.
The purpose of these caches is to reduce peak memory usage, by only ever having *a single* instance of a particular object.
However, as-is these caches are never cleared and they will thus remain until the worker is destroyed. This could very well have a negative effect on total memory usage, particularly for large/long documents, hence it seems to make sense to clear out these caches together with various other ones.
This is similar to the existing caching used to reduced the number of `Cmd` and `Name` objects.
With the `tracemonkey.pdf` file, this patch changes the number of `Ref` objects as follows (in the default viewer):
| | Loading the first page | Loading *all* the pages |
|----------|------------------------|-------------------------|
| `master` | 332 | 3265 |
| `patch` | 163 | 996 |
The specification states that `CreationDate` is only available for
markup annotations instead of for all annotation types.
Moreover, popup annotations are not markup annotations according to the
specification, so the creation date inheritance from the parent
annotation is also removed there (note that only the modification date
is used in e.g., the viewer).
This includes the information in the core and display layers. The
date parsing logic from the document properties is rewritten according
to the specification and now includes unit tests.
Moreover, missing unit tests for the color of a popup annotation have
been added.
Finally the styling of the popup is changed slightly to make the text a
bit smaller (it's currently quite large in comparison to other viewers)
and to make the drop shadow a bit more subtle. The former is done to be
able to easily include the modification date in the popup similar to how
other viewers do this.
Currently `handleColorN` will fallback to add a completely unparsed/unvalidated operator when no valid pattern was found. This is unfortunate, since it could very easily lead to a couple of different errors:
- `DataCloneError`s when attempting to send the data to the main-thread, e.g. when `args` is `Dict`/`Stream`.
- Errors in `getShadingPatternFromIR` on the main-thread, unless `args` just happens to have the expected format.
- Errors when actually attempting to render the pattern on the main-thread, since the `args` will most likely not have the expected format.
Hence it probably makes sense to error in `PartialEvaluator.handleColorN`, and having invalid patterns fail gracefully via the existing `ignoreErrors` code-paths instead.
It appears that this has been broken ever since PR 9089, which also introduced this code, since the `QueueOptimizer`/`NullOptimizer` choice was made based on the still undefined `this.intent` property.
Furthermore, fixing this also uncovered the fact that the `NullOptimizer.reset` method was missing.
First of all, while this simple approach appears to work OK in practice I'm not sure if it's the best way of addressing the problem (assuming that you even want to).
Second of all, while the solution implemented here only requires tracking/checking one new boolean in order for this to work, I'm nonetheless not entirely happy about this since it will add additional overhead (albeit *very* small) to the parsing of path operators in PDF documents just for a handful of *corrupt* ones.
This way we can avoid manually building a "document id" in multiple places in `evaluator.js`, and it also let's us avoid passing in an otherwise unnecessary `PDFManager` instance when creating a `PartialEvaluator`.
While PR 10714 did address the `disableRange=true` case, it also managed to "break" the `disableStream=true` case instead since the indeterminate loadingBar is now displayed when it shouldn't; sorry about that!
The solution is simple enough though, don't attempt to fallback to `_fullRequestReader.onProgress` when handling "incomplete" loading information.
Please see the specification, https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#M11.9.12864.1Heading.71.Viewer.Preferences
Furthermore, note that this patch *only* adds API support and unit-tests but does not attempt to integrate e.g. the `ViewerPreferences -> Direction` property into the viewer (which would be necessary to address issue 10736).
The reason for this is that it's not entirely clear to me exactly if/how that could be implemented; e.g. would it be as simple as setting the `dir` attribute on the `viewerContainer` DOM element, or will it be more complicated?
There's also the question of how the `ViewerPreferences -> Direction` value interacts with the `PageMode`, and this will generally require a fair bit of manual testing. Since the direction of the *entire* viewer depends on the browser locale, there's also a somewhat open question regarding what default value to use for different locales.
Finally, if the viewer supports `ViewerPreferences -> Direction` then I'm assuming that it will be necessary to allow users to override the default value, which will require (most likely) new `SecondaryToolbar` buttons and icons for those etc.
Hence this patch only lays the necessary foundation for eventually addressing issue 10736, but defers the actual implementation until later. (Time permitting, I'll try to look into the viewer part later.)
*Please note:* This patch purposely ignores `src/display/network.js`, since its support for progressive reading depends on the non-standard `moz-chunked-arraybuffer` responseType which is currently in the process of being removed.
The file `test/pdfs/annotation-caret-ink.pdf` is already available in
the repository as a reference test for this since I supplied it for
another patch that implemented ink annotations.
This mirrors the canvas implementation where we ignore these operators.
This avoids console spam regarding unimplemented operators we're not
interested in.
For the Tracemonkey paper, we're now down to one warning about tiling
patterns which is in fact a valid one.
In particular, this should reduce intermediate string creation by using
template strings and reduce variable lookup times by removing unneeded
variables and caching `this.current` in more places.
With PR 10675 having fixed the completely broken `disableRange=true` setting in the Firefox version of PDF.js, I couldn't help but noticing that loading progress is never reported properly in that case.
Currently loading progress is only reported for the `rangeProgress` chrome-event, which obviously isn't dispatched with `disableRange=true` set. However, the `progressiveRead` chrome-event includes loading progress as well, but this information isn't being used in any way.
Furthermore, the `PDFDataRangeTransport.onDataProgress` method wasn't able to handle "complete" loading information, and neither was `PDFDataTransportStream._onProgress` since that method would only ever attempt to report it through a RangeReader (which won't exist when `disableRange=true` is set).