This patch was tested using the PDF file from issue 2618, i.e. https://bug570667.bugzilla-attachments.gnome.org/attachment.cgi?id=226471, with the following manifest file:
```
[
{ "id": "issue2618",
"file": "../web/pdfs/issue2618.pdf",
"md5": "",
"rounds": 50,
"type": "eq"
}
]
```
which gave the following results when comparing this patch against the `master` branch:
```
-- Grouped By browser, stat --
browser | stat | Count | Baseline(ms) | Current(ms) | +/- | % | Result(P<.05)
------- | ------------ | ----- | ------------ | ----------- | --- | ---- | -------------
firefox | Overall | 50 | 3417 | 3426 | 9 | 0.27 |
firefox | Page Request | 50 | 1 | 1 | 0 | 5.41 |
firefox | Rendering | 50 | 3416 | 3426 | 9 | 0.27 |
```
Based on these results, there's no significant performance regression from using standard classes and this patch should thus be OK.
Previously, we set the base transformation and pattern matrix
directly to the main rendering ctx of the page, however doing this
caused the current transform to be lost. This would cause issues
with things like shear missing so the pattern was misaligned or when
stroke was used the scale of the line width or dash would be wrong.
Instead we should leave the current transform and use setTransfrom
on the pattern so it is applied correctly. For axial and radial shadings I had
to create a temporary canvas to draw the shading so I could in turn
use setTransform.
Fixes: #13325, #6769, #7847, #11018, #11597, #11473
The following already in the corpus are improved:
issue8078-page1
issue1877-page1
This patch moves the `PDFJSDev`-checks *inline*, similar to the rest of the code-base, such that the code in question is actually being removed from the *built* files in e.g. the official releases.
After PR 13117 it's now (finally) possible for *different* build targets to specify individual options/preferences, and we can utilize that to only expose the `renderer`-preference in builds where `SVGGraphics` is actually defined.
Note that for e.g. `MOZCENTRAL`-builds, trying to enable SVG-rendering will throw immediately and the preference thus doesn't make sense to include there.
Also, update the dummy `SVGGraphics` to use a class, tweak the `PDFJSDev`-check in `src/display/svg.js` to agree fully with the option/preference, and remove an unnecessary `eslint-disable`.
Reasons for the removal include:
- This functionality was always somewhat experimental and has never been enabled by default, partly because of worries about rendering bugs caused by e.g. bad/outdated graphics drivers.
- After the initial implementation, in PR 4286 (back in 2014), no additional functionality has been added to the WebGL implementation.
- The vast majority of all documents do not benefit from WebGL rendering, since only a couple of *specific* features are supported (e.g. some Soft Masks and Patterns).
- There is, and has always been, *zero* test-coverage for the WebGL implementation.
- Overall performance, in the PDF.js library, has improved since the experimental WebGL implementation was added.
Rather than shipping unused *and* untested code, it seems reasonable to simply remove the WebGL implementation for now; thanks to version control it's always possible to bring back the code should the need ever arise.
This functionality was originally implemented in PR 7029; however it's not, nor has it ever been, used as far as I can tell.[1]
Note in particular that the default viewer does not expose either a preference or even an option with which `disableCanvasToImageConversion` can be toggled, and source-code modification is thus required.
Furthermore, note also that we have multiple other instances of `canvas`-data accesses in both the `src/display/canvas.js` and `src/display/text_layer.js` files. If any of those are blocked, by e.g. browser settings, there will be outright rendering bugs and non-working thumbnails thus seem like a very small issue in the grand scheme of things; hence why I'm suggesting that we remove the unused `disableCanvasToImageConversion` functionality.
---
[1] For the Tor use-case mentioned in issue 7026, I *believe* that the solution was to white-list `canvas`-data accesses for its built-in PDF Viewer.
Compared to other data-structures, such as e.g. `Dict`s, we're purposely *not* caching Streams on the `XRef`-instance.[1]
The, somewhat unfortunate, effect of Streams not being cached is that repeatedly getting the *same* Stream-data requires re-parsing/re-initializing of a bunch of data; see `XRef.fetch` and related methods.
For the font-parsing in particular we're currently fetching the `toUnicode`-data, which is very often a Stream, in `PartialEvaluator.preEvaluateFont` and then *again* in `PartialEvaluator.extractDataStructures` soon afterwards.
By instead letting `PartialEvaluator.preEvaluateFont` export the "raw" `toUnicode`-data, we can avoid *some* unnecessary re-parsing/re-initializing when handling fonts.
*Please note:* In this particular case, given that `PartialEvaluator.preEvaluateFont` only accesses the "raw" `toUnicode` data, exporting a Stream should be safe.
---
[1] The reasons for this include:
- Streams, especially `DecodeStream`-instances, can become *very* large once read. Hence caching them really isn't a good idea simply because of the (potential) memory impact of doing so.
- Attempting to read from the *same* Stream-instance more than once won't work, unless it's `reset` in between, since using any method such as e.g. `getBytes` always starts at the current data position.
- Given that parsing, even in the worker-thread, is now fairly asynchronous it's generally impossible to assert that any one Stream-instance isn't being accessed "concurrently" by e.g. different `getOperatorList` calls. Hence `reset`-ing a cached Stream-instance isn't going to work in the general case.
Rather than re-fetching/re-parsing these properties immediately in `PartialEvaluator.translateFont`, we can simply export them instead. (Obviously the effect will be really tiny, but there is less parsing overall this way.)
*Please note:* While I don't have a document that this patches fixes, the current code is however not entirely correct as far as I can tell.
Looking at how the `Widths` array is parsed in `PartialEvaluator.extractWidths`, it's clear that the implementation in `PartialEvaluator.preEvaluateFont` is a bit too simplistic. In particular, by only wrapping the data into a TypedArray, there's no attempt to handle *indirect* objects which could potentially lead to colliding `hash`es being computed.
To hopefully help prevent any future bugs, make sure that composite/non-composite fonts cannot accidentally get matching `hash`es. Given the differences between those font types, that's very unlikely to be useful or even correct in general.
Without this some *composite* fonts may incorrectly end up with matching `hash`es, thus breaking rendering since we'll not actually try to load/parse some of the fonts.
*Please note:* Given that the document, in the referenced issue, doesn't embed *any* of its fonts there's no guarantee that it renders correctly in all configurations even with this patch.
With modern JavaScript modules, where you explicitly list the properties that should be exported, it's no longer necessary to wrap *all* of the code within one file into a top-level closure.[1]
This patch reduces the size, of even the *built* `pdf.worker.js` file, since there's now a lot less unnecessary whitespace.
---
[1] For files which contain *different* functionality, some closures may however still make sense in order to separate the code.
It might be possible to remove some of those cases later, e.g. once private class fields becomes generally available/usable in browsers.
With modern JavaScript modules, where you explicitly list the properties that should be exported, it's no longer necessary to wrap all of the code in a closure.[1]
This patch also tries to clean-up/improve a couple of the existing JSDoc-comments.
---
[1] This reduces the size, even of the *built* `pdf.js` file, since there's now a lot less unnecessary whitespace.