Fixes#11477
The PDF draws many space characters but the embedded fonts don't have a glyph named `space`, so `.notdef` should be drawn instead. PDF.js assumed that Type1 fonts define `.notdef` as the first glyph (index 0). However, now the fonts have the glyph `A` at index 0 and `.notdef` is the last one, so `A` appears where spaces are expected.
Because the rest of the font machinery in `core/fonts.js` assumes `.notdef` is at index zero, it's easiest to modify `core/type1_parser.js` so that it "repairs" fonts and makes sure `.notdef` is at index 0.
The PDF document in question is *corrupt*, since it contains multiple instances of incorrect operators.
We obviously don't want to slow down parsing of *all* documents (since most are valid), just to accommodate a particular bad PDF generator, hence the reason for the inline check before calling the `ensureStateFont` method.
It's no longer necessary to special-case this getter in the `GenericFontLoader` case, since the GENERIC build hasn't been using `mozPrintCallback` for years now (furthermore Firefox 63 is really old as well).
Add a `ignoreDestinationZoom` option/preference to allow users to preserve the current zoom level when navigating to internal destinations (issue 5064, 11606)
Since the goal has always been, essentially since the `EventBus` abstraction was added, to remove all dispatching of DOM events[1] from the viewer components this patch tries to address one thing that came up when updating the examples:
The DOM events are always dispatched last, and it's thus guaranteed that all internal event listeners have been invoked first.
However, there's no such guarantees with the general `EventBus` functionality and the order in which event listeners are invoked is *not* specified. With the promotion of the `EventBus` in the examples, over DOM events, it seems like a good idea to at least *try* to keep this ordering invariant[2] intact.
Obviously this won't prevent anyone from manually calling the new *internal* viewer component methods on the `EventBus`, but hopefully that won't be too common since any existing third-party code would obviously use the `on`/`off` methods and that all of the examples shows the *correct* usage (which should be similarily documented on the "Third party viewer usage" Wiki-page).
---
[1] Looking at the various Firefox-tests, I'm not sure that it'll be possible to (easily) re-write all of them to not rely on DOM events (since getting access to `PDFViewerApplication` might be generally difficult/messy depending on scopes).
In any case, even if technically feasible, it would most likely add *a lot* of complication that may not be desireable in the various Firefox-tests. All-in-all, I'd be fine with keeping the DOM events only for the `MOZCENTRAL` target and gated on `Cu.isInAutomation` (or similar) rather than a preference.
[2] I wouldn't expect any *real* bugs in a custom implementation, simply based on event ordering, but it nonetheless seem like a good idea if any "external" events are still handled last.
To avoid outright breaking third-party usages of the "viewer components" the `getGlobalEventBus` functionality is left intact, but a deprecation message is printed if the function is invoked.
The various examples are updated to *explicitly* initialize an `EventBus` instance, and provide that when initializing the relevant viewer components.
This patch extends the existing heuristics, which are really the best that we can do in general for these kinds of non-embedded *and* non-standard fonts.
Furthermore, this patch also tries to improve the copy-and-paste behaviour for non-embedded Wingdings fonts by also using the `ZapfDingbatsEncoding` in this case.
*Note:* I'm not sure that adding additional tests for Wingdings fonts matters that much, given how limited our "support" for them really is.
Given that all of these primitives implement caching, to avoid unnecessarily duplicating those objects *a lot* during parsing, it would thus be good to actually enforce usage of `Cmd.get()`/`Name.get()`/`Ref.get()` in the code-base.
Luckily it turns out that there's an ESLint rule, which is fairly easy to use, that can be used to disallow arbitrary JavaScript syntax.
Please find additional details about the ESLint rule at https://eslint.org/docs/rules/no-restricted-syntax
*This whole patch feels somewhat arbitrary, and I'd be slightly worried about possibly breaking something else.*
To limit the impact of these changes, we only re-parse JPEG images using a reduced `scanLines` value if and only if: An unexpected EOI (End of Image) marker was encountered during decoding of Scan data *and* the "actual" `scanLines` value is at least one order of magnitude smaller than expected.
In some cases PDF documents can contain JPEG images that the native browser decoder cannot handle, e.g. images with DNL (Define Number of Lines) markers or images where the SOF (Start of Frame) marker contains a wildly incorrect `scanLines` parameter.
Currently, for "simple" JPEG images, we're relying on native image decoding to *fail* before falling back to the implementation in `src/core/jpg.js`. In some cases, note e.g. issue 10880, the native image decoder doesn't outright fail and thus some images may not render.
In an attempt to improve the current situation, this patch adds additional validation of the JPEG image SOF data to force the use of `src/core/jpg.js` directly in cases where the native JPEG decoder cannot be trusted to do the right thing.
The only way to implement this is unfortunately to parse the *beginning* of the JPEG image data, looking for a SOF marker. To limit the impact of this extra parsing, the result is cached on the `JpegStream` instance and this code is only run for images which passed all of the pre-existing "can the JPEG image be natively rendered and/or decoded" checks.
---
*Slightly off-topic:* Working on this *really* makes me start questioning if native rendering/decoding of JPEG images is actually a good idea.
There's certain kinds of JPEG images not supported natively, and all of the validation which is now necessary isn't "free". At this point, in the `NativeImageDecoder`, we're having to check for certain properties in the image dictionary, parse the `ColorSpace`, and finally read the actual image data to find the SOF marker.
Furthermore, we cannot just send the image to the main-thread and be done in the "JpegStream" case, but we also need to wait for rendering to complete (or fail) before continuing with other parsing.
In the "JpegDecode" case we're even having to parse part of the image on the main-thread, which seems completely at odds with the principle of doing all heavy parsing in the Worker, and there's also a couple of potentially large (temporary) allocations/copies of TypedArray data involved as well.
Given that this is completely unused, and that a "normal" function call may be a *tiny* bit more efficient, there's no good reason as far as I can tell to keep it.
Rather than adding two buttons, it seems easier to simply link to the relevant section of the README instead (since it also means fewer things to keep up-to-date).
This complements the existing `PDFViewerApplication.initialized` boolean property, and may be helpful for custom implementations of the default viewer. This will thus provide users of the default viewer an alternative to setting the preference to dispatch events to the DOM (and listen for the "localized" event), since they can instead use:
```javascript
document.addEventListener("webviewerloaded", function() {
PDFViewerApplication.initializedPromise.then(function() {
// The viewer has now been initialized.
})
});
```
Note that in order to avoid manually tracking the initialization state *twice*, this implementation purposely uses the `PromiseCapability` functionality to handle both `PDFViewerApplication.initialized` and `PDFViewerApplication.initializedPromise` internally.
*I've had this patch locally for awhile, but have apparently missed to upstream it.*
This simplifies enabling of new ESLint rules, since most of them support automatic fixing of errors, without having to edit `gulpfile.js` or manually invoke ESLint directly.
With the way that the `default_preferences.json` file is now generated at build time, the `gulp lint` task is now noticeably slower than before. This slowdown has been, and still is, somewhat annoying during the deployment of new ESLint rules.
Hence this patch, which moves the `chromium/preferences_schema.json` validation from `gulp lint` and into a new `gulp lint-chromium` task instead. *Obviously* this new task is run as part of the `gulp npm-test` task, and thus through `npm test` on Node.js/Travis, such that it's still being tested as before.
Please note that these changes do *not* affect the *public* interface of the `Metadata` class, but only touches internal structures.[1]
These changes were prompted by looking at the `getAll` method, which simply returns the "private" metadata object to the consumer. This seems wrong conceptually, since it allows way too easy/accidental changes to the internal parsed metadata.
As part of fixing this, the internal metadata was changed to use a `Map` rather than a plain Object.
---
[1] Basically, we shouldn't need to worry about someone depending on internal implementation details.