Commit Graph

12690 Commits

Author SHA1 Message Date
Tim van der Meij
6dfa10fe97
Merge pull request #11676 from Snuffleupagus/BaseViewer-reset-PromiseCapability
Slightly improve the `BaseViewer.{firstPagePromise, onePageRendered, pagesPromise}` functionality
2020-03-09 22:56:46 +01:00
Jonas Jenwald
3adbba55b2 Limit the number of warning messages printed by any one Lexer.getHexString invocation
*This patch fixes something that's annoyed me every now and then over the years, when debugging/fixing corrupt PDF documents.*

For corrupt PDF documents where `Lexer.getHexString` encounters invalid characters, there's very rarely just a handful of them. In practice it's not uncommon for there to be many hundreds, or even many thousands, invalid hex characters found.
Not only is the resulting console warning spam utterly useless in these cases, there's often enough of it that performance may even suffer; hence this patch which limits the amount of messages that any one `Lexer.getHexString` invocation may print.
2020-03-09 13:34:53 +01:00
Jonas Jenwald
65e6ea2cb2 Prevent lookup errors in PartialEvaluator.hasBlendModes from breaking all parsing/rendering of a page (issue 11678)
The PDF document in question is *corrupt*, since it contains an XObject with a truncated dictionary and where the stream contents start without a "stream" operator.
2020-03-09 12:00:12 +01:00
Jonas Jenwald
3eb4c1940d Initialize the textLayerFactory once in BaseViewer.setDocument, rather than repeating it for every page
For reasons that I don't even pretend to understand, the `textLayerFactory` property is determined for *every single* page in the PDF document.
Given that the `TextLayerMode` should be consistent for *all* pages in a document, we obviously could/should define `textLayerFactory` just once instead.
2020-03-08 09:23:45 +01:00
Jonas Jenwald
1fac29d184 Slightly improve the BaseViewer.{firstPagePromise, onePageRendered, pagesPromise} functionality
There's a couple of issues with this functionality:
 - The respective `PromiseCapability` instances are not being reset, in `BaseViewer._resetView`, when the document is closed which is inconsistent with all other state.
 - While the default viewer depends on these promises, and they thus ought to be considered part of e.g. the `PDFViewer` API-surface, they're not really defined in a particularily user-visible way (being that they're attached to the `BaseViewer` instance *inline* in `BaseViewer.setDocument`).
 - There's some internal `BaseViewer` state, e.g. `BaseViewer._pageViewsReady`, which is tracked manually and could instead be tracked indirectly via the relevant `PromiseCapability`, thus reducing the need to track state *twice* since that's always best to avoid.

*Please note:* In the existing implementation, these promises are not defined *until* the `BaseViewer.setDocument` method has been called.
While it would've been simple to lift that restriction in this patch, I'm purposely choosing *not* to do so since this ensures that any Promise handlers added inside of `BaseViewer.setDocument` are always invoked *before* any external ones (and keeping that behaviour seems generally reasonable).
2020-03-08 09:23:44 +01:00
Tim van der Meij
7b07b88e71
Merge pull request #11675 from ji-1/master
Fix typo in comment
2020-03-07 23:07:57 +01:00
Jiwon Jeon
df22dfb531 Fix typo 2020-03-07 12:37:22 +09:00
Tim van der Meij
1a97c142b3
Merge pull request #11523 from Snuffleupagus/issue-10880
Add a heuristic, in `src/core/jpg.js`, to handle JPEG images with a wildly incorrect SOF (Start of Frame) `scanLines` parameter (issue 10880)
2020-03-06 23:03:09 +01:00
Tim van der Meij
001b0b270b
Merge pull request #11667 from Snuffleupagus/move-dispatchDOMEvent
Add a deprecation warning for the `eventBusDispatchToDOM` option/preference (PR 11631 follow-up)
2020-03-06 22:55:17 +01:00
Tim van der Meij
1bb25f5cb8
Merge pull request #11673 from Snuffleupagus/FontLoader-bind-more-await
Update the `FontLoader.bind` method to avoid explicitly returning `undefined`
2020-03-06 22:51:39 +01:00
Tim van der Meij
5d566b9dbe
Merge pull request #11672 from Snuffleupagus/Dict-set-value-assert
Slightly simplify the lookup of data in `Dict.{get, getAsync, has}`
2020-03-06 22:47:14 +01:00
Tim van der Meij
977049cd0c
Merge pull request #11671 from Snuffleupagus/update-packages
Update packages and translations
2020-03-06 22:42:16 +01:00
Jonas Jenwald
7d4be08dad Update the FontLoader.bind method to avoid explicitly returning undefined
The only reason for the `return undefined;` lines was to appease the ESLint `consistent-return` rule, but that's not actually necessary if you make use of the fact that the method is `async` and that we can thus await the Promise rather than returning it.
2020-03-06 17:45:24 +01:00
Jonas Jenwald
160cfc4084 Slightly simplify the lookup of data in Dict.{get, getAsync, has}
Note that `Dict.set` will only be called with values returned through `Parser.getObj`, and thus indirectly via `Lexer.getObj`. Since neither of those methods will ever return `undefined`, we can simply assert that that's the case when inserting data into the `Dict` and thus get rid of `in` checks when doing the data lookups.
In this case, since `Dict.set` is fairly hot, the patch utilizes an *inline check* and when necessary a direct call to `unreachable` to not affect performance of `gulp server/test` too much (rather than always just calling `assert`).

For very large and complex PDF files this will help performance *slightly*, since `Dict.{get, getAsync, has}` is called *a lot* during parsing in the worker.

This patch was tested using the PDF file from issue 2618, i.e. http://bugzilla-attachments.gnome.org/attachment.cgi?id=226471, with the following manifest file:
```
[
    {  "id": "issue2618",
       "file": "../web/pdfs/issue2618.pdf",
       "md5": "",
       "rounds": 250,
       "type": "eq"
    }
]
```

which gave the following results when comparing this patch against the `master` branch:
```
-- Grouped By browser, stat --
browser | stat         | Count | Baseline(ms) | Current(ms) | +/- |    %  | Result(P<.05)
------- | ------------ | ----- | ------------ | ----------- | --- | ----- | -------------
Firefox | Overall      |   250 |         2838 |        2820 | -18 | -0.65 |        faster
Firefox | Page Request |   250 |            1 |           2 |   0 | 11.92 |        slower
Firefox | Rendering    |   250 |         2837 |        2818 | -19 | -0.65 |        faster
```
2020-03-06 14:12:14 +01:00
Jonas Jenwald
af8c371fa9 Update l10n files 2020-03-06 13:08:15 +01:00
Jonas Jenwald
824e5c8156 Update npm packages 2020-03-06 13:06:21 +01:00
Jonas Jenwald
01fb309a2a [api-minor] Add more general OpenAction support (PR 10334 follow-up, issue 11642)
This patch deprecates the existing `getOpenActionDestination` API method, in favor of a better and more general `getOpenAction` method instead. (For now JavaScript actions, related to printing, are still handled as before.)

By clearly separating "regular" Print actions from the JavaScript handling, it's thus possible to get rid of the somewhat annoying and strictly incorrect warning when the viewer loads.
2020-03-06 13:03:00 +01:00
Jonas Jenwald
0fb44f5dd6 Move the dispatchDOMEvent functionality out from the EventBus and add a deprecation warning for the eventBusDispatchToDOM option/preference (PR 11631 follow-up)
It occured to me that similar to the `getGlobalEventBus` function, it's probably a good idea to *also* notify users of the fact that `eventBusDispatchToDOM` is now deprecated.

Rather than depending of the re-dispatching of internal events to the DOM, the default viewer can instead be used in e.g. the following way:
```javascript
document.addEventListener("webviewerloaded", function() {
  PDFViewerApplication.initializedPromise.then(function() {
    // The viewer has now been initialized, and its properties can be accessed.

    PDFViewerApplication.eventBus.on("pagerendered", function(event) {
      console.log("Has rendered page number: " + event.pageNumber);
    });
  });
});
```
2020-03-05 13:27:00 +01:00
Jonas Jenwald
3ed1bc917d Update the waitOnEventOrTimeout helper function to handle internal events consistently with the rest of the viewer components (PR 11631 follow-up)
I overlooked this in PR 11631; sorry about that!

Also, ensure that `EventBus` instances *always* track "external" events using a boolean regardless of the actual option value.
2020-03-05 12:04:19 +01:00
Tim van der Meij
25693c6b6d
Merge pull request #11664 from Snuffleupagus/bug-1619595
Prevent the zoom dropdown icon from being overridden when the element is `:active` (bug 1619595)
2020-03-04 23:12:33 +01:00
Jonas Jenwald
ecbe0076fc Prevent the zoom dropdown icon from being overridden when the element is :active (bug 1619595)
This changes the dropdown icon from being set using the `background` CSS property, to being set with `::after` which is *similar* to all the other toolbar button icons (which use `::before`).
Also tweaks the dropdown `background-color` on `:hover` slightly, since the other changes made it too light.
2020-03-04 16:16:41 +01:00
Tim van der Meij
c95b9b1e17
Merge pull request #11653 from Snuffleupagus/ensureStateFont
Ensure that there's always a setFont (Tf) operator before text rendering operators (issue 11651)
2020-03-03 23:33:13 +01:00
Tim van der Meij
b56e058b4b
Merge pull request #11660 from janpe2/type1notdef
Fix Type1 font parsing when .notdef is not at index zero
2020-03-03 23:26:24 +01:00
Jani Pehkonen
71e7686950 Fix Type1 font parsing when .notdef is not at index zero
Fixes #11477
The PDF draws many space characters but the embedded fonts don't have a glyph named `space`, so `.notdef` should be drawn instead. PDF.js assumed that Type1 fonts define `.notdef` as the first glyph (index 0). However, now the fonts have the glyph `A` at index 0 and `.notdef` is the last one, so `A` appears where spaces are expected.

Because the rest of the font machinery in `core/fonts.js` assumes `.notdef` is at index zero, it's easiest to modify `core/type1_parser.js` so that it "repairs" fonts and makes sure `.notdef` is at index 0.
2020-03-03 21:55:51 +02:00
Jonas Jenwald
65e514e063 Ensure that there's always a setFont (Tf) operator before text rendering operators (issue 11651)
The PDF document in question is *corrupt*, since it contains multiple instances of incorrect operators.
We obviously don't want to slow down parsing of *all* documents (since most are valid), just to accommodate a particular bad PDF generator, hence the reason for the inline check before calling the `ensureStateFont` method.
2020-03-03 10:05:18 +01:00
Tim van der Meij
52749d1f0d
Merge pull request #11631 from Snuffleupagus/getGlobalEventBus-deprecate
[api-minor] Deprecate `getGlobalEventBus` and update the "viewer components" examples accordingly
2020-03-02 23:30:07 +01:00
Tim van der Meij
8ea8fa5958
Merge pull request #11654 from Snuffleupagus/BaseFontLoader-isFontLoadingAPISupported
Simplify the `BaseFontLoader.isFontLoadingAPISupported` getter
2020-03-02 23:19:53 +01:00
Jonas Jenwald
1ad65cf405 Simplify the BaseFontLoader.isFontLoadingAPISupported getter
It's no longer necessary to special-case this getter in the `GenericFontLoader` case, since the GENERIC build hasn't been using `mozPrintCallback` for years now (furthermore Firefox 63 is really old as well).
2020-03-02 23:14:48 +01:00
Takashi Tamura
d6b67cd28a Fix the horizontal scaling of texts with SVG backend. #10988 2020-03-02 14:54:41 +09:00
Tim van der Meij
d60c1f68b7
Merge pull request #11556 from tamuratak/vertical_h_scaling
Fix the vertical writing mode with horizontal scaling. #11555.
2020-03-01 18:33:14 +01:00
Takashi Tamura
d8c9f119b0 Fix the vertical writing mode with horizontal scaling. #11555.
It is not valid to multiply textHScale when the writing mode is vertical.

See 9.4.4 Text Space Details, https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G8.1694762
2020-02-29 07:48:29 +09:00
Brendan Dahl
e6d8899827
Merge pull request #11590 from brendandahl/riot
Update links from IRC to Riot.
2020-02-28 09:09:29 -08:00
Brendan Dahl
594a8dfac4 Update links from IRC to Matrix.
Mozilla's IRC is going away and we're migrating to Matrix/Riot.
2020-02-27 16:26:17 -08:00
Tim van der Meij
e1586016c5
Merge pull request #11577 from Snuffleupagus/Pages-tree-refs
Prevent circular references in the /Pages tree
2020-02-27 23:36:11 +01:00
Tim van der Meij
175a6fc64c
Merge pull request #11608 from Snuffleupagus/ignoreDestinationZoom
Add a `ignoreDestinationZoom` option/preference to allow users to preserve the current zoom level when navigating to internal destinations (issue 5064, 11606)
2020-02-27 23:23:13 +01:00
Jonas Jenwald
4a1b056c82 Re-factor the EventBus to allow servicing of "external" event listeners *after* the viewer components have updated
Since the goal has always been, essentially since the `EventBus` abstraction was added, to remove all dispatching of DOM events[1] from the viewer components this patch tries to address one thing that came up when updating the examples:
The DOM events are always dispatched last, and it's thus guaranteed that all internal event listeners have been invoked first.
However, there's no such guarantees with the general `EventBus` functionality and the order in which event listeners are invoked is *not* specified. With the promotion of the `EventBus` in the examples, over DOM events, it seems like a good idea to at least *try* to keep this ordering invariant[2] intact.

Obviously this won't prevent anyone from manually calling the new *internal* viewer component methods on the `EventBus`, but hopefully that won't be too common since any existing third-party code would obviously use the `on`/`off` methods and that all of the examples shows the *correct* usage (which should be similarily documented on the "Third party viewer usage" Wiki-page).

---
[1] Looking at the various Firefox-tests, I'm not sure that it'll be possible to (easily) re-write all of them to not rely on DOM events (since getting access to `PDFViewerApplication` might be generally difficult/messy depending on scopes).
In any case, even if technically feasible, it would most likely add *a lot* of complication that may not be desireable in the various Firefox-tests. All-in-all, I'd be fine with keeping the DOM events only for the `MOZCENTRAL` target and gated on `Cu.isInAutomation` (or similar) rather than a preference.

[2] I wouldn't expect any *real* bugs in a custom implementation, simply based on event ordering, but it nonetheless seem like a good idea if any "external" events are still handled last.
2020-02-27 19:38:13 +01:00
Jonas Jenwald
9a437a158f [api-minor] Deprecate getGlobalEventBus and update the "viewer components" examples accordingly
To avoid outright breaking third-party usages of the "viewer components" the `getGlobalEventBus` functionality is left intact, but a deprecation message is printed if the function is invoked.

The various examples are updated to *explicitly* initialize an `EventBus` instance, and provide that when initializing the relevant viewer components.
2020-02-27 14:44:48 +01:00
Jonas Jenwald
03f5dd2cf2 Add a ignoreDestinationZoom option/preference to allow users to preserve the current zoom level when navigating to internal destinations (issue 5064, 11606) 2020-02-27 08:42:50 +01:00
Tim van der Meij
965ebe63fd
Merge pull request #11540 from tamuratak/charspacing
Fix text spacing with vertical fonts. #7687 and #11526.
2020-02-26 22:26:27 +01:00
Tim van der Meij
bde78cda33
Merge pull request #11630 from Snuffleupagus/README-gitpod
Attempt to clarify, and improve the wording of, the Gitpod section of the README
2020-02-26 22:09:28 +01:00
Jonas Jenwald
ac6bb2e103 Attempt to clarify, and improve the wording of, the Gitpod section of the README 2020-02-26 13:50:22 +01:00
Tim van der Meij
30e0f028b5
Merge pull request #11625 from Snuffleupagus/issue-11451
Use the same non-embedded Wingdings fallback for fonts named "Wingdings-Regular" too (PR 5463 follow-up, issue 11451)
2020-02-25 23:18:19 +01:00
Jonas Jenwald
c55d30a715 Use the same non-embedded Wingdings fallback for fonts named "Wingdings-Regular" too (PR 5463 follow-up, issue 11451)
This patch extends the existing heuristics, which are really the best that we can do in general for these kinds of non-embedded *and* non-standard fonts.

Furthermore, this patch also tries to improve the copy-and-paste behaviour for non-embedded Wingdings fonts by also using the `ZapfDingbatsEncoding` in this case.

*Note:* I'm not sure that adding additional tests for Wingdings fonts matters that much, given how limited our "support" for them really is.
2020-02-24 17:40:06 +01:00
Tim van der Meij
dd893d59d9
Merge pull request #11623 from Snuffleupagus/eslint-disallow-new-primitives
Use the ESLint `no-restricted-syntax` rule to prevent direct usage of `new Cmd()`/`new Name()`/`new Ref()`
2020-02-22 21:32:04 +01:00
Jonas Jenwald
bf09d79eea Use the ESLint no-restricted-syntax rule to prevent direct usage of new Cmd()/new Name()/new Ref()
Given that all of these primitives implement caching, to avoid unnecessarily duplicating those objects *a lot* during parsing, it would thus be good to actually enforce usage of `Cmd.get()`/`Name.get()`/`Ref.get()` in the code-base.
Luckily it turns out that there's an ESLint rule, which is fairly easy to use, that can be used to disallow arbitrary JavaScript syntax.

Please find additional details about the ESLint rule at https://eslint.org/docs/rules/no-restricted-syntax
2020-02-22 21:15:00 +01:00
Jonas Jenwald
c3c3b8cd81 Add a heuristic, in src/core/jpg.js, to handle JPEG images with a wildly incorrect SOF (Start of Frame) scanLines parameter (issue 10880)
*This whole patch feels somewhat arbitrary, and I'd be slightly worried about possibly breaking something else.*

To limit the impact of these changes, we only re-parse JPEG images using a reduced `scanLines` value if and only if: An unexpected EOI (End of Image) marker was encountered during decoding of Scan data *and* the "actual" `scanLines` value is at least one order of magnitude smaller than expected.
2020-02-22 14:16:07 +01:00
Jonas Jenwald
5494f7d5bc Add basic validation of the scanLines parameter in JPEG images, before delegating decoding to the browser
In some cases PDF documents can contain JPEG images that the native browser decoder cannot handle, e.g. images with DNL (Define Number of Lines) markers or images where the SOF (Start of Frame) marker contains a wildly incorrect `scanLines` parameter.
Currently, for "simple" JPEG images, we're relying on native image decoding to *fail* before falling back to the implementation in `src/core/jpg.js`. In some cases, note e.g. issue 10880, the native image decoder doesn't outright fail and thus some images may not render.

In an attempt to improve the current situation, this patch adds additional validation of the JPEG image SOF data to force the use of `src/core/jpg.js` directly in cases where the native JPEG decoder cannot be trusted to do the right thing.
The only way to implement this is unfortunately to parse the *beginning* of the JPEG image data, looking for a SOF marker. To limit the impact of this extra parsing, the result is cached on the `JpegStream` instance and this code is only run for images which passed all of the pre-existing "can the JPEG image be natively rendered and/or decoded" checks.

---

*Slightly off-topic:* Working on this *really* makes me start questioning if native rendering/decoding of JPEG images is actually a good idea.
There's certain kinds of JPEG images not supported natively, and all of the validation which is now necessary isn't "free". At this point, in the `NativeImageDecoder`, we're having to check for certain properties in the image dictionary, parse the `ColorSpace`, and finally read the actual image data to find the SOF marker.
Furthermore, we cannot just send the image to the main-thread and be done in the "JpegStream" case, but we also need to wait for rendering to complete (or fail) before continuing with other parsing.
In the "JpegDecode" case we're even having to parse part of the image on the main-thread, which seems completely at odds with the principle of doing all heavy parsing in the Worker, and there's also a couple of potentially large (temporary) allocations/copies of TypedArray data involved as well.
2020-02-22 14:16:07 +01:00
Tim van der Meij
3472b671e7
Merge pull request #11621 from Snuffleupagus/update-packages
Update packages and translations
2020-02-21 22:37:24 +01:00
Tim van der Meij
7f1e15e088
Merge pull request #11620 from Snuffleupagus/RefSetCache-forEach-rm-thisArg
Remove the unused `thisArg` from `RefSetCache.forEach`
2020-02-21 22:35:22 +01:00
Jonas Jenwald
ed01158127 Update l10n files 2020-02-21 17:40:08 +01:00