Commit Graph

15281 Commits

Author SHA1 Message Date
Tim van der Meij
f287c5f817
Merge pull request #14411 from Snuffleupagus/getAllPageDicts-async
Convert `Catalog.getAllPageDicts` to an `async` method
2022-01-01 14:43:20 +01:00
Jonas Jenwald
b0e774d9c5 Convert Catalog.getAllPageDicts to an async method
The patch in PR 14335 *essentially* re-introduced the old code from before PR 3848, however looking at this code a bit closer it should be possible to simplify it by making the method asynchronous.

While this method is currently only used as a *fallback* in corrupt documents, the way that `MissingDataException`s are handled is less than ideal. Note that if a `MissingDataException` is thrown, we're forced to re-parse the *entire* /Pages tree[1].
With this method now being asynchronous, we're able to handle fetching of References in a *much* easier/nicer way than before without having to throw `MissingDataException`s and re-parse anything.
These changes also let us simplify the call-site slightly, by calling the method *directly* instead of using the `PDFManager`-instance (since again it will no longer throw `MissingDataException`s).

Furthermore, this patch contains the following other changes:
 - Reduce unnecessary duplication in the various `catch` handlers throughout the method, by simply moving the `XRefEntryException` handling into the `addPageError` helper function instead.
 - Move the "circular references"-check to occur slightly earlier, since there's obviously no point in asynchronously fetching data just to then throw an Error *immediately* afterwards.

---
[1] Imagine e.g. a thousand page document, where there's a `MissingDataException` thrown when fetching/parsing page 900.
2021-12-31 22:03:10 +01:00
Tim van der Meij
3d7bb6c38d
Merge pull request #14409 from Snuffleupagus/getPageIndex-better-caching
Improve caching for the `Catalog.getPageIndex` method (PR 13319 follow-up)
2021-12-31 19:19:14 +01:00
Jonas Jenwald
1491459dea Improve caching for the Catalog.getPageIndex method (PR 13319 follow-up)
This method is now being used a lot more, compared to when it's added, since it's now used together with scripting as part of the `PDFDocument.fieldObjects` parsing (called during viewer initialization).
For /Page Dictionaries that we've already parsed, the `pageIndex` corresponding to a particular Reference is already known and we're thus able to skip *all* parsing in the `Catalog.getPageIndex` method for those cases.
2021-12-29 20:29:14 +01:00
Jonas Jenwald
a20393e6e4 Update PDFDocument._getLinearizationPage to do the /Type-check correctly (PR 14400 follow-up)
I forgot about this in PR 14400, since we should obviously be consistent *and* given that the existing check is actually wrong; sorry about this!
2021-12-29 13:26:58 +01:00
Jonas Jenwald
b99927e1ee Improve the API unit-tests for scripting-related functionality
I happened to notice that we didn't have *any* unit-tests for either `getFieldObjects` or `getCalculationOrderIds`, on the `PDFDocumentProxy` class, which seems unfortunate since it's API functionality that we depend on in e.g. the viewer.
2021-12-29 12:57:32 +01:00
Tim van der Meij
e42d54e1b5
Merge pull request #14400 from Snuffleupagus/getPageDict-async
[api-minor] Convert `Catalog.getPageDict` to an asynchronous method
2021-12-28 19:40:34 +01:00
Tim van der Meij
01b25b2612
Merge pull request #14391 from KouWakai/annot-border-correct
Handle non-integer Annotation border widths correctly (issue 14203)
2021-12-28 19:28:32 +01:00
Tim van der Meij
07c32f0f4f
Merge pull request #14401 from Snuffleupagus/update-packages
Update packages and translations
2021-12-28 19:17:31 +01:00
Jonas Jenwald
ea55e8bf41 Update l10n files 2021-12-26 11:19:19 +01:00
Jonas Jenwald
69f14b1ee9 Update npm packages 2021-12-26 11:09:29 +01:00
Jonas Jenwald
b513c64d9d [api-minor] Convert Catalog.getPageDict to an asynchronous method
Besides converting `Catalog.getPageDict` to an `async` method, thus simplifying the code, this patch also allows us to pro-actively fix a existing issue.
Note how we're looking up References in such a way that `MissingDataException`s won't cause trouble, however it's *technically possible* that the entries (i.e. /Count, /Kids, and /Type) in a /Pages Dictionary could actually be indirect objects as well. In the existing code this could lead to *some*, or even all, pages failing to load/render as intended.
In practice that doesn't *appear* to happen in real-world PDF documents, but given all the weird things that PDF software do I'd prefer to fix this pro-actively (rather than waiting for a bug report).
With `Catalog.getPageDict` being `async` this is now really simple to address, however I didn't want to introduce a bunch more *unconditional* asynchronicity in this method if it could be avoided (since that could slow things down). Hence we'll *synchronously* lookup the *raw* data in a /Pages Dictionary, and only fallback to asynchronous data lookup when a Reference was encountered.

In addition to the above, this patch also makes the following notable changes:
 - Let `Catalog.getPageDict` *consistently* reject with the actual error, regardless of what data we're fetching. Previously we'd "swallow" the actual errors except when looking up Dictionary entries, which is inconsistent and thus seem unfortunate. As can be seen from the updated unit-tests this change is API-observable, hence why the patch is tagged `[api-minor]`.

 - Improve the consistency of the Dictionary /Type-checks in both the `Catalog.getPageDict` and `Catalog.getAllPageDicts` methods.
   In `Catalog.getPageDict` there's a fallback code-path where we're *incorrectly* checking the /Page Dictionary for a /Contents-entry, which is wrong since a /Page Dictionary doesn't need to have a /Contents-entry in order to be valid.
   For consistency the `Catalog.getAllPageDicts` method is also updated to handle errors in the /Type-lookup correctly.

 - Reduce the `PagesCountLimit.PAUSE_EAGER_PAGE_INIT` viewer constant, to further improve loading/rendering performance of the *second* page during initialization of very long documents; PR 14359 follow-up.
2021-12-25 15:22:48 +01:00
KouWakai
98158b67a3 Handle non-integer Annotation border widths correctly (issue 14203)
The existing code appears to be wrong, since according to the PDF specification the border width of an Annotation only has to be a number and not specifically an integer. Please see:
 - https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#page=392
 - https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G11.2096210
 - https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G6.1965562
2021-12-24 22:10:19 +09:00
Jonas Jenwald
41dab8e7b6
Merge pull request #14388 from Snuffleupagus/bug-1746213
Unblock the "load" event in inactive windows/tabs (bug 1746213, PR 11646 follow-up)
2021-12-21 10:04:10 +01:00
Tim van der Meij
c4d344b52a
Merge pull request #14389 from timvandermeij/bump
Bump versions in `pdfjs.config`
2021-12-19 17:02:51 +01:00
Tim van der Meij
08f35f9f7c
Bump versions in pdfjs.config 2021-12-19 16:58:17 +01:00
Jonas Jenwald
dc4a6e94f3 Unblock the "load" event when the windows/tabs becomes inactive (bug 1746213)
*This addresses the following case missing from the previous patch:*
The viewer is loaded in an *active* window/tab, and enough time is allowed to pass in order to allow rendering to start. However, if the user then switches to another tab (or another program) *before* rendering has finished, the "load" event also needs to be unblocked.
2021-12-19 10:40:31 +01:00
Jonas Jenwald
472bbf4592 Unblock the "load" event in inactive windows/tabs (bug 1746213, PR 11646 follow-up)
Given that `requestAnimationFrame` is being used, see the `src/diplay/api.js` file, an inactive window/tab means that rendering will not run and we'll thus not fetch all pages. The latter is a requirement for the "load" event to be unblocked, in the MOZCENTRAL-version of, the default viewer.

This patch is a *partial* solution, since it only addresses the following situations:
 - A *background*  tab (containing the viewer) is reloaded, e.g. via the tab-bar context menu.
 - The viewer is loaded in a active tab, but the user switches away from it (or switches to another program window) *before* rendering has started.
2021-12-19 10:39:48 +01:00
Tim van der Meij
a2ae56f394
Merge pull request #14387 from timvandermeij/test-utils
Modernize the test utilities
2021-12-18 16:40:56 +01:00
Tim van der Meij
71326c6a1c
Enable the no-var linting rule in test/testutils.js
This is done automatically with the `gulp lint --fix` command with the
only exception of the `parts` variable.
2021-12-18 15:58:47 +01:00
Tim van der Meij
a24982a733
Drop custom confirmation logic in favor of using the built-in Node.js readline module
Most likely this code predates our use of Node.js, but in Node.js asking
for user confirmation is a solved problem, so we can remove the custom
logic we have for this, which overall makes things much simpler.
2021-12-18 15:52:04 +01:00
Tim van der Meij
869b396011
Merge pull request #14373 from Snuffleupagus/update-TypeScript
[api-minor] Fix broken/missing JSDocs and `typedef`s, to allow updating TypeScript to the latest version (issue 14342)
2021-12-18 13:35:54 +01:00
Tim van der Meij
afa43d3af0
Merge pull request #14386 from Snuffleupagus/issue-14385
Ignore *negative* /FitH parameters in the viewer (issue 14385)
2021-12-18 13:24:42 +01:00
Jonas Jenwald
6b75e46d11 Ignore *negative* /FitH parameters in the viewer (issue 14385)
This provides a work-around for badly generated PDF documents that contain *negative* /FitH parameters (in the referenced issue the value `-32768` is used).
2021-12-18 11:35:21 +01:00
Jonas Jenwald
e19020c028 Move the Default{...}LayerFactory into a new web/default_factory.js file
This patch, first of all, removes circular dependencies in the TypeScript definitions. Secondly, it also moves `RenderingStates` into `web/ui_utils.js` to break another type-dependency and directly use the `XfaLayerBuilder` during XFA-printing.
Finally, note that this patch *slightly* reduces the size of the default viewer (e.g. in the `MOZCENTRAL` build) by not having to bundle code which is completely unused.
2021-12-15 23:17:08 +01:00
Jonas Jenwald
e0dba504d2 Fix broken/missing JSDocs and typedefs, to allow updating TypeScript to the latest version (issue 14342)
This patch circumvents the issues seen when trying to update TypeScript to version `4.5`, by "simply" fixing the broken/missing JSDocs and `typedef`s such that `gulp typestest` now passes.
As always, given that I don't really know anything about TypeScript, I cannot tell if this is a "correct" and/or proper way of doing things; we'll need TypeScript users to help out with testing!

*Please note:* I'm sorry about the size of this patch, but given how intertwined all of this unfortunately is it just didn't seem easy to split this into smaller parts.
However, one good thing about this TypeScript update is that it helped uncover a number of pre-existing bugs in our JSDocs comments.
2021-12-15 23:14:25 +01:00
Tim van der Meij
d3e1d7090a
Merge pull request #14370 from Snuffleupagus/getPageDict-sync-Pages
Slightly reduce asynchronicity in the `Catalog.getPageDict` method (PR 14338 follow-up)
2021-12-15 19:40:39 +01:00
Tim van der Meij
274989ab56
Merge pull request #14372 from Snuffleupagus/BaseViewer-Lang
Move the /Lang handling into the `BaseViewer` (PR 14114 follow-up)
2021-12-15 19:37:50 +01:00
Tim van der Meij
21aea0b1a2
Merge pull request #14380 from Snuffleupagus/event-utils
Move the `EventBus`, and related functionality, into its own file
2021-12-15 19:34:43 +01:00
Jonas Jenwald
0a19ef6864 Move the EventBus, and related functionality, into its own file
The size of the `web/ui_utils.js` file has increased over time, as more code has been added to (or moved into) that file. To reduce its size slightly, this patch moves the event-related functionality into a separate file.
2021-12-15 17:18:57 +01:00
Jonas Jenwald
760f765e56 Move the /Lang handling into the BaseViewer (PR 14114 follow-up)
In PR 14114 this was only added to the default viewer, which means that in the viewer components the user would need to *manually* implement /Lang handling. This was (obviously) a bad choice, since the viewer components already support e.g. structTrees by default; sorry about overlooking this!

To avoid having to make *two* `getMetadata` API-calls[1] very early during initialization, in the default viewer, the API will now cache its result. This will also come in handy elsewhere in the default viewer, e.g. by reducing parsing when opening the "document properties" dialog.

---
[1] This not only includes a round-trip to the worker-thread, but also having to re-parse the /Metadata-entry when it exists.
2021-12-14 13:19:05 +01:00
Jonas Jenwald
a425c9cfa5
Merge pull request #14368 from timvandermeij/puppeteer
Consistently use string arguments for page.waitForFunction calls and upgrade to Puppeteer 13.0.0
2021-12-14 10:36:06 +01:00
Jonas Jenwald
fa51fd9428 Slightly reduce asynchronicity in the Catalog.getPageDict method (PR 14338 follow-up)
After the changes in PR 14338, specifically in the `XRef.parse`-method, the /Pages-entry will now always have been fetched/validated when the `Catalog`-instance is created.
Hence we can directly access the /Pages-entry in `Catalog.getPageDict` and thus avoid *one* asynchronous data-lookup per page in the document. (In practice this is unlikely to show up in e.g. benchmarks, but it really cannot hurt.)

Finally, make sure that the `getPageDict`/`getAllPageDicts`-methods track the /Pages-tree reference correctly to prevent circular references in corrupt documents.
2021-12-13 21:18:06 +01:00
Tim van der Meij
da2b3dd3be
Upgrade to Puppeteer 13.0.0 2021-12-12 19:52:11 +01:00
Tim van der Meij
1bc6b846b6
Consistently use string arguments for page.waitForFunction calls
We use string arguments in all other places, so these two places are a
bit inconsistent in that sense. Moreover, it's just one argument now,
which makes it a bit easier to read and see what it does because we
don't have to pass the always-empty options argument anymore. Finally,
doing it like this ensures it works in all Puppeteer versions given
https://github.com/puppeteer/puppeteer/issues/7836.
2021-12-12 19:45:34 +01:00
Tim van der Meij
e638a84afe
Merge pull request #14367 from timvandermeij/integration-tests
Disable failing print actions integration test in Firefox
2021-12-12 16:20:34 +01:00
Tim van der Meij
2643e6a823
Disable failing print actions integration test in Firefox
Once the upstream bug is fixed it can be enabled again because it's
causing way too much noise now. This is tracked in issue #14293. Note
that I deliberately added a new block so we can easily remove it later
on and because the other block is about another bug.
2021-12-12 16:10:50 +01:00
Tim van der Meij
d47b6735b4
Merge pull request #14364 from Snuffleupagus/BaseViewer-conditional-getPermissions
Only call `PDFDocumentProxy.getPermissions`, in the viewer, when `pdfjs.enablePermissions` is set (PR 14362 follow-up)
2021-12-12 14:00:04 +01:00
Jonas Jenwald
63af15eb8f Only call PDFDocumentProxy.getPermissions, in the viewer, when pdfjs.enablePermissions is set (PR 14362 follow-up)
By making this API-call *unconditionally*, we introduce a (slight) delay in the initialization of *all* documents.
That seems quite unfortunate, since `pdfjs.enablePermissions` is off by default, and it thus seem better only do the API-call when actually needed; sorry about this!
2021-12-11 20:46:19 +01:00
Tim van der Meij
6d8d37e93d
Merge pull request #14362 from Snuffleupagus/issue-14356
Support disabling of form editing when `pdfjs.enablePermissions` is set (issue 14356)
2021-12-11 20:02:23 +01:00
Tim van der Meij
fefb9ed5b4
Merge pull request #14360 from timvandermeij/updates
Update packages and translations
2021-12-11 19:51:38 +01:00
Tim van der Meij
c5847141b4
Update translations to the most recent versions 2021-12-11 19:44:52 +01:00
Tim van der Meij
2757000bb2
Fix some dependency vulnerabilities reported by npm audit
This is done automatically using the `npm audit fix` command.
2021-12-11 19:44:52 +01:00
Tim van der Meij
d3d8141372
Update packages to the most recent versions 2021-12-11 19:44:48 +01:00
Jonas Jenwald
b1d3e7f121 Support disabling of form editing when pdfjs.enablePermissions is set (issue 14356)
For encrypted PDF documents without the required permissions set, this patch adds support for disabling of form editing. However, please note that it also requires that the `pdfjs.enablePermissions` preference is set to `true`[1] (since PDF document permissions could be seen as user hostile).

Based on https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G6.1942134, this condition hopefully makes sense.

---
[1] Either manually with `about:config`, or using e.g. a [Group Policy](https://github.com/mozilla/policy-templates).
2021-12-11 18:26:13 +01:00
Jonas Jenwald
b03281de18 Move the permissions handling into the BaseViewer (PR 11789 follow-up)
Besides making the permissions-functionality directly available in the viewer-components, these changes are also necessary for the next patch.
2021-12-11 17:13:41 +01:00
Jonas Jenwald
d856ed9395
Merge pull request #14361 from timvandermeij/nodejs
Upgrade Node.js to version 16 in the CI workflow
2021-12-11 15:58:00 +01:00
Tim van der Meij
4269148d3d
Upgrade Node.js to version 16 in the CI workflow
Version 14 that we used before is now in maintenance mode, so we should
upgrade to the most recent LTS version.

Moreover, use the most recent `setup-node` workflow version and syntax;
see https://github.com/actions/setup-node#usage.
2021-12-11 15:50:23 +01:00
Tim van der Meij
3a8318aa1c
Merge pull request #14359 from Snuffleupagus/PAUSE_EAGER_PAGE_INIT
Avoid overloading the worker-thread during eager page initialization in the viewer (PR 11263 follow-up)
2021-12-11 13:28:35 +01:00
Tim van der Meij
a6dd39b645
Merge pull request #14358 from Snuffleupagus/checkLastPage-improvements
Improve `PDFDocument.checkLastPage`/`Catalog.getAllPageDicts` for documents with corrupt XRef tables (PR 14311, 14335 follow-up)
2021-12-11 13:07:54 +01:00