pdf.js

Author	SHA1	Message	Date
Tim van der Meij	e42d54e1b5	Merge pull request #14400 from Snuffleupagus/getPageDict-async [api-minor] Convert `Catalog.getPageDict` to an asynchronous method	2021-12-28 19:40:34 +01:00
Tim van der Meij	01b25b2612	Merge pull request #14391 from KouWakai/annot-border-correct Handle non-integer Annotation border widths correctly (issue 14203)	2021-12-28 19:28:32 +01:00
Tim van der Meij	07c32f0f4f	Merge pull request #14401 from Snuffleupagus/update-packages Update packages and translations	2021-12-28 19:17:31 +01:00
Jonas Jenwald	ea55e8bf41	Update l10n files	2021-12-26 11:19:19 +01:00
Jonas Jenwald	69f14b1ee9	Update npm packages	2021-12-26 11:09:29 +01:00
Jonas Jenwald	b513c64d9d	[api-minor] Convert `Catalog.getPageDict` to an asynchronous method Besides converting `Catalog.getPageDict` to an `async` method, thus simplifying the code, this patch also allows us to pro-actively fix a existing issue. Note how we're looking up References in such a way that `MissingDataException`s won't cause trouble, however it's technically possible that the entries (i.e. /Count, /Kids, and /Type) in a /Pages Dictionary could actually be indirect objects as well. In the existing code this could lead to some, or even all, pages failing to load/render as intended. In practice that doesn't appear to happen in real-world PDF documents, but given all the weird things that PDF software do I'd prefer to fix this pro-actively (rather than waiting for a bug report). With `Catalog.getPageDict` being `async` this is now really simple to address, however I didn't want to introduce a bunch more unconditional asynchronicity in this method if it could be avoided (since that could slow things down). Hence we'll synchronously lookup the raw data in a /Pages Dictionary, and only fallback to asynchronous data lookup when a Reference was encountered. In addition to the above, this patch also makes the following notable changes: - Let `Catalog.getPageDict` consistently reject with the actual error, regardless of what data we're fetching. Previously we'd "swallow" the actual errors except when looking up Dictionary entries, which is inconsistent and thus seem unfortunate. As can be seen from the updated unit-tests this change is API-observable, hence why the patch is tagged `[api-minor]`. - Improve the consistency of the Dictionary /Type-checks in both the `Catalog.getPageDict` and `Catalog.getAllPageDicts` methods. In `Catalog.getPageDict` there's a fallback code-path where we're incorrectly checking the /Page Dictionary for a /Contents-entry, which is wrong since a /Page Dictionary doesn't need to have a /Contents-entry in order to be valid. For consistency the `Catalog.getAllPageDicts` method is also updated to handle errors in the /Type-lookup correctly. - Reduce the `PagesCountLimit.PAUSE_EAGER_PAGE_INIT` viewer constant, to further improve loading/rendering performance of the second page during initialization of very long documents; PR 14359 follow-up.	2021-12-25 15:22:48 +01:00
KouWakai	98158b67a3	Handle non-integer Annotation border widths correctly (issue 14203) The existing code appears to be wrong, since according to the PDF specification the border width of an Annotation only has to be a number and not specifically an integer. Please see: - https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#page=392 - https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G11.2096210 - https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G6.1965562	2021-12-24 22:10:19 +09:00
Jonas Jenwald	41dab8e7b6	Merge pull request #14388 from Snuffleupagus/bug-1746213 Unblock the "load" event in inactive windows/tabs (bug 1746213, PR 11646 follow-up)	2021-12-21 10:04:10 +01:00
Tim van der Meij	c4d344b52a	Merge pull request #14389 from timvandermeij/bump Bump versions in `pdfjs.config`	2021-12-19 17:02:51 +01:00
Tim van der Meij	08f35f9f7c	Bump versions in `pdfjs.config`	2021-12-19 16:58:17 +01:00
Jonas Jenwald	dc4a6e94f3	Unblock the "load" event when the windows/tabs becomes inactive (bug 1746213) This addresses the following case missing from the previous patch: The viewer is loaded in an active window/tab, and enough time is allowed to pass in order to allow rendering to start. However, if the user then switches to another tab (or another program) before rendering has finished, the "load" event also needs to be unblocked.	2021-12-19 10:40:31 +01:00
Jonas Jenwald	472bbf4592	Unblock the "load" event in inactive windows/tabs (bug 1746213, PR 11646 follow-up) Given that `requestAnimationFrame` is being used, see the `src/diplay/api.js` file, an inactive window/tab means that rendering will not run and we'll thus not fetch all pages. The latter is a requirement for the "load" event to be unblocked, in the MOZCENTRAL-version of, the default viewer. This patch is a partial solution, since it only addresses the following situations: - A background tab (containing the viewer) is reloaded, e.g. via the tab-bar context menu. - The viewer is loaded in a active tab, but the user switches away from it (or switches to another program window) before rendering has started.	2021-12-19 10:39:48 +01:00
Tim van der Meij	a2ae56f394	Merge pull request #14387 from timvandermeij/test-utils Modernize the test utilities v2.12.313	2021-12-18 16:40:56 +01:00
Tim van der Meij	71326c6a1c	Enable the `no-var` linting rule in `test/testutils.js` This is done automatically with the `gulp lint --fix` command with the only exception of the `parts` variable.	2021-12-18 15:58:47 +01:00
Tim van der Meij	a24982a733	Drop custom confirmation logic in favor of using the built-in Node.js `readline` module Most likely this code predates our use of Node.js, but in Node.js asking for user confirmation is a solved problem, so we can remove the custom logic we have for this, which overall makes things much simpler.	2021-12-18 15:52:04 +01:00
Tim van der Meij	869b396011	Merge pull request #14373 from Snuffleupagus/update-TypeScript [api-minor] Fix broken/missing JSDocs and `typedef`s, to allow updating TypeScript to the latest version (issue 14342)	2021-12-18 13:35:54 +01:00
Tim van der Meij	afa43d3af0	Merge pull request #14386 from Snuffleupagus/issue-14385 Ignore negative /FitH parameters in the viewer (issue 14385)	2021-12-18 13:24:42 +01:00
Jonas Jenwald	6b75e46d11	Ignore negative /FitH parameters in the viewer (issue 14385) This provides a work-around for badly generated PDF documents that contain negative /FitH parameters (in the referenced issue the value `-32768` is used).	2021-12-18 11:35:21 +01:00
Jonas Jenwald	e19020c028	Move the `Default{...}LayerFactory` into a new `web/default_factory.js` file This patch, first of all, removes circular dependencies in the TypeScript definitions. Secondly, it also moves `RenderingStates` into `web/ui_utils.js` to break another type-dependency and directly use the `XfaLayerBuilder` during XFA-printing. Finally, note that this patch slightly reduces the size of the default viewer (e.g. in the `MOZCENTRAL` build) by not having to bundle code which is completely unused.	2021-12-15 23:17:08 +01:00
Jonas Jenwald	e0dba504d2	Fix broken/missing JSDocs and `typedef`s, to allow updating TypeScript to the latest version (issue 14342) This patch circumvents the issues seen when trying to update TypeScript to version `4.5`, by "simply" fixing the broken/missing JSDocs and `typedef`s such that `gulp typestest` now passes. As always, given that I don't really know anything about TypeScript, I cannot tell if this is a "correct" and/or proper way of doing things; we'll need TypeScript users to help out with testing! Please note: I'm sorry about the size of this patch, but given how intertwined all of this unfortunately is it just didn't seem easy to split this into smaller parts. However, one good thing about this TypeScript update is that it helped uncover a number of pre-existing bugs in our JSDocs comments.	2021-12-15 23:14:25 +01:00
Tim van der Meij	d3e1d7090a	Merge pull request #14370 from Snuffleupagus/getPageDict-sync-Pages Slightly reduce asynchronicity in the `Catalog.getPageDict` method (PR 14338 follow-up)	2021-12-15 19:40:39 +01:00
Tim van der Meij	274989ab56	Merge pull request #14372 from Snuffleupagus/BaseViewer-Lang Move the /Lang handling into the `BaseViewer` (PR 14114 follow-up)	2021-12-15 19:37:50 +01:00
Tim van der Meij	21aea0b1a2	Merge pull request #14380 from Snuffleupagus/event-utils Move the `EventBus`, and related functionality, into its own file	2021-12-15 19:34:43 +01:00
Jonas Jenwald	0a19ef6864	Move the `EventBus`, and related functionality, into its own file The size of the `web/ui_utils.js` file has increased over time, as more code has been added to (or moved into) that file. To reduce its size slightly, this patch moves the event-related functionality into a separate file.	2021-12-15 17:18:57 +01:00
Jonas Jenwald	760f765e56	Move the /Lang handling into the `BaseViewer` (PR 14114 follow-up) In PR 14114 this was only added to the default viewer, which means that in the viewer components the user would need to manually implement /Lang handling. This was (obviously) a bad choice, since the viewer components already support e.g. structTrees by default; sorry about overlooking this! To avoid having to make two `getMetadata` API-calls[1] very early during initialization, in the default viewer, the API will now cache its result. This will also come in handy elsewhere in the default viewer, e.g. by reducing parsing when opening the "document properties" dialog. --- [1] This not only includes a round-trip to the worker-thread, but also having to re-parse the /Metadata-entry when it exists.	2021-12-14 13:19:05 +01:00
Jonas Jenwald	a425c9cfa5	Merge pull request #14368 from timvandermeij/puppeteer Consistently use string arguments for page.waitForFunction calls and upgrade to Puppeteer 13.0.0	2021-12-14 10:36:06 +01:00
Jonas Jenwald	fa51fd9428	Slightly reduce asynchronicity in the `Catalog.getPageDict` method (PR 14338 follow-up) After the changes in PR 14338, specifically in the `XRef.parse`-method, the /Pages-entry will now always have been fetched/validated when the `Catalog`-instance is created. Hence we can directly access the /Pages-entry in `Catalog.getPageDict` and thus avoid one asynchronous data-lookup per page in the document. (In practice this is unlikely to show up in e.g. benchmarks, but it really cannot hurt.) Finally, make sure that the `getPageDict`/`getAllPageDicts`-methods track the /Pages-tree reference correctly to prevent circular references in corrupt documents.	2021-12-13 21:18:06 +01:00
Tim van der Meij	da2b3dd3be	Upgrade to Puppeteer 13.0.0	2021-12-12 19:52:11 +01:00
Tim van der Meij	1bc6b846b6	Consistently use string arguments for `page.waitForFunction` calls We use string arguments in all other places, so these two places are a bit inconsistent in that sense. Moreover, it's just one argument now, which makes it a bit easier to read and see what it does because we don't have to pass the always-empty options argument anymore. Finally, doing it like this ensures it works in all Puppeteer versions given https://github.com/puppeteer/puppeteer/issues/7836.	2021-12-12 19:45:34 +01:00
Tim van der Meij	e638a84afe	Merge pull request #14367 from timvandermeij/integration-tests Disable failing print actions integration test in Firefox	2021-12-12 16:20:34 +01:00
Tim van der Meij	2643e6a823	Disable failing print actions integration test in Firefox Once the upstream bug is fixed it can be enabled again because it's causing way too much noise now. This is tracked in issue #14293. Note that I deliberately added a new block so we can easily remove it later on and because the other block is about another bug.	2021-12-12 16:10:50 +01:00
Tim van der Meij	d47b6735b4	Merge pull request #14364 from Snuffleupagus/BaseViewer-conditional-getPermissions Only call `PDFDocumentProxy.getPermissions`, in the viewer, when `pdfjs.enablePermissions` is set (PR 14362 follow-up)	2021-12-12 14:00:04 +01:00
Jonas Jenwald	63af15eb8f	Only call `PDFDocumentProxy.getPermissions`, in the viewer, when `pdfjs.enablePermissions` is set (PR 14362 follow-up) By making this API-call unconditionally, we introduce a (slight) delay in the initialization of all documents. That seems quite unfortunate, since `pdfjs.enablePermissions` is off by default, and it thus seem better only do the API-call when actually needed; sorry about this!	2021-12-11 20:46:19 +01:00
Tim van der Meij	6d8d37e93d	Merge pull request #14362 from Snuffleupagus/issue-14356 Support disabling of form editing when `pdfjs.enablePermissions` is set (issue 14356)	2021-12-11 20:02:23 +01:00
Tim van der Meij	fefb9ed5b4	Merge pull request #14360 from timvandermeij/updates Update packages and translations	2021-12-11 19:51:38 +01:00
Tim van der Meij	c5847141b4	Update translations to the most recent versions	2021-12-11 19:44:52 +01:00
Tim van der Meij	2757000bb2	Fix some dependency vulnerabilities reported by `npm audit` This is done automatically using the `npm audit fix` command.	2021-12-11 19:44:52 +01:00
Tim van der Meij	d3d8141372	Update packages to the most recent versions	2021-12-11 19:44:48 +01:00
Jonas Jenwald	b1d3e7f121	Support disabling of form editing when `pdfjs.enablePermissions` is set (issue 14356) For encrypted PDF documents without the required permissions set, this patch adds support for disabling of form editing. However, please note that it also requires that the `pdfjs.enablePermissions` preference is set to `true`[1] (since PDF document permissions could be seen as user hostile). Based on https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G6.1942134, this condition hopefully makes sense. --- [1] Either manually with `about:config`, or using e.g. a [Group Policy](https://github.com/mozilla/policy-templates).	2021-12-11 18:26:13 +01:00
Jonas Jenwald	b03281de18	Move the permissions handling into the `BaseViewer` (PR 11789 follow-up) Besides making the permissions-functionality directly available in the viewer-components, these changes are also necessary for the next patch.	2021-12-11 17:13:41 +01:00
Jonas Jenwald	d856ed9395	Merge pull request #14361 from timvandermeij/nodejs Upgrade Node.js to version 16 in the CI workflow	2021-12-11 15:58:00 +01:00
Tim van der Meij	4269148d3d	Upgrade Node.js to version 16 in the CI workflow Version 14 that we used before is now in maintenance mode, so we should upgrade to the most recent LTS version. Moreover, use the most recent `setup-node` workflow version and syntax; see https://github.com/actions/setup-node#usage.	2021-12-11 15:50:23 +01:00
Tim van der Meij	3a8318aa1c	Merge pull request #14359 from Snuffleupagus/PAUSE_EAGER_PAGE_INIT Avoid overloading the worker-thread during eager page initialization in the viewer (PR 11263 follow-up)	2021-12-11 13:28:35 +01:00
Tim van der Meij	a6dd39b645	Merge pull request #14358 from Snuffleupagus/checkLastPage-improvements Improve `PDFDocument.checkLastPage`/`Catalog.getAllPageDicts` for documents with corrupt XRef tables (PR 14311, 14335 follow-up)	2021-12-11 13:07:54 +01:00
Tim van der Meij	70809a80ce	Merge pull request #14355 from Snuffleupagus/api-page-caches-Map Change `WorkerTransport.{pageCache, pagePromises}` from an Array to a Map	2021-12-11 13:00:11 +01:00
Tim van der Meij	2b8a5dce70	Merge pull request #14354 from Snuffleupagus/improve-pageKidsCountCache-further Further improve caching in `Catalog.getPageDict`, for `disableAutoFetch` mode (PR 8207 follow-up)	2021-12-11 12:54:39 +01:00
Jonas Jenwald	90472e5130	Avoid overloading the worker-thread during eager page initialization in the viewer (PR 11263 follow-up) This patch is essentially another continuation of PR 11263, which tried to improve loading/initialization performance of very large/long documents. For most documents, unless they're very long, we'll eagerly initialize all of the pages in the viewer. For shorter documents having all pages loaded/initialized early provides overall better performance/UX in the viewer, however there's cases where it can instead hurt performance. For documents with a couple of thousand pages[1], the parsing and pre-rendering of the second page of the document can be delayed (quite a bit). The reason for this is that we trigger `PDFDocumentProxy.getPage` for all pages early during the viewer initialization, which causes the worker-thread to be swamped with handling (potentially) thousands of `getPage`-calls and leaving very little time for other parsing (such as e.g. of operatorLists). To address this situation, this patch thus proposes temporarily "pausing" the eager `PDFDocumentProxy.getPage`-calls once a threshold has been reached, to give the worker-thread a change to handle other requests.[2] Obviously this may slightly delay the "pagesloaded" event in longer documents, but considering that it's already the result of asynchronous parsing that'll hopefully not be seen as a blocker for these changes.[3] --- [1] A particularly problematic example is https://github.com/mozilla/pdf.js/files/876321/kjv.pdf (16 MB large), which is a document with 2236 pages and a /Pages-tree that's only one level deep. [2] Please note that I initially considered simply chaining the `PDFDocumentProxy.getPage`-calls, however that'd slowed things down for all documents which didn't seem appropriate. [3] This patch will hopefully also make it possible to re-visit PR 11312, since it seems that changing `Catalog.getPageDict` to an `async` method wasn't the problem in itself. Rather it appears that it leads to slightly different timings, thus exacerbating the already existing issues with the worker-thread being overloaded by `getPage`-calls. Having recently worked with that method, there's a couple of (very old) issues that I'd also like to address and having `Catalog.getPageDict` be `async` would simplify things a great deal.	2021-12-10 20:44:06 +01:00
Jonas Jenwald	70ac6b1694	Update `Catalog.getAllPageDicts` to always propagate the actual Errors (PR 14335 follow-up) Rather than "swallowing" the actual Errors, when data fetching fails, ensure that they're always being propagated as intended to the call-site instead. Note that we purposely handle `XRefEntryException` specially, to make it possible to fallback to indexing all XRef objects.	2021-12-10 15:22:36 +01:00
Jonas Jenwald	47f9eef584	Improve `PDFDocument.checkLastPage` for documents with corrupt XRef tables (PR 14311, 14335 follow-up) Rather than trying, and failing, to fetch the entire /Pages-tree for documents with corrupt XRef tables, let's fallback to indexing all objects before trying to invoke the `Catalog.getAllPageDicts` method.	2021-12-10 11:45:09 +01:00
Jonas Jenwald	f39536a30b	Change `WorkerTransport.pagePromises` from an Array to a Map Given that not all pages necessarily are being accessed, or that the pages may be accessed out of order, using a `Map` seems like a more appropriate data-structure here. Finally, also changes the `pagePromises` to a private property since it's not supposed to be accessed from the "outside".	2021-12-09 15:30:10 +01:00

1 2 3 4 5 ...

15175 Commits