pdf.js

Author	SHA1	Message	Date
Jonas Jenwald	0ebac67a9f	Remove the `{BaseViewer, PDFThumbnailViewer}._pagesRequests` caches In the `BaseViewer` this cache is mostly relevant in the `disableAutoFetch = true` mode, since the pages are being initialized lazily in that case. In the `PDFThumbnailViewer` this cache is mostly used for thumbnails that are actually being rendered, as opposed to those created directly from the "regular" pages. Please note that I'm not suggesting that we remove these caches because they're only used in some situations, but rather because they're for all intents and purposes actually redundant. In the API itself, we're already caching both the page-promises and the actual pages themselves on the `WorkerTransport`-instance. Hence these viewer-caches aren't really necessary in practice, and adds what to me mostly seems like an unnecessary level of indirection.[1] Given that the viewer now relies on caching in the API itself, this patch also adds a new unit-test to ensure that page-caching works (and keep working) as expected. --- [1] In the `WorkerTransport.getPage`-method the parameter is being validated on every call, but that's hardly enough code to warrant keeping the "duplicate" caches in the viewer in my opinion.	2021-11-21 11:40:45 +01:00
Tim van der Meij	aabd4e5092	Merge pull request #14294 from Snuffleupagus/getStats-refactor [api-minor] Replace `PDFDocumentProxy.getStats` with a synchronous `PDFDocumentProxy.stats` getter	2021-11-20 15:42:46 +01:00
Jonas Jenwald	6da0944fc7	[api-minor] Replace `PDFDocumentProxy.getStats` with a synchronous `PDFDocumentProxy.stats` getter Please note: These changes will primarily benefit longer documents, somewhat at the expense of e.g. one-page documents. The existing `PDFDocumentProxy.getStats` function, which in the default viewer is called for each rendered page, requires a round-trip to the worker-thread in order to obtain the current document stats. In the default viewer, we currently make one such API-call for every rendered page. This patch proposes replacing that method with a synchronous `PDFDocumentProxy.stats` getter instead, combined with re-factoring the worker-thread code by adding a `DocStats`-class to track Stream/Font-types and only send them to the main-thread the first time that a type is encountered. Note that in practice most PDF documents only use a fairly limited number of Stream/Font-types, which means that in longer documents most of the `PDFDocumentProxy.getStats`-calls will return the same data.[1] This re-factoring will obviously benefit longer document the most[2], and could actually be seen as a regression for one-page documents, since in practice there'll usually be a couple of "DocStats" messages sent during the parsing of the first page. However, if the user zooms/rotates the document (which causes re-rendering), note that even a one-page document would start to benefit from these changes. Another benefit of having the data available/cached in the API is that unless the document stats change during parsing, repeated `PDFDocumentProxy.stats`-calls will return the same identical object. This is something that we can easily take advantage of in the default viewer, by now only reporting "documentStats" telemetry[3] when the data actually have changed rather than once per rendered page (again beneficial in longer documents). --- [1] Furthermore, the maximium number of `StreamType`/`FontType` are `10` respectively `12`, which means that regardless of the complexity and page count in a PDF document there'll never be more than twenty-two "DocStats" messages sent; see `41ac3f0c07/src/shared/util.js (L206-L232)` [2] One example is the `pdf.pdf` document in the test-suite, where rendering all of its 1310 pages only result in a total of seven "DocStats" messages being sent from the worker-thread. [3] Reporting telemetry, in Firefox, includes using `JSON.stringify` on the data and then sending an event to the `PdfStreamConverter.jsm`-code. In that code the event is handled and `JSON.parse` is used to retrieve the data, and in the "documentStats"-case we'll then iterate through the data to avoid double-reporting telemetry; see https://searchfox.org/mozilla-central/rev/8f4c180b87e52f3345ef8a3432d6e54bd1eb18dc/toolkit/components/pdfjs/content/PdfStreamConverter.jsm#515-549	2021-11-20 12:20:55 +01:00
Tim van der Meij	41ac3f0c07	Merge pull request #14291 from Snuffleupagus/force-postMessageTransfers [api-minor] Only use Workers when `postMessage` transfers are supported (PR 11123 follow-up)	2021-11-19 20:02:51 +01:00
Tim van der Meij	b1e9e214bf	Merge pull request #14229 from brendandahl/term-log Add an easy way to log to the terminal during browser tests.	2021-11-19 19:48:59 +01:00
Brendan Dahl	c6cb39ef30	Merge pull request #14262 from Snuffleupagus/issue-14261 Include the /Lang-property, when it exists, in the StructTree-data (issue 14261)	2021-11-19 07:51:21 -08:00
Jonas Jenwald	6f22327e61	[api-minor] Only use Workers when `postMessage` transfers are supported (PR 11123 follow-up) Given that all modern browsers now support `postMessage` transfers, and have for years, it no longer seems necessary for the PDF.js library to support using Workers unless the `postMessage` transfers functionality is available. This patch is a follow-up to PR 11123, which made it impossible to manually disable `postMessage` transfers for performance reasons (since it increases memory usage), which hasn't caused any bug reports as far as I know.[1] Hence we'll now only support proper Worker implementations, with fully working `postMessage` transfers, and fallback to using "fake" Workers otherwise. --- [1] At the time of that PR we still "supported" IE, which is why this code was left intact.	2021-11-19 16:47:58 +01:00
Brendan Dahl	052db56a2e	Add an easy way to log to the terminal during browser tests. On the main thread call `driver.log` and the message will output in the terminal with the pdf id and the message. I've been using this a lot when trying to find certain PDFs or logging stats.	2021-11-18 15:38:56 -08:00
Brendan Dahl	9f4a2cf5ce	Merge pull request #14276 from Snuffleupagus/issue-14242-2 Only show the `loadingIcon`-spinner on visible pages (issue 14242)	2021-11-18 13:43:58 -08:00
Tim van der Meij	3dccaccbb4	Merge pull request #14278 from Snuffleupagus/rm-removeChild Replace the remaining `Node.removeChild()` instances with `Element.remove()`	2021-11-17 20:17:55 +01:00
Tim van der Meij	f90eebd282	Merge pull request #14280 from Snuffleupagus/scrollMode-PAGE-spread-loop Slightly optimize `spreadMode` toggling with `ScrollMode.PAGE` set (PR 14112 follow-up)	2021-11-17 19:46:30 +01:00
Jonas Jenwald	4e2c2fafc9	Enable the `unicorn/prefer-dom-node-remove` ESLint plugin rule Please see https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/prefer-dom-node-remove.md	2021-11-16 17:52:50 +01:00
Jonas Jenwald	4ef1a129fa	Replace the remaining `Node.removeChild()` instances with `Element.remove()` Using `Element.remove()` is a slightly more compact way of removing an element, since you no longer need to explicitly find/use its parent element. Furthermore, the patch also replaces a couple of loops that're used to delete all elements under a node with simply overwriting the contents directly (a pattern already used throughout the viewer). See also: - https://developer.mozilla.org/en-US/docs/Web/API/Node/removeChild - https://developer.mozilla.org/en-US/docs/Web/API/Element/remove	2021-11-16 17:52:50 +01:00
Brendan Dahl	3209c013c4	Merge pull request #14247 from calixteman/button [api-minor] Render pushbuttons on their own canvas (bug 1737260)	2021-11-16 08:10:40 -08:00
Jonas Jenwald	1214c056e9	Slightly optimize `spreadMode` toggling with `ScrollMode.PAGE` set (PR 14112 follow-up) It shouldn't be necessary to iterate through all pages when using a non-default `spreadMode`, since we already know which page(s) should become visible. This code is a left-over from the initial (local) implementation that resulted in PR 14112, however I forgot to clean-up some things such as e.g. this loop. Also fixes an outdated comment, see PR 14204 which removed the mentioned data-structure.	2021-11-16 15:37:58 +01:00
Jonas Jenwald	7d4c37e988	Use the new iterator in the `PDFPageViewBuffer` unit-tests The previous patch introduced an iterator in the `PDFPageViewBuffer`-class, hence the test-only `_buffer`-getter is no longer necessary.	2021-11-15 14:06:17 +01:00
Jonas Jenwald	e909fcdba8	Only show the `loadingIcon`-spinner on visible pages (issue 14242) This patch preserves the old behaviour of appending a `loadingIcon`-div to all pages that are not yet loaded/rendered. However, the actual `loadingIcon`-spinner (i.e. the `loading-icon.gif` image) will only be displayed on visible pages to improve performance. To avoid having to iterate through all pages in the document, which doesn't seem like a good idea for a PDF document with thousands of pages, we use a combination of the currently visible and cached pages to toggle the `loadingIcon`-spinner.	2021-11-15 14:06:14 +01:00
Tim van der Meij	e4f97a2a91	Merge pull request #14273 from Snuffleupagus/update-packages Update packages and translations	2021-11-14 15:09:31 +01:00
Jonas Jenwald	971ac8e993	Include the /Lang-property, when it exists, in the StructTree-data (issue 14261) Please note: This is a tentative patch, since I don't have the necessary a11y-software to actually test it.	2021-11-14 12:37:41 +01:00
Jonas Jenwald	a54bed4963	Enable the ESLint `no-loss-of-precision` rule Please refer to https://eslint.org/docs/rules/no-loss-of-precision	2021-11-14 10:48:50 +01:00
Jonas Jenwald	c47f5e81fe	Update l10n files	2021-11-14 10:48:50 +01:00
Jonas Jenwald	04bdc26d3a	Update the `eslint-plugin-unicorn` package to the latest version	2021-11-14 10:27:29 +01:00
Jonas Jenwald	1dd74efb0f	Update the `eslint-plugin-no-unsanitized` package to the latest version	2021-11-14 10:24:41 +01:00
Jonas Jenwald	bd1e140e2a	Update the `dommatrix` package to the latest version	2021-11-14 10:20:54 +01:00
Jonas Jenwald	9f6d37263c	Update npm packages	2021-11-14 10:17:30 +01:00
Jonas Jenwald	712621b508	Merge pull request #14255 from Snuffleupagus/GrabToPan-class Convert `GrabToPan` to a standard `class`	2021-11-13 23:24:36 +01:00
Jonas Jenwald	08d56c67ae	Convert `GrabToPan` to a standard `class` This code is the last piece[1] of the viewer that's not using standard `class`es, and by converting this code we get rid of some now unneeded boilerplate code (slightly reducing the size of the built `web/viewer.js` file). Note that while this code was originally imported from a separate repository, it was last sync-ed with upstream five years ago which is why this re-factoring should be OK as far as I'm concerned (and we've done some other clean-up since then as well). --- [1] Technically the `web/debugger.js` file is left as well, however that code is first of all not bundled in the built `web/viewer.js` file and secondly it's not even loaded by default either.	2021-11-13 23:07:36 +01:00
Jonas Jenwald	ed6af0f844	[web/grab_to_pan.js] Inline the `isLeftMouseReleased` helper function Given the support information listed in the function itself, the [MDN compatibility data](https://developer.mozilla.org/en-US/docs/Web/API/MouseEvent/buttons#browser_compatibility), and the [currently supported browsers](`4bb9de4b00/gulpfile.js (L79-L87)`) in the PDF.js project we should be able to simplify the code by inlining the function instead.	2021-11-13 23:00:15 +01:00
Jonas Jenwald	7a428345db	Merge pull request #14271 from calixteman/params Parse query string in using URLSearchParams	2021-11-13 22:59:34 +01:00
Calixte Denizet	fe95e100e4	Parse query string in using URLSearchParams - I just noticed in reading the code that we parse that stuff when something exists in the web api; - see https://developer.mozilla.org/en-US/docs/Web/API/URLSearchParams/URLSearchParams.	2021-11-13 21:10:54 +01:00
Tim van der Meij	de7cfed9e3	Merge pull request #14260 from Snuffleupagus/telemetry-pageInfo-once Report "pageInfo" telemetry once, rather than for each rendered page	2021-11-13 20:22:58 +01:00
Tim van der Meij	138ebb09c0	Merge pull request #14253 from Snuffleupagus/ScrollMode-PAGE-chromium [Chromium addon] Add the Page scrolling mode to the options (PR 14112 follow-up)	2021-11-13 20:12:04 +01:00
calixteman	85c6dd59ce	Merge pull request #14268 from calixteman/outline Remove non-displayable chars from outline title (#14267)	2021-11-13 08:12:56 -08:00
Calixte Denizet	7041c62ccf	Remove non-displayable chars from outline title (#14267 ) - it aims to fix #14267; - there is nothing about chars in range [0-1F] in the specs but acrobat doesn't display them in any way.	2021-11-13 16:56:08 +01:00
Jonas Jenwald	db41c49321	Merge pull request #14270 from Snuffleupagus/issue-14269 When parsing corrupt documents without any trailer-dictionary, fallback to the "top"-dictionary (issue 14269)	2021-11-13 16:01:28 +01:00
Jonas Jenwald	afcc99a86d	When parsing corrupt documents without any trailer-dictionary, fallback to the "top"-dictionary (issue 14269) There's obviously no guarantee that this will work in general, if the document is sufficiently corrupt, but it should hopefully be better than just throwing `InvalidPDFException` as currently happens. Please note that, as is often the case with corrupt documents, it's somewhat difficult to know if we're rendering the document "correctly" with this patch[1]. In this case even Adobe Reader cannot open the document, which is always a good sign that it's really corrupt, however we're at least able to render something with this patch. --- [1] Whatever "correct" even means when dealing with corrupt PDF documents, where often times different PDF viewers won't agree completely.	2021-11-13 13:21:38 +01:00
Jonas Jenwald	28fb3975eb	Merge pull request #14266 from calixteman/bug931481 Don't consider space as real space when there is an extra spacing (bug 931481)	2021-11-12 21:42:32 +01:00
Calixte Denizet	a88ff34eb7	Don't consider space as real space when there is an extra spacing (bug 931481) - it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=931481; - real space chars are pushed in the chunk but when there is an extra spacing, the next char position must be compared with the previous one; - for example, an extra spacing can cancel a space so visually there are no space.	2021-11-12 18:53:48 +01:00
calixteman	7d6d3fc124	Merge pull request #14265 from calixteman/14150 XFA - Avoid an exception when looking for a font in a parent node	2021-11-12 09:53:19 -08:00
Calixte Denizet	5b7e1f5232	XFA - Avoid an exception when looking for a font in a parent node - it aims to fix issue https://github.com/mozilla/pdf.js/issues/14150; - a parent can be null in case the root has been reached, so just add a check.	2021-11-12 16:27:08 +01:00
Calixte Denizet	33ea817b20	[api-minor] Render pushbuttons on their own canvas (bug 1737260) - First step to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1737260; - several interactive pdfs use the possibility to hide/show buttons to show different icons; - render pushbuttons on their own canvas and then insert it the annotation_layer; - update test/driver.js in order to convert canvases for pushbuttons into images.	2021-11-12 15:37:33 +01:00
Brendan Dahl	3a31b7ef0c	Merge pull request #14258 from Snuffleupagus/issue-14256-2 Always prefer abbreviated keys, over full ones, when doing any dictionary lookups (issue 14256)	2021-11-11 14:23:43 -08:00
Jonas Jenwald	8eed0b9145	Report "pageInfo" telemetry once, rather than for each rendered page Reporting telemetry, in Firefox, includes using `JSON.stringify` on the data and then sending an event to the `PdfStreamConverter.jsm`-code. In that code the event is handled and `JSON.parse` is used to retrieve the data, and in the "pageInfo"-case we'll then proceed to ignore everything except the first such event; see https://searchfox.org/mozilla-central/rev/24fac1ad31fb9c6e9c4c767c6a7ff45d226078f3/toolkit/components/pdfjs/content/PdfStreamConverter.jsm#509-514 All-in-all, sending the "pageInfo" telemetry for each rendered page is thus unnecessary and this patch makes the viewer send it only once instead.	2021-11-11 12:36:06 +01:00
Jonas Jenwald	ea1c348c67	Always prefer abbreviated keys, over full ones, when doing any dictionary lookups (issue 14256) Note that issue 14256 was specifically about inline images, please refer to: - https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G7.1852045 - https://www.pdfa.org/safedocs-unearths-pdf-inline-image-issue/ - https://pdf-issues.pdfa.org/32000-2-2020/clause08.html#H8.9.7 However, during review of the initial PR in https://github.com/mozilla/pdf.js/pull/14257#issuecomment-964469710, it was suggested that we instead do this unconditionally for all dictionary lookups. In addition to re-ordering the existing call-sites in the `src/core`-code, and adding non-PRODUCTION/TESTING asserts to catch future errors, for consistency a number of existing `if`/`switch`-blocks were re-factored to also check the abbreviated keys first.	2021-11-10 11:56:18 +01:00
Brendan Dahl	4ee906adf4	Merge pull request #14209 from Snuffleupagus/issue-14205 [Google Chrome] Ensure that `markedContent` spans are placed in the top-left corner (issue 14205)	2021-11-09 07:59:14 -08:00
calixteman	4bb9de4b00	Merge pull request #14239 from calixteman/1739502 XFA - Fix a breakBefore issue when target is a contentArea and startNew is 1 (bug 1739502)	2021-11-08 03:14:42 -08:00
Jonas Jenwald	27e461a897	[Chromium addon] Add the Page scrolling mode to the options (PR 14112 follow-up)	2021-11-08 10:18:25 +01:00
Jonas Jenwald	8064318015	Merge pull request #14250 from calixteman/14249 XFA - Encode tag names in UTF-8 when saving (fix #14249)	2021-11-07 22:28:53 +01:00
Calixte Denizet	13ae6d493a	XFA - Encode tag names in UTF-8 when saving (fix #14249 )	2021-11-07 21:41:37 +01:00
Tim van der Meij	891f21fba6	Merge pull request #14245 from Snuffleupagus/PDFPageViewBuffer-class Convert `PDFPageViewBuffer` to a standard class, and use a `Set` internally	2021-11-07 14:37:33 +01:00

1 2 3 4 5 ...

15049 Commits