pdf.js

Author	SHA1	Message	Date
Brendan Dahl	e1aed05c44	Merge pull request #11049 from brendandahl/telem-add-time Add page rendered timestamp to telemetry.	2019-08-06 11:53:08 -07:00
Brendan Dahl	47077f8de9	Add page rendered timestamp to telemetry.	2019-08-06 09:46:33 -07:00
Jonas Jenwald	5ac9c7c384	Support corrupt PDF files with invalid/non-existent Group /CS entries (issue 11045) The PDF file in question tries to reference a non-existent ColorSpace, which should be quite rare in practice.	2019-08-06 14:33:05 +02:00
Tim van der Meij	a666f1ef00	Merge pull request #11048 from Snuffleupagus/linkService-compact-pagesRefCache Use more compact keys in `PDFLinkService._pagesRefCache`	2019-08-05 22:44:00 +02:00
Jonas Jenwald	9acaaf5126	Use more compact keys in `PDFLinkService._pagesRefCache` By using the same internal formatting here as in the `Ref.toString` method, in `src/core/primitives.js`, all cache-keys will become at least two bytes shorter (and most three bytes shorter). Obviously this won't have a huge effect on memory since there's only one cache entry per page, but it nonetheless seems wasteful to use longer keys than strictly required.	2019-08-05 18:06:32 +02:00
Tim van der Meij	be70ee236d	Merge pull request #11013 from timvandermeij/annotations-quadpoints [api-minor] Implement quadpoints for annotations in the core layer	2019-08-04 16:06:10 +02:00
Tim van der Meij	6c05a7653d	Merge pull request #11038 from Snuffleupagus/getStats-objects [api-minor] Fix completely broken `getStats` method by returning stats in Objects, rather than in Arrays (PR 11029 follow-up)	2019-08-02 22:08:37 +02:00
Jonas Jenwald	0276385e6e	[api-minor] Fix completely broken `getStats` method by returning stats in Objects, rather than in Arrays (PR 11029 follow-up) With the changes to the `StreamType`/`FontType` "enums" in PR 11029, one unfortunate result is that `getStats` now always returns empty Arrays. Something that everyone, myself included, apparently missed is that you obviously cannot index an Array with Strings :-) I wrongly assumed that the unit-tests would catch any bugs, but they apparently suffered from the same issue as the code in `src/core/`. Another possible option could perhaps be to use `Set`s, rather than objects, but that will require larger changes since `LoopbackPort` (in `src/display/api.js`) doesn't support them.	2019-08-02 14:09:24 +02:00
Tim van der Meij	9c8fe3142a	Merge pull request #11034 from Snuffleupagus/cancel-with-AbortException Ensure that `ReadableStream`s are cancelled with actual Errors	2019-08-02 00:18:44 +02:00
Tim van der Meij	e0b38bed3c	Merge pull request #11029 from brendandahl/pdfjs-telemetry-update [api-minor] Update telemetry to use 'categorical' histograms.	2019-08-02 00:11:02 +02:00
Tim van der Meij	2754b09888	Merge pull request #11033 from Snuffleupagus/viewer-close-updateLoadingIndicatorState Ensure that the loading indicator, in the pageNumber input, is hidden when the viewer is closed	2019-08-01 23:36:55 +02:00
Brendan Dahl	31d71808e7	[api-minor] Update telemetry to use 'categorical' histograms. Firefox telemetry supports using string labels now. Convert our integers that we used for categories to just use strings. The upstream work will happen in: https://bugzilla.mozilla.org/show_bug.cgi?id=1566882	2019-08-01 09:51:02 -07:00
Jonas Jenwald	a3150166ec	Ensure that `ReadableStream`s are cancelled with actual Errors There's a number of spots in the current code, and tests, where `cancel` methods are not called with appropriate arguments (leading to Promises not being rejected with Errors as intended). In some cases the cancel `reason` is implicitly set to `undefined`, and in others the cancel `reason` is just a plain String. To address this inconsistency, the patch changes things such that cancelling is done with `AbortException`s everywhere instead.	2019-08-01 16:40:46 +02:00
Jonas Jenwald	cb1394c13d	Ensure that the loading indicator, in the pageNumber input, is hidden when the viewer is closed Currently the indicator may remain visible even after the document has been closed, which seems weird given that no page is either visible nor rendering :-)	2019-08-01 16:30:33 +02:00
Tim van der Meij	d909b86b28	Merge pull request #11020 from Snuffleupagus/issue-11016 Add a work-around, in `glyphlist.js`, for bad PDF generators which use a non-standard `/f_f` string in the `Encoding` dictionary when referring to the ff ligature (issue 11016)	2019-07-31 23:33:34 +02:00
Tim van der Meij	c5d837d2fe	Merge pull request #11019 from wangsongyan/master Decode URL encoded filenames from content disposition headers	2019-07-31 23:24:08 +02:00
wangsongyan	c61205d980	decode filename when match an urlencode filename from contentDispositionFilename	2019-07-31 09:33:56 +08:00
Jonas Jenwald	9ad50521b1	Add a work-around, in `glyphlist.js`, for bad PDF generators which use a non-standard `/f_f` string in the `Encoding` dictionary when referring to the ff ligature (issue 11016) This patch will not incur any (measurable) overhead, since the glyphlist is already quite long and one more entry won't really matter, which is important given that this sort of PDF corruption ought to be very rare. Furthermore, this patch purposely does not add a bunch of similarly modified ligature names on pure speculation. Any similar additions, for other ligatures, should only be made if there's real-world examples of PDF files where that's actually necessary.	2019-07-30 17:06:58 +02:00
Tim van der Meij	323b2eabcf	Merge pull request #11012 from Snuffleupagus/EvaluatorPreprocessor-read-fewer-function-calls Reduce the number of function calls in `EvaluatorPreprocessor.read`	2019-07-29 22:32:02 +02:00
Jonas Jenwald	38ccb43436	Reduce the number of function calls in `EvaluatorPreprocessor.read` For very large and complex PDF files this will help performance slightly, since `EvaluatorPreprocessor.read` is called a lot during parsing in the worker. This patch was tested using the PDF file from issue 2618, i.e. http://bugzilla-attachments.gnome.org/attachment.cgi?id=226471, using the following manifest file: ``` [ { "id": "issue2618", "file": "../web/pdfs/issue2618.pdf", "md5": "", "rounds": 200, "type": "eq" } ] ``` This gave the following results when comparing this patch against the `master` branch: ``` -- Grouped By browser, stat -- browser \| stat \| Count \| Baseline(ms) \| Current(ms) \| +/- \| % \| Result(P<.05) ------- \| ------------ \| ----- \| ------------ \| ----------- \| --- \| ----- \| ------------- Firefox \| Overall \| 200 \| 3402 \| 3358 \| -43 \| -1.28 \| faster Firefox \| Page Request \| 200 \| 1 \| 1 \| 0 \| 26.71 \| Firefox \| Rendering \| 200 \| 3401 \| 3357 \| -44 \| -1.28 \| faster ```	2019-07-29 08:43:36 +02:00
Tim van der Meij	9114004d5b	[api-minor] Implement quadpoints for annotations in the core layer	2019-07-28 20:36:21 +02:00
Tim van der Meij	9b72089886	Merge pull request #11003 from Snuffleupagus/webViewerWheel-supportedKeys Ensure that setting the `zoomDisabledTimeout` isn't skipped, regardless of the supported zoom keys, when handling mouse wheel events (PR 7097 follow-up)	2019-07-23 22:28:36 +02:00
Tim van der Meij	e066db47dc	Merge pull request #10996 from Snuffleupagus/app-conditional-findbar Avoid creating a `PDFFindBar` instance, in the Firefox built-in viewer, when not actually necessary	2019-07-23 22:22:31 +02:00
Jonas Jenwald	1eed5b7235	Ensure that setting the `zoomDisabledTimeout` isn't skipped, regardless of the supported zoom keys, when handling mouse wheel events (PR 7097 follow-up) Possible follow-up: It probably wouldn't hurt to try and shorten the `supportedMouseWheelZoomModifierKeys` name a bit, but I'm not attempting that here since it'd also require updating `PdfStreamConverter.jsm` in mozilla-central in order to be consistent.	2019-07-23 17:42:12 +02:00
Jonas Jenwald	46b61ff12e	Avoid creating a `PDFFindBar` instance, in the Firefox built-in viewer, when not actually necessary This is similar to how `PDFPresentationMode` isn't used when the Fullscreen API isn't supported.	2019-07-23 07:51:14 +02:00
Tim van der Meij	1f287ec486	Merge pull request #11001 from Snuffleupagus/Parser-shift-rm-isCmd Inline the `isCmd` check in the `Parser.shift` method	2019-07-22 22:32:53 +02:00
Jonas Jenwald	ff90aa4323	Inline the `isCmd` check in the `Parser.shift` method For very large and complex PDF files this will help performance slightly, since `Parser.shift` is called a lot during parsing. This patch was tested using the PDF file from issue 2618, i.e. http://bugzilla-attachments.gnome.org/attachment.cgi?id=226471 (with well over four million `Parser.shift` calls for just the one page), using the following manifest file: ``` [ { "id": "issue2618", "file": "../web/pdfs/issue2618.pdf", "md5": "", "rounds": 100, "type": "eq" } ] ``` This gave the following results when comparing this patch against the `master` branch: ``` -- Grouped By browser, stat -- browser \| stat \| Count \| Baseline(ms) \| Current(ms) \| +/- \| % \| Result(P<.05) ------- \| ------------ \| ----- \| ------------ \| ----------- \| --- \| ----- \| ------------- Firefox \| Overall \| 100 \| 3386 \| 3322 \| -65 \| -1.92 \| faster Firefox \| Page Request \| 100 \| 1 \| 1 \| 0 \| -8.08 \| Firefox \| Rendering \| 100 \| 3385 \| 3321 \| -65 \| -1.92 \| faster ```	2019-07-22 12:07:36 +02:00
Tim van der Meij	71d9f5f860	Merge pull request #10995 from Snuffleupagus/app-rm-one-setFileSize Remove an unnecessary `PDFDocumentProperties.setFileSize` call, relevant for the Firefox built-in viewer, and use the "normal" code-path in `PDFViewerApplication.open` instead	2019-07-21 12:42:26 +02:00
Jonas Jenwald	53a854bb0a	Remove an unnecessary `PDFDocumentProperties.setFileSize` call, relevant for the Firefox built-in viewer, and use the "normal" code-path in `PDFViewerApplication.open` instead Since calling `getDocument` with a `PDFDataRangeTransport` argument will always unconditionally override a manually provided `length` argument, see `a1a667809f/src/display/api.js (L390-L394)`, this patch should thus be safe.	2019-07-21 11:38:17 +02:00
Tim van der Meij	a1a667809f	Merge pull request #10993 from Snuffleupagus/AppOption-docBaseUrl Add the `docBaseUrl` API parameter to `AppOptions` in the viewer	2019-07-20 13:57:52 +02:00
Jonas Jenwald	ba2c042a75	Add the `docBaseUrl` API parameter to `AppOptions` in the viewer This unfortunately required a bit of special handling, to correctly deal with the various extension builds.	2019-07-20 13:39:34 +02:00
Tim van der Meij	0cc0789af3	Merge pull request #10986 from Snuffleupagus/inline-ensureByte-ensureRange Attempt to significantly reduce the number of `ChunkedStream.{ensureByte, ensureRange}` calls by inlining the `this.progressiveDataLength` checks at the call-sites	2019-07-19 22:51:21 +02:00
Tim van der Meij	acef5bfd16	Merge pull request #10979 from Snuffleupagus/firefox-zoomreset [Firefox] Re-factor the 'zoomreset' message handling in the viewer (PR 10652 follow-up)	2019-07-19 22:42:07 +02:00
Tim van der Meij	b964df53da	Merge pull request #10990 from Snuffleupagus/onBeforeDraw-onAfterDraw Refactor the `onBeforeDraw`/`onAfterDraw` functionality used in `BaseViewer` and `PDFPageView`	2019-07-19 22:35:41 +02:00
Tim van der Meij	98c4c646cb	Merge pull request #10987 from mozilla/dependabot/npm_and_yarn/js-yaml-3.13.1 Bump js-yaml from 3.12.0 to 3.13.1	2019-07-19 22:25:13 +02:00
Jonas Jenwald	366eebeb0f	Refactor the `onBeforeDraw`/`onAfterDraw` functionality used in `BaseViewer` and `PDFPageView` This functionality is very old, and pre-dates e.g. the introduction of the `EventBus` by a number of years. Rather than attaching two callback functions to every single `PDFPageView` instance, it's thus now possible to utilize the `EventBus` such that you only need a grand total of two listeners to achieve the same result. For the `onAfterDraw` callback the replacement is particularly simple, given that a 'pagerendered' event is already being dispatched in the appropriate spot. An added benefit here is the ability to remove the event listener, since we only ever care about one (arbitrary) page being rendered for the `BaseViewer.onePageRendered` promise. For the `onBeforeDraw` callback, a new 'pagerender' event was thus added to replace the callback.	2019-07-19 12:57:14 +02:00
dependabot[bot]	808f7db586	Bump js-yaml from 3.12.0 to 3.13.1 Bumps [js-yaml](https://github.com/nodeca/js-yaml) from 3.12.0 to 3.13.1. - [Release notes](https://github.com/nodeca/js-yaml/releases) - [Changelog](https://github.com/nodeca/js-yaml/blob/master/CHANGELOG.md) - [Commits](https://github.com/nodeca/js-yaml/compare/3.12.0...3.13.1) Signed-off-by: dependabot[bot] <support@github.com>	2019-07-19 00:04:03 +00:00
Jonas Jenwald	b5254f2745	Attempt to significantly reduce the number of `ChunkedStream.{ensureByte, ensureRange}` calls by inlining the `this.progressiveDataLength` checks at the call-sites The number of in particular `ChunkedStream.ensureByte` calls is often absolutely huge (on the order of million calls) when loading and rendering even moderately complicated PDF files, which isn't entirely surprising considering that the `getByte`/`getBytes`/`peekByte`/`peekBytes` methods are used for essentially all data reading/parsing. The idea implemented in this patch is to inline an inverted `progressiveDataLength` check at all of the `ensureByte`/`ensureRange` call-sites, which in practice will often result in several orders of magnitude fewer function calls. Obviously this patch will only help if the browser supports streaming, which all reasonably modern browsers now do (including the Firefox built-in PDF viewer), and assuming that the user didn't set the `disableStream` option (e.g. for using `disableAutoFetch`). However, I think we should be able to improve performance for the default out-of-the-box use case, without worrying about e.g. older browsers (where this patch will thus incur one additional check before calling `ensureByte`/`ensureRange`). This patch was inspired by the first commit in PR 5005, which was subsequently backed out in PR 5145 for causing regressions. Since the general idea of avoiding unnecessary function calls was really nice, I figured that re-attempting this in one way or another wouldn't be a bad idea. Given that streaming is now supported, which it wasn't back then, using `progressiveDataLength` seemed like an easier approach in general since it also allowed supporting both `ensureByte` and `ensureRange`. This sort of patch obviously needs data to back it up, hence I've benchmarked the changes using the following manifest file (with the default `tracemonkey` file): ``` [ { "id": "tracemonkey-eq", "file": "pdfs/tracemonkey.pdf", "md5": "9a192d8b1a7dc652a19835f6f08098bd", "rounds": 250, "type": "eq" } ] ``` I get the following complete results when comparing this patch against the `master` branch: ``` -- Grouped By browser, stat -- browser \| stat \| Count \| Baseline(ms) \| Current(ms) \| +/- \| % \| Result(P<.05) ------- \| ------------ \| ----- \| ------------ \| ----------- \| --- \| ----- \| ------------- Firefox \| Overall \| 3500 \| 140 \| 134 \| -6 \| -4.46 \| faster Firefox \| Page Request \| 3500 \| 2 \| 2 \| 0 \| -0.10 \| Firefox \| Rendering \| 3500 \| 138 \| 131 \| -6 \| -4.54 \| faster ``` Here it's pretty clear that the patch does have a positive net effect, even for a PDF file of fairly moderate size and complexity. However, in this case it's probably interesting to also look at the results per page: ``` -- Grouped By page, stat -- page \| stat \| Count \| Baseline(ms) \| Current(ms) \| +/- \| % \| Result(P<.05) ---- \| ------------ \| ----- \| ------------ \| ----------- \| --- \| ------ \| ------------- 0 \| Overall \| 250 \| 74 \| 75 \| 1 \| 0.69 \| 0 \| Page Request \| 250 \| 1 \| 1 \| 0 \| 33.20 \| 0 \| Rendering \| 250 \| 73 \| 74 \| 0 \| 0.25 \| 1 \| Overall \| 250 \| 123 \| 121 \| -2 \| -1.87 \| faster 1 \| Page Request \| 250 \| 3 \| 2 \| 0 \| -11.73 \| 1 \| Rendering \| 250 \| 121 \| 119 \| -2 \| -1.67 \| 2 \| Overall \| 250 \| 64 \| 63 \| -1 \| -1.91 \| 2 \| Page Request \| 250 \| 1 \| 1 \| 0 \| 8.81 \| 2 \| Rendering \| 250 \| 63 \| 62 \| -1 \| -2.13 \| faster 3 \| Overall \| 250 \| 97 \| 97 \| 0 \| -0.06 \| 3 \| Page Request \| 250 \| 1 \| 1 \| 0 \| 25.37 \| 3 \| Rendering \| 250 \| 96 \| 95 \| 0 \| -0.34 \| 4 \| Overall \| 250 \| 97 \| 97 \| 0 \| -0.38 \| 4 \| Page Request \| 250 \| 1 \| 1 \| 0 \| -5.97 \| 4 \| Rendering \| 250 \| 96 \| 96 \| 0 \| -0.27 \| 5 \| Overall \| 250 \| 99 \| 97 \| -3 \| -2.92 \| 5 \| Page Request \| 250 \| 2 \| 1 \| 0 \| -17.20 \| 5 \| Rendering \| 250 \| 98 \| 95 \| -3 \| -2.68 \| 6 \| Overall \| 250 \| 99 \| 99 \| 0 \| -0.14 \| 6 \| Page Request \| 250 \| 2 \| 2 \| 0 \| -16.49 \| 6 \| Rendering \| 250 \| 97 \| 98 \| 0 \| 0.16 \| 7 \| Overall \| 250 \| 96 \| 95 \| -1 \| -0.55 \| 7 \| Page Request \| 250 \| 1 \| 2 \| 1 \| 66.67 \| slower 7 \| Rendering \| 250 \| 95 \| 94 \| -1 \| -1.19 \| 8 \| Overall \| 250 \| 92 \| 92 \| -1 \| -0.69 \| 8 \| Page Request \| 250 \| 1 \| 1 \| 0 \| -17.60 \| 8 \| Rendering \| 250 \| 91 \| 91 \| 0 \| -0.52 \| 9 \| Overall \| 250 \| 112 \| 112 \| 0 \| 0.29 \| 9 \| Page Request \| 250 \| 2 \| 1 \| 0 \| -7.92 \| 9 \| Rendering \| 250 \| 110 \| 111 \| 0 \| 0.37 \| 10 \| Overall \| 250 \| 589 \| 522 \| -67 \| -11.38 \| faster 10 \| Page Request \| 250 \| 14 \| 13 \| 0 \| -1.26 \| 10 \| Rendering \| 250 \| 575 \| 508 \| -67 \| -11.62 \| faster 11 \| Overall \| 250 \| 66 \| 66 \| -1 \| -0.86 \| 11 \| Page Request \| 250 \| 1 \| 1 \| 0 \| -16.48 \| 11 \| Rendering \| 250 \| 65 \| 65 \| 0 \| -0.62 \| 12 \| Overall \| 250 \| 303 \| 291 \| -12 \| -4.07 \| faster 12 \| Page Request \| 250 \| 2 \| 2 \| 0 \| 12.93 \| 12 \| Rendering \| 250 \| 301 \| 289 \| -13 \| -4.19 \| faster 13 \| Overall \| 250 \| 48 \| 47 \| 0 \| -0.45 \| 13 \| Page Request \| 250 \| 1 \| 1 \| 0 \| 1.59 \| 13 \| Rendering \| 250 \| 47 \| 46 \| 0 \| -0.52 \| ``` Here it's clear that this patch significantly improves the rendering performance of the slowest pages, while not causing any big regressions elsewhere. As expected, this patch thus helps larger and/or more complex pages the most (which is also where even small improvements will be most beneficial). There's obviously the question if this is slightly regressing simpler pages, but given just how short the times are in most cases it's not inconceivable that the page results above are simply caused be e.g. limited `Date.now()` and/or limited numerical precision.	2019-07-18 17:30:22 +02:00
Jonas Jenwald	d1af8bd196	Slightly more simplified dispatching of the 'findbarclose' events in `firefoxcom.js`	2019-07-18 14:28:49 +02:00
Jonas Jenwald	8e5aa484fb	[Firefox] Re-factor the 'zoomreset' message handling in the viewer (PR 10652 follow-up) Given that this special-case only matters for the Firefox PDF viewer, it's probably better to just move it into `firefoxcom.js` instead to reduce unnecessary confusion.	2019-07-18 14:27:43 +02:00
Tim van der Meij	6e96a158f4	Merge pull request #10820 from vlastimilmaca/annot-irt-rt-states Annotations - Added parsing of IRT, RT, State and StateModel	2019-07-17 23:34:31 +02:00
vlastimilmaca	fe49f0f766	Annotations - Implement parsing of IRT, RT, State and StateModel	2019-07-16 23:33:07 +02:00
Tim van der Meij	bf60fe88d0	Merge pull request #10974 from Snuffleupagus/refactor-get-fingerprint Simplify the `PDFDocument.fingerprint` method slightly	2019-07-15 22:29:37 +02:00
Jonas Jenwald	bea15b6ce5	Simplify the `PDFDocument.fingerprint` method slightly The way that this method handles documents without an `ID` entry in the Trailer dictionary feels overly complicated to me. Hence this patch adds `getByteRange` methods to the various Stream implementations[1], and utilize that rather than manually calling `ensureRange` when computing a fallback `fingerprint`. --- [1] Note that `PDFDocument` is only ever initialized with either a `Stream` or a `ChunkedStream`, hence why the `DecodeStream.getByteRange` method isn't implemented.	2019-07-15 13:26:08 +02:00
Jonas Jenwald	c7de6dbe41	Update the `fingerprint` API unit-tests to explicitly check for the expected result The current tests won't catch inadvertent changes to the logic used to obtain/compute the document `fingerprint`.	2019-07-15 11:19:17 +02:00
Tim van der Meij	13ebfec903	Merge pull request #10969 from Snuffleupagus/api-test-stopAtErrors Add an API unit-test for the `stopAtErrors` option (PRs 8240 and 8922 follow-up)	2019-07-14 14:47:57 +02:00
Tim van der Meij	766d076dcb	Merge pull request #10970 from Snuffleupagus/MessageHandler-simplify-finalize Simplify, and inline, the `finalize` function in the `MessageHandler` class	2019-07-14 14:45:15 +02:00
Jonas Jenwald	b548bafef7	Simplify, and inline, the `finalize` function in the `MessageHandler` class The `finalize` helper function has only a single call-site, and furthermore it's just a one-liner too. Furthermore it's only ever called with a `Promise` as its argument, meaning that it's unnecessarily convoluted as well (i.e. the `Promise.resolve()` part shouldn't be necessary). Hence this code can be both simplified and inlined at its only call-site instead.	2019-07-13 17:54:32 +02:00
Jonas Jenwald	c7fb7116d6	Add an API unit-test for the `stopAtErrors` option (PRs 8240 and 8922 follow-up) Also fixes an inconsistency in the 'PageError' handler, for `getOperatorList`, in the API.	2019-07-13 16:06:05 +02:00
Tim van der Meij	b01cc55cfd	Merge pull request #10968 from Snuffleupagus/MessageHandler-rm-useless-wrapReason Remove useless `wrapReason` calls in the `MessageHandler` class	2019-07-13 14:30:02 +02:00

... 3 4 5 6 7 ...

11933 Commits