Commit Graph

12107 Commits

Author SHA1 Message Date
Jonas Jenwald
ff90aa4323 Inline the isCmd check in the Parser.shift method
For very large and complex PDF files this will help performance slightly, since `Parser.shift` is called *a lot* during parsing.

This patch was tested using the PDF file from issue 2618, i.e. http://bugzilla-attachments.gnome.org/attachment.cgi?id=226471 (with well over *four million* `Parser.shift` calls for just the one page), using the following manifest file:
```
[
    {  "id": "issue2618",
       "file": "../web/pdfs/issue2618.pdf",
       "md5": "",
       "rounds": 100,
       "type": "eq"
    }
]
```

This gave the following results when comparing this patch against the `master` branch:
```
-- Grouped By browser, stat --
browser | stat         | Count | Baseline(ms) | Current(ms) | +/- |    %  | Result(P<.05)
------- | ------------ | ----- | ------------ | ----------- | --- | ----- | -------------
Firefox | Overall      |   100 |         3386 |        3322 | -65 | -1.92 |        faster
Firefox | Page Request |   100 |            1 |           1 |   0 | -8.08 |
Firefox | Rendering    |   100 |         3385 |        3321 | -65 | -1.92 |        faster
```
2019-07-22 12:07:36 +02:00
Tim van der Meij
71d9f5f860
Merge pull request #10995 from Snuffleupagus/app-rm-one-setFileSize
Remove an unnecessary `PDFDocumentProperties.setFileSize` call, relevant for the Firefox built-in viewer, and use the "normal" code-path in `PDFViewerApplication.open` instead
2019-07-21 12:42:26 +02:00
Jonas Jenwald
53a854bb0a Remove an unnecessary PDFDocumentProperties.setFileSize call, relevant for the Firefox built-in viewer, and use the "normal" code-path in PDFViewerApplication.open instead
Since calling `getDocument` with a `PDFDataRangeTransport` argument will always unconditionally override a manually provided `length` argument, see a1a667809f/src/display/api.js (L390-L394), this patch should thus be safe.
2019-07-21 11:38:17 +02:00
Tim van der Meij
a1a667809f
Merge pull request #10993 from Snuffleupagus/AppOption-docBaseUrl
Add the `docBaseUrl` API parameter to `AppOptions` in the viewer
2019-07-20 13:57:52 +02:00
Jonas Jenwald
ba2c042a75 Add the docBaseUrl API parameter to AppOptions in the viewer
This unfortunately required a bit of special handling, to correctly deal with the various extension builds.
2019-07-20 13:39:34 +02:00
Tim van der Meij
0cc0789af3
Merge pull request #10986 from Snuffleupagus/inline-ensureByte-ensureRange
Attempt to significantly reduce the number of `ChunkedStream.{ensureByte, ensureRange}` calls by inlining the `this.progressiveDataLength` checks at the call-sites
2019-07-19 22:51:21 +02:00
Tim van der Meij
acef5bfd16
Merge pull request #10979 from Snuffleupagus/firefox-zoomreset
[Firefox] Re-factor the 'zoomreset' message handling in the viewer (PR 10652 follow-up)
2019-07-19 22:42:07 +02:00
Tim van der Meij
b964df53da
Merge pull request #10990 from Snuffleupagus/onBeforeDraw-onAfterDraw
Refactor the `onBeforeDraw`/`onAfterDraw` functionality used in `BaseViewer` and `PDFPageView`
2019-07-19 22:35:41 +02:00
Tim van der Meij
98c4c646cb
Merge pull request #10987 from mozilla/dependabot/npm_and_yarn/js-yaml-3.13.1
Bump js-yaml from 3.12.0 to 3.13.1
2019-07-19 22:25:13 +02:00
Jonas Jenwald
366eebeb0f Refactor the onBeforeDraw/onAfterDraw functionality used in BaseViewer and PDFPageView
This functionality is very old, and pre-dates e.g. the introduction of the `EventBus` by a number of years. Rather than attaching two callback functions to every single `PDFPageView` instance, it's thus now possible to utilize the `EventBus` such that you only need a grand total of two listeners to achieve the same result.

For the `onAfterDraw` callback the replacement is particularly simple, given that a 'pagerendered' event is already being dispatched in the appropriate spot. An added benefit here is the ability to remove the event listener, since we only ever care about *one* (arbitrary) page being rendered for the `BaseViewer.onePageRendered` promise.

For the `onBeforeDraw` callback, a new 'pagerender' event was thus added to replace the callback.
2019-07-19 12:57:14 +02:00
dependabot[bot]
808f7db586
Bump js-yaml from 3.12.0 to 3.13.1
Bumps [js-yaml](https://github.com/nodeca/js-yaml) from 3.12.0 to 3.13.1.
- [Release notes](https://github.com/nodeca/js-yaml/releases)
- [Changelog](https://github.com/nodeca/js-yaml/blob/master/CHANGELOG.md)
- [Commits](https://github.com/nodeca/js-yaml/compare/3.12.0...3.13.1)

Signed-off-by: dependabot[bot] <support@github.com>
2019-07-19 00:04:03 +00:00
Jonas Jenwald
b5254f2745 Attempt to significantly reduce the number of ChunkedStream.{ensureByte, ensureRange} calls by inlining the this.progressiveDataLength checks at the call-sites
The number of in particular `ChunkedStream.ensureByte` calls is often absolutely *huge* (on the order of million calls) when loading and rendering even moderately complicated PDF files, which isn't entirely surprising considering that the `getByte`/`getBytes`/`peekByte`/`peekBytes` methods are used for essentially all data reading/parsing.

The idea implemented in this patch is to inline an inverted `progressiveDataLength` check at all of the `ensureByte`/`ensureRange` call-sites, which in practice will often result in *several* orders of magnitude fewer function calls.
Obviously this patch will only help if the browser supports streaming, which all reasonably modern browsers now do (including the Firefox built-in PDF viewer), and assuming that the user didn't set the `disableStream` option (e.g. for using `disableAutoFetch`). However, I think we should be able to improve performance for the default out-of-the-box use case, without worrying about e.g. older browsers (where this patch will thus incur *one* additional check before calling `ensureByte`/`ensureRange`).

This patch was inspired by the *first* commit in PR 5005, which was subsequently backed out in PR 5145 for causing regressions. Since the general idea of avoiding unnecessary function calls was really nice, I figured that re-attempting this in one way or another wouldn't be a bad idea.
Given that streaming is now supported, which it wasn't back then, using `progressiveDataLength` seemed like an easier approach in general since it also allowed supporting both `ensureByte` and `ensureRange`.

This sort of patch obviously needs data to back it up, hence I've benchmarked the changes using the following manifest file (with the default `tracemonkey` file):
```
[
    {  "id": "tracemonkey-eq",
       "file": "pdfs/tracemonkey.pdf",
       "md5": "9a192d8b1a7dc652a19835f6f08098bd",
       "rounds": 250,
       "type": "eq"
    }
]
```

I get the following complete results when comparing this patch against the `master` branch:
```
-- Grouped By browser, stat --
browser | stat         | Count | Baseline(ms) | Current(ms) | +/- |    %  | Result(P<.05)
------- | ------------ | ----- | ------------ | ----------- | --- | ----- | -------------
Firefox | Overall      |  3500 |          140 |         134 |  -6 | -4.46 |        faster
Firefox | Page Request |  3500 |            2 |           2 |   0 | -0.10 |
Firefox | Rendering    |  3500 |          138 |         131 |  -6 | -4.54 |        faster
```

Here it's pretty clear that the patch does have a positive net effect, even for a PDF file of fairly moderate size and complexity. However, in this case it's probably interesting to also look at the results per page:
```
-- Grouped By page, stat --
page | stat         | Count | Baseline(ms) | Current(ms) | +/- |     %  | Result(P<.05)
---- | ------------ | ----- | ------------ | ----------- | --- | ------ | -------------
0    | Overall      |   250 |           74 |          75 |   1 |   0.69 |
0    | Page Request |   250 |            1 |           1 |   0 |  33.20 |
0    | Rendering    |   250 |           73 |          74 |   0 |   0.25 |
1    | Overall      |   250 |          123 |         121 |  -2 |  -1.87 |        faster
1    | Page Request |   250 |            3 |           2 |   0 | -11.73 |
1    | Rendering    |   250 |          121 |         119 |  -2 |  -1.67 |
2    | Overall      |   250 |           64 |          63 |  -1 |  -1.91 |
2    | Page Request |   250 |            1 |           1 |   0 |   8.81 |
2    | Rendering    |   250 |           63 |          62 |  -1 |  -2.13 |        faster
3    | Overall      |   250 |           97 |          97 |   0 |  -0.06 |
3    | Page Request |   250 |            1 |           1 |   0 |  25.37 |
3    | Rendering    |   250 |           96 |          95 |   0 |  -0.34 |
4    | Overall      |   250 |           97 |          97 |   0 |  -0.38 |
4    | Page Request |   250 |            1 |           1 |   0 |  -5.97 |
4    | Rendering    |   250 |           96 |          96 |   0 |  -0.27 |
5    | Overall      |   250 |           99 |          97 |  -3 |  -2.92 |
5    | Page Request |   250 |            2 |           1 |   0 | -17.20 |
5    | Rendering    |   250 |           98 |          95 |  -3 |  -2.68 |
6    | Overall      |   250 |           99 |          99 |   0 |  -0.14 |
6    | Page Request |   250 |            2 |           2 |   0 | -16.49 |
6    | Rendering    |   250 |           97 |          98 |   0 |   0.16 |
7    | Overall      |   250 |           96 |          95 |  -1 |  -0.55 |
7    | Page Request |   250 |            1 |           2 |   1 |  66.67 |        slower
7    | Rendering    |   250 |           95 |          94 |  -1 |  -1.19 |
8    | Overall      |   250 |           92 |          92 |  -1 |  -0.69 |
8    | Page Request |   250 |            1 |           1 |   0 | -17.60 |
8    | Rendering    |   250 |           91 |          91 |   0 |  -0.52 |
9    | Overall      |   250 |          112 |         112 |   0 |   0.29 |
9    | Page Request |   250 |            2 |           1 |   0 |  -7.92 |
9    | Rendering    |   250 |          110 |         111 |   0 |   0.37 |
10   | Overall      |   250 |          589 |         522 | -67 | -11.38 |        faster
10   | Page Request |   250 |           14 |          13 |   0 |  -1.26 |
10   | Rendering    |   250 |          575 |         508 | -67 | -11.62 |        faster
11   | Overall      |   250 |           66 |          66 |  -1 |  -0.86 |
11   | Page Request |   250 |            1 |           1 |   0 | -16.48 |
11   | Rendering    |   250 |           65 |          65 |   0 |  -0.62 |
12   | Overall      |   250 |          303 |         291 | -12 |  -4.07 |        faster
12   | Page Request |   250 |            2 |           2 |   0 |  12.93 |
12   | Rendering    |   250 |          301 |         289 | -13 |  -4.19 |        faster
13   | Overall      |   250 |           48 |          47 |   0 |  -0.45 |
13   | Page Request |   250 |            1 |           1 |   0 |   1.59 |
13   | Rendering    |   250 |           47 |          46 |   0 |  -0.52 |
```

Here it's clear that this patch *significantly* improves the rendering performance of the slowest pages, while not causing any big regressions elsewhere. As expected, this patch thus helps larger and/or more complex pages the most (which is also where even small improvements will be most beneficial).
There's obviously the question if this is *slightly* regressing simpler pages, but given just how short the times are in most cases it's not inconceivable that the page results above are simply caused be e.g. limited `Date.now()` and/or limited numerical precision.
2019-07-18 17:30:22 +02:00
Jonas Jenwald
d1af8bd196 Slightly more simplified dispatching of the 'findbarclose' events in firefoxcom.js 2019-07-18 14:28:49 +02:00
Jonas Jenwald
8e5aa484fb [Firefox] Re-factor the 'zoomreset' message handling in the viewer (PR 10652 follow-up)
Given that this special-case only matters for the Firefox PDF viewer, it's probably better to just move it into `firefoxcom.js` instead to reduce unnecessary confusion.
2019-07-18 14:27:43 +02:00
Tim van der Meij
6e96a158f4
Merge pull request #10820 from vlastimilmaca/annot-irt-rt-states
Annotations - Added parsing of IRT, RT, State and StateModel
2019-07-17 23:34:31 +02:00
vlastimilmaca
fe49f0f766 Annotations - Implement parsing of IRT, RT, State and StateModel 2019-07-16 23:33:07 +02:00
Tim van der Meij
bf60fe88d0
Merge pull request #10974 from Snuffleupagus/refactor-get-fingerprint
Simplify the `PDFDocument.fingerprint` method slightly
2019-07-15 22:29:37 +02:00
Jonas Jenwald
bea15b6ce5 Simplify the PDFDocument.fingerprint method slightly
The way that this method handles documents without an `ID` entry in the Trailer dictionary feels overly complicated to me. Hence this patch adds `getByteRange` methods to the various Stream implementations[1], and utilize that rather than manually calling `ensureRange` when computing a fallback `fingerprint`.

---
[1] Note that `PDFDocument` is only ever initialized with either a `Stream` or a `ChunkedStream`, hence why the `DecodeStream.getByteRange` method isn't implemented.
2019-07-15 13:26:08 +02:00
Jonas Jenwald
c7de6dbe41 Update the fingerprint API unit-tests to explicitly check for the expected result
The current tests won't catch inadvertent changes to the logic used to obtain/compute the document `fingerprint`.
2019-07-15 11:19:17 +02:00
Tim van der Meij
13ebfec903
Merge pull request #10969 from Snuffleupagus/api-test-stopAtErrors
Add an API unit-test for the `stopAtErrors` option (PRs 8240 and 8922 follow-up)
2019-07-14 14:47:57 +02:00
Tim van der Meij
766d076dcb
Merge pull request #10970 from Snuffleupagus/MessageHandler-simplify-finalize
Simplify, and inline, the `finalize` function in the `MessageHandler` class
2019-07-14 14:45:15 +02:00
Jonas Jenwald
b548bafef7 Simplify, and inline, the finalize function in the MessageHandler class
The `finalize` helper function has only a *single* call-site, and furthermore it's just a one-liner too. Furthermore it's only ever called with a `Promise` as its argument, meaning that it's unnecessarily convoluted as well (i.e. the `Promise.resolve()` part shouldn't be necessary).
Hence this code can be both simplified *and* inlined at its only call-site instead.
2019-07-13 17:54:32 +02:00
Jonas Jenwald
c7fb7116d6 Add an API unit-test for the stopAtErrors option (PRs 8240 and 8922 follow-up)
Also fixes an inconsistency in the 'PageError' handler, for `getOperatorList`, in the API.
2019-07-13 16:06:05 +02:00
Tim van der Meij
b01cc55cfd
Merge pull request #10968 from Snuffleupagus/MessageHandler-rm-useless-wrapReason
Remove useless `wrapReason` calls in the `MessageHandler` class
2019-07-13 14:30:02 +02:00
Jonas Jenwald
17116917f7 Remove useless wrapReason calls in the MessageHandler class
Currently `wrapReason` is manually called at *every* `resolveOrReject` call-site, despite it being completely unnecessary unless there's an actual error being handled. This is obviously inefficient, and it's easy enough to avoid by having `resolveOrReject` handle this only when actually needed.
2019-07-13 13:08:29 +02:00
Tim van der Meij
421bf62849
Merge pull request #10966 from brendandahl/pagerendered-time
Add timestamp to the page rendered event.
2019-07-12 23:24:10 +02:00
Brendan Dahl
60b8b7a8d2 Add timestamp to the page rendered event.
This is needed to track rendering time in Firefox's talos performance
framework. See https://bugzilla.mozilla.org/show_bug.cgi?id=1565680
2019-07-12 14:08:23 -07:00
Tim van der Meij
e3496041b5
Merge pull request #10950 from monchouchou/master
Fixed testing webserver to handle paths correctly on Windows
2019-07-12 23:05:37 +02:00
Tim van der Meij
ed3954fc7a
Merge pull request #10851 from brendandahl/shading-bbox
Apply bounding box before using shading patterns.
2019-07-12 22:52:07 +02:00
Tim van der Meij
87f36e3520
Merge pull request #10850 from brendandahl/scale-line-width
Scale stroking line width when using a tiling pattern.
2019-07-12 22:50:32 +02:00
Tim van der Meij
28326165ff
Merge pull request #10958 from Snuffleupagus/api-rm-receivingOperatorList
Remove the `intentState.receivingOperatorList` boolean since it's redundant
2019-07-11 23:55:00 +02:00
Tim van der Meij
7e85f3fa77
Merge pull request #10964 from mozilla/dependabot/npm_and_yarn/lodash-4.17.14
Bump lodash from 4.17.10 to 4.17.14
2019-07-11 23:51:33 +02:00
Tim van der Meij
f8fd38744f
Merge pull request #10962 from Snuffleupagus/TextLayer-uncaught-promise-msg
Prevent "Uncaught promise" messages in the console when cancelling `TextLayer` tasks (PR 10601 follow-up)
2019-07-11 23:14:28 +02:00
Tim van der Meij
6e594a89da
Merge pull request #10959 from Snuffleupagus/rm-PrintService-body-attribute
Remove the `data-pdfjsprinting` attribute on the `<body>` when destroying `FirefoxPrintService`/`PDFPrintService` instances (issue 10948)
2019-07-11 23:12:40 +02:00
Tim van der Meij
478f05650c
Merge pull request #10963 from Snuffleupagus/app-zoomIn-zoomOut-presentationMode
Ensure that `PDFViewerApplication.{zoomIn, zoomOut}` won't run when PesentationMode is active (PR 10652 follow-up)
2019-07-11 23:09:23 +02:00
dependabot[bot]
99de61038a
Bump lodash from 4.17.10 to 4.17.14
Bumps [lodash](https://github.com/lodash/lodash) from 4.17.10 to 4.17.14.
- [Release notes](https://github.com/lodash/lodash/releases)
- [Commits](https://github.com/lodash/lodash/compare/4.17.10...4.17.14)

Signed-off-by: dependabot[bot] <support@github.com>
2019-07-11 13:44:41 +00:00
Jonas Jenwald
19f6facc1e Ensure that PDFViewerApplication.{zoomIn, zoomOut} won't run when PresentationMode is active (PR 10652 follow-up)
Similar to the `zoomReset` method we need to ensure that this code won't run for zoom events originating within the browser UI itself, since checks in e.g. the `keydown` event handler won't help in that case.
2019-07-11 15:41:44 +02:00
Jonas Jenwald
9a4d14bf36 Prevent "Uncaught promise" messages in the console when cancelling TextLayer tasks (PR 10601 follow-up)
Since `finally` won't stop error propagation, this causes unnecessary messages to be printed in the console whenever a `TextLayer` task is cancelled.
2019-07-11 11:48:33 +02:00
Brendan Dahl
8444aeec83
Merge pull request #10960 from timvandermeij/bump
Bump versions in `pdfjs.config`
2019-07-10 14:21:07 -07:00
Tim van der Meij
734074c547
Bump versions in pdfjs.config 2019-07-10 22:25:24 +02:00
Jonas Jenwald
cd48f05597 Remove the data-pdfjsprinting attribute on the <body> when destroying FirefoxPrintService/PDFPrintService instances (issue 10948)
Also, cleans up variable definitions slightly in the `FirefoxPrintService.layout` method.
2019-07-10 16:49:31 +02:00
Jonas Jenwald
ef48a9a713 Update the PageError handler, in the API, to always mark the operatorList as done and finalize any pending renderTasks
Note that, in the old code, there was a code-path which could prevent this from happening thus affecting future cleanup.
Furthermore, ensure that we'll always attempt to cleanup when handling the 'PageError' message, similar to the code in e.g. the `PDFPageProxy._renderPageChunk` method.
2019-07-10 14:23:59 +02:00
Jonas Jenwald
c6fcdf474b Remove the intentState.receivingOperatorList boolean since it's redundant
The `receivingOperatorList` property is currently tracked *twice* in the rendering code, both directly and inversely through the `intentState.operatorList.lastChunk` boolean. This type of double bookkeeping is never a good idea, since it's just too easy for the properties to accidentally fall out of sync.

In this case there's even a `cleanup`-related bug caused by this, which means that `PDFPageProxy._tryCleanup` will never be able to discard any data if there's an error on the worker-thread (as handled through the 'PageError' message).

Hence the simplest solution seems, at least to me, to update `PDFPageProxy._tryCleanup` to replace the `intentState.receivingOperatorList` check with a `!intentState.operatorList.lastChunk` check and completely remove the former property.
2019-07-10 14:23:10 +02:00
Brendan Dahl
6fab0a0dac Apply bounding box before using shading patterns.
Fixes #8092
2019-07-08 14:05:48 -07:00
Brendan Dahl
446efab707 Scale stroking line width when using a tiling pattern. 2019-07-08 13:47:54 -07:00
Tim van der Meij
d7afb74a6e
Merge pull request #10949 from Snuffleupagus/delay-findController-init
Delay initialization of searching, in the viewer, until the first page has rendered
2019-07-08 22:39:22 +02:00
alephneo
f861d5c0d4 Fixed test/webserver to handle paths correctly on Windows 2019-07-07 02:42:50 +05:30
Jonas Jenwald
d3c0f2861b Delay initialization of searching, in the viewer, until the first page has rendered
When searching occurs for the first time in a document, the `textContent` of all pages will be fetched from the API. If there's a pending search operation when the document loads that will thus lead to a lot of `getTextContent` calls very early on, which may unnecessarily delay rendering of the first page. Generally, in the viewer, a number of non-essential API calls[1] will be deferred until the first page has been rendered, and there's no good reason as far as I can tell to handle searching differently.

---
[1] Such as e.g. `getOutline` and `getAttachments`.
2019-07-06 17:33:28 +02:00
Tim van der Meij
d66d273869
Merge pull request #10947 from Snuffleupagus/document-find-peekBytes
Make the `find` helper function, in `src/core/document.js`, more efficient by using `peekBytes` rather reading the stream one byte at a time
2019-07-06 13:52:22 +02:00
Jonas Jenwald
bdc31f8b50 Make the find helper function, in src/core/document.js, more efficient by using peekBytes rather reading the stream one byte at a time
*Please note:* A a similar change was attempted in PR 5005, but it was subsequently backed out in PR 5069.

Unfortunately I don't think anyone ever tried to debug *exactly* why it didn't work, since it ought to have worked, and having re-tested this now I'm not able to reproduce the problem any more. However, given just how inefficient the current code is, with thousands of strictly unnecessary function calls for each `find` invocation, I'd really like to try fixing this again.
2019-07-06 11:44:17 +02:00