Commit Graph

415 Commits

Author SHA1 Message Date
Tim van der Meij
37d5b80ba8
Merge pull request #11118 from Snuffleupagus/FetchBuiltInCMap-sendWithStream
Transfer, rather than copy, CMap data to the worker-thread
2019-09-06 22:56:14 +02:00
Jonas Jenwald
7dea3f9389 [api-minor] Remove the postMessageTransfers parameter, and thus the ability to manually disable transferring of data, from the API
By transfering, rather than copying, `ArrayBuffer`s between the main- and worker-threads, you can avoid unnecessary allocations by only having *one* copy of the same data.
Hence manually setting `postMessageTransfers: false`, when calling `getDocument`, is a performance footgun[1] which will do nothing but waste memory.

Given that every reasonably modern browser supports `postMessage` transfers[2], I really don't see why it should be possible to force-disable this functionality.
Looking at the browser support, for `postMessage` transfers[2], it's highly unlikely that PDF.js is even usable in browsers without it. However, the feature testing of `postMessage` transfers is kept for the time being just to err on the safe side.

---
[1] This is somewhat similar to the, now removed, `disableWorker` parameter which also provided API users a much too simple way of reducing performance.

[2] See e.g. https://developer.mozilla.org/en-US/docs/Web/API/MessagePort/postMessage#Browser_compatibility and https://developer.mozilla.org/en-US/docs/Web/API/Transferable#Browser_compatibility
2019-09-05 13:09:54 +02:00
Jonas Jenwald
f0534b9b51 Adjust the values sent, with the 'test' message, by the WorkerMessageHandler.setup method
Note how the sent values have inconsistent types, with a boolean in one case and an object in the other (normal) case.
Furthermore, explicitly sending a `supportTypedArray: true` property seems superfluous at least to me.
2019-09-05 11:27:27 +02:00
Jonas Jenwald
f11a4ba750 Transfer, rather than copy, CMap data to the worker-thread
It recently occurred to me that the CMap data should be an excellent candidate for transfering.
This will help reduce peak memory usage for PDF documents using CMaps, since transfering of data avoids duplicating it on both the main- and worker-threads.

Unfortunately it's not possible to actually transfer data when *returning* data through `sendWithPromise`, and another solution had to be used.
Initially I looked at using one message for requesting the data, and another message for returning the actual CMap data. While that should have worked, it would have meant adding a lot more complexity particularly on the worker-thread.
Hence the simplest solution, at least in my opinion, is to utilize `sendWithStream` since that makes it *really* easy to transfer the CMap data. (This required PR 11115 to land first, since otherwise CMap fetch errors won't propagate correctly to the worker-thread.)

Please note that the patch *purposely* only changes the API to Worker communication, and not the API *itself* since changing the interface of `CMapReaderFactory` would be a breaking change.
Furthermore, given the relatively small size of the `.bcmap` files (the largest one is smaller than the default range-request size) streaming doesn't really seem necessary either.
2019-09-04 11:46:04 +02:00
Jonas Jenwald
229f6f34d1 Remove the API/Worker version warning message in TESTING mode
The warning messages turn out to be more annoying than helpful when looking at the `console` during tests, so let's just remove them.
2019-09-01 16:47:26 +02:00
Jonas Jenwald
055f03938b Remove support for the scope parameter in the MessageHandler.on method
At this point in time it's easy to convert the `MessageHandler.on` call-sites to use arrow functions, and thus let the JavaScript engine handle scopes for us, rather than having to manually keep references to the relevant scopes in `MessageHandler`.[1]
An additional benefit of this is that a couple of `Function.prototype.call()` instances can now be converted into "normal" function calls, which should be a tiny bit more efficient.

All in all, I don't see any compelling reason why it'd be necessary to keep supporting custom `scope`s in the `MessageHandler` implementation.

---
[1] In the event that a custom scope is ever needed, simply using `bind` on the handler function when calling `MessageHandler.on` ought to work as well.
2019-09-01 09:24:15 +02:00
Jonas Jenwald
ae0d9e8c2a Replace some instances of implicit function.bind(this) usage, in src/display/api.js, with arrow functions instead 2019-08-30 11:35:05 +02:00
Yury Delendik
66e0dd1b06 Use streams for OperatorList chunking (issue 10023)
*Please note:* The majority of this patch was written by Yury, and it's simply been rebased and slightly extended to prevent issues when dealing with `RenderingCancelledException`.

By leveraging streams this (finally) provides a simple way in which parsing can be aborted on the worker-thread, which will ultimately help save resources.
With this patch worker-thread parsing will *only* be aborted when the document is destroyed, and not when rendering is cancelled. There's a couple of reasons for this:

 - The API currently expects the *entire* OperatorList to be extracted, or an Error to occur, once it's been started. Hence additional re-factoring/re-writing of the API code will be necessary to properly support cancelling and re-starting of OperatorList parsing in cases where the `lastChunk` hasn't yet been seen.
 - Even with the above addressed, immediately cancelling when encountering a `RenderingCancelledException` will lead to worse performance in e.g. the default viewer. When zooming and/or rotation of the document occurs it's very likely that `cancel` will be (almost) immediately followed by a new `render` call. In that case you'd obviously *not* want to abort parsing on the worker-thread, since then you'd risk throwing away a partially parsed Page and thus be forced to re-parse it again which will regress perceived performance.
 - This patch is already *somewhat* risky, given that it touches fundamentally important/critical code, and trying to keep it somewhat small should hopefully reduce the risk of regressions (and simplify reviewing as well).

Time permitting, once this has landed and been in Nightly for awhile, I'll try to work on the remaining points outlined above.

Co-Authored-By: Yury Delendik <ydelendik@mozilla.com>
Co-Authored-By: Jonas Jenwald <jonas.jenwald@gmail.com>
2019-08-24 15:56:40 +02:00
Jonas Jenwald
0276385e6e [api-minor] Fix completely broken getStats method by returning stats in Objects, rather than in Arrays (PR 11029 follow-up)
With the changes to the `StreamType`/`FontType` "enums" in PR 11029, one unfortunate result is that `getStats` now *always* returns empty Arrays. Something that everyone, myself included, apparently missed is that you obviously cannot index an Array with Strings :-)

I wrongly assumed that the unit-tests would catch any bugs, but they apparently suffered from the same issue as the code in `src/core/`.

Another possible option could perhaps be to use `Set`s, rather than objects, but that will require larger changes since `LoopbackPort` (in `src/display/api.js`) doesn't support them.
2019-08-02 14:09:24 +02:00
Jonas Jenwald
a3150166ec Ensure that ReadableStreams are cancelled with actual Errors
There's a number of spots in the current code, and tests, where `cancel` methods are not called with appropriate arguments (leading to Promises not being rejected with Errors as intended).
In some cases the cancel `reason` is implicitly set to `undefined`, and in others the cancel `reason` is just a plain String. To address this inconsistency, the patch changes things such that cancelling is done with `AbortException`s everywhere instead.
2019-08-01 16:40:46 +02:00
Jonas Jenwald
c7fb7116d6 Add an API unit-test for the stopAtErrors option (PRs 8240 and 8922 follow-up)
Also fixes an inconsistency in the 'PageError' handler, for `getOperatorList`, in the API.
2019-07-13 16:06:05 +02:00
Jonas Jenwald
ef48a9a713 Update the PageError handler, in the API, to always mark the operatorList as done and finalize any pending renderTasks
Note that, in the old code, there was a code-path which could prevent this from happening thus affecting future cleanup.
Furthermore, ensure that we'll always attempt to cleanup when handling the 'PageError' message, similar to the code in e.g. the `PDFPageProxy._renderPageChunk` method.
2019-07-10 14:23:59 +02:00
Jonas Jenwald
c6fcdf474b Remove the intentState.receivingOperatorList boolean since it's redundant
The `receivingOperatorList` property is currently tracked *twice* in the rendering code, both directly and inversely through the `intentState.operatorList.lastChunk` boolean. This type of double bookkeeping is never a good idea, since it's just too easy for the properties to accidentally fall out of sync.

In this case there's even a `cleanup`-related bug caused by this, which means that `PDFPageProxy._tryCleanup` will never be able to discard any data if there's an error on the worker-thread (as handled through the 'PageError' message).

Hence the simplest solution seems, at least to me, to update `PDFPageProxy._tryCleanup` to replace the `intentState.receivingOperatorList` check with a `!intentState.operatorList.lastChunk` check and completely remove the former property.
2019-07-10 14:23:10 +02:00
Tim van der Meij
06b253d609
Merge pull request #10890 from Snuffleupagus/outline-items-hidden
Add support for outline items, in the default viewer, which default to collapsed when the outline is built
2019-06-09 11:35:49 +02:00
Jonas Jenwald
26bc630e19 Add support for outline items, in the default viewer, which default to collapsed when the outline is built
The PDF specification supports this feature, which is commonly used in large/long documents (such as the spec itself), and it seems reasonably straightforward to implement; see https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G11.2095911
2019-06-07 12:26:23 +02:00
Jonas Jenwald
625af8d2ad [api-minor] Attempt to reduce memory usage during printing, by always running cleanup once rendering has finished
Given that `cleanupAfterRender` is already set for large images, when handling 'obj' messages, this patch *should* thus be safe in general (since otherwise there ought be existing bugs related to cleanup and printing).
2019-06-03 00:29:17 +02:00
Jonas Jenwald
8857a81c8d Re-use, rather than re-creating, some Arrays when resetting them in src/display/api.js
Calling `someArray = []` will create a new Array, which seems completely unnecessary when it's sufficient to just call `someArray.length = 0` to achieve the same effect.

Even though I cannot imagine these particular cases having any noticeable performance impact, similar changes were made in `core/` code years ago since it's apparently more efficient memory wise.
2019-05-30 16:33:05 +02:00
Jonas Jenwald
173fbef05b Enable the consistent-return ESLint rule
This rule is already enabled in mozilla-central, and helps ensure more consistent functions/methods, see https://searchfox.org/mozilla-central/rev/b9da45f63cb567244933c77b2c7e827a057d3f9b/tools/lint/eslint/eslint-plugin-mozilla/lib/configs/recommended.js#119-120

Please see https://eslint.org/docs/rules/consistent-return for additional information.
2019-05-11 14:27:21 +02:00
Tim van der Meij
762c58e0fc
Merge pull request #10738 from Snuffleupagus/ViewerPreferences-api
[api-minor] Add support for ViewerPreferences in the API (issue 10736)
2019-04-20 18:39:32 +02:00
Jonas Jenwald
311bac3ebb [api-minor] Add support for ViewerPreferences in the API (issue 10736)
Please see the specification, https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#M11.9.12864.1Heading.71.Viewer.Preferences

Furthermore, note that this patch *only* adds API support and unit-tests but does not attempt to integrate e.g. the `ViewerPreferences -> Direction` property into the viewer (which would be necessary to address issue 10736).
The reason for this is that it's not entirely clear to me exactly if/how that could be implemented; e.g. would it be as simple as setting the `dir` attribute on the `viewerContainer` DOM element, or will it be more complicated?
There's also the question of how the `ViewerPreferences -> Direction` value interacts with the `PageMode`, and this will generally require a fair bit of manual testing. Since the direction of the *entire* viewer depends on the browser locale, there's also a somewhat open question regarding what default value to use for different locales.
Finally, if the viewer supports `ViewerPreferences -> Direction` then I'm assuming that it will be necessary to allow users to override the default value, which will require (most likely) new `SecondaryToolbar` buttons and icons for those etc.

Hence this patch only lays the necessary foundation for eventually addressing issue 10736, but defers the actual implementation until later. (Time permitting, I'll try to look into the viewer part later.)
2019-04-14 14:20:52 +02:00
Jonas Jenwald
be604bd195 Support (rare) Type3 fonts which contains image resources (issue 10717)
The Type3 font type is not commonly used in PDF documents, as can be seen from telemetry data such as: https://telemetry.mozilla.org/new-pipeline/dist.html#!cumulative=0&end_date=2019-04-09&include_spill=0&keys=__none__!__none__!__none__&max_channel_version=nightly%252F68&measure=PDF_VIEWER_FONT_TYPES&min_channel_version=nightly%252F57&processType=*&product=Firefox&sanitize=1&sort_by_value=0&sort_keys=submissions&start_date=2019-03-18&table=0&trim=1&use_submission_date=0 (see also https://github.com/mozilla/pdf.js/wiki/Enumeration-Assignments-for-the-Telemetry-Histograms#pdf_viewer_font_types).

Type3 fonts containing image resources are *very* rare in practice, usually they only contain path rendering operators, but as the issue shows they unfortunately do exist.
Currently these Type3-related image resources are not handled in any special way, and given that fonts are document rather than page specific rendering breaks since the image resources are thus not available to the *entire* document.
Fortunately fixing this isn't too difficult, but it does require adding a couple of Type3-specific code-paths to the `PartialEvaluator`. In order to keep the implementation simple, particularily on the main-thread, these Type3 image resources are completely decoded on the worker-thread to avoid adding too many special cases. This should not cause any issues, only marginally less efficient code, but given how rare this kind of Type3 font is adding premature optimizations didn't seem at all warranted at this point.
2019-04-13 18:27:50 +02:00
Tim van der Meij
17de90b88a
Merge pull request #10694 from Snuffleupagus/main-thread-progressiveDataLength
Avoid dispatching range requests to fetch PDF data that's already loaded with streaming (PR 10675 follow-up)
2019-04-13 17:15:01 +02:00
Jonas Jenwald
a7273c8efe Avoid dispatching range requests to fetch PDF data that's already loaded with streaming (PR 10675 follow-up)
*Please note:* This patch purposely ignores `src/display/network.js`, since its support for progressive reading depends on the non-standard `moz-chunked-arraybuffer` responseType which is currently in the process of being removed.
2019-04-13 00:26:13 +02:00
Jonas Jenwald
f0a28b3c0d [Firefox] Ensure that loading progress is reported, and the loadingBar updated, when disableRange=true is set
With PR 10675 having fixed the completely broken `disableRange=true` setting in the Firefox version of PDF.js, I couldn't help but noticing that loading progress is never reported properly in that case.
Currently loading progress is only reported for the `rangeProgress` chrome-event, which obviously isn't dispatched with `disableRange=true` set. However, the `progressiveRead` chrome-event includes loading progress as well, but this information isn't being used in any way.
Furthermore, the `PDFDataRangeTransport.onDataProgress` method wasn't able to handle "complete" loading information, and neither was `PDFDataTransportStream._onProgress` since that method would only ever attempt to report it through a RangeReader (which won't exist when `disableRange=true` is set).
2019-04-06 12:53:33 +02:00
Jonas Jenwald
7a999d1d67 [api-minor] Add basic support for PageLayout in the API and the viewer
Please see the specification, https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G6.2393749, and refer to the inline comments for additional details.
2019-04-05 11:32:01 +02:00
Tim van der Meij
57abddc9ca
Merge pull request #10713 from Snuffleupagus/rm-JSDoc-annotation
Remove `src/core/annotation.js` from the `gulp jsdoc` build target
2019-04-04 23:15:02 +02:00
Jonas Jenwald
f666395c24 Remove src/core/annotation.js from the gulp jsdoc build target
Note how at https://mozilla.github.io/pdf.js/api/ it's being described as API docs, however `src/core/annotation.js` is not part of the public API.
Furthermore, given that the code residing in the `src/core/` folder is run in a worker-thread, it's not even accessible on the main-thread (since `postMessage` is being used to transfer the data).
Hence the different API methods simply returns a "proxy" to the underlying data, but not actually the same objects and data structures as in the worker-thread itself; thus it doesn't make a whole lot of sense to expose this in API docs as far as I'm concerned.

Finally, the patch fixes a small JSDoc related typo in `src/display/api.js` when referring to the `TextStyle` typedef.
2019-04-04 18:03:08 +02:00
Jonas Jenwald
bb384dd5ed [Firefox regression] Fix disableRange=true bug in PDFDataTransportStream
Currently if trying to set `disableRange=true` in the built-in PDF Viewer in Firefox, either through `about:config` or via the URL hash, the PDF document will never load. It appears that this has been broken for a couple of years, without anyone noticing.

Obviously it's not a good idea to set `disableRange=true`, however it seems that this bug affects the PDF Viewer in Firefox even with default settings:
 - In the case where `initialData` already contains the *entire* file, we're forced to dispatch a range request to re-fetch already available data just so that file loading may complete.
 - (In the case where the data arrives, via streaming, before being specifically requested through `requestDataRange`, we're also forced to re-fetch data unnecessarily.) *This part was removed, to reduce the scope/risk of the patch somewhat.*

In the cases outlined above, we're having to re-fetch already available data thus potentially delaying loading/rendering of PDF files in Firefox (and wasting resources in the process).
2019-03-26 16:34:13 +01:00
Jonas Jenwald
983b25f863 Ensure that blob: URLs will be revoked when pages are cleaned-up/destroyed
Natively supported JPEG images are sent as-is, using a `blob:` or possibly a `data` URL, to the main-thread for loading/decoding.
However there's currently no attempt at releasing these resources, which are held alive by `blob:` URLs, which seems unfortunately given that images can be arbitrarily large.

As mentioned in https://developer.mozilla.org/en-US/docs/Web/API/URL/createObjectURL the lifetime of these URLs are tied to the document, hence they are not being removed when a page is cleaned-up/destroyed (e.g. when being removed from the `PDFPageViewBuffer` in the viewer).
This is easy to test with the help of `about:memory` (in Firefox), which clearly shows the number of `blob:` URLs becomming arbitrarily large *without* this patch. With this patch however the `blob:` URLs are immediately release upon clean-up as expected, and the memory consumption should thus be considerably reduced for long documents with (simple) JPEG images.
2019-03-15 10:40:58 +01:00
Jonas Jenwald
24fc4f83ca Small clean-up of the PDFDocumentProxy.destroy method and related code
Note how `PDFDocumentProxy.destroy` is a nothing more than an alias for `PDFDocumentLoadingTask.destroy`. While removing the latter method would be a breaking API change, there's still room for at least some clean-up here.

The main changes in this patch are:
 - Stop providing a `PDFDocumentLoadingTask` instance *separately* when creating a `PDFDocumentProxy`, since the loadingTask is already available through the `WorkerTransport` instance.
 - Stop tracking the `PDFDocumentProxy` instance on the `WorkerTransport`, since that property is completely unused.
 - Simplify the 'Multiple `getDocument` instances' unit-tests by only destroying *once*, rather than twice, for each document.
2019-03-12 13:25:29 +01:00
Jonas Jenwald
7caf769a66 Move the deprecated helper function to the src/display/display_utils.js file
Given that the function is (purposely) independent of the verbosity level and that its message is worded to only apply on the main-thread, there's no reason to duplicate this across the built `pdf.js`/`pdf.worker.js` files.
2019-03-02 20:23:56 +01:00
Jonas Jenwald
4170c414fa Reduce usage of Date.now() in src/core/worker.js
Currently for every single parsed/rendered page there's no less than *four* `Date.now()` calls being made on the worker-side. This seems totally unnecessary, since the result of these calls are, by default, not used for anything *unless* the verbosity level is set to `INFO`.
2019-03-02 20:23:52 +01:00
Jonas Jenwald
4687cc85ac Zero the width/height of the temporary canvas used during JpegDecode (issue 10594) 2019-02-28 12:23:34 +01:00
Jonas Jenwald
a1f7517996 Rename the src/display/dom_utils.js file to src/display/display_utils.js
This file (currently) contains not only DOM-specific helper functions/classes, but is used generally for various helper code relevant for main-thread functionality.
2019-02-23 16:30:16 +01:00
Jonas Jenwald
b6d090cc14 Fallback to the built-in font renderer when font loading fails
After PR 9340 all glyphs are now re-mapped to a Private Use Area (PUA) which means that if a font fails to load, for whatever reason[1], all glyphs in the font will now render as Unicode glyph outlines.
This obviously doesn't look good, to say the least, and might be seen as a "regression" since previously many glyphs were left in their original positions which provided a slightly better fallback[2].

Hence this patch, which implements a *general* fallback to the PDF.js built-in font renderer for fonts that fail to load (i.e. are rejected by the sanitizer). One caveat here is that this only works for the Font Loading API, since it's easy to handle errors in that case[3].

The solution implemented in this patch does *not* in any way delay the loading of valid fonts, which was the problem with my previous attempt at a solution, and will only require a bit of extra work/waiting for those fonts that actually fail to load.

*Please note:* This patch doesn't fix any of the underlying PDF.js font conversion bugs that's responsible for creating corrupt font files, however it does *improve* rendering in a number of cases; refer to this possibly incomplete list:

[Bug 1524888](https://bugzilla.mozilla.org/show_bug.cgi?id=1524888)
Issue 10175
Issue 10232

---
[1] Usually because the PDF.js font conversion code wasn't able to parse the font file correctly.

[2] Glyphs fell back to some default font, which while not accurate was more useful than the current state.

[3] Furthermore I'm not sure how to implement this generally, assuming that's even possible, and don't really have time/interest to look into it either.
2019-02-11 10:27:08 +01:00
Jonas Jenwald
13230a1123 Remove the ability to pass in more than one font to BaseFontLoader.bind
- The only existing call-site, of this method, is never passing more than *one* font at a time anyway.
 - As far as I can remember, this functionality has never actually been used (caveat: I didn't check the git history).
 - This allows simplification of the method, especially by making use of the fact that it's now asynchronous.
 - It should be just as easy to call `BaseFontLoader.bind` from within a loop, rather than having the loop in the method itself.
2019-02-10 21:09:57 +01:00
Jonas Jenwald
af3fcca88d Convert BaseFontLoader.bind to be async, and only utilize BaseFontLoader._queueLoadingCallback when actually necessary
Currently all fonts are using the `_queueLoadingCallback` method to determine when they have been loaded[1]. However in most cases this is just adding unnecessary overhead, especially with `BaseFontLoader.bind` now being asynchronous, given how fonts are loaded:
 - For fonts loaded using the Font Loading API, it's already possible to easily tell when a font has been loaded simply by checking the `loaded` promise on the FontFace object itself.
 - For browsers, e.g. Firefox, which support synchronous font loading it's already assumed that fonts are immediately available.

Hence the `_queueLoadingCallback` method is moved into the `GenericFontLoader`, such that it's only utilized for fonts which are loaded using CSS.

---
[1] In the "fonts loaded using CSS" case, this is already a hack anyway as outlined in the comments.
2019-02-10 21:09:57 +01:00
Jonas Jenwald
614e502227 [api-minor] Remove the document.currentScript polyfill
This polyfill is currently used in only *one* file, i.e. `src/display/api.js`, and only when trying to build a *fallback* `workerSrc` path.

Given that the global `workerSrc` should *always* be set[1] when using the PDF.js library[2], and that the fallback `workerSrc` should only be regarded as a best-effort solution anyway, there isn't a particularily strong reason to keep the compatibility code in my opinion.

---
[1] Other supported options include setting the global `workerPort`, or passing in a `PDFWorker` instance as part of the `getDocument` call.

[2] Which is clearly mentioned in the JSDocs in `src/display/worker_options.js`.
2019-02-03 14:09:24 +01:00
Jonas Jenwald
5081063b9e Attempt to clean-up/restore pending rendering operations when errors occurs while a RenderTask runs (PR 10202 follow-up)
This piggybacks of the existing `cancel` functionality, to ensure that any pending operations are closed *and* that any temporary canvases are actually being removed.

Also simplifies `finishPaintTask` in `PDFPageView.draw` slightly, by converting it to an async function.
2019-01-26 16:02:51 +01:00
Tim van der Meij
103f4616ac
Merge pull request #10334 from Snuffleupagus/OpenAction-dest
[api-minor] Add support for OpenAction destinations (issue 10332)
2018-12-23 20:49:50 +01:00
Jonas Jenwald
f0719ed565 [api-minor] Change the getViewport method, on PDFPageProxy, to take a parameter object rather than a bunch of (randomly) ordered parameters
If, as PR 10368 suggests, more parameters should be added to `getViewport` I think that it would be a mistake to not change the signature *first* to avoid needlessly unwieldy call-sites.

To not break any existing code and third-party use-cases, this is obviously implemented with a deprecation warning *and* with a working fallback[1] for the old method signature.

---
[1] This is limited to `GENERIC` builds, which should be sufficient.
2018-12-21 11:55:20 +01:00
Jonas Jenwald
b05f053287 [api-minor] Add support for OpenAction destinations (issue 10332)
Note that the OpenAction dictionary may contain other information besides just a destination array, e.g. instructions for auto-printing[1].
Given first of all that an arbitrary `Dict` cannot be sent from the Worker (since cloning would fail), and second of all that the data obviously needs to be validated, this patch purposely only adds support for fetching a destination from the OpenAction entry[2].

---
[1] This information is, currently in PDF.js, being included through the `getJavaScript` API method.

[2] This significantly reduces the complexity of the implementation, which seems fine for now. If there's ever need for other kinds of OpenAction to be fetched, additional API methods could/should be implemented as necessary (could e.g. follow the `getOpenActionWhatever` naming scheme).
2018-12-19 11:45:16 +01:00
Jonas Jenwald
ac6b94c9dd Replace the remaining occurences, in src/display/api.js, of var with let/const 2018-11-18 19:08:27 +01:00
Jonas Jenwald
061f7bd2f3 Convert PDFWorker, in src/display/api.js, to an ES6 class
Also changes all occurrences of `var` to `let`/`const` in this code.
2018-11-18 19:08:27 +01:00
Jonas Jenwald
02e77a39ec Convert InternalRenderTask, in src/display/api.js, to an ES6 class
This changes all occurrences of `var` to `let`/`const` in this code, and updates the signature of the constructor to use object destructuring for better readability (and self documentation).
Also, `useRequestAnimationFrame` is changed to a parameter and the `typeof window` check is now done *once* rather than at every `_scheduleNext` call.
2018-11-18 19:08:27 +01:00
Jonas Jenwald
5a0d64a6de Convert PDFPageProxy, in src/display/api.js, to an ES6 class
This changes all occurrences of `var` to `let`/`const` in this code, and updates the signatures of a couple of methods to use object destructuring.
Finally, when creating `InternalRenderTask` instances *only* the necessary parameter are now provided, since passing through the `RenderParameters` as-is seems completely unnecessary.
2018-11-18 19:08:25 +01:00
Jonas Jenwald
2c003a82d5 Convert RenderTask, in src/display/api.js, to an ES6 class
Also deprecates the `then` method, in favour of the `promise` getter.
2018-11-18 19:08:00 +01:00
Jonas Jenwald
ef8e5fd77c Convert PDFDocumentLoadingTask, in src/display/api.js, to an ES6 class
Also deprecates the `then` method, in favour of the `promise` getter.
2018-11-18 19:07:57 +01:00
Jonas Jenwald
60da2d882b [api-minor] Refactor/simplify the PDFObject class
First of all, note how there's currently *two* methods for checking if a certain object exists, which seems completely unwarranted.
Furthermore, the rarely used `getData` method was removed and its only callsite changed to use a combination of `PDFObjects.{has, get}` instead.
Finally, the methods were rearranged slightly, to bring the most important ones (for an API user) to the top of the class.
2018-11-08 10:13:39 +01:00
Jonas Jenwald
d32321d84f Convert PDFObjects, in src/display/api.js, to an ES6 class
Also changes all occurrences of `var` to `const`, and marks internal properties/methods as "private".
2018-11-08 10:11:40 +01:00
Tim van der Meij
ec76aa531e
Merge pull request #10202 from Snuffleupagus/issue-10200
Attempt to clean-up/restore pending rendering operations on `RenderTask.cancel` (issue 10200)
2018-11-02 23:11:47 +01:00
Jonas Jenwald
f23dba1c10 Change canvasInRendering to a WeakSet instead of a WeakMap
Note how nowhere in the code `canvasInRendering.get()` is ever called, and that this structure is really only used to store references to `<canvas>` DOM elements.
The reason for this being a `WeakMap` is probably because at the time we weren't using `core-js` polyfills yet, and since there already existed a manually implemented `WeakMap` polyfill it was probably simpler to use that.
2018-10-31 18:15:23 +01:00
Jonas Jenwald
f77b463339 Attempt to clean-up/restore pending rendering operations on RenderTask.cancel (issue 10200)
Please note that, given the lack of a runnable example, I'm not totally sure if this first of all is enough to *completely* address the issue as filed and second of all if we actually want this new behaviour.
2018-10-31 16:22:17 +01:00
Jonas Jenwald
5bb7f4b615 Convert PDFDataRangeTransport to an ES6 class 2018-10-20 17:15:27 +02:00
Jonas Jenwald
327f2eb588 Ensure that onProgress is always called when the entire PDF file has been loaded, regardless of how it was fetched (issue 10160)
*Please note:* I'm totally fine with this patch being rejected, and the issue closed as WONTFIX; however these changes should address the issue if that's desired.

From a conceptual point of view, reporting loading progress doesn't really make a lot of sense for PDF files opened by passing raw binary data directly to `getDocument` (since obviously *all* data was loaded).
This is compared to PDF files loaded via e.g. `XMLHttpRequest` or the Fetch API, where the entire PDF file isn't available from the start and knowing the loading progress makes total sense.

However I can certainly see why the current API could be considered inconsistent, which isn't great, since a registered `onProgress` callback will never be called for certain `getDocument` calls.
The simplest solution to this inconsistency thus seem to be to ensure that `onProgress` is always called when handling the `DataLoaded` message, since that will *always* be dispatched[1] from the worker-thread.

---
[1] Note that this isn't guaranteed to happen, since setting `disableAutoFetch = true` often prevents the *entire* file from ever loading. However, this isn't relevant for the issue at hand, and is a well-known consequence of using `disableAutoFetch = true`; note how the default viewer even has a specialized code-path for hiding the loadingBar.
2018-10-16 13:51:12 +02:00
Tim van der Meij
9e9426c354
Merge pull request #10143 from Snuffleupagus/getMainThreadWorkerMessageHandler-catch-errors
Ensure that `getMainThreadWorkerMessageHandler` won't accidentally break `getDocument` (PR 10139 follow-up)
2018-10-11 00:05:01 +02:00
Jonas Jenwald
0e2c6047e4 Ensure that getMainThreadWorkerMessageHandler won't accidentally break getDocument (PR 10139 follow-up)
*This should have been part of PR 10139.*

In the event that a user has attempted to manually load the worker file on the main-thread, but somehow failed to do that correctly, there's a possibility that `getMainThreadWorkerMessageHandler` could throw. Considering how/where that helper function is being called, an error could still prevent `PDFDocumentLoadingTask` from completing (regardless if it's being resolved/rejected).
2018-10-09 15:44:31 +02:00
Jonas Jenwald
21c8dd4842 Combine the pdfjsFilePath and fallback workerSrc handling in src/display/api.js
With the way that the `getWorkerSrc()` helper function is implemented now, there's no longer a particularly strong reason for keeping the global `pdfjsFilePath` variable around.
With this patch the fallback `workerSrc` will thus, assuming is wasn't already set, be set to the "pdfjsFilePath" which simplifies the `getWorkerSrc()` function and reduces the amount of global state.

Finally, the global `workerSrc` variable was renamed to prevent shadowing.
2018-10-09 13:47:48 +02:00
Jonas Jenwald
755c6edc5e Ensure that the PDFDocumentLoadingTask is rejected when "setting up fake worker" failed (issue 10135)
This should, hopefully, cover all the possible ways[1] in which "fake workers" are loaded. Given the different code-paths, adding unit-tests might not be that simple.
Note that in order to make this work, the various `fakeWorkerFilesLoader` functions were converted to return `Promises`.

---
[1] Unfortunately there's lots of them, for various build targets and configurations.
2018-10-06 13:18:51 +02:00
Tim van der Meij
959ed3705b
Implement a permissions API 2018-09-02 21:23:09 +02:00
Tim van der Meij
4874e9ace0
Convert the WorkerTransport class, in src/display/api.js, to ES6 syntax 2018-09-02 21:06:57 +02:00
Tim van der Meij
9c37599fd3
Convert the PDFDocumentProxy class, in src/display/api.js, to ES6 syntax
Moreover, indicate that a member are private and improve the comments to
be more consistent.
2018-09-02 21:06:57 +02:00
cheryly279
29c0ea159d Adding chunkname to async loaded code
Better name
2018-08-27 17:17:32 -04:00
Jonas Jenwald
1179584fd6 Reject getDestination, in the API, for non-string inputs
Note how e.g. the `getPage` method does basic validation of the input.
2018-08-11 16:06:35 +02:00
Jonas Jenwald
a9ce4e8417 Stop exposing the URL polyfill in the global scope
This moves/exposes the `URL` polyfill similarily to the existing `ReadableStream` polyfill, rather than exposing it globally, to avoid interfering with any "outside" code.
Both the `URL` and `ReadableStream` polyfills are now exposed on the `pdfjsLib` object, such that they are accessible to the viewer components.
Furthermore, the `no-restricted-globals` ESLint rule is also enabled to prevent accidental usage of the native `URL`/`ReadableStream` implementations directly in the `src/` and `web/` folders; see also https://eslint.org/docs/rules/no-restricted-globals

Addresses the remaining TODO in https://github.com/mozilla/pdf.js/projects/6
2018-07-04 09:16:28 +02:00
Jonas Jenwald
bf0aca86d7 Fix re-rendering, using the same canvas, when rendering was previously cancelled (PR 8519 follow-up)
Currently if `RenderTask.cancel` is called *immediately* after rendering was started, then by the time that `InternalRenderTask.initializeGraphics` is called rendering will already have been cancelled.
However, we're still inserting the canvas into the `canvasInRendering` map, thus breaking any future attempts at re-rendering using the same canvas. Considering that `InternalRenderTask.cancel` always removes the canvas from the map, I cannot imagine that we'd ever want to re-add it *after* rendering was cancelled (it was likely just a simple oversight in PR 8519).

Fixes 9456.
2018-06-28 22:56:37 +02:00
Jonas Jenwald
74e9999044 Add unit-tests for PDFPageProxy.stats (PR 9245 follow-up)
This wasn't included in PR 9245, since all the API options were still global at that time.

Writing the unit-tests also uncovered an issue with `getOperatorList` not starting the "Page Request" timer.
2018-06-25 14:20:49 +02:00
Jonas Jenwald
275834ae66 Clean-up, and add JSDocs to, the PDFDocumentProxy.loadingParams method (PR 9830 follow-up) 2018-06-23 13:33:22 +02:00
eugenesqr
331ac8ae74 removed safari compatibility check 2018-06-21 12:57:56 +03:00
Brendan Dahl
a278c5a8dc
Merge pull request #9795 from timvandermeij/object-assign
Replace `Util.extendObj` by `Object.assign`
2018-06-20 10:50:40 -07:00
Tim van der Meij
620da6f4df
Merge pull request #9802 from Snuffleupagus/ColorSpace-PDFImage-Uint8ClampedArray
Update `ColorSpace` and `PDFImage` to use `Uint8ClampedArray`s and remove manual clamping/rounding
2018-06-16 17:55:10 +02:00
Jonas Jenwald
0958006713 Send UnsupportedFeature notification when errors are ignored in FontFaceObject.getPathGenerator 2018-06-13 11:02:10 +02:00
Jonas Jenwald
bf0db0fb72 Pass the ignoreErrors API option to the FontFaceObject constructor, and utilize it in getPathGenerator to ignore missing glyphs
Obviously it's still not possible to render non-embedded fonts as paths, but in this way the rest of the page will at least be allowed to continue rendering.

*Please note:* Including the 14 standard fonts in PDF.js probably wouldn't be *that* difficult to implement. (I'm not a lawyer, but the fonts from PDFium could probably be used given their BSD license.)
However, the main blocker ought to be the total size of the necessary font data, since I cannot imagine people being OK with shipping ~5 MB of (additional) font data with Firefox. (Based on the reactions when the CMap files were added, and those are only ~1 MB in size.)
2018-06-13 11:02:06 +02:00
Jonas Jenwald
778981ec89 Catch, and propagate, errors in the requestAnimationFrame branch of InternalRenderTask._scheduleNext
To support these changes, `InternalRenderTask._next` now returns a Promise.
2018-06-13 11:01:58 +02:00
Jonas Jenwald
731f2e6dfc Remove manual clamping/rounding from ColorSpace and PDFImage, by having their methods use Uint8ClampedArrays
The built-in image decoders are already using `Uint8ClampedArray` when returning data, and this patch simply extends that to the rest of the image/colorspace code.

As far as I can tell, the only reason for using manual clamping/rounding in the first place was because TypedArrays used to be polyfilled (using regular arrays). And trying to polyfill the native clamping/rounding would probably have been had too much overhead, but given that TypedArray support is required in PDF.js version `2.0` that's no longer a concern.

*Please note:* Because of different rounding behaviour, basically `Math.round` in `Uint8ClampedArray` respectively `Math.floor` in the old code, there will be very slight movement in quite a few existing test-cases. However, the changes should be imperceivable to the naked eye, given that the absolute difference is *at most* `1` for each RGB component when comparing `master` and this patch (see also the updated expectation values in the unit-tests).
2018-06-12 11:01:32 +02:00
Brendan Dahl
3ac638fad3
Merge pull request #9689 from RafaPolit/master
Fixed critical unhandled promise that prevented error catching using API
2018-06-11 15:40:30 -06:00
Tim van der Meij
af8e88d00b
Replace Util.extendObj by Object.assign 2018-06-10 20:11:03 +02:00
Tim van der Meij
903bad1906
Remove Util.appendToArray and Util.prependToArray
The former may be replaced by regular JavaScript array concatenation and
the latter is unused. This avoids unnecessary function calls/imports.
2018-06-10 15:24:09 +02:00
Jonas Jenwald
07d610615c Move, and modernize, Util.loadScript from src/shared/util.js to src/display/dom_utils.js
Not only is the `Util.loadScript` helper function unused on the Worker side, even trying to use it there would throw an Error (since `document` isn't defined/available in Workers).
Hence this helper function is moved, and its code modernized slightly by having it return a Promise rather than needing a callback function.

Finally, to reduced code duplication, the "new" loadScript function is exported and used in the viewer.
2018-06-07 13:52:40 +02:00
Jonas Jenwald
871bf5c68b Remove the, now obsolete, handling of the CMapReaderFactory parameter in getDocument
This special handling was added in PR 8567, but was made redundant in PR 8721 which stopped sending everything but the kitchen sink to the Worker side.
2018-06-06 08:52:43 +02:00
Jonas Jenwald
c8e2163bbc Remove incorrect/unnecessary validation of the verbosity parameter in the PDFWorker constructor (PR 9480 follow-up) 2018-06-06 08:52:43 +02:00
Jonas Jenwald
b263b702e8 Rename PDFPageProxy.pageInfo to PDFPageProxy._pageInfo to indicate that the property should be considered "private"
Since `PDFPageProxy` already provide getters for all the data returned by `GetPage` (in the Worker), there isn't any compelling reason for accessing the `pageInfo` directly on `PDFPageProxy`.

The patch also changes the `GetPage` handler, in `src/core/worker.js`, to use modern JavaScript features.
2018-06-06 08:52:42 +02:00
Jonas Jenwald
4f4b50e01e Rename PDFDocumentProxy.pdfInfo to PDFDocumentProxy._pdfInfo to indicate that the property should be considered "private"
Since `PDFDocumentProxy` already provide getters for all the data returned by `GetDoc` (in the Worker), there isn't any compelling reason for accessing the `pdfInfo` directly on `PDFDocumentProxy`.
2018-06-06 08:52:42 +02:00
Jonas Jenwald
e89afa5899 Stop sending the PDFManagerReady message from the Worker, since it's unused in the API
After PR 8617 the `PDFManagerReady` message handler function, in `src/display/api.js`, is now a no-op. Hence it seems completely unnecessary to keep sending this message from `src/core/worker.js`.
2018-06-06 08:52:42 +02:00
Jonas Jenwald
eef53347fe Ensure that the correct data is sent, with the test message, from the worker if typed arrays aren't properly supported
With native typed array support now being mandatory in PDF.js, since version 2.0, this probably isn't a huge problem even though the current code seems wrong (it was changed in PR 6571).

Note how in the `!(data instanceof Uint8Array)` case we're currently attempting to send `handler.send('test', 'main', false);` to the main-thread, which doesn't really make any sense since the signature of the method reads `send(actionName, data, transfers) {`.
Hence the data that's *actually* being sent here is `'main'`, with `false` as the transferList, which just seems weird. On the main-thread, this means that we're in this case checking `data && data.supportTypedArray`, where `data` contains the string `'main'` rather than being falsy. Since a string doesn't have a `supportTypedArray` property, that check still fails as expected but it doesn't seem great nonetheless.
2018-06-06 08:52:42 +02:00
Jonas Jenwald
dc6e1b4176 Use Uint8ClampedArray for the image data returned by JpegDecode, in src/display/api.js
Since all the built-in PDF.js image decoders now return their data as `Uint8ClampedArray`, for consistency `JpegDecode` on the main-thread should be doing the same thing; follow-up to PR 8778.
2018-06-06 08:52:41 +02:00
Jonas Jenwald
47a9d38280 Add more validation in PDFWorker.fromPort
The signature of the `PDFWorker.fromPort` method, in addition to the `PDFWorker` constructor, was changed in PR 9480.
Hence it's probably a good idea to add a bit more validation to `PDFWorker.fromPort`, to ensure that it won't fail silently for an API consumer that updates to version 2.0 of the PDF.js library.
2018-06-06 08:52:41 +02:00
Jonas Jenwald
3c5c8d2a0b Remove the typed array check when calling LoopbackPort in PDFWorker._setupFakeWorker
With version 2.0, native support for typed arrays is now a requirement for using the PDF.js library; see PR 9094 where the old polyfills were removed.
Hence the `isTypedArraysPresent` check, when setting up fake workers, no longer serves any purpose here and can thus be removed.
2018-06-06 08:52:33 +02:00
Jonas Jenwald
89caaf4071 Use LoopbackPort in the "message_handler" unit-tests
There's no good reason, as far as I can tell, to duplicate the functionality of the `LoopbackPort` in the unit-tests. The only difference between the implementations is that `LoopbackPort` mimics the (native) structured cloning, however that shouldn't matter here since the tests are only sending "simple" data (strings respectively arrays with numbers).

Furthermore the patch also changes `LoopbackPort` to default to using "structured cloning" and deferred invocation of the listeners, since native typed array support is now a requirement for using the PDF.js library.
2018-06-04 12:53:08 +02:00
Jonas Jenwald
44d8afd46b Move MessageHandler into a separate src/shared/message_handler.js file
The `MessageHandler` itself, and its assorted helper functions, are currently the single largest[1] piece of code in the `src/shared/util.js` file. By moving this code into its own file, `src/shared/util.js` thus becomes smaller and more manageable.
2018-06-04 12:53:08 +02:00
Jonas Jenwald
69f2a77543 Update the signature of the PageViewport constructor, and improve the JSDoc comments for the class
This changes the constructor to take a parameter object, rather than a string of parameters.
2018-06-04 12:53:07 +02:00
Jonas Jenwald
08c8f8733d Move PageViewport from src/shared/util.js to src/display/dom_utils.js
Since the `PageViewport` is not used in the worker, duplicating this code on both the main and worker sides seems completely unnecessary.
2018-06-04 12:53:07 +02:00
Jonas Jenwald
ef081a0531 Ensure that the WorkerTransport._passwordCapability is always rejected, even when errors are thrown in PDFDocumentLoadingTask.onPassword callback
Please note that while the current code works, both in the viewer and the unit-tests, it can leave the `WorkerTransport._passwordCapability` Promise in a pending state.
In the `PasswordRequest` handler, in src/display/api.js, we're returning the Promise from a `capability` object (rather than just a "plain" Promise). While an error thrown anywhere within this handler was fortunately enough to propagate it to the Worker side, it won't cause the Promise (in `WorkerTransport._passwordCapability`) to actually be rejected.
Finally note that while we're now catching errors in the `PasswordRequest` handler, those errors are still propagated to the Worker side via the (now) rejected Promise and the existing `return this._passwordCapability.promise;` line.

This prevents warnings about uncaught Promises, with messages such as "Error: Worker was destroyed during onPassword callback", when running the unit-tests both in browsers *and* in Node.js/Travis.
2018-06-03 00:28:40 +02:00
Jonas Jenwald
0ecc22cb04 Attempt to provide better default values for the disableFontFace/nativeImageDecoderSupport API options in Node.js
This should provide a better out-of-the-box experience when using PDF.js in a Node.js environment, since it's missing native support for both `@font-face` and `Image`.
Please note that this change *only* affects the default values, hence it's still possible for an API consumer to override those values when calling `getDocument`.

Also, prevents "ReferenceError: document is not defined" errors, when running the unit-tests in Node.js/Travis.
2018-06-03 00:28:37 +02:00
RafaPolit
d63b17dbe3 Fixed critical unhandled promise that prevented error catching using API 2018-04-24 13:10:00 -05:00
Jonas Jenwald
8b09f7c34e Clean-up getMainThreadWorkerMessageHandler for non-PRODUCTION mode
*This is a final piece of clean-up of code that I recently wrote, after which I'm done :-)*

When the `getMainThreadWorkerMessageHandler` function was added, in PR 9385, it did so by basically introducing a `web/app.js` dependency in `src/display/api.js` through the `window.pdfjsNonProductionPdfWorker` property[1]. Even though this is limited to non-`PRODUCTION` mode, i.e. `gulp server`, it still seems unfortunate to have that sort of viewer dependency in the API code itself.

With the new, much nicer and shorter, names introduced in PR 9565 we can remove this non-`PRODUCTION` hack and just use `window.pdfjsWorker` in both the viewer and the API regardless of the build mode.

---

[1] It didn't seem correct to piggy-back on the `window.pdfjsDistBuildPdfWorker` property in non-`PRODUCTION` mode.
2018-03-29 11:03:47 +02:00
Tim van der Meij
5c1a16ba6e
Merge pull request #9586 from Snuffleupagus/pageSize-api-rotate
Ensure that `PDFPageProxy.pageSizeInches` handles non-default /Rotate entries correctly
2018-03-25 18:03:32 +02:00
Jonas Jenwald
d547936827 Ensure that PDFPageProxy.pageSizeInches handles non-default /Rotate entries correctly
Without this patch, the pageSize will be incorrectly reported for some PDF files.

---

Move pageSizeInches to ui_utils
2018-03-25 16:48:29 +02:00
Brendan Dahl
63c7aee112
Merge pull request #9565 from brendandahl/new-name
Rename the globals to shorter names.
2018-03-20 13:49:04 -07:00
Jonas Jenwald
e0ae157582 [api-minor] Fix various issues related to the pageSize information
The `getPageSizeInches` method was implemented on `PDFDocumentProxy`, which seems conceptually wrong since the size property isn't global to the document but rather specific to each page. Hence the method is moved into `PDFPageProxy`, as `get pageSizeInches` instead to address this.

Despite the fact that new API functionality was implemented, no unit-tests were added. To prevent issues later on, we should *always* ensure that new functionality has at least some test-coverage; something that this patch also takes care of.

The new `PDFDocumentProperties._parsePageSize` method seemed unnecessary convoluted. Furthermore, in the "no data provided"-case it even returned incorrect data (an array, rather than the expected object).
Finally, the fallback strings didn't actually agree with the `en-US` locale. This inconsistency doesn't look too great, and it's thus addressed here as well.
2018-03-18 09:10:19 +01:00