pdf.js

Author	SHA1	Message	Date
Jonas Jenwald	df0e1edab5	Re-factor sending of various Exceptions from the worker to the API As can be seen in the API, there's a number of document loading Exception handlers which are both really simple and highly similar. Hence these are changed such that all the relevant Exceptions are sent via one message instead. Furthermore, the patch also avoids unnecessarily re-creating `UnknownErrorException`s at the worker side and removes an unnecessary `bind` call.	2019-10-19 12:54:54 +02:00
Tim van der Meij	ec6a99d781	Bundle all API documentation in a module This commit allows JSDoc to generate all API documentation in the `pdfjsLib` module (namespace) so the documentation becomes easier to navigate.	2019-10-13 21:23:00 +02:00
Tim van der Meij	9f4d45ddf4	Don't include private methods in the the `PDFPageProxy` API documentation	2019-10-13 21:23:00 +02:00
Tim van der Meij	36c01c2c2a	Deduplicate the documentation for `PDFDocumentLoadingTask` and `PDFWorker` Both classes live inside a closure with the same name, which confuses JSDoc. Move the documentation to the inner class to deduplicate them.	2019-10-13 21:23:00 +02:00
Tim van der Meij	ca3a58f93a	Consistently use `@returns` for returned data types in JSDoc comments Sometimes we also used `@return`, but `@returns` is what the JSDoc documentation recommends. Even though `@return` works as an alias, it's good to use the recommended syntax and to be consistent within the project.	2019-10-13 13:58:17 +02:00
Tim van der Meij	8b4ae6f3eb	Consistently use `@type` for getter data types in JSDoc comments Sometimes we also used `@return` or `@returns`, but `@type` is what the JSDoc documentation recommends. This also improves the documentation because before this commit the types were not shown and now they are.	2019-10-13 13:58:17 +02:00
Tim van der Meij	f4daafc077	Consistently use square brackets for optional parameters in JSDoc comments Square brackets are recommended to indicate optional parameters. Using them helps for automatically generating correct documentation.	2019-10-13 13:58:17 +02:00
Jonas Jenwald	ea729ec55c	[api-minor] Replace all `deprecated` calls with throwing of actual `Error`s All of these methods have been marked as `deprecated` in three releases now, and I'd thus like to (slowly) move towards complete removal. However rather than just removing the methods right away, which would cause somewhat cryptic failures, this patch tries to implement a hopefully reasonable middle ground by throwing `Error`s with (essentially) the same information as the previous warnings. While the previous `deprecated` messages could perhaps be seen as optional, with these changes API consumers will now be forced to actually migrate their code.	2019-10-09 09:21:15 +02:00
Takashi Tamura	d5ee083050	* use square brackets for optional properties in the JSDoc comments of src/display/api.js	2019-10-08 20:34:17 +09:00
Tim van der Meij	8c4f4b5eec	Merge pull request #11182 from Snuffleupagus/disableWorker-disable-Dict-postMessage Forbid sending of `Dict`s and `Stream`s, with `postMessage`, when workers are disabled	2019-09-29 15:09:42 +02:00
Jonas Jenwald	5d93fda4f2	Convert the various `...Exception`s to proper classes, to reduce code duplication By utilizing a base "class", things become significantly simpler. Unfortunately the new `BaseException` cannot be a proper ES6 class and just extend `Error`, since the SystemJS dependency doesn't seem to play well with that. Note also that we (generally) need to keep the `name` property on the actual `...Exception` object, rather than on its prototype, since the property will otherwise be dropped during the structured cloning used with `postMessage`.	2019-09-29 10:16:20 +02:00
Jonas Jenwald	3f8fee371b	Forbid sending of `Dict`s and `Stream`s, with `postMessage`, when workers are disabled By default, i.e. with workers enabled, it's purposely not possible to send `Dict`s and `Stream`s from the worker-thread. This is achieved by defining a `function` on every `Dict` instance, since that ensures that [the structured clone algoritm](https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm) will throw an Error on `postMessage`. However, with workers disabled we fall-back to the `LoopbackPort` implementation which just ignores any `function`s, thus incorrectly allowing sending of data which should be unclonable.	2019-09-26 16:16:13 +02:00
Tim van der Meij	1f5ebfbf0c	Replace our `URL` polyfill with the one from `core-js` `core-js` polyfills have proven to be of good quality and using them prevents us from having to maintain them ourselves.	2019-09-19 14:09:51 +02:00
Jonas Jenwald	281ed33e43	Abort, with a small delay, `getOperatorList` on the worker-thread when rendering is cancelled (PR 11069 follow-up) With this patch we're finally able to abort worker-thread parsing of the `OperatorList`, rather than only aborting the main-thread rendering itself, when the `RenderTask.cancel` method is being called. This will help improve perceived performance in the default viewer, especially when reading longer and more complex documents, since pages that've been scrolled out-of-view (and thus evicted from the cache) will no longer compete for parsing resources on the worker-thread. Please note: With the implementation in this patch we're not aborting worker-thread parsing immediately on `RenderTask.cancel`, since that would lead to worse performance in many cases. For example: When zoom/rotation occurs in the viewer, while parsing/rendering is still ongoing, a `cancel` call will usually be (almost) immediately folled by a new `PDFPageProxy.render` call. In that case you obviously don't want to abort parsing on the worker-thread, since that would risk throwing away a partially parsed `OperatorList` and thus force unnecessary re-parsing which will regress perceived performance (especially for more complex documents). When choosing a reasonable delay, before cancelling `getOperatorList` on the worker-thread when `RenderTask.cancel` is called, two different positions need to be considered: 1. The delay needs to be short enough, since a timeout in the multiple seconds range would essentially make this entire functionality meaningless (by always allowing most/all pages enough time to finish parsing). 2. The delay cannot be too short, since that would actually reduce performance in the zoom/rotation case outlined above. Furthermore, the time between `RenderTask.cancel` and `PDFPageProxy.render` calls will obviously be affected by both general computer performance and current CPU load. It's certainly possible that the timeout may require some further tweaks, however the value settled on in this patch was easily one order of magnitude larger than the delta between cancel/render in my tests.	2019-09-14 11:30:32 +02:00
Jonas Jenwald	00efff532c	Ensure that `addLinkAttributes` is always called with a valid `url` parameter There's no good reason for calling this helper function without a `url` parameter, and this way we can prevent that from happening. Note how the `PDFOutlineViewer` call-site was already doing the right thing here, and only the `LinkAnnotationElement` call-site needed a small adjustment to make it work.	2019-09-11 13:24:04 +02:00
Jonas Jenwald	12e1c91f73	Don't `enqueue` unused properties when sending 'GetOperatorList' data from the worker-thread (PR 11069 follow-up) With the changes made in PR 11069, it's no longer necessary to include the `pageIndex`/`intent` parameters when sending 'GetOperatorList' data. In the previous implementation these properties were used to associate the `OperatorList` with the correct `RenderTask`, however now that `ReadableStream`s are used that's handled automatically and it's thus dead code at this point.	2019-09-09 17:41:26 +02:00
Tim van der Meij	37d5b80ba8	Merge pull request #11118 from Snuffleupagus/FetchBuiltInCMap-sendWithStream Transfer, rather than copy, CMap data to the worker-thread	2019-09-06 22:56:14 +02:00
Jonas Jenwald	7dea3f9389	[api-minor] Remove the `postMessageTransfers` parameter, and thus the ability to manually disable transferring of data, from the API By transfering, rather than copying, `ArrayBuffer`s between the main- and worker-threads, you can avoid unnecessary allocations by only having one copy of the same data. Hence manually setting `postMessageTransfers: false`, when calling `getDocument`, is a performance footgun[1] which will do nothing but waste memory. Given that every reasonably modern browser supports `postMessage` transfers[2], I really don't see why it should be possible to force-disable this functionality. Looking at the browser support, for `postMessage` transfers[2], it's highly unlikely that PDF.js is even usable in browsers without it. However, the feature testing of `postMessage` transfers is kept for the time being just to err on the safe side. --- [1] This is somewhat similar to the, now removed, `disableWorker` parameter which also provided API users a much too simple way of reducing performance. [2] See e.g. https://developer.mozilla.org/en-US/docs/Web/API/MessagePort/postMessage#Browser_compatibility and https://developer.mozilla.org/en-US/docs/Web/API/Transferable#Browser_compatibility	2019-09-05 13:09:54 +02:00
Jonas Jenwald	f0534b9b51	Adjust the values sent, with the 'test' message, by the `WorkerMessageHandler.setup` method Note how the sent values have inconsistent types, with a boolean in one case and an object in the other (normal) case. Furthermore, explicitly sending a `supportTypedArray: true` property seems superfluous at least to me.	2019-09-05 11:27:27 +02:00
Jonas Jenwald	f11a4ba750	Transfer, rather than copy, CMap data to the worker-thread It recently occurred to me that the CMap data should be an excellent candidate for transfering. This will help reduce peak memory usage for PDF documents using CMaps, since transfering of data avoids duplicating it on both the main- and worker-threads. Unfortunately it's not possible to actually transfer data when returning data through `sendWithPromise`, and another solution had to be used. Initially I looked at using one message for requesting the data, and another message for returning the actual CMap data. While that should have worked, it would have meant adding a lot more complexity particularly on the worker-thread. Hence the simplest solution, at least in my opinion, is to utilize `sendWithStream` since that makes it really easy to transfer the CMap data. (This required PR 11115 to land first, since otherwise CMap fetch errors won't propagate correctly to the worker-thread.) Please note that the patch purposely only changes the API to Worker communication, and not the API itself since changing the interface of `CMapReaderFactory` would be a breaking change. Furthermore, given the relatively small size of the `.bcmap` files (the largest one is smaller than the default range-request size) streaming doesn't really seem necessary either.	2019-09-04 11:46:04 +02:00
Tim van der Meij	e59b11860d	Merge pull request #11108 from timvandermeij/es6-annotations Use more ES6 syntax in the annotation code	2019-09-02 23:13:24 +02:00
Jonas Jenwald	229f6f34d1	Remove the API/Worker version warning message in `TESTING` mode The warning messages turn out to be more annoying than helpful when looking at the `console` during tests, so let's just remove them.	2019-09-01 16:47:26 +02:00
Jonas Jenwald	055f03938b	Remove support for the `scope` parameter in the `MessageHandler.on` method At this point in time it's easy to convert the `MessageHandler.on` call-sites to use arrow functions, and thus let the JavaScript engine handle scopes for us, rather than having to manually keep references to the relevant scopes in `MessageHandler`.[1] An additional benefit of this is that a couple of `Function.prototype.call()` instances can now be converted into "normal" function calls, which should be a tiny bit more efficient. All in all, I don't see any compelling reason why it'd be necessary to keep supporting custom `scope`s in the `MessageHandler` implementation. --- [1] In the event that a custom scope is ever needed, simply using `bind` on the handler function when calling `MessageHandler.on` ought to work as well.	2019-09-01 09:24:15 +02:00
Tim van der Meij	49018482dc	Use more ES6 syntax in `src/display/annotation_layer.js` `let` is converted to `const` where possible, `var` usage is disabled and template strings are used where possible.	2019-08-31 16:40:39 +02:00
Jonas Jenwald	ae0d9e8c2a	Replace some instances of implicit `function.bind(this)` usage, in `src/display/api.js`, with arrow functions instead	2019-08-30 11:35:05 +02:00
Jonas Jenwald	667e548e5f	[TextLayer] Remove `setAttribute` usage in `appendText` (issue 8066) One of the motivations for using `setAttribute` in the first place was to support more efficient DOM updates in the `expandTextDivs` method, since performance of the `enhanceTextSelection` mode can be somewhat bad when there's a lot of `textDivs` on the page. With recent `TextLayer` changes/optimizations it's no longer necessary to store a complete `style`-string for every `textDiv`, and we can thus re-visit the `setAttribute` usage. Note that with the current code, in `appendText`, there's only one string per `textDiv` which avoids a bunch of temporary strings. While the changes in this patch means that there's now three strings per `textDiv` instead, the total length of these strings are now quite a bit shorter (42 characters to be exact).	2019-08-28 16:52:09 +02:00
Jonas Jenwald	106b239c5d	[TextLayer] Avoid unnecessary font updates in `_layoutText` (PR 11097 follow-up) This should obviously have been done in PR 11097, but for some reason I completely overlooked it; sorry about that. There's no good reason to update the font unless you're actually going to measure the width of the textContent. This can reduce unnecessary font switching a fair bit, even for documents which are somewhat simple/short (in e.g. the `tracemonkey.pdf` file this cuts the amount of font switches almost in half).	2019-08-28 16:08:06 +02:00
Jonas Jenwald	a1398048e5	[TextLayer] Simplify building of the expanded transform in `expandTextDivs` Rather than essentially re-computing the `originalTransform` every time, we can simply use it directly instead.	2019-08-25 13:09:04 +02:00
Jonas Jenwald	b68f7bb404	[TextLayer] Only measure the width of the text, in `_layoutText`, for multi-char text divs For performance reasons single-char text divs aren't being scaled, as outlined in a comment in `appendText`. Hence it doesn't seem necessary, or even a good idea, to unconditionally measuring the width of the text in `_layoutText`.	2019-08-25 12:32:49 +02:00
Yury Delendik	66e0dd1b06	Use streams for OperatorList chunking (issue 10023) Please note: The majority of this patch was written by Yury, and it's simply been rebased and slightly extended to prevent issues when dealing with `RenderingCancelledException`. By leveraging streams this (finally) provides a simple way in which parsing can be aborted on the worker-thread, which will ultimately help save resources. With this patch worker-thread parsing will only be aborted when the document is destroyed, and not when rendering is cancelled. There's a couple of reasons for this: - The API currently expects the entire OperatorList to be extracted, or an Error to occur, once it's been started. Hence additional re-factoring/re-writing of the API code will be necessary to properly support cancelling and re-starting of OperatorList parsing in cases where the `lastChunk` hasn't yet been seen. - Even with the above addressed, immediately cancelling when encountering a `RenderingCancelledException` will lead to worse performance in e.g. the default viewer. When zooming and/or rotation of the document occurs it's very likely that `cancel` will be (almost) immediately followed by a new `render` call. In that case you'd obviously not want to abort parsing on the worker-thread, since then you'd risk throwing away a partially parsed Page and thus be forced to re-parse it again which will regress perceived performance. - This patch is already somewhat risky, given that it touches fundamentally important/critical code, and trying to keep it somewhat small should hopefully reduce the risk of regressions (and simplify reviewing as well). Time permitting, once this has landed and been in Nightly for awhile, I'll try to work on the remaining points outlined above. Co-Authored-By: Yury Delendik <ydelendik@mozilla.com> Co-Authored-By: Jonas Jenwald <jonas.jenwald@gmail.com>	2019-08-24 15:56:40 +02:00
Jonas Jenwald	29a2516e4c	[TextLayer] Use an Array to build the total `padding`, rather than concatenating Strings, in `expandTextDivs` Furthermore, it's possible to re-use the same Array for all `textDiv`s on the page and the resulting padding string also becomes a lot more compact. Please note that the `paddingLeft` branch was moved, since the padding values need to be ordered as `top, right, bottom, left`. Finally, with this re-factoring it's no longer necessary to cache the original `style` string for every `textDiv` when `enhanceTextSelection` is enabled.	2019-08-24 01:13:59 +02:00
Tim van der Meij	edbebb8bf7	Merge pull request #11090 from Snuffleupagus/textLayer-expandTextDivs-transform [TextLayer] Use an Array to build the total `transform`, rather than concatenating Strings, in `expandTextDivs`	2019-08-23 23:12:42 +02:00
Jonas Jenwald	932fcacff8	[TextLayer] Only handle positive padding values in `expandTextDivs` Given that browsers will reject padding values smaller than zero (which may be caused by limited numerical precision during calculations in the `expand` code), it makes no sense to include those when expanding the `textDiv`s.	2019-08-23 13:16:20 +02:00
Jonas Jenwald	37e8a8189b	[TextLayer] Use an Array to build the total `transform`, rather than concatenating Strings, in `expandTextDivs` Furthermore, it's possible to re-use the same Array for all `textDiv`s on the page.	2019-08-23 12:17:12 +02:00
Tim van der Meij	490deb1b65	Merge pull request #11086 from Snuffleupagus/textLayer-originalTransform [TextLayer] Only cache the `originalTransform` when `enhanceTextSelection` is enabled	2019-08-22 23:09:07 +02:00
Brendan Dahl	31f319301d	Merge pull request #11087 from brendandahl/disable-links Add a way to disable external links.	2019-08-22 11:13:11 -07:00
Jonas Jenwald	a519ceffee	[TextLayer] Use template strings when updating the font property in the `_layoutText` method	2019-08-22 14:47:44 +02:00
Jonas Jenwald	6afe3221b7	[TextLayer] Only cache the `originalTransform` when `enhanceTextSelection` is enabled Given that this is completely unused in "regular" text-selection mode, there's no reason to unconditionally store one string for every `textDiv`.	2019-08-22 14:47:18 +02:00
Brendan Dahl	98e989116c	Add a way to disable external links.	2019-08-21 11:20:41 -07:00
Jonas Jenwald	431a264126	[TextLayer] Reduce the amount of intermediary strings in `expandTextDivs` By using template strings, we can avoid some unnecessary string allocations (which is also helped by shortening a variable name).	2019-08-19 12:09:18 +02:00
Jonas Jenwald	45dfad8640	[TextLayer] Only cache the current `textDiv` style when `enhanceTextSelection` is enabled This will help save a little bit of memory, by not storing one unused string for each `textDiv` in regular text-selection mode.	2019-08-19 11:02:56 +02:00
Jonas Jenwald	0276385e6e	[api-minor] Fix completely broken `getStats` method by returning stats in Objects, rather than in Arrays (PR 11029 follow-up) With the changes to the `StreamType`/`FontType` "enums" in PR 11029, one unfortunate result is that `getStats` now always returns empty Arrays. Something that everyone, myself included, apparently missed is that you obviously cannot index an Array with Strings :-) I wrongly assumed that the unit-tests would catch any bugs, but they apparently suffered from the same issue as the code in `src/core/`. Another possible option could perhaps be to use `Set`s, rather than objects, but that will require larger changes since `LoopbackPort` (in `src/display/api.js`) doesn't support them.	2019-08-02 14:09:24 +02:00
Jonas Jenwald	a3150166ec	Ensure that `ReadableStream`s are cancelled with actual Errors There's a number of spots in the current code, and tests, where `cancel` methods are not called with appropriate arguments (leading to Promises not being rejected with Errors as intended). In some cases the cancel `reason` is implicitly set to `undefined`, and in others the cancel `reason` is just a plain String. To address this inconsistency, the patch changes things such that cancelling is done with `AbortException`s everywhere instead.	2019-08-01 16:40:46 +02:00
wangsongyan	c61205d980	decode filename when match an urlencode filename from contentDispositionFilename	2019-07-31 09:33:56 +08:00
Jonas Jenwald	c7fb7116d6	Add an API unit-test for the `stopAtErrors` option (PRs 8240 and 8922 follow-up) Also fixes an inconsistency in the 'PageError' handler, for `getOperatorList`, in the API.	2019-07-13 16:06:05 +02:00
Tim van der Meij	ed3954fc7a	Merge pull request #10851 from brendandahl/shading-bbox Apply bounding box before using shading patterns.	2019-07-12 22:52:07 +02:00
Tim van der Meij	87f36e3520	Merge pull request #10850 from brendandahl/scale-line-width Scale stroking line width when using a tiling pattern.	2019-07-12 22:50:32 +02:00
Tim van der Meij	28326165ff	Merge pull request #10958 from Snuffleupagus/api-rm-receivingOperatorList Remove the `intentState.receivingOperatorList` boolean since it's redundant	2019-07-11 23:55:00 +02:00
Jonas Jenwald	9a4d14bf36	Prevent "Uncaught promise" messages in the console when cancelling `TextLayer` tasks (PR 10601 follow-up) Since `finally` won't stop error propagation, this causes unnecessary messages to be printed in the console whenever a `TextLayer` task is cancelled.	2019-07-11 11:48:33 +02:00
Jonas Jenwald	ef48a9a713	Update the `PageError` handler, in the API, to always mark the `operatorList` as done and finalize any pending renderTasks Note that, in the old code, there was a code-path which could prevent this from happening thus affecting future cleanup. Furthermore, ensure that we'll always attempt to cleanup when handling the 'PageError' message, similar to the code in e.g. the `PDFPageProxy._renderPageChunk` method.	2019-07-10 14:23:59 +02:00

... 5 6 7 8 9 ...

1188 Commits