Commit Graph

13999 Commits

Author SHA1 Message Date
Tim van der Meij
0ae5a6ef05
Merge pull request #13209 from Snuffleupagus/rm-PDFPageView-enableScripting
Remove the `enableScripting` option from the `PDFPageView` constructor
2021-04-09 20:59:29 +02:00
Tim van der Meij
acb5c5093b
Merge pull request #13210 from brendandahl/cache-source-map
Cache babel source map.
2021-04-09 20:56:07 +02:00
Tim van der Meij
bf7ae6f82a
Merge pull request #13204 from eltociear/patch-2
Fix typo in canvas.js
2021-04-09 20:54:54 +02:00
Brendan Dahl
a31d142253 Cache babel source map.
When source is already cached and you reload,
the source map is lost which makes debugging
async functions difficult.
2021-04-09 10:34:54 -07:00
Brendan Dahl
fc9501a637 Add support for basic structure tree for accessibility.
When a PDF is "marked" we now generate a separate DOM that represents
the structure tree from the PDF.  This DOM is inserted into the <canvas>
element and allows screen readers to walk the tree and have more
information about headings, images, links, etc. To link the structure
tree DOM (which is empty) to the text layer aria-owns is used. This
required modifying the text layer creation so that marked items are
now tracked.
2021-04-09 09:56:28 -07:00
Jonas Jenwald
ec9e29807a Remove the enableScripting option from the PDFPageView constructor
Scripting, as implemented, requires access to a complete document/viewer in order to work. Hence it doesn't really make sense to keep the `enableScripting`-option on `PDFPageView`-instances.[1]

---
[1] Note that there's the `PDFSinglePageViewer`, which can be used in cases where you want access to all features/functionality of the viewer but only display *one* page at a time.
2021-04-09 14:20:47 +02:00
Jonas Jenwald
737a8e846d Add deprecated handling of the now removed AnnotationStorage API-parameters
These changes are done separately, to make it easier to remove them in the future.
2021-04-09 13:25:03 +02:00
Jonas Jenwald
72ef183085 [api-minor] Remove the manual passing of an AnnotationStorage-instance when calling various API-method
Note how we purposely don't expose the `AnnotationStorage`-class directly in the official API (see `src/pdf.js`), since trying to use *multiple* ones simultaneously doesn't really make sense (e.g. in the viewer).
Instead we lazily initialize, and cache, just *one* instance via `PDFDocumentProxy.annotationStorage` which should thus be available internally in the API itself without having to be manually passed to various methods.

To support these changes, the `AnnotationStorage`-instance initialization is moved into the `WorkerTransport`-class to allow both `PDFDocumentProxy` and `PDFPageProxy` to access it.
This patch implements the following simplifications:
 - Remove the `annotationStorage`-parameter from `PDFDocumentProxy.saveDocument`, since it's already available internally.
   Furthermore, while it's currently possible to call that method without an `AnnotationStorage`-instance, that really does *not* make any sense at all. In this case you're effectively reducing `PDFDocumentProxy.saveDocument` to a "regular" `PDFDocumentProxy.getData` call, but with *a lot* more overhead, which was obviously not the intention of the `PDFDocumentProxy.saveDocument`-method.

 - Try to discourage third-party users from calling `PDFDocumentProxy.saveDocument` unconditionally, as a replacement for `PDFDocumentProxy.getData` (note the previous point).

 - Replace the `annotationStorage`-parameter, in `PDFPageProxy.render`, with a boolean `includeAnnotationStorage`-parameter which simply indicates if the (internally available) `AnnotationStorage`-instance should be used during rendering (e.g. for printing).

 - By removing the need to *manually* provide `annotationStorage`-parameters to various API-methods, using the API should become simpler (e.g. for third-parties) since you no longer need to worry about manually fetching and passing around this data.
2021-04-09 13:24:25 +02:00
Ikko Ashimine
c4c4333d54
Fix typo in canvas.js
Reseting -> Resetting
2021-04-08 23:45:24 +09:00
Jonas Jenwald
d8e0794650 Improve the image quality of thumbnails rendered by PDFThumbnailView.draw (issue 8233)
The reason for the fairly large discrepancy, in the thumbnail quality, between the `draw`/`setImage`-methods is that in the former case we *directly* render the thumbnails at the final size that they'll appear at in the sidebar. In the latter case, we instead downsize the (generally) much larger "regular" pages.

To address this, I'm thus proposing that we let `PDFThumbnailView.draw` render thumbnails at *twice* their intended size and then downsize them to the final size.
Obviously this will increase *peak* memory usage during thumbnail rendering in `PDFThumbnailView.draw`, since doubling the width/height of a `canvas` will lead to its pixel-count increasing by a factor of `4`. Furthermore, since you need four components per pixel (given that it's RGBA-data), this will thus lead to the *temporary* thumbnail `canvas`-sizes increasing by a factor of `16` during rendering. Hence why rendering thumbnails at their "original" scale, i.e. using something like `PDFPageProxy.getViewport({ scale: 1 });`, would be an absolutely terrible idea!

To reduce the size and scope of these changes, I've tried to re-factor and re-use as much of the existing downsizing-implementation already present in `PDFThumbnailView` as possible.

While this will generally *not* make thumbnails rendered by `PDFThumbnailView.draw` look *identical* to those based on the rendered pages (via `PDFThumbnailView.setImage`), it's a considerable improvement as far as I'm concerned and enough to call the issue fixed.

*Please note:* This patch will not lead to *any* additional overhead, in either memory usage or parsing, for thumbnails which are based on the rendered pages.
2021-04-08 13:58:24 +02:00
Jonas Jenwald
32a00b9b2b Stop looping over childNodes, in PDFThumbnailView.reset, when removing the thumbnail
A loop is less efficient than just overwriting the content, which is what we've generally been using (for years) in other parts of the code-base (see e.g. `BaseViewer` and `PDFThumbnailViewer`).
2021-04-08 12:22:07 +02:00
Jonas Jenwald
8ea83f7030 Convert some properties, on PDFThumbnailView-instances, to local variables
These properties are always updated/used together, and there's no other methods which depend on just one of them, hence they're changed into local variables instead.
Looking through the history of this code, it seems they were converted *from* local variables and to properties all the way back in PR 2914; however as far as I can tell from that diff it doesn't seem to have been necessary even back then!?
2021-04-08 12:21:52 +02:00
Tim van der Meij
6429ccc002
Merge pull request #13194 from Snuffleupagus/ttcf-fuzzy-match
Fuzzy-match the fontName, for TrueType Collection fonts, where the "name"-table is wrong (issue 13193)
2021-04-07 20:50:19 +02:00
Tim van der Meij
5945f7c4a1
Merge pull request #13186 from Snuffleupagus/rm-deprecated-code
Remove some `deprecated` code
2021-04-07 20:38:59 +02:00
Tim van der Meij
336ebd6fa1
Merge pull request #13184 from timvandermeij/sets
Convert objects to sets in places where we only track keys
2021-04-07 20:34:49 +02:00
Jonas Jenwald
f986ccdf0e Fuzzy-match the fontName, for TrueType Collection fonts, where the "name"-table is wrong (issue 13193)
The fontName, as defined in the PDF document, cannot be found in *any* of the "name"-tables in the TrueType Collection font. To work-around that, this patch adds a *fallback* code-path to allow using an approximately matching fontName rather than outright failing.
2021-04-07 15:25:32 +02:00
Jonas Jenwald
4e81e0e14f Remove the deprecated AnnotationStorage.getOrCreateValue-method (PR 12759 follow-up)
While this method has only been deprecated in one releases now, the `AnnotationStorage`-functionality is new enough that third-party implementations hopefully don't rely heavily on it just yet. (And removing this quickly should help reduce the likelihood that someone starts using it.)
2021-04-06 13:22:06 +02:00
Jonas Jenwald
92ec10bfca Remove the deprecated PDFLinkService.navigateTo-method (PR 12440 follow-up)
This method has been deprecated in two releases now, hence we shouldn't need to keep this code around.
2021-04-06 13:08:50 +02:00
Jonas Jenwald
b2758c3023 Remove the deprecated properties from the "presentationmodechanged" event (PR 12788 follow-up)
These properties have been deprecated in two releases now, hence we shouldn't need to keep this code around.
2021-04-06 13:04:23 +02:00
Tim van der Meij
ff393d6e96
Convert the pendingFindMatches member, in web/pdf_find_controller.js, from an object to a set
We only want to track page numbers instead of actual data, so using a
set conveys that intention more clearly and is slightly more efficient.
2021-04-05 19:33:53 +02:00
Tim van der Meij
fc0cd4a443
Convert the startXRefParsedCache variable, in src/core/obj.js, from an object to a set
We only want to track XRef starting points instead of actual data, so
using a set conveys that intention more clearly and is slightly more
efficient.
2021-04-05 19:32:58 +02:00
Tim van der Meij
6ddc297170
Merge pull request #13183 from timvandermeij/bump
Bump versions in `pdfjs.config`
2021-04-05 16:01:29 +02:00
Tim van der Meij
37c88bfaed
Bump versions in pdfjs.config 2021-04-05 15:56:18 +02:00
Tim van der Meij
228adbf673
Merge pull request #13172 from Snuffleupagus/cleanup-keepFonts
[api-minor] Add an option, in `PDFDocumentProxy.cleanup`, to allow fonts to remain attached to the DOM
2021-04-05 14:21:34 +02:00
Tim van der Meij
11f2eab4b8
Merge pull request #13179 from Snuffleupagus/more-Set
Utilize `Set` a bit more in the code-base
2021-04-05 14:06:53 +02:00
Tim van der Meij
3698bc1287
Merge pull request #13175 from Snuffleupagus/app-close-save
[GENERIC viewer] Avoid data loss in forms, by triggering saving when the document is closed (issue 12257)
2021-04-05 14:03:18 +02:00
Tim van der Meij
3f90565272
Merge pull request #13177 from Snuffleupagus/update-packages
Update packages and translations
2021-04-05 13:31:30 +02:00
Jonas Jenwald
16fd838f52 Convert the renderTasks, used in PDFPageProxy.render/PDFPageProxy.getOperatorList, to a Set
When removing tasks we're currently forced to *indirectly* iterate through the array, which can be avoided by using a Set instead.
Furthermore, we can also (slightly) modernize the code responsible for initializing the `renderTasks`.
2021-04-05 10:51:28 +02:00
Jonas Jenwald
68d3a333ac Change the seenStyles object, in PartialEvaluator.getTextContent, to a Set
Given that what we actually want is only to keep track of the loadedFont-names, rather than storing any actual data, using an object isn't really necessary here. Furthermore, in the current code, we're also using `in` when checking if the data exists, which is generally less efficient than just checking for the value directly.
2021-04-05 10:34:02 +02:00
Jonas Jenwald
3d9f6ec0f9 Update l10n files 2021-04-04 11:04:05 +02:00
Jonas Jenwald
f4727f0fec Update npm packages 2021-04-04 10:53:29 +02:00
Jonas Jenwald
3f59d4201a [GENERIC viewer] Avoid data loss in forms, by triggering saving when the document is closed (issue 12257)
As discussed in the issue, this is a small/simple patch that should help to prevent *outright* data loss in forms when a new document is opened in the GENERIC viewer.

While the implementation is perhaps a bit "simplistic", it does seem to work and should be fine given that this is an edge-case only relevant for the GENERIC viewer.
2021-04-03 18:16:53 +02:00
Jonas Jenwald
5b28a0bf97 Re-factor the download/save-methods, on PDFViewerApplication, to make full use of async/await
In the next patch we'll need to be able to actually wait for saving to complete, hence it's necessary to slightly re-factor the `save`-method.

As part of these changes, we can reduce some duplication in the `save`-method and slightly improve the overall code. For consistency, the `download`-method is updated similarily to improve the code (this functionality is *very* old, even pre-dating the introduction of Promises in the code-base).
2021-04-03 18:11:01 +02:00
Tim van der Meij
5cf116a958
Merge pull request #13169 from Snuffleupagus/DefaultAppearanceEvaluator-fontName
[api-minor] Change the format of the `fontName`-property, in `defaultAppearanceData`, on Annotation-instances (PR 12831 follow-up)
2021-04-02 20:43:22 +02:00
Jonas Jenwald
232fbd28e1 Re-factor the PDFDocumentProxy.cleanup unit-tests to use async/await 2021-04-02 12:32:35 +02:00
Jonas Jenwald
a2bc6481a0 [api-minor] Add an option, in PDFDocumentProxy.cleanup, to allow fonts to remain attached to the DOM
As mentioned in the JSDoc comment, this should not be used unless you know what you're doing, since it will lead to increased memory usage. However, in some situations (e.g. SVG-rendering), we still want to be able to run general clean-up on both the main/worker-thread while keeping loaded fonts attached to the DOM.[1]

As part of these changes, `WorkerTransport.startCleanup` is converted to an async method and we'll also skip clean-up when destruction has started (since it's redundant).

---
[1] The SVG-rendering mode is obviously not officially supported, since it's both rather incomplete and inherently slower. However with recent changes, whereby we cache repeated images on the document rather than the page level, memory usage can be *a lot* worse than before if we never attempt to release e.g. cached image-data when the viewer is in SVG-rendering mode.
2021-04-02 12:32:31 +02:00
Jonas Jenwald
48ff20493f Mark some internal PDFDocumentProxy-properties as "private"
These two properties were *never* intended to be anything but "private", hence it really cannot hurt to actually indicate that they're *not* part of any official API.
2021-04-02 12:26:32 +02:00
Jonas Jenwald
0eb1433c78 [api-minor] Change the format of the fontName-property, in defaultAppearanceData, on Annotation-instances (PR 12831 follow-up)
Currently the `fontName`-property contains an actual /Name-instance, which is a problem given that its fallback value is an empty string; see ca7f546828/src/core/default_appearance.js (L35)
The reason that this is a problem can be seen in ca7f546828/src/core/primitives.js (L30-L34), since an empty string short-circuits the cache. Essentially, in PDF documents, a /Name-instance cannot be empty and the way that the `DefaultAppearanceEvaluator` does things is unfortunately not entirely correct.

Hence the `fontName`-property is changed to instead contain a string, rather than a /Name-instance, which simplifies the code overall.

*Please note:* I'm tagging this patch with "[api-minor]", since PR 12831 is included in the current pre-release (although we're not using the `fontName`-property in the display-layer).
2021-04-01 16:47:30 +02:00
Tim van der Meij
ca7f546828
Merge pull request #12908 from calixteman/11918
Slightly rescale lineWidth to workaround chrome rendering issue
2021-03-31 21:56:31 +02:00
Calixte Denizet
a0cfb0841f Slightly rescale lineWidth to workaround chrome rendering issue 2021-03-31 21:49:00 +02:00
Tim van der Meij
5a64157a2f
Merge pull request #13168 from janpe2/ttf-uni-glyphs
Use post table when Encoding has only Differences
2021-03-31 21:35:13 +02:00
Tim van der Meij
1a4af17d07
Merge pull request #13165 from Snuffleupagus/Annotation-rm-defaultAppearance-export
[api-minor] Stop exposing the *raw* `defaultAppearance`-string on Annotation-instances
2021-03-31 21:30:50 +02:00
Tim van der Meij
5be0fbe8f1
Merge pull request #13166 from Snuffleupagus/getDocument-URL
[api-minor] Support proper `URL`-objects, in addition to URL-strings, in `getDocument`
2021-03-31 21:20:08 +02:00
Tim van der Meij
70915d34ed
Merge pull request #13157 from Snuffleupagus/webViewerOpenFileViaURL-cleanup
Remove the `file://`-URL special-case from `webViewerOpenFileViaURL`
2021-03-31 20:24:08 +02:00
Tim van der Meij
2fb4d02ea5
Merge pull request #13158 from Snuffleupagus/rm-URL-polyfill
Remove the `URL` polyfill
2021-03-31 20:22:02 +02:00
Tim van der Meij
d47b30ef54
Merge pull request #13156 from Snuffleupagus/outline-destRef-null
Prevent errors, in `PDFOutlineViewer._getPageNumberToDestHash`, for invalid `destRef` values (PR 12777 follow-up)
2021-03-31 20:20:35 +02:00
Tim van der Meij
b6babb0d95
Merge pull request #13161 from mozilla/dependabot/npm_and_yarn/y18n-3.2.2
Bump y18n from 3.2.1 to 3.2.2
2021-03-31 20:19:01 +02:00
Jani Pehkonen
0117ee5071 Use post table when Encoding has only Differences
Fixes #13107
In the issue, some TrueType glyph names have the format `uniXXXX`.
Font's `Encoding` dictionary has the entry `Differences` but no
`BaseEncoding`. `uniXXXX` names are converted to glyph indices
using font's `post` table but currently that is done only when
`BaseEncoding` exists. We must enable the conversion also when only
`Differences` exists.
2021-03-31 17:58:44 +03:00
Jonas Jenwald
db1e1612df [api-minor] Support proper URL-objects, in addition to URL-strings, in getDocument
Currently only URL-strings are officially supported by `getDocument`, however at this point in time I cannot really see any compelling reason to not support `URL`-objects as well.

Most likely the reason that we've don't already support `URL`-objects, in `getDocument`, is that historically `URL` wasn't fully implemented across browsers and our old polyfill wasn't perfect; see https://developer.mozilla.org/en-US/docs/Web/API/URL/URL#browser_compatibility

*Please note:* Because of how the `url` parameter is currently handled, there's actually *some* cases where passing a `URL`-object to `getDocument` already works. That, in my opinion, provides additional motivation for supporting `URL`-objects officially, since it makes the API more consistent.

The following is an attempt to summarize the *current* situation, based on the actual code rather than the JSDocs:
 - `getDocument("url string")` works and is documented.[1]
 - `getDocument({ url: "url string", })` works and is documented.[1]
 - `getDocument(new URL(...))` throws immediately, since no supported parameters are found.
 - `getDocument({ url: new URL(...), })` actually works even though it's not documented.[1] Originally, when data was fetched on the worker-thread, this would likely have thrown since `URL` isn't clonable.[2]
 - `getDocument({ url: { abc: 123, }, })`, or some similarily meaningless input, will be "accepted" by `getDocument` and then throw a `MissingPDFException` when attempting to fetch the bogus data.

With the changes in this patch, not only is `URL`-objects now officially supported and documented when calling `getDocument`, but we'll also do a much better job at actually validating any URL-data passed to `getDocument` (and instead fail early).

---
[1] In *browsers*, we create a valid URL thus indirectly validating the input. In Node.js environments, on the other hand, no validation is done since obtaining a baseUrl is more difficult (and PDF.js is primarily written for browsers anyway).

[2] https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm#supported_types
2021-03-31 16:21:41 +02:00
Jonas Jenwald
27add0f1f3 Re-factor the source parsing, in getDocument, to use switch rather than if...else
Given the number of parameters that we now need to parse here, this code is no longer as readable as one would like. Hence this re-factoring, which will improve overall readability and also help with the next patch.
2021-03-31 16:21:37 +02:00