This is first of all consistent with existing API-methods, where we return `null` when the data in question doesn't exist. Secondly, it should also be (slightly) more efficient since there's less dummy-data that we need to transfer between threads.
Finally, this prevents us from adding an empty/unnecessary span to *every* single page even in documents without any structure tree data.
- but don't validate them for now;
- Firefox will display a bar to warn that the signature validation is not supported (see https://bugzilla.mozilla.org/show_bug.cgi?id=854315)
- almost all (all ?) pdf readers display signatures;
- validation is done in edge but for now it's behind a pref.
It's obviously better and more correct to handle the "pagesloaded" case within `PDFOutlineViewer` *itself*, rather than essentially splitting the logic in two parts and forcing `PDFSidebar` to deal with what should've been handled internally in `PDFOutlineViewer`.
This is what I *should* have done in PR 12777, but for some reason didn't figure out how to implement it well enough back then; sorry about the churn here!
*This patch fixes some technical debt in the viewer.*
Given that most API methods are (purposely) asynchronous, there's always a risk that the viewer could have been `close`d before the requested data arrives.
Lately we've started to check this case before using the data, to prevent errors and/or inconsistent state, however the outline/attachments/layers fetching and rendering is old enough that it pre-dates those checks.
When a PDF is "marked" we now generate a separate DOM that represents
the structure tree from the PDF. This DOM is inserted into the <canvas>
element and allows screen readers to walk the tree and have more
information about headings, images, links, etc. To link the structure
tree DOM (which is empty) to the text layer aria-owns is used. This
required modifying the text layer creation so that marked items are
now tracked.
Scripting, as implemented, requires access to a complete document/viewer in order to work. Hence it doesn't really make sense to keep the `enableScripting`-option on `PDFPageView`-instances.[1]
---
[1] Note that there's the `PDFSinglePageViewer`, which can be used in cases where you want access to all features/functionality of the viewer but only display *one* page at a time.
Note how we purposely don't expose the `AnnotationStorage`-class directly in the official API (see `src/pdf.js`), since trying to use *multiple* ones simultaneously doesn't really make sense (e.g. in the viewer).
Instead we lazily initialize, and cache, just *one* instance via `PDFDocumentProxy.annotationStorage` which should thus be available internally in the API itself without having to be manually passed to various methods.
To support these changes, the `AnnotationStorage`-instance initialization is moved into the `WorkerTransport`-class to allow both `PDFDocumentProxy` and `PDFPageProxy` to access it.
This patch implements the following simplifications:
- Remove the `annotationStorage`-parameter from `PDFDocumentProxy.saveDocument`, since it's already available internally.
Furthermore, while it's currently possible to call that method without an `AnnotationStorage`-instance, that really does *not* make any sense at all. In this case you're effectively reducing `PDFDocumentProxy.saveDocument` to a "regular" `PDFDocumentProxy.getData` call, but with *a lot* more overhead, which was obviously not the intention of the `PDFDocumentProxy.saveDocument`-method.
- Try to discourage third-party users from calling `PDFDocumentProxy.saveDocument` unconditionally, as a replacement for `PDFDocumentProxy.getData` (note the previous point).
- Replace the `annotationStorage`-parameter, in `PDFPageProxy.render`, with a boolean `includeAnnotationStorage`-parameter which simply indicates if the (internally available) `AnnotationStorage`-instance should be used during rendering (e.g. for printing).
- By removing the need to *manually* provide `annotationStorage`-parameters to various API-methods, using the API should become simpler (e.g. for third-parties) since you no longer need to worry about manually fetching and passing around this data.
The reason for the fairly large discrepancy, in the thumbnail quality, between the `draw`/`setImage`-methods is that in the former case we *directly* render the thumbnails at the final size that they'll appear at in the sidebar. In the latter case, we instead downsize the (generally) much larger "regular" pages.
To address this, I'm thus proposing that we let `PDFThumbnailView.draw` render thumbnails at *twice* their intended size and then downsize them to the final size.
Obviously this will increase *peak* memory usage during thumbnail rendering in `PDFThumbnailView.draw`, since doubling the width/height of a `canvas` will lead to its pixel-count increasing by a factor of `4`. Furthermore, since you need four components per pixel (given that it's RGBA-data), this will thus lead to the *temporary* thumbnail `canvas`-sizes increasing by a factor of `16` during rendering. Hence why rendering thumbnails at their "original" scale, i.e. using something like `PDFPageProxy.getViewport({ scale: 1 });`, would be an absolutely terrible idea!
To reduce the size and scope of these changes, I've tried to re-factor and re-use as much of the existing downsizing-implementation already present in `PDFThumbnailView` as possible.
While this will generally *not* make thumbnails rendered by `PDFThumbnailView.draw` look *identical* to those based on the rendered pages (via `PDFThumbnailView.setImage`), it's a considerable improvement as far as I'm concerned and enough to call the issue fixed.
*Please note:* This patch will not lead to *any* additional overhead, in either memory usage or parsing, for thumbnails which are based on the rendered pages.
A loop is less efficient than just overwriting the content, which is what we've generally been using (for years) in other parts of the code-base (see e.g. `BaseViewer` and `PDFThumbnailViewer`).
These properties are always updated/used together, and there's no other methods which depend on just one of them, hence they're changed into local variables instead.
Looking through the history of this code, it seems they were converted *from* local variables and to properties all the way back in PR 2914; however as far as I can tell from that diff it doesn't seem to have been necessary even back then!?
As discussed in the issue, this is a small/simple patch that should help to prevent *outright* data loss in forms when a new document is opened in the GENERIC viewer.
While the implementation is perhaps a bit "simplistic", it does seem to work and should be fine given that this is an edge-case only relevant for the GENERIC viewer.
In the next patch we'll need to be able to actually wait for saving to complete, hence it's necessary to slightly re-factor the `save`-method.
As part of these changes, we can reduce some duplication in the `save`-method and slightly improve the overall code. For consistency, the `download`-method is updated similarily to improve the code (this functionality is *very* old, even pre-dating the introduction of Promises in the code-base).
As mentioned in the JSDoc comment, this should not be used unless you know what you're doing, since it will lead to increased memory usage. However, in some situations (e.g. SVG-rendering), we still want to be able to run general clean-up on both the main/worker-thread while keeping loaded fonts attached to the DOM.[1]
As part of these changes, `WorkerTransport.startCleanup` is converted to an async method and we'll also skip clean-up when destruction has started (since it's redundant).
---
[1] The SVG-rendering mode is obviously not officially supported, since it's both rather incomplete and inherently slower. However with recent changes, whereby we cache repeated images on the document rather than the page level, memory usage can be *a lot* worse than before if we never attempt to release e.g. cached image-data when the viewer is in SVG-rendering mode.
* JS - Handle correctly hierarchy of fields
- it aims to fix#13132;
- annotations can inherit their actions from the parent field;
- there are some fields which act as a container for other fields:
- they can be access through js so need to add them with an empty type (nothing in the spec about that but checked in Acrobat);
- calculation order list (CO) can reference them so need make them through this.getField;
- getArray method must return kids.
- field values are number, string, ... depending of their type but nothing in the spec on how to know what's the type:
- according to the comment for Canonical Format: https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#page=461
- it seems that this "type" can be guessed from js action Format (when setting a type in Acrobat DC, the only affected thing is this action).
- util.scand with an empty string returns the current date.
Given that *all* data has been loaded on the main-thread, and then transferred to the worker-thread, ever since PR 8617 (almost four years ago) it should no longer be necessary to keep this special-case around.
Given that the `webViewerOpenFileViaURL` helper function is being defined in *all* builds anyway, the current pre-processor usage doesn't really improve readability in my opinion.
Currently `destRef === null`, which will only happen in documents with corrupt destinations, will (unsurprisingly) throw when trying to lookup the pageNumber. To avoid this, we can simply use the same format as in 1a2cdaffc5/web/pdf_link_service.js (L128)
The rotation handling that's currently living in `PDFViewerApplication` is *very* old, and pre-dates the introduction of the viewer components by years.
As can be seen in the `BaseViewer.pagesRotation` setter, we're not actually normalizing the rotation as intended and instead rely on the caller to handle that correctly. This is first of all inconsistent, given how other setters are implemented, and secondly it could also lead to the rotation being set to a value outside of the `[0, 360)`-range.
Finally, for improved consistency the rotation handling in `PageViewport` is updated similarly. Please note that this case, it's *not* changing the pre-existing logic.
This improves and simplifies #13102 in order to make printing of test-cases
like the one in bug 1698414 (where the real page is bigger than the target
page) much better, see incoming screenshots.
The reason why we need to stop setting .style.width / .style.height is to get
the right auto-sizing behavior in both axes. This shouldn't change behavior as
long as the print resolution is >= the CSS resolution, which seems like a
reasonable assumption.
If you try to print with a lower resolution than CSS, then instead of an
stretched canvas, you'd get a centered CSS-quality canvas, which seems
sensible. This could maybe be fixed with some CSS hackery (some combination of
min / max and viewport units perhaps?), but I think it's more trouble than it's
worth.
- implement few positioning properties: position, width, height, anchor;
- implement font element;
- implement fill element (used by font) and its children (linear, radial, ...);
- font property is inherited from ancestor container (see https://www.pdfa.org/wp-content/uploads/2020/07/XFA-3_3.pdf#page=43) so let CSS handles that stuff;
- in order to reduce the number of properties to set, only set non default properties and put the default in CSS;
- set a background to some containers to be able to see them (will be removed in a future commit).
The intention, in PR 12493, was that the page we're adding to the browser history should behave as if it were a "regular" internal destination (to properly convey user intent).
Unfortunately, since I didn't consider all the edge-cases correctly, it ended up behaving like a URL-hash instead which obviously wasn't intended. Note that currently this isn't a problem, however it can become an issue (in some cases) with upcoming re-factoring around `PDFHistory` and OpenAction support[1].
---
[1] I've started working on fixing the following TODO, which will require a couple of smaller tweaks here and there: 9d0ce6e79f/web/app.js (L1680-L1681)
In the `getAll`-method, we can have just one *explicit* loop rather than two indirect ones via the old `Object.assign`-call.
Also, changes the `get`-method to be slightly more compact (while keeping the logic intact).
Looking at this now, I cannot understand why we'd need to initialize `this.prefs` with all of the values from `this.defaults`.
Not only does this *indirectly* require one extra loop, via the `Object.assign`-call, but it also means that in GENERIC-builds changes to default-preference values might not be picked-up unless the the existing user-prefs are cleared (if the user had *manually* set prefs previously).
Given that the `enableXfa` parameter must to be passed to the API/Worker, and thus included in the `getDocument` call, it's not necessary to include it when initializing the `PDFViewer`-instance used in the default viewer. (Also, in `AppOptions`, the parameter is clearly marked with `OptionKind.API`.)
Furthermore, we probably don't want to display the fallback bar (in Firefox) for XFA documents when `enableXfa = true` is set.
While it's still not entirely clear if this would've prevented the issue as reported, given that the particular use-case reported apparently no longer applies, this small change really cannot hurt in general *and* it won't effect "regular" viewer builds in any way.
Given how the compatibility-values are being handled, it's not actually possible to override a *truthy* default-value with a *falsy* compatibility-value.
This is a simple oversight on my part, and with modern ECMAScript features this is very easy to support.
With the changes made in the previous patch, we can now list "disableTelemetry" in the `AppOptions` only for the `CHROME`-builds and thus remove the special-casing in the `checkChromePreferencesFile` helper function.
Originally the default preferences where simply placed in a JSON-file, checked into the repository, which over time became impractical, annoying, and error-prone to maintain; please see PR 10548.
While that improved the overall situation a fair bit, it however inherited one quite unfortunate property of the old JSON-based solution[1]: It's still not possible for *different* build targets to specify their *own* default preference values.
With some preferences, such as e.g. `enableScripting`, it's not inconceivable that you'd want to (at least) support build-specific default preference values. Currently that's not really possible, which is why this PR re-factors the default preferences generation to support this.
---
[1] This fact isn't really clear from the `AppOptions` implementation, unless you're familiar with the `gulpfile.js` code, which could lead to some confusion for those new to this part of the code-base.