Currently, with `enablePrintAutoRotate = true` set, we're forced to loop through all the pages *twice* when checking for any landscape pages.
This seems completely unnecessary now, and using only *one* loop should be marginally more efficient in general.
Currently landscape pages are rotated *clockwise*, which for most documents feel wrong since holding the printed pages at their *left* edge causes the landscape pages to be viewed "upside down".
In general, since most documents are LTR ones, it feels more appropriate to instead rotate landscape pages *counterclockwise* for printing.
- add an option to enable XFA rendering if any;
- for now, let the canvas layer: it could be useful to implement XFAF forms (embedded pdf in xml stream for the background and xfa form for the foreground);
- ui elements in template DOM are pretty close to their html counterpart so we generate a fake html DOM from template one:
- it makes easier to translate template properties to html ones;
- it makes faster the creation of the html element in the main thread.
*Please note:* Given the pre-existing issues raised in PR 13056, which seem to block immediate progress there, this patch extracts some *overall* improvements of the scripting/sandbox destruction in `PDFScriptingManager`.
As can be seen in `BaseViewer.setDocument`, it's currently necessary to *manually* delay the `PDFScriptingManager`-destruction in order for things to work correctly. This is, in hindsight, obviously an *extremely poor* design choice on my part; sorry about the churn here!
In order to improve things overall, the `PDFScriptingManager._destroyScripting`-method is re-factored to wait for the relevant events to be dispatched *before* sandbox-destruction occurs.
To avoid the scripting/sandbox-destruction hanging indefinitely, we utilize a timeout to force-destroy the sandbox after a short time (currently set to 1 second).
By moving this code from the `BaseViewer` and into `PDFScriptingManager`, all of the scripting initialization/handling code is now limited to just one file/class which help overall readability (in my opinion). Also, this patch is a *net reduction* in number of lines of code which can never hurt.
As part of these changes, the intermediary "pageopen"/"pageclose" events are now removed in favor of using the "regular" viewer events directly in `PDFScriptingManager`. Hence this removes some (strictly unnecessary) indirection in the current code, when handling PageOpen/PageClose events, which leads to overall fewer function calls in this part of the code.
The *main* purpose of this patch is to allow scripting to be used together with the viewer components, note the updated "simpleviewer"/"singlepageviewer" examples, in addition to the full default viewer.
Given how the scripting functionality is currently implemented in the default viewer, trying to re-use this with the standalone viewer components would be *very* hard and ideally you'd want it to work out-of-the-box.
For an initial implementation, in the default viewer, of the scripting functionality it probably made sense to simply dump all of the code in the `app.js` file, however that cannot be used with the viewer components.
To address this, the functionality is moved into a new `PDFScriptingManager` class which can thus be handled in the same way as all other viewer components (and e.g. be passed to the `BaseViewer`-implementations).
Obviously the scripting functionality needs quite a lot of data, during its initialization, and for the default viewer we want to maintain the current way of doing the lookups since that helps avoid a number of redundant API-calls.
To that end, the `PDFScriptingManager` implementation accepts (optional) factories/functions such that we can maintain the current behaviour for the default viewer. For the viewer components specifically, fallback code-paths are provided to ensure that scripting will "just work"[1].
Besides moving the viewer handling of the scripting code to its own file/class, this patch also takes the opportunity to re-factor the functionality into a number of helper methods to improve overall readability[2].
Note that it's definitely possible that the `PDFScriptingManager` class could be improved even further (e.g. for general re-use), since it's still heavily tailored to the default viewer use-case, however I believe that this patch is still a good step forward overall.
---
[1] Obviously *all* the relevant document properties might not be available in the viewer components use-case (e.g. the various URLs), but most things should work just fine.
[2] The old `PDFViewerApplication._initializeJavaScript` method, where everything was simply inlined, have over time (in my opinion) become quite large and somewhat difficult to *easily* reason about.
Rather than having to spell out the English fallback strings at *every* single `IL10n.get` call-site throughout the viewer, we can simplify things by collecting them in *one* central spot.
This provides a much better overview of the fallback l10n strings used, which makes future changes easier and ensures that fallback strings occuring in multiple places cannot accidentally get out of sync.
Furthermore, by making the `fallback` parameter of the `IL10n.get` method *optional*[1] many of the call-sites (and their surrounding code) become a lot less verbose.
---
[1] It's obviously still possible to pass in a fallback string, it's just not required.
I happened to look at this code, and I can't for the life of me figure out why I didn't just implement it like this patch in the first place (since the current format feels overly verbose).
- Mark `BaseViewer.initializeScriptingEvents` as an `async` method, since that's actually how it's being used in the default viewer (see `PDFViewerApplication-_initializeJavaScript`).
- Change `BaseViewer._pageWidthScaleFactor` to access the *internal* scroll/spread-modes directly, rather than using the getters, since that's consistent with the rest of the code (and not just for these properties).
Given that `getComputedStyle` only works on visible elements, the result of PR 12354 is that if the viewer is placed in a *hidden* `iframe` the viewer will now be broken. This obviously wasn't the intention of that PR, hence I believe that we should limit the `position: absolute;` check slightly to avoid this.
Currently it's not *immediately* clear from the code itself, unless you look at the definition of `this._pageLabels`, that the default value is `null`.[1]
We can improve this, and also reduce the amount of code, by using modern ECMAScript features such as optional chaining and nullish coalescing.
---
[1] Keep in mind that an *empty* string is actually a valid page label, according to the PDF specification.
- For wrapped scrolling, we unfortunately need to do a fair bit of parsing of the *current* page layout. Compared to e.g. the spread-modes, where we can easily tell how the pages are laid out, with wrapped scrolling we cannot tell without actually checking. In particular documents with varying page sizes require some care, since we need to check all pages on the "row" of the current page are visible and that there aren't any "holes" present. Otherwise, in the general case, there's a risk that we'd skip over pages if we'd simply always advance to the previous/next "row" in wrapped scrolling.
- For horizontal scrolling, this patch simply maintains the current behaviour of advancing *one* page at a time. The reason for this is to prevent inconsistent behaviour for the next and previous cases, since those cannot be handled identically. For the next-case, it'd obviously be simple to advance to the first not completely visible page. However for the previous-case, we'd only be able to go back *one* page since it's not possible to (easily) determine the page layout of non-visible pages (documents with varying page sizes being a particular issue).
- For vertical scrolling, this patch maintains the current behaviour by default. When spread-modes are being used, we'll now attempt to advance to the next *spread*, rather than just the next page, whenever possible. To prevent skipping over a page, this two-page advance will only apply when both pages of the current spread are visible (to avoid breaking documents with varying page sizes) and when the second page in the current spread is fully visible *horizontally* (to handle larger zoom values).
In order to reduce the performance impact of these changes, note that the previous/next-functionality will only call `getVisibleElements` for the scroll/spread-modes where that's necessary and that "normal" vertical scrolling is thus unaffected by these changes.
To support these changes, the `getVisibleElements` helper function will now also include the `widthPercent` in addition to the existing `percent` property.
The `PDFViewer._updateHelper` method is changed slightly w.r.t. updating the `currentPageNumber` for the non-vertical/spread modes, i.e. won't affect "normal" vertical scrolling, since that helped simplify the overall calculation of the page advance.
Finally, these new `BaseViewer` methods also allow (some) simplification of previous/next-page functionality in various viewer components.
*Please note:* There's one thing that this patch does not attempt to change, namely disabling of the previous/next toolbarButtons respectively the firstPage/lastPage secondaryToolbarButtons. The reason for this is that doing so would add quite a bit of complexity in general, and if for some reason `BaseViewer._getPageAdvance` would get things wrong we could end up incorrectly disabling the buttons. Hence it seemed overall safer to *not* touch this, and accept that the buttons won't be `disabled` despite in some edge-cases no further scrolling being possible.
Note first of all how the `PDFDocumentProxy.getJSActions` method in the API caches the result, which makes repeated lookups cheap enough to not really be an issue.
Secondly, with the previous patch, we're now only dispatching "pageopen"/"pageclose"-events when there's actually a sandbox that listens for them.
All-in-all, with these changes we can thus simplify the default-viewer "pageopen"-event handler a fair bit.
The "pageopen"/"pageclose"-events are only necessary if, and only if, there's actually a sandbox to dispatch the events in. Hence we shouldn't dispatch those events unconditionally, as soon as `enableScripting` is set, but rather initialize that functionality only when needed.
Furthermore, in `web/app.js`, there's currently a bug since we're attempting to *manually* simulate a "pageopen"-event for a page that may not actually have been rendered at the time. With the modified `BaseViewer.initializeScriptingEvents` method, we'll now dispatch a correct "pageopen"-event here.
With the code dispatching a "pageopen" event on the existing (general) `BaseViewer` event "pagesinit", in practice this means that the `Set` is always being created. Hence we can simplify the method overall, by always initializing the `this._pageOpenPendingSet` property.
Given that "pageopen" events are not guaranteed to occur, if the page becomes inactive *before* it finishes rendering, we should probably also avoid dispatching a "pageclose" event in that case to avoid confusing/inconsistent state in any event handlers.
- Actually remove the `isDown` property when destroying the scripting-instance.
- Mark all `mouseState` usage as "private" in the various classes.
- Ensure that the `AnnotationLayer` actually treats the parameter as properly *optional*, the same way that the viewer components do.
- For now remove the `mouseState` parameter from the `PDFPageView` class, and keep it only on the `BaseViewer`, since it's questionable if all of the scripting-functionality will work all that well without e.g. a full `BaseViewer`.
- Append the `mouseState` to the JSDoc for the `AnnotationElement` class, and just move its definition into the base-`AnnotationElement` class.
This new event essentially mirrors the existing "pagesinit" event, and will allow e.g. a custom implementation of the viewer to be notified before the current PDF document is removed from the viewer.
By using this new event, we're thus able to dispatch a "pageclose" event for JavaScript actions when closing the existing document.
Having looked at the Acrobat JavaScript specification, see https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/AcrobatDC_js_api_reference.pdf#G5.1963437, I suppose that introducing these two new events is probably the easiest solution overall.
However there's a number of things that, as far as I'm concerned, will help the overall implementation:
- Only dispatch these new events when `enableScripting = true` is set.
- Handle them *separately* from the existing "pagechanging" event dispatching, to avoid too much clutter.
- Don't dispatch either of the events if the page didn't actually change.
- When waiting for pages to render, don't dispatch "pageopen" if the page is no longer active when rendering finishes.
- Ensure that we only use *one* "pagerendered" event listener.
- Ensure that "pageopen" is actually dispatched when the document loads.
I suppose that we *could* avoid adding the "pageclose" event, and use the existing "pagechanging" event instead, however having a separate event might allow more flexibility in the future. (E.g. I don't know if we'll possibly want to dispatch "pageclose" on document close, as mentioned briefly in the specification.)
- Add support for logical assignment operators, i.e. `&&=`, `||=`, and `??=`, with a Babel-plugin. Given that these required incrementing the ECMAScript version in the ESLint and Acorn configurations, and that platform/browser support is still fairly limited, always transpiling them seems appropriate for now.
- Cache the `hasJSActions` promise in the API, similar to the existing `getAnnotations` caching. With this implemented, the lookup should now be cheap enough that it can be called unconditionally in the viewer.
- Slightly improve cleanup of resources when destroying the `WorkerTransport`.
- Remove the `annotationStorage`-property from the `PDFPageView` constructor, since it's not necessary and also brings it more inline with the `BaseViewer`.
- Update the `BaseViewer.createAnnotationLayerBuilder` method to actaually agree with the `IPDFAnnotationLayerFactory` interface.[1]
- Slightly tweak a couple of JSDoc comments.
---
[1] We probably ought to re-factor both the `IPDFTextLayerFactory` and `IPDFAnnotationLayerFactory` interfaces to take parameter objects instead, since especially the `IPDFAnnotationLayerFactory` one is becoming quite unwieldy. Given that that would likely be a breaking change for any custom viewer-components implementation, this probably requires careful deprecation.
This patch will help reduce memory usage, especially for longer documents, when the user scrolls around in the thumbnailView (in the sidebar).
Note how the `PDFPageProxy.cleanup` method will, assuming it's safe to do so, release main-thread resources associated with the page. These include things such as e.g. image data (which can be arbitrarily large), and also the operatorList (which can also be quite large).
Hence when pages are evicted from the `PDFPageViewBuffer`, on the `BaseViewer`-instance, the `PDFPageView.destroy` method is invoked which will (among other things) call `PDFPageProxy.cleanup` in the API.
However, looking at the `PDFThumbnailViewer`/`PDFThumbnailView` classes you'll notice that there's no attempt to ever call `PDFPageProxy.cleanup`, which implies that in certain circumstances we'll essentially keep all resources allocated permanently on the `PDFPageProxy`-instances in the API.
In particular, this happens when the users opens the sidebar and starts scrolling around in the thumbnails. Generally speaking you obviously need to keep all thumbnail *images* around, since otherwise the thumbnailView is useless, but there's still room for improvement here.
Please note that the case where a *rendered page* is used to create the thumbnail is (obviously) completely unaffected by the issues described above, and this rather only applies to thumbnails being explicitly rendered by the `PDFThumbnailView.draw` method.
For the latter case, we can fix these issues simply by calling `PDFPageProxy.cleanup` once rendering has finished. To prevent *accidentally* pulling the rug out from under `PDFPageViewBuffer` in the viewer, which expects data to be available, this required adding a couple of new methods[1] to enable checking that it's indeed safe to call `PDFPageProxy.cleanup` from the `PDFThumbnailView.draw` method.
It's really quite fascinating that no one has noticed this issue before, since it's been around since basically "forever".
---
[1] While it should be *very* rare for `PDFThumbnailView.draw` to be called for a pageView that's also in the `PDFPageViewBuffer`, given that pages are rendered before thumbnails and that the *rendered page* is used to create the thumbnail, it can still happen since rendering is asynchronous.
Furthermore, it's also possible for `PDFThumbnailView.setImage` to be disabled, in which case checking the `PDFPageViewBuffer` for active pageViews *really* matters.
* When no actions then set it to null instead of empty object
* Even if a field has no actions, it needs to listen to events from the sandbox in order to be updated if an action changes something in it.
Given the number of parameters, and the fact that many of them are booleans, the call-sites are no longer particularly easy to read and understand. Furthermore, this slightly improves the formatting of the JSDoc-comment, since it needed updating as part of these changes anyway.
Finally, this removes an unnecessary `numViews === 0` check from `getVisibleElements`, since that should be *very* rare and more importantly that the `binarySearchFirstItem` function already has a fast-path for that particular case.
This patch addresses a review comment, which pointed out that we should *also* handle the pageNumber-input, from PR 12493.
Given that a user *manually* changing pages using the pageNumber-input, on the toolbar, could be regarded as a pretty strong indication of user-intent w.r.t. navigation in the document, hence I suppose that updating the browser history in this case as well probably won't hurt.
Given that we're now accessing certain API-functionality *directly* in this file, e.g. the AnnotationStorage and Optional Content configuration, ensuring that there's not a version mismatch definitely seem like a good idea to prevent any *subtle* future bugs.
The `getVisibleElements` helper function currently requires the viewerContainer to be absolutely positioned; possibly fixing this is tracked in issue 11626.
Without `position: absolute;` set, in the CSS, there's a number of things that won't work correctly such as e.g.
- Determining which pages are currently visible, thus forcing all of them to render on load and increasing resource usage significantly; note https://github.com/mozilla/pdf.js/wiki/Frequently-Asked-Questions#allthepages
- Scrolling pages into view, by using the `BaseViewer.currentPageNumber` setter or similar.
Based on the number of opened issues over the years, the fact that `position: absolute;` is required has shown to be something that users can very easily overlook unless they follow e.g. the `simpleviewer` example to the letter.
Hence, to improve things until such a time that issue 11626 is fixed, we'll now refuse to initialize a `BaseViewer` instance unless the `container` has the required CSS set. (Forcibly setting `position: absolute;` on the viewerContainer element is bound to cause significantly more issues/confusion, hence the current approach of throwing an Error.)
This should help prevent future issues, caused by the user omitting the `viewer` option and/or providing an incorrect `container` option, when initializing a `BaseViewer`-instance.
*Besides, obviously, adding viewer support:* This patch attempts to improve the general API for Optional Content Groups slightly, by adding a couple of new methods for interacting with the (more complex) data structures of `OptionalContentConfig`-instances. (Thus allowing us to mark some of the data as "private", given that it probably shouldn't be manipulated directly.)
By utilizing not just the "raw" Optional Content Groups, but the data from the `/Order` array when available, we can thus display the Layers in a proper tree-structure with collapsible headings for PDF documents that utilizes that feature.
Note that it's possible to reset all Optional Content Groups to their default visibility state, simply by double-clicking on the Layers-button in the sidebar.
(Currently that's indicated in the Layers-button tooltip, which is obviously easy to overlook, however it's probably the best we can do for now without adding more buttons, or even a dropdown-toolbar, to the sidebar.)
Also, the current Layers-button icons are a little rough around the edges, quite literally, but given that the viewer will soon have its UI modernized anyway they hopefully suffice in the meantime.
To give users *full* control of the visibility of the various Optional Content Groups, even those which according to the `/Order` array should not (by default) be toggleable in the UI, this patch will place those under a *custom* heading which:
- Is collapsed by default, and placed at the bottom of the Layers-tree, to be a bit less obtrusive.
- Uses a slightly different formatting, compared to the "regular" headings.
- Is localizable.
Finally, note that the thumbnails are *purposely* always rendered with all Optional Content Groups at their default visibility state, since that seems the most useful and it's also consistent with other viewers.
To ensure that this works as intended, we'll thus disable the `PDFThumbnailView.setImage` functionality when the Optional Content Groups have been changed in the viewer. (This obviously means that we'll re-render thumbnails instead of using the rendered pages. However, this situation ought to be rare enough for this to not really be a problem.)
Given that `renderInteractiveForms` is now enabled by default in "full" viewer, it seems reasonable to enable it by default in the viewer components as well.
Especially considering that it's simple to disable, when creating the affected components, for anyone implementing their own viewer.
The current behavior for `getPagesOverview` assumes we want to only
auto-rotate if:
- `enablePrintAutoRotate` is `true`
- `isFirstPagePortrait !== isPortraitOrientation(size)`
This second check is what is breaking #9297. The two PDFs linked have a
landscape orientation first page, as well as subsequent pages. Since
`false === false`, we print portrait.
Let's drop the comparison with `isFirstPagePortrait`, and print
landscape if `!isPortraitOrientation(size)`.
Fixes#9297.
*Please note:* These changes were done automatically, using the `gulp lint --fix` command.
This rule is already enabled in mozilla-central, see https://searchfox.org/mozilla-central/rev/567b68b8ff4b6d607ba34a6f1926873d21a7b4d7/tools/lint/eslint/eslint-plugin-mozilla/lib/configs/recommended.js#103-104
The main advantage, besides improved consistency, of this rule is that it reduces the size of the code (by 3 bytes for each case). In the PDF.js code-base there's close to 8000 instances being fixed by the `dot-notation` ESLint rule, which end up reducing the size of even the *built* files significantly; the total size of the `gulp mozcentral` build target changes from `3 247 456` to `3 224 278` bytes, which is a *reduction* of `23 178` bytes (or ~0.7%) for a completely mechanical change.
A large number of these changes affect the (large) lookup tables used on the worker-thread, but given that they are still initialized lazily I don't *think* that the new formatting this patch introduces should undo any of the improvements from PR 6915.
Please find additional details about the ESLint rule at https://eslint.org/docs/rules/dot-notation
Please note that these changes were done automatically, using `gulp lint --fix`.
Given that the major version number was increased, there's a fair number of (primarily whitespace) changes; please see https://prettier.io/blog/2020/03/21/2.0.0.html
In order to reduce the size of these changes somewhat, this patch maintains the old "arrowParens" style for now (once mozilla-central updates Prettier we can simply choose the same formatting, assuming it will differ here).
The correct/intended way of working with the "viewer components" is by providing an `EventBus` instance upon initialization, and the `getGlobalEventBus` was only added for backwards compatibility.
Note, for example, that using `getGlobalEventBus` doesn't really work at all well with a use-case where there's *multiple* `PDFViewer` instances on a one page, since it may then be difficult/impossible to tell which viewer a particular event originated from.
All of the "viewer components" examples have been previously updated, such that there's no longer any code/examples which relies on the now removed `getGlobalEventBus` functionality.
For reasons that I don't even pretend to understand, the `textLayerFactory` property is determined for *every single* page in the PDF document.
Given that the `TextLayerMode` should be consistent for *all* pages in a document, we obviously could/should define `textLayerFactory` just once instead.
There's a couple of issues with this functionality:
- The respective `PromiseCapability` instances are not being reset, in `BaseViewer._resetView`, when the document is closed which is inconsistent with all other state.
- While the default viewer depends on these promises, and they thus ought to be considered part of e.g. the `PDFViewer` API-surface, they're not really defined in a particularily user-visible way (being that they're attached to the `BaseViewer` instance *inline* in `BaseViewer.setDocument`).
- There's some internal `BaseViewer` state, e.g. `BaseViewer._pageViewsReady`, which is tracked manually and could instead be tracked indirectly via the relevant `PromiseCapability`, thus reducing the need to track state *twice* since that's always best to avoid.
*Please note:* In the existing implementation, these promises are not defined *until* the `BaseViewer.setDocument` method has been called.
While it would've been simple to lift that restriction in this patch, I'm purposely choosing *not* to do so since this ensures that any Promise handlers added inside of `BaseViewer.setDocument` are always invoked *before* any external ones (and keeping that behaviour seems generally reasonable).
Since the goal has always been, essentially since the `EventBus` abstraction was added, to remove all dispatching of DOM events[1] from the viewer components this patch tries to address one thing that came up when updating the examples:
The DOM events are always dispatched last, and it's thus guaranteed that all internal event listeners have been invoked first.
However, there's no such guarantees with the general `EventBus` functionality and the order in which event listeners are invoked is *not* specified. With the promotion of the `EventBus` in the examples, over DOM events, it seems like a good idea to at least *try* to keep this ordering invariant[2] intact.
Obviously this won't prevent anyone from manually calling the new *internal* viewer component methods on the `EventBus`, but hopefully that won't be too common since any existing third-party code would obviously use the `on`/`off` methods and that all of the examples shows the *correct* usage (which should be similarily documented on the "Third party viewer usage" Wiki-page).
---
[1] Looking at the various Firefox-tests, I'm not sure that it'll be possible to (easily) re-write all of them to not rely on DOM events (since getting access to `PDFViewerApplication` might be generally difficult/messy depending on scopes).
In any case, even if technically feasible, it would most likely add *a lot* of complication that may not be desireable in the various Firefox-tests. All-in-all, I'd be fine with keeping the DOM events only for the `MOZCENTRAL` target and gated on `Cu.isInAutomation` (or similar) rather than a preference.
[2] I wouldn't expect any *real* bugs in a custom implementation, simply based on event ordering, but it nonetheless seem like a good idea if any "external" events are still handled last.