Note first of all how the `PDFDocumentProxy.getJSActions` method in the API caches the result, which makes repeated lookups cheap enough to not really be an issue.
Secondly, with the previous patch, we're now only dispatching "pageopen"/"pageclose"-events when there's actually a sandbox that listens for them.
All-in-all, with these changes we can thus simplify the default-viewer "pageopen"-event handler a fair bit.
This patch is a rebased *and* refactored version of PR 9448, such that it applies cleanly given that `PDFFindController` has changed since that PR was opened; obviously keeping the original author information intact.
This patch will thus ensure that e.g. fractions, and other things that we normalize before searching, will still be highlighted correctly in the textLayer.
Furthermore, this patch also adds basic unit-tests for this functionality.
*Note:* The `[api-minor]` tag is added, since third-party implementations of the `PDFFindController` must now always use the `pageMatchesLength` property to get accurate length information (see the `web/text_layer_builder.js` changes).
Co-authored-by: Ross Johnson <ross@mazira.com>
Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com>
The "pageopen"/"pageclose"-events are only necessary if, and only if, there's actually a sandbox to dispatch the events in. Hence we shouldn't dispatch those events unconditionally, as soon as `enableScripting` is set, but rather initialize that functionality only when needed.
Furthermore, in `web/app.js`, there's currently a bug since we're attempting to *manually* simulate a "pageopen"-event for a page that may not actually have been rendered at the time. With the modified `BaseViewer.initializeScriptingEvents` method, we'll now dispatch a correct "pageopen"-event here.
Not only was long text in popups no longer wrapped correctly, the
alignment was also center instead of left (or right, depending on the
locale used) for both text in popups and the other parts within the
annotation's section, such as the icon.
Note that these changes were done automatically, using `gulp lint --fix`.
With this rule, we'll thus enforce a *consistent* formatting of zero-lengths in our CSS files.
Please find additional details about the Stylelint rule at https://stylelint.io/user-guide/rules/length-zero-no-unit
With the updated default viewer UI, some `dir`-dependent CSS rules are now redundant since *identical* rules are being specified for both LTR and RTL mode; after PR 12807 landed I've found even more of these cases.
Note in particular that the findbar-button rules can be simplified quite a bit, since there's a fair amount of unnecessary duplication in the CSS.
There's built-in ESLint rule, see `sort-imports`, to ensure that all `import`-statements are sorted alphabetically, since that often helps with readability.
Unfortunately there's no corresponding rule to sort `export`-statements alphabetically, however there's an ESLint plugin which does this; please see https://www.npmjs.com/package/eslint-plugin-sort-exports
The only downside here is that it's not automatically fixable, but the re-ordering is a one-time "cost" and the plugin will help maintain a *consistent* ordering of `export`-statements in the future.
*Note:* To reduce the possibility of introducing any errors here, the re-ordering was done by simply selecting the relevant lines and then using the built-in sort-functionality of my editor.
This implementation is inspired by the behaviour in (recent versions of) Adobe Reader, since it leads to reasonably simple and straightforward code as far as I'm concerned.
*Specifically:* We'll only consider *one* destination per page when finding/highlighting the current outline item, which is similar to e.g. Adobe Reader, and we choose the *first* outline item at the *lowest* level of the outline tree.
Given that this functionality requires not only parsing of the `outline`, but looking up *all* of the destinations in the document, this feature can when initialized have a non-trivial performance overhead for larger PDF documents.
In an attempt to reduce the performance impact, the following steps are taken here:
- The "find current outline item"-functionality will only be enabled once *one* page has rendered and *all* the pages have been loaded[1], to prevent it interfering with data regular fetching/parsing early on during document loading and viewer initialization.
- With the exception of a couple of small and simple `eventBus`-listeners, in `PDFOutlineViewer`, this new functionality is initialized *lazily* the first time that the user clicks on the `currentOutlineItem`-button.
- The entire "find current outline item"-functionality is disabled when `disableAutoFetch = true` is set, since it can easily lead to the setting becoming essentially pointless[2] by triggering *a lot* of data fetching from a relatively minor viewer-feature.
- Fetch the destinations *individually*, since that's generally more efficient than using `PDFDocumentProxy.getDestinations` to fetch them all at once. Despite making the overall parsing code *more* asynchronous, and leading to a lot more main/worker-thread message passing, in practice this seems faster for larger documents.
Finally, we'll now always highlight an outline item that the user manually clicked on, since only highlighting when the new "find current outline item"-functionality is used seemed inconsistent.
---
[1] Keep in mind that the `outline` itself already isn't fetched/parsed until at least *one* page has been rendered in the viewer.
[2] And also quite slow, since it can take a fair amount of time to fetch all of the necessary `destinations` data when `disableAutoFetch = true` is set.
With the code dispatching a "pageopen" event on the existing (general) `BaseViewer` event "pagesinit", in practice this means that the `Set` is always being created. Hence we can simplify the method overall, by always initializing the `this._pageOpenPendingSet` property.
Given that "pageopen" events are not guaranteed to occur, if the page becomes inactive *before* it finishes rendering, we should probably also avoid dispatching a "pageclose" event in that case to avoid confusing/inconsistent state in any event handlers.
Ensure that `PDFViewerApplication._contentLength` is always updated with the *correct* length, as returned by `PDFDocumentProxy.getDownloadInfo`, and only let the `PDFViewerApplication._initializeMetadata` method overwrite if it's not already been set.
Finally, in `PDFViewerApplication._initializeJavaScript`, the fallback `_contentLength` handling is now moved to just after the fallback `documentInfo` handling, such that all the fallback code is in one place within the method.
With the updated default viewer UI, a couple of `dir`-dependent CSS rules have now become redundant since *identical* rules are being specified for both LTR and RTL mode.
Furthermore, there's also some unnecessary re-defining of the `toolbarButton`/`secondaryToolbarButton`-icon related CSS rules.
Finally, for the toggle-buttons there's a particular styling applied to the `:hover:active` state, however the color wasn't defined with CSS variables.
With the updated default viewer UI, a couple of the toolbarButton icons are now *vertically* symmetrical; hence we can remove some now unneeded `transform: scaleX(-1);` rules from the viewer CSS.
Note how the `onerror` functionality is not being used in the GENERIC `DownloadManager`, since we have no way of knowing if downloading succeeded.
Hence this functionality is only *possibly* useful in MOZCENTRAL builds, however as outlined in the existing comments it's unlikely to be helpful in practice. Generally speaking, if downloading failed once in [`PdfStreamConverter.jsm`](https://searchfox.org/mozilla-central/rev/809ac3660845fef6faf18ec210232fdadc0f1ad9/toolkit/components/pdfjs/content/PdfStreamConverter.jsm#294-406) it seems very likely that it would fail again; all-in-all I'm thus suggesting that we just remove the `onerror` functionality altogether here.
Currently this code is duplicated no less than three times in the `web/app.js` file, and by introducing a helper method we can avoid unnecessary repetition.
There's a fair number of cases where `FirefoxCom.request`-calls are manually wrapped in a Promise to make it asynchronous. We can reduce the amount of boilerplate code in these cases by introducing a new `FirefoxCom.requestAsync` method instead.
Furthermore, a couple of `FirefoxCom.request`-calls in the `DownloadManager` are also changed to be asynchronous rather than using callback-functions.
With this patch, we're thus able to replace a lot of *direct* usages of `FirefoxCom.request` with the new `FirefoxCom.requestAsync` method instead.
*Please note:* It's highly recommended to ignore whitespace-only changes when looking at this patch.
Besides modernizing this code, by converting it to a standard class, the existing JSDoc comments are updated to actually agree better with the way that this functionality is used now. (The next patch will reduce usage of `FirefoxCom.request` significantly, hence the JSDocs for the optional `callback` is removed to not unnecessarily advertise that functionality.)
Finally, the unnecessary/unused `return` statement at the end of `FirefoxCom.request` is also removed.
This is the "modern" way of removing a node from the DOM, which has the benefit of being a lot shorter and more concise.
Also, this patch removes the `return` statement from the "pdf.js.response" event listener, since it's always `undefined`, given that none of the `callback`-functions used here ever return anything (and don't need to either). Generally speaking, returning a value from an event listener isn't normally necessary either.
This method currently accepts a callback-function, which does feel a bit old fashioned now. At the time that this code was introduced, native Promises didn't exist yet and there's a custom Promise-implementation used instead.
However, today with Promises and async/await being used *a lot* it seems reasonable to change `DefaultExternalServices.fallback` to an `async` method instead such that the callback-function can be removed.
Note how the end of the `{PDFOutlineViewer, PDFAttachmentViewer, PDFLayerViewer}.render` methods share *almost* identical code, hence we can reduce some duplication by introducing the new `BaseTreeViewer` helper method here.
Furthermore, setting `this._lastToggleIsShow` can be made ever so slightly more efficient, since we don't care about the number of ".treeItemsHidden"-classes but only want to know if at least one exists.
This follows the same principle as the `once` option that exists in the native `addEventListener` method, and will thus automatically remove an `EventBus` listener when it's invoked; see https://developer.mozilla.org/en-US/docs/Web/API/EventTarget/addEventListener#Parameters
Finally, this patch also tweaks some the existing `EventBus`-code to use modern features such as optional chaining and logical assignment operators.
Given that we already have a `PresentationModeState`-enumeration, we should use that with the "presentationmodechanged" event rather than including separate properties. Note that this new behaviour, of including an enumeration-value in the event, is consistent with lots of other existing viewer-events.
To hopefully avoid issues in custom implementations of the default viewer, any attempt to access the removed properties will now throw.
Similar to e.g. the "locale" option, this in *only* done for those build-targets where the "sandboxBundleSrc" is actually defined.
With these changes we can remove an `AppOptions` dependency from the `web/generic_scripting.js` file, thus limiting *direct* `AppOptions` usage in the default viewer files.
Given that the `dispatchEventInSandbox` method (on the scripting-classes) is asynchronous, there's a very real risk that the events won't be dispatched/handled until *after* their associated functionality has actually run (with the "Will..." events being particularily susceptible to this issue).
To reduce the likelihood of that happening, we can simply `await` the `dispatchEventInSandbox` calls as necessary. A couple of methods are now marked as `async` to support these changes, however that shouldn't be a problem as far as I can tell.
*Please note:* Given that the browser "beforeprint"/"afterprint" events are *synchronous*, we unfortunately cannot await the `WillPrint`/`DidPrint` event dispatching. To fix this properly the web-platform would need support for asynchronous printing, and we'll thus have to hope that things work correctly anyway.
Note that currently the `DidSave` event is not *guaranteed* to actually be dispatched if there's any errors during saving, which is easily fixed by simply moving it to occur in the `finally`-handler in `PDFViewerApplication.save` method.
For the `WillPrint`/`DidPrint` events, things are unfortunately more complicated. Currently these events will *only* be dispatched iff the printing request comes from within the viewer itself (e.g. by the user clicking on the "Print" toolbar button), however printing can be triggered in a few additional ways:
- In the GENERIC viewer:
- By the <kbd>Ctrl</kbd>+<kbd>P</kbd> keyboard shortcut.
- In the MOZCENTRAL viewer, i.e. the Firefox built-in viewer:
- By the <kbd>Ctrl</kbd>+<kbd>P</kbd> keyboard shortcut.
- By the "Print" item, as found in either the Firefox "Hamburger menu" or in the browser-window menu.
In either of the cases described above, no `WillPrint`/`DidPrint` events will be dispatched. In order to *guarantee* that things work in the general case, we thus have to move the `dispatchEventInSandbox` calls to the "beforeprint"/"afterprint" event handlers instead.
Rather than calling `getJavaScript` in the API and then ignoring the result, when "enableScripting" is set, it should be more efficient/faster to simply skip it altogether instead.
Finally, the `setTimeout` call at the end of `PDFViewerApplication._initializeAutoPrint` is removed, since it doesn't seem necessary any more as far as I can tell.[1]
Note that when this functionality was originally added, back in PR 2839, it seems that `pagesPromise` simply waited for the `getPage` calls of *all* pages to resolve. Today, on the other hand, the viewer fetches *and* renders the first page *before* doing the remaining `getPage` calls, and only afterwards is `pagesPromise` resolved. Hence it's not really clear why we now need to delay printing even further with a `setTimeout` call.
---
[1] The patch was tested with the following documents: https://github.com/mozilla/pdf.js/blob/master/test/pdfs/bug1001080.pdf and https://github.com/mozilla/pdf.js/blob/master/test/pdfs/issue6106.pdf
These callbacks should not be necessary *before* the document has been initialized. Furthermore, move the functionality to a new helper-method since `PDFViewerApplication.load` is already quite large.
Given that this relies on accessing properties on the `PDFDocumentProxy`-instance, it seems more appropriate for this code to live in `PDFViewerApplication`.
It seems that the timeout is way too short in practice, since this new integration-test failed *intermittently* already in PR 12702 (which is where the test was added).
The ideal solution here would be to simply await an event, dispatched by the viewer, however that unfortunately doesn't appear to be supported by Puppeteer.
Instead, the solution implemented here is to add a new method in `PDFViewerApplication` which Puppeteer can query to check if the scripting/sandbox has been fully initialized.
There's really no point, as far as I can tell, to attempt to dispatch an event in a non-existent sandbox. Generally speaking, even trying to do this *could* possibly even lead to errors in some cases.
Furthermore, utilize optional chaining to simplify some `dispatchEventInSandbox` calls throughout the viewer.
Finally, replace superfluous `return` statements with `break` in the switch-statement in the `updateFromSandbox` event-handler.
There's no really compelling reason, as far as I can tell, to introduce the `ENABLE_SCRIPTING` build-target, instead of simply re-using the existing `TESTING` build-target for the new `gulp integrationtest` task.
In general there should be no problem with just always enable scripting in TESTING-builds, and if I were to *guess* the reason that this didn't seem to work was most likely because the Preferences ended up over-writing the `AppOptions`.
As it turns out the GENERIC-viewer has already has built-in support for disabling of Preferences, via the `AppOptions`, and this can be utilized in TESTING-builds as well to ensure that whatever `AppOptions` are set they're always respected.
For DOM events all event names are lower-case, and the newly added PDF.js scripting-events thus "stick out" quite a bit. Even more so, considering that our internal `eventBus`-events follow the same naming convention.
Hence this patch, which changes the "updateFromSandbox"/"dispatchEventInSandbox" events to be lower-case instead.
Furthermore, using DOM events for communication *within* the PDF.js code itself (i.e. between code in `web/app.js` and `src/display/annotation_layer.js/`) feels *really* out of place.
That's exactly the reason that we have the `EventBus` abstraction, since it allowed us to remove prior use of DOM events, and this patch thus re-factors the code to make use of the `EventBus` instead for scripting-related events.
Obviously for events targeting a *specific element* using DOM events is still fine, but the "updatefromsandbox"/"dispatcheventinsandbox" ones should be using the `EventBus` internally.
*Drive-by change:* Use the `BaseViewer.currentScaleValue` setter unconditionally in `PDFViewerApplication._initializeJavaScript`, since it accepts either a string or a number.
- Actually remove the `isDown` property when destroying the scripting-instance.
- Mark all `mouseState` usage as "private" in the various classes.
- Ensure that the `AnnotationLayer` actually treats the parameter as properly *optional*, the same way that the viewer components do.
- For now remove the `mouseState` parameter from the `PDFPageView` class, and keep it only on the `BaseViewer`, since it's questionable if all of the scripting-functionality will work all that well without e.g. a full `BaseViewer`.
- Append the `mouseState` to the JSDoc for the `AnnotationElement` class, and just move its definition into the base-`AnnotationElement` class.
* the goal is to execute actions like Open or OpenAction
* can be tested with issue6106.pdf (auto-print)
* once #12701 is merged, we can add page actions