Given that we're (ab)using spread-modes in order to ensure that pages are centered *vertically* in PresentationMode, this re-factoring simplifies the code slightly.
Furthermore, in the event that we *possibly* want to try and support spread-modes in PresentationMode[1] this re-factoring will also prevent future duplication.
---
[1] Note that I'm not particularly keen on doing that, since documents with varying page sizes will be annoying to handle.
There's no point in having these variables defined (implicitly) as `undefined` in e.g. the Firefox PDF Viewer.
By defining them with `var` and using ESList ignores, rather than `let`, we can move them into the relevant pre-processor block instead. Note that since the entire viewer-code is placed, by Webpack, in a top-level closure these variables will thus *not* become globally accessible.
Given that `webViewerOpenFileViaURL` only has a single call-site, and also isn't a particularly large/complex function, it doesn't seem necessary for this to be a separate function and hence it's simply inlined instead.
Also, changes the "no valid build-target was set"-case to throw unconditionally since the only way that it could ever be hit is if there are bugs in the `gulpfile`-code.
Currently we're using *both* ids and classes when specifying the button icons, which seems completely unnecessary and only serves to bloat the CSS code.
In general you always need to be careful about CSS specificity, but in these cases that should not actually be a problem.
Also, while not a button, this patch makes a similar simplification for the `pageNumber`-input.
Note how in the `viewer.html` file we're specifying class names for most buttons, despite that not really being necessary. First of all, in many cases those class names are *identical* to the element-ids. Secondly, looking through the CSS rules they are only ever used when specifying button icons.
All-in-all, we should be able to simplify the HTML-code and use the element-ids in the CSS rules instead. Obviously ids have a higher CSS specificity than classes, but given how the old classes were being used that shouldn't be a problem here.
Also, the patch generalizes the styling for buttons (e.g. `viewBookmark`) that are *actually* link-elements.
Finally, while slightly unrelated, this patch also removes a little bit of duplication when specifying the border for `toolbarField`s.
Searching through all of the files in the `web/`-folder has no *other* hits for the "textButton" string. Hence it's clear that there are no DOM elements actually using this class, and it's thus dead code.
This re-factors the various toolbar separators to *explicitly* specify both their dimensions and margins. Also, for the `horizontalToolbarSeparator`-class we can just set the `background-color` rather than using `border-top`.
Note that the `splitToolbarButtonSeparator`-class currently sets a number of unnecessary CSS rules, since as mentioned by the Firefox Devtools both the `display`- and `z-index`-properties are being ignored because `float` is used.
Finally, there's also no need to set a `z-index` for some of the `:hover`-rules. It's possible that this *was* necessary before the re-design, since the buttons had borders then.
Note how both of the openFile-buttons are always hidden during viewer initialization in the MOZCENTRAL build, i.e. the *built-in* Firefox PDF Viewer. Despite that we still include HTML, CSS, and JavaScript code for these buttons in the build.
This patch *reduces* the size of the `gulp mozcentral` output by `1679` bytes, which isn't a lot but still cannot hurt.
These rules became unnecessary with PR 7697, over five years ago, since printing is now done from a `printContainer`-element rather than "directly" using the viewer.
Note how the *entire* `outerContainer`, which contains all of the DOM elements that were being manually hidden, is now being hidden during printing. Furthermore, note also how the print-canvases/images and their containers are using custom CSS-classes[1] rather than re-using the ones from the viewer.
---
[1] See the `printedPage` respectively `xfaPrintedPage` classes.
Rather than modifying the leading/trailing `margin` on the actual toolbar buttons, to achieve appropriate spacing at the left/right edges of the toolbar(s), it seems much more appropriate (and simpler) to just specify an explicit `padding` for the relevant toolbar containers.
Also, for toolbar buttons placed in `splitToolbarButton`-classes we can reduce some complexity around setting the `margin` (since it should always be zero now).
With these changes, we're thus able to get rid of a couple more `!important`-rules.
- Remove a redundant `margin-top` rule for the `.dropdownToolbarButton`. After the (somewhat) recent UI-refresh all buttons now use `margin: 2px 1px;`, which renders the override unnecessary (and getting rid of an `!important`-rule can't hurt).
- Combine two `.toggled::before` rules, since they're identical.
Note that both the `errorWrapper` HTML and JavaScript code is being ignored in the MOZCENTRAL build, i.e. the *built-in* Firefox PDF Viewer, however the CSS rules are still being included.
That seems totally unnecessary, and while we currently don't have full build-target support in the CSS pre-processor we can actually improve things quite easily anyway. By (ab)using the existing CSS pre-processor, which will remove any non-Firefox CSS rules for the MOZCENTRAL build, it's possible to easily stop bundling any CSS rules by using comments that include a `-webkit`-string.
*Please note:* To easily test that this doesn't break the `errorWrapper` in GENERIC builds, try running e.g. `PDFViewerApplication._otherError("test");` in the web-console.
This special-case was added because the original PresentationMode-implementation used some CSS-tricks to hide everything except the current page. With the changes in PR 14112, which added a PAGE scroll-mode, many of the old PresentationMode-specific hacks could thus be removed from both the JS and CSS code.
This patch is yet another (small) clean-up step, to reduce the number of PresentationMode special-cases used throughout the `BaseViewer`. Note in particular that `BaseViewer.update` now works just fine in PresentationMode[1], and that we only need to ensure that the active page won't *accidentally* change because of the PresentationMode-specific zooming that occurs during page-switching.
---
[1] In the event that we ever want to try and support spread-modes in PresentationMode, which I'm really not keen on doing since documents with varying page sizes will be annoying to handle, these changes would be necessary as well.
The styling of the previous/next-buttons and the findInput, with the elements being "glued" together, was supposed to mimic the styling used in the Firefox *browser* findbar. However, after the most recent re-styling of the Firefox browser UI these elements are now visually separated.
Hence it makes sense, as far as I'm concerned, to try and follow this styling for the findbar used in the GENERIC viewer. One benefit of doing this is that we get more consistent styling, since the buttons now look/behave identically in both the main toolbar and the findbar. Additionally this also simplifies the CSS a bit, since a lot of the existing findbar-specific rules can be removed.
The spread-mode code in `BaseViewer.#ensurePageViewVisible`-method was initially copied from the `BaseViewer._updateSpreadMode`-method, which means that it's slightly more complicated than actually necessary.
In particular, in the PAGE scroll-mode there can only be *one* spread active at a time and we thus don't need to handle insertion of multiple spread-divs.
Given that no HTML element has used the `loadingBox`-id for many years, we obviously don't need to try and hide a non-existent element during printing.
Furthermore, we also shouldn't need to change the `overflow`-value for the `viewerContainer`-element during printing. Originally, many years ago now, we printed "directly" using the viewer and then this apparently made sense.
Given that none of these CSS rules are used at all, unless debugging is enabled, it seems completely unnecessary to load them *unconditionally* for all users.[1]
Note that if *both* the `textLayer` and `pdfBug` debugging hash-parameters are specified simultaneously, we'll now load the `PDFBug`-file *twice* (since the code is simpler that way). However, given first of all that none of this is enabled by default and secondly that using those parameters together isn't helpful[2], potentially loading that file twice is hopefully not an issue.
For the `gulp mozcentral` target, the size of the *built* `viewer.css` file is reduced `> 3%` with this patch.
---
[1] For the Firefox built-in PDF Viewer, in order to even be able to access the `PDFBug` functionality, you need to first of all set `pdfjs.pdfBugEnabled = true` manually in `about:config`. Secondly, you then also need to append the `pdfBug=...` hash-parameter to the URL when *initially* loading the document.
[2] Note how the `textLayer`-settings are already, since essentially forever, overriding the highlighting-features of the "FontInspector"-tab.
This CSS variable is only used together with the `annotationCanvasMap`-functionality in the canvas-code, however its value can be *trivially* computed by using the older `--zoom-factor` CSS variable together with the `PixelsPerInch`-structure.
Rather than having *two different* CSS variables that are this closely linked, it seems better to simplify things by using just one CSS variable instead.
According to the CSS, there should be a visible "divider" after the "Page Width" zoom-option. However, this is being ignored in both Mozilla Firefox[1] and Google Chrome hence the rule is effectively useless now.
Furthermore, the "custom" zoom-option is already being `hidden` using the attribute (in the HTML code) and there should thus be no reason to duplicate this in the CSS as well.
---
[1] Support for *detailed* styling of `<select>`-elements was removed as part of the E10s project.
With just a couple of exceptions, for the `thumbnailView`, all of the sidebarViews share the same basic styling which thus allows for some simplification.
- For the findbar/secondaryToolbar case, the `min-width` rule doesn't really make sense since it's way too small to be useful. Furthermore, the findbar is already specifying its own `min-width` and the secondaryToolbar will (thanks to its buttons) receive a correct/useful width.
- The pageNumber-input already has an *explicit* `width` set, hence setting the `min-width` rule as well is completely unnecessary.
- The treeItem-links are supposed to *compute* their `min-width`, and the static value was only added as a fallback for older browsers without `calc()` support.
In a couple of spots, mostly related to the debugging tools, we're unnecessarily using a somewhat "complex" `background`-format only to specify a solid color. This can be simplified by using `background-color` instead, and the patch also removes a `color`-rule that's being ignored anyway.
After the changes in https://bugzilla.mozilla.org/show_bug.cgi?id=1761839, we no longer need this CSS work-around to prevent the entire `<dialog>` contents from becoming selected when the backdrop is clicked.
When the viewer becomes narrow enough that the sidebar is overlaying the document, which means that the `viewerContainer` is not moving when opening/closing the sidebar, we're currently not removing the `sidebarMoving` CSS class as intended.
While this doesn't cause any *visible* issues, it's nonetheless wrong and should be fixed.
With the changes in PR 8993, a number of the `@media`-related CSS rules became unnecessary. However, it appears that some of these rule were *accidentally* left behind despite being unused now.
Note that previously, when opening the sidebar shifted the position of the main toolbar, we had to take both the sidebar opened *and* closed cases into account in these `@media` rules.
*This is yet another installment in a never-ending series of patches that attempt to simplify and improve old code.*
The `fileInput`-element is used to support the "Open file"-button in the `GENERIC` viewer, however this is very old code.
Rather than creating the element dynamically in JavaScript, we can simply define it conditionally in the HTML code thanks to the pre-processor. Furthermore, the `fileInput`-element currently has a number of unnecessary CSS rules, since the element is *purposely* never made visibly.
Note that with these changes, the `fileInput`-element will now *always* have `display: none;` set. This shouldn't matter, since we can still trigger the `click`-event of the element just fine (via JavaScript) and this patch has been successfully tested in both Mozilla Firefox and Google Chrome.
With the changes in the previous patch, we can simplify the state-tracking by using the `PresentationModeState`-values directly in the `PDFPresentationMode` class.
The `pdfOpenParams` parameter has never really made sense in PresentationMode, since e.g. the zoom-value doesn't (generally) agree with the value chosen by the user prior to entering PresentationMode.
This has never mattered all that much, since the `viewBookmark`-button isn't visible in PresentationMode (nor is any other toolbar button for that matter). However, in the `PDFHistory`-implementation we're currently forced to handle this case specifically since we don't want to populate the browser history with nonsensical state.
Hence it makes overall sense, as far as I'm concerned, to tweak the "updateviewarea" event to include a *simplified* `pdfOpenParams` parameter when PresentationMode is active. Given that the `viewer components` don't include PresentationMode functionality, this change thus shouldn't matter for third-party users.
This method was originally added specifically to work-around bugs/issues related to PresentationMode in Google Chrome. Note that prior to PR 14112 we were using some CSS hacks to only show the current page in PresentationMode, and that could lead to the `getVisibleElements` function not always finding the correct elements.
However, after the changes in PR 14112 we're now using the Page-scrolling mode in PresentationMode and consequently there'll only be *a single* page visible at a time. Hence then `BaseViewer._getCurrentVisiblePage` helper method should no longer be needed, and when testing (locally) in Google Chrome everything seems to work correctly now.
The various functionality in `web/debugger.js` is currently *indirectly* added to the global scope, since that's how `var` works when it's used outside of any functions/closures.
Given how this functionality is being accessed/used, not just in the viewer but also in the API and textLayer, simply converting the entire file to a module isn't trivial[1]. However, we can at least export the `PDFBug`-part properly and then `import` that dynamically in the viewer.
Also, to improve the code a little bit, we're now *explicitly* exporting the necessary functionality globally.
According to MDN, see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/import#browser_compatibility, all the browsers that we now support have dynamic `imports` implementations.
---
[1] We could probably pass around references to the necessary functionality, but given how this is being used I'm just not sure it's worth the effort. Also, adding *official* support for these viewer-specific debugging tools in the API feels both unnecessary and unfortunate.
Please note that this patch is purposely quite basic, e.g. it doesn't add the polyfill-CSS in order to simplify the build process, and things such as `::backdrop` thus isn't working.
However, this patch does ensure that older browsers can at least still *access* all of the previous overlays and that things like e.g. opening of password-protected documents respectively printing still works.
This will hopefully improve the a11y a little bit in these dialogs, however there's most definately more things that can be done here (by someone more knowledgeable about a11y).
This way we're able to store the `<dialog>` elements directly, which removes the need to use manually specified name-strings thus simplifying both the `OverlayManager` itself and its calling code.
This replaces our *custom* overlays with standard `<dialog>` DOM elements, see https://developer.mozilla.org/en-US/docs/Web/HTML/Element/dialog, thus simplifying the related CSS, HTML, and JavaScript code.
With these changes, some of the functionality of the `OverlayManager` class is now handled natively (e.g. `Esc` to close the dialog). However, since we still need to be able to prevent dialogs from overlaying one another, it still makes sense to keep this functionality (as far as I'm concerned).
PDFScriptingManager uses the `mousedown` and `mouseup` listeners to keep
track of whether the mouse pointer is pressed in the `isDown` flag.
These listeners were registered to run during the bubbling phase of the
event dispatch, which can be interrupted if any of the previous event
listeners stopped the event propagation. An example of that is by
`GrabToPan` in web/grab_to_pan.js.
Since the mousedown (and mouseup) listeners of PDFScriptingManager are
free of side effects, and the intention is to always run them, it makes
most sense to register them with the capture flag.
After the recent round of patches, I figured that we'd gone as far as possible in replacing `dir`-dependent CSS rules for the viewer.
However, it occurred that me that we could actually use a bit of CSS-trickery to get rid of the remaining ones. More specifically, this was done by defining a CSS variable whose value depends on the document direction and then using that variable together with `calc()` in the affected rules.
*Please note:* I suppose that this could perhaps be seen as a bit too "magical", hence I understand if this patch is ultimately rejected, however this is probably the only simple way to get rid of the remaining `dir`-dependent CSS rules.
The old code would allow an overlay to force close *itself*, before immediately re-opening itself, which actually isn't very helpful in practice since that won't re-run any overlay-specific initialization code.
Given how the overlays are being used this really shouldn't have caused any issues, but it's a bug that we should fix nonetheless.
*Please note:* This is another step in what will, time permitting, become a series of patches to simplify/modernize the viewer CSS.
Rather than having to manually specify ltr/rtl-specific float-values in the CSS, we can use logical `inline-start`/`inline-end` instead (and similar for some related left/right occurrences).
These logical properties depend on, among other things, the direction of the HTML document which we *always* specify in the viewer.
Given that most of these logical CSS properties are fairly new, and that cross-browser support is thus somewhat limited (see below), we rely on PostCSS plugins in order to support this in the GENERIC viewer.
- https://developer.mozilla.org/en-US/docs/Web/CSS/float#browser_compatibility
- https://developer.mozilla.org/en-US/docs/Web/CSS/inset-inline-end#browser_compatibility
Currently we're resolving the Promises in the `_extractTextPromises` Array with the page-index, despite that not really being necessary since the Promises in the Array are explicitly inserted in the correct order.
Furthermore, we can replace the standard `for`-loop with a `for...of`-loop which results in ever so slightly more compact code.
Given that none of these methods were ever intended to be accessed directly from the outside, we can use modern ECMAScript features to ensure that they are indeed private.
This patch also makes `fieldData` private, to remove the old hack used to prevent it from being modified from the outside.
Given that we're now *building* the `web/viewer.css` file used in the development viewer, i.e. with `gulp server`, we no longer need to hard-code these `-webkit`-prefixed rules and can instead let Autoprefixer handle that for us.
To allow using modern CSS features that currently only Mozilla Firefox supports[1], while still enabling development/testing in recent Google Chrome versions, we'll have to start building the `web/viewer.css` file with `gulp server` as well.
In my testing, building the development CSS (and copying the images) takes *less than* `200 ms` on average which is hopefully an acceptable overhead for this sort of feature.
---
[1] In particular `float`, with `inline-start`/`inline-end` values.
This patch fixes an old inconsistency, when using `BasePreferences.{reset, set}`, where the internal Preference values would be kept even if writing them to storage failed.
Given that none of these fields were ever intended to be accessed directly from the *outside*, since that will lead to inconsistent/broken state, we can use modern ECMAScript features to ensure that they are indeed private.
*Please note:* I don't really know anything about a11y, hence it's possible that this patch either doesn't work correctly or at least isn't a complete solution.
In both the SecondaryToolbar and the Sidebar we have "button groups" that functionally acts essentially like radio-buttons. Based on skimming through [this MDN article](https://developer.mozilla.org/en-US/docs/Web/Accessibility/ARIA/Roles/radio_role) it thus appears that we should tag them as such, using `role="radiogroup"` and `role="radio"`, and then utilize the `aria-checked` attribute to indicate to a11y software which button is currently active.
Unfortunately simply using `appearance: textfield;`, or even `-webkit-appearance: textfield;`, doesn't actually hide the "spinner" in number-input elements in e.g. Google Chrome.
Hence we need to use a work-around with the `-webkit-inner-spin-button` rule, however in our CSS code we also have `-webkit-outer-spin-button` currently. According to both [the MDN compatibility data](https://developer.mozilla.org/en-US/docs/Web/CSS/::-webkit-outer-spin-button#browser_compatibility) and also manual testing in Google Chrome Beta 99, the `-webkit-outer-spin-button` rule is no longer necessary and we can thus clean-up the CSS a tiny bit.
This was added in PR 4516 specifically for Safari on iOS devices, but according to MDN it should no longer be necessary now; see https://developer.mozilla.org/en-US/docs/Web/CSS/-webkit-overflow-scrolling#browser_compatibility
According to the MDN compatibility data, this CSS feature:
- Was never implemented anywhere *except* for Safari on iOS.
- Was never standardized and thus never existed in an *unprefixed* version.
- Has now been removed, starting with Safari version 13.
Given that [the FAQ](https://github.com/mozilla/pdf.js/wiki/Frequently-Asked-Questions#faq-support) already lists Safari as "Mostly" supported, and that the default viewer is written primarily for Mozilla Firefox, it ought to be fine to remove these CSS rules now.
At this point in time, after recent rounds of clean-up, the `webkit`-prefixed Fullscreen API is the only remaining *browser-specific* compatibility hack in the `web/`-folder JavaScript code.
The standard, and thus unprefixed, Fullscreen API has been supported for *over three years* in both Mozilla Firefox and Google Chrome. [According to MDN](https://developer.mozilla.org/en-US/docs/Web/API/Fullscreen_API#browser_compatibility), the unprefixed Fullscreen API has been available since:
- Mozilla Firefox 64, released on 2018-12-11; see https://wiki.mozilla.org/Release_Management/Calendar#Past_branch_dates
- Google Chrome 71, released on 2018-12-04; see https://en.wikipedia.org/wiki/Google_Chrome_version_history
Hence *only* Safari now requires using a prefixed Fullscreen API, and it's thus (significantly) lagging behind other browsers in this regard.
Considering that the default viewer is written *specifically* to be the UI for the Firefox PDF Viewer, and that we ask users to not just use it as-is[1], I think that we should only support the standard Fullscreen API now.
Furthermore, note also that the FAQ already lists Safari as "Mostly" supported; see https://github.com/mozilla/pdf.js/wiki/Frequently-Asked-Questions#faq-support
---
[1] Note e.g. http://mozilla.github.io/pdf.js/getting_started/#introduction
> The viewer is built on the display layer and is the UI for PDF viewer in Firefox and the other browser extensions within the project. It can be a good starting point for building your own viewer. *However, we do ask if you plan to embed the viewer in your own site, that it not just be an unmodified version. Please re-skin it or build upon it.*
Now that the there's ECMAScript support for properly private methods on `class`es, we can use that instead and thus remove all of the `@private` JSDocs comments.
In some cases, in the `PDFPageView` implementation, we're modifying the `sx`/`sy` properties when CSS-only zooming is being used.
Currently this requires that you remember to *manually* update the `scaled` property to prevent issues, which doesn't feel all that nice and also seems error-prone. By replacing the `scaled` property with a getter, this is now handled automatically instead.
The `CanvasRenderingContext2D.backingStorePixelRatio` property was never standardized, and only Safari set (its prefixed version of) it to anything other than `1`.
Note that e.g. MDN doesn't contain any information about this property, and one of the few sources of information (at this point) is the following post: https://stackoverflow.com/questions/24332639/why-context2d-backingstorepixelratio-deprecated
Hence we can simplify the `getOutputScale` helper function, by removing some dead code, and now it no longer requires any parameters when called.
Given that the `Navigator` interface has been available since "forever", please see https://developer.mozilla.org/en-US/docs/Web/API/Navigator#browser_compatibility, it's somewhat difficult to see why these checks are actually necessary since the viewer is only intended for usage in browsers.
Looking at the history of the code, this functionality was originally placed in the general `src/shared/compatibility.js` file which could thus run in e.g. worker-threads and Node.js environments (where the `Navigator` interface isn't available).
- it aims to fix#14562;
- 'X-\n' were not correctly positioned;
- when X is a diacritic (e.g. in "sä-\n", which is decomposed into "sa¨-\n") we must handle both things:
- diacritics on the one hand;
- "-\n" on the other hand.
This is essentially a *continuation* of PR 7926, where we added support for rejecting the current `PDFDocumentLoadingTask`-promise by throwing inside of the `onPassword`-callback.
Hence the naive way to address [bug 1754421](https://bugzilla.mozilla.org/show_bug.cgi?id=1754421) would be to simply throw in the `onPassword`-callback used in the default viewer. However it unfortunately turns out to not work, since the password input/validation is asynchronous, and we thus need another approach.
The simplest solution that I can come up with here, is thus to *extend* the `onPassword`-callback to also reject the current `PDFDocumentLoadingTask`-instance if an `Error` is explicitly passed as the input to the callback function. (This doesn't feel great, but I cannot see a better solution that isn't really complicated.)
Note that the *browser* findbar in Firefox uses "Title Case" for the labels, and it thus seem like a good idea to ensure that `PDFFindBar` in consistent with that.
Furthermore, the new label added in PR #13261 uses the "Title Case" format which means that currently the default viewer findbar looks inconsistent.
*Please note:* Based on the official Firefox localization docs, see https://firefox-source-docs.mozilla.org/l10n/overview.html#string-updates, changing only the casing should *not* require updating the key:
> 1) If the change is minor, like fixing a spelling error or case, the developer should update the en-US translation without changing the l10n-id.
When the viewer becomes narrow, the `PDFFindBar` will (forcibly) wrap its elements to prevent it from extending to the full window width.
Currently, after PR 13261, this now leads to the `findResultsCount` span taking up vertical space *unconditionally* when the findbar is wrapped. To avoid a blank space being shown in this case, before searching has begun, place the `findResultsCount` span in a "message" rather than an "options" container.
- get original index in using a dichotomic seach instead of a linear one;
- normalize the text in using NFD;
- convert the query string into a RegExp;
- replace whitespaces in the query with \s+;
- handle hyphens at eol use to break a word;
- add some \s* around punctuation signs
With these changes, we'll now *always* replace all whitespaces with standard spaces (0x20). This behaviour is already, since many years, the default in both the viewer and the browser-tests.
This `disableCreateObjectURL` option was originally introduced all the way back in PR 4103 (eight years ago), in order to work-around `URL.createObjectURL()`-bugs specific to Internet Explorer.
In PR 8081 (five years ago) the `disableCreateObjectURL` option was extended to cover Google Chrome on iOS-devices as well, since that configuration apparently also suffered from `URL.createObjectURL()`-bugs.[1]
At this point in time, I thus think that it makes sense to re-evaluate if we should still keep the `disableCreateObjectURL` option.
- For Internet Explorer, support was explicitly removed in PDF.js version `2.7.570` which was released one year ago and all IE-specific compatibility code (and polyfills) have since been removed.
- For Google Chrome on iOS-devices, while we still "support" such configurations, it's *not* the focus of any development and platform-specific bugs are thus often closed as WONTFIX.
Note here that at this point in time, the `disableCreateObjectURL` option is *only* being used in the viewer and any `URL.createObjectURL()`-bugs browser/platform bugs will thus not affect the main PDF.js library. Furthermore, given where the `disableCreateObjectURL` option is being used in the viewer the basic functionality should also remain unaffected by these changes.[2]
Furthermore, it's also possible that the `URL.createObjectURL()`-bugs have been fixed in *browser* itself since PR 8081 was submitted.[3]
Obviously you could argue that this isn't a lot of code, w.r.t. number of lines, and you'd be technically correct. However, it does add additional complexity in a few different viewer components which thus add overhead when reading and working with this code.
Finally, assuming the `URL.createObjectURL()`-bugs are still present in Google Chrome on iOS-devices, I think that we should ask ourselves if it's reasonable for the PDF.js project (and its contributors) to keep attempting to support a configuration if the *browser* developers still haven't fixed these kind of bugs!?
---
[1] According to https://groups.google.com/a/chromium.org/forum/#!topic/chromium-html5/RKQ0ZJIj7c4, which is linked in PR 8081, that bug was mentioned/reported as early as the 2014 (eight years ago).
[2] Viewer functionality such as e.g. downloading and printing may be affected.
[3] I don't have access to any iOS-devices to test with.
So that Firefox doesn't switch to pixel mode for compat with other
browsers.
This should fix https://github.com/mozilla/pdf.js/issues/14476, in terms
of restoring the previous behavior.
We probably want to change the pixel-based scrolling code to not scroll
so much (the deltaMode stuff normalizes to +/-1 tick for each wheel
event, perhaps the pixel-based value should do the same).
- it aims to fix issue #14307;
- this event has been added recently in Firefox and we can now use it;
- fix few bugs in aform.js or in annotation_layer.js;
- add some integration tests to test keystroke events (see `AFSpecial_Keystroke`);
- make dispatchEvent in the quickjs sandbox async.
*Please note:* This is a tentative patch, since I don't know if this is deemed important enough to fix.
The new event could be seen as a *supplement* to the existing "documentinit" and "documentloaded" events, but for the case when a PDF document fails to load.
To make the "documenterror" event generally useful, it'll include both the localized error message as well as the original reason for the error (when that exists).
This helper function has never been used in e.g. the worker-thread, hence its placement in `src/shared/util.js` led to a *small* amount of unnecessary duplication.
After the previous patches this helper function is now *only* used in the viewer, hence it no longer seems necessary to expose it through the official API.
*Please note:* It seems somewhat unlikely that third-party users were relying *directly* on this helper function, which is why it's not being exported as part of the viewer components. (If necessary, we can always change this later on.)
As part of the changes/improvement in PR 14092, we're no longer using the `addLinkAttributes` directly in e.g. the AnnotationLayer-code.
Given that the helper function is now *only* used in the viewer, hence it no longer seems necessary to expose it through the official API.
*Please note:* It seems somewhat unlikely that third-party users were relying *directly* on the helper function, which is why it's not being exported as part of the viewer components. (If necessary, we can always change this later on.)
This structure contains *almost* exclusively references to DOM elements (and a couple of simple strings), rather than complete classes/functions. Hence the `eventBus`-option sticks out a fair bit, and I'd guess that it's *mostly* unused in e.g. third-party implementations.
Given that we, in multiple places, mention that the default viewer shouldn't be used as-is I really don't think that we need to keep this special `eventBus`-option around. Furthermore, nowadays it's also a lot easier to (safely) access the existing `EventBus`-instance in the viewer; see https://github.com/mozilla/pdf.js/wiki/Third-party-viewer-usage#initialization-promise which shows how to listen for the default viewer being initialized (and its `eventBus` thus being available).
Besides converting `Catalog.getPageDict` to an `async` method, thus simplifying the code, this patch also allows us to pro-actively fix a existing issue.
Note how we're looking up References in such a way that `MissingDataException`s won't cause trouble, however it's *technically possible* that the entries (i.e. /Count, /Kids, and /Type) in a /Pages Dictionary could actually be indirect objects as well. In the existing code this could lead to *some*, or even all, pages failing to load/render as intended.
In practice that doesn't *appear* to happen in real-world PDF documents, but given all the weird things that PDF software do I'd prefer to fix this pro-actively (rather than waiting for a bug report).
With `Catalog.getPageDict` being `async` this is now really simple to address, however I didn't want to introduce a bunch more *unconditional* asynchronicity in this method if it could be avoided (since that could slow things down). Hence we'll *synchronously* lookup the *raw* data in a /Pages Dictionary, and only fallback to asynchronous data lookup when a Reference was encountered.
In addition to the above, this patch also makes the following notable changes:
- Let `Catalog.getPageDict` *consistently* reject with the actual error, regardless of what data we're fetching. Previously we'd "swallow" the actual errors except when looking up Dictionary entries, which is inconsistent and thus seem unfortunate. As can be seen from the updated unit-tests this change is API-observable, hence why the patch is tagged `[api-minor]`.
- Improve the consistency of the Dictionary /Type-checks in both the `Catalog.getPageDict` and `Catalog.getAllPageDicts` methods.
In `Catalog.getPageDict` there's a fallback code-path where we're *incorrectly* checking the /Page Dictionary for a /Contents-entry, which is wrong since a /Page Dictionary doesn't need to have a /Contents-entry in order to be valid.
For consistency the `Catalog.getAllPageDicts` method is also updated to handle errors in the /Type-lookup correctly.
- Reduce the `PagesCountLimit.PAUSE_EAGER_PAGE_INIT` viewer constant, to further improve loading/rendering performance of the *second* page during initialization of very long documents; PR 14359 follow-up.
*This addresses the following case missing from the previous patch:*
The viewer is loaded in an *active* window/tab, and enough time is allowed to pass in order to allow rendering to start. However, if the user then switches to another tab (or another program) *before* rendering has finished, the "load" event also needs to be unblocked.
Given that `requestAnimationFrame` is being used, see the `src/diplay/api.js` file, an inactive window/tab means that rendering will not run and we'll thus not fetch all pages. The latter is a requirement for the "load" event to be unblocked, in the MOZCENTRAL-version of, the default viewer.
This patch is a *partial* solution, since it only addresses the following situations:
- A *background* tab (containing the viewer) is reloaded, e.g. via the tab-bar context menu.
- The viewer is loaded in a active tab, but the user switches away from it (or switches to another program window) *before* rendering has started.
This provides a work-around for badly generated PDF documents that contain *negative* /FitH parameters (in the referenced issue the value `-32768` is used).
This patch, first of all, removes circular dependencies in the TypeScript definitions. Secondly, it also moves `RenderingStates` into `web/ui_utils.js` to break another type-dependency and directly use the `XfaLayerBuilder` during XFA-printing.
Finally, note that this patch *slightly* reduces the size of the default viewer (e.g. in the `MOZCENTRAL` build) by not having to bundle code which is completely unused.
This patch circumvents the issues seen when trying to update TypeScript to version `4.5`, by "simply" fixing the broken/missing JSDocs and `typedef`s such that `gulp typestest` now passes.
As always, given that I don't really know anything about TypeScript, I cannot tell if this is a "correct" and/or proper way of doing things; we'll need TypeScript users to help out with testing!
*Please note:* I'm sorry about the size of this patch, but given how intertwined all of this unfortunately is it just didn't seem easy to split this into smaller parts.
However, one good thing about this TypeScript update is that it helped uncover a number of pre-existing bugs in our JSDocs comments.
The size of the `web/ui_utils.js` file has increased over time, as more code has been added to (or moved into) that file. To reduce its size slightly, this patch moves the event-related functionality into a separate file.
In PR 14114 this was only added to the default viewer, which means that in the viewer components the user would need to *manually* implement /Lang handling. This was (obviously) a bad choice, since the viewer components already support e.g. structTrees by default; sorry about overlooking this!
To avoid having to make *two* `getMetadata` API-calls[1] very early during initialization, in the default viewer, the API will now cache its result. This will also come in handy elsewhere in the default viewer, e.g. by reducing parsing when opening the "document properties" dialog.
---
[1] This not only includes a round-trip to the worker-thread, but also having to re-parse the /Metadata-entry when it exists.
By making this API-call *unconditionally*, we introduce a (slight) delay in the initialization of *all* documents.
That seems quite unfortunate, since `pdfjs.enablePermissions` is off by default, and it thus seem better only do the API-call when actually needed; sorry about this!
For encrypted PDF documents without the required permissions set, this patch adds support for disabling of form editing. However, please note that it also requires that the `pdfjs.enablePermissions` preference is set to `true`[1] (since PDF document permissions could be seen as user hostile).
Based on https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G6.1942134, this condition hopefully makes sense.
---
[1] Either manually with `about:config`, or using e.g. a [Group Policy](https://github.com/mozilla/policy-templates).
This patch is essentially *another* continuation of PR 11263, which tried to improve loading/initialization performance of *very* large/long documents.
For most documents, unless they're *very* long, we'll eagerly initialize all of the pages in the viewer. For shorter documents having all pages loaded/initialized early provides overall better performance/UX in the viewer, however there's cases where it can instead *hurt* performance.
For documents with a couple of thousand pages[1], the parsing and pre-rendering of the *second* page of the document can be delayed (quite a bit). The reason for this is that we trigger `PDFDocumentProxy.getPage` for *all pages* early during the viewer initialization, which causes the worker-thread to be swamped with handling (potentially) thousands of `getPage`-calls and leaving very little time for other parsing (such as e.g. of operatorLists).
To address this situation, this patch thus proposes temporarily "pausing" the eager `PDFDocumentProxy.getPage`-calls once a threshold has been reached, to give the worker-thread a change to handle other requests.[2]
Obviously this may *slightly* delay the "pagesloaded" event in longer documents, but considering that it's already the result of asynchronous parsing that'll hopefully not be seen as a blocker for these changes.[3]
---
[1] A particularly problematic example is https://github.com/mozilla/pdf.js/files/876321/kjv.pdf (16 MB large), which is a document with 2236 pages and a /Pages-tree that's only *one* level deep.
[2] Please note that I initially considered simply chaining the `PDFDocumentProxy.getPage`-calls, however that'd slowed things down for all documents which didn't seem appropriate.
[3] This patch will *hopefully* also make it possible to re-visit PR 11312, since it seems that changing `Catalog.getPageDict` to an `async` method wasn't the problem in itself. Rather it appears that it leads to slightly different timings, thus exacerbating the already existing issues with the worker-thread being overloaded by `getPage`-calls.
Having recently worked with that method, there's a couple of (very old) issues that I'd also like to address and having `Catalog.getPageDict` be `async` would simplify things a great deal.
These changes improves the consistency ever so slightly in the `PDFOutlineViewer._dispatchEvent` method, by making sure that we can tell the following two cases apart:
- The "pagesloaded" event has *not yet* been fired.
- The "pagesloaded" event has been fired, but no pages were available.
*This patch can be tested e.g. with the `poppler-85140-0.pdf` document from the test-suite.*
For some sufficiently corrupt documents the `getDocument` call will succeed, but fetching even the very first page fails. Currently we only print error messages (in the console) from the `{BaseViewer, PDFThumbnailViewer}.setDocument` methods, but don't actually provide these errors to allow the viewer to handle them properly.
In practice this means that the GENERIC viewer won't display the `errorWrapper`, and in the MOZCENTRAL viewer the *browser* loading indicator is never hidden (since we never unblock the "load" event).
This patch is essentially a continuation of PR 11263, which tried to improve loading/initialization performance of *very* large/long documents.
Note that browsers, in general, don't handle a huge amount of DOM-elements very well, with really poor (e.g. sluggish scrolling) performance once the number gets "large". Furthermore, at least in Firefox, it seems that DOM-elements towards the bottom of a HTML-page can effectively be ignored; for the PDF.js viewer that means that pages at the end of the document can become impossible to access.
Hence, in order to improve things for these *very* large/long documents, this patch will now enforce usage of the (recently added) PAGE-scrolling mode for these documents. As implemented, this will only happen once the number of pages *exceed* 15000 (which is hopefully rare in practice).
While this might feel a bit jarring to users being *forced* to use PAGE-scrolling, it seems all things considered like a better idea to ensure that the entire document actually remains accessible and with (hopefully) more acceptable performance.
Fixes [bug 1588435](https://bugzilla.mozilla.org/show_bug.cgi?id=1588435), to the extent that doing so is possible since the document contains 25560 pages (and is 197 MB large).
This was added on the assumption that the viewer would (eventually) start using the `PDFSinglePageViewer` for e.g. PAGE-scrolling mode and PresentationMode. However, having both a `PDFViewer` and a `PDFSinglePageViewer` side-by-side in the viewer would've been tricky to implement well, which is why PR 14112 implemented PAGE-scrolling for the general `BaseViewer` instead.
Given that the default viewer is no longer (potentially) going to use `PDFSinglePageViewer`, there's code in the `SecondaryToolbar` (and related CSS rules) which is now unnecessary.
*This patch can be tested e.g. with the `sizes.pdf` document in the test-suite.*
While this patch isn't necessarily the best solution, e.g. it might be possible to solve this with *only* CSS, it's what I was able to come up with to address an old issue.
The solution here re-uses the `spread`-class in PresentationMode, since that one already takes care of centering pages *vertically*, together with a dummy-page that takes up the entire height of the window.
Finally, some PresentationMode-related CSS-rules are also simplified slightly, since the changes in PR 14112 (using Page-scrolling) allows some clean-up here.
The issue that this patch fixes has existed ever since the viewer was first re-factored into components, however it only really affects the `disableAutoFetch = true` mode.
By default we're fetching all pages in `BaseViewer.setDocument`, and as part of the parsing/initialization we're also populating the `PDFLinkService`-pagesRefCache. The purpose of that cache is to make navigating to any internal destinations faster, by not having to (asynchronously) lookup the pageNumber via the API when handling the destination.
In comparison, when the `disableAutoFetch = true` mode is being used we're instead *lazily* initializing the pages in the `BaseViewer.#ensurePdfPageLoaded`-method. For some reason, that I can only assume is a simple oversight, we're not attempting to update the `PDFLinkService`-pagesRefCache in that case.
In the `BaseViewer` this cache is mostly relevant in the `disableAutoFetch = true` mode, since the pages are being initialized *lazily* in that case.
In the `PDFThumbnailViewer` this cache is mostly used for thumbnails that are actually being rendered, as opposed to those created directly from the "regular" pages.
Please note that I'm not suggesting that we remove these caches because they're only used in some situations, but rather because they're for all intents and purposes actually *redundant*. In the API itself, we're already caching both the page-promises and the actual pages themselves on the `WorkerTransport`-instance.
Hence these viewer-caches aren't really necessary in practice, and adds what to me mostly seems like an unnecessary level of indirection.[1]
Given that the viewer now relies on caching in the API itself, this patch also adds a new unit-test to ensure that page-caching works (and keep working) as expected.
---
[1] In the `WorkerTransport.getPage`-method the parameter is being validated on every call, but that's hardly enough code to warrant keeping the "duplicate" caches in the viewer in my opinion.
*Please note:* These changes will primarily benefit longer documents, somewhat at the expense of e.g. one-page documents.
The existing `PDFDocumentProxy.getStats` function, which in the default viewer is called for each rendered page, requires a round-trip to the worker-thread in order to obtain the current document stats. In the default viewer, we currently make one such API-call for *every rendered* page.
This patch proposes replacing that method with a *synchronous* `PDFDocumentProxy.stats` getter instead, combined with re-factoring the worker-thread code by adding a `DocStats`-class to track Stream/Font-types and *only send* them to the main-thread *the first time* that a type is encountered.
Note that in practice most PDF documents only use a fairly limited number of Stream/Font-types, which means that in longer documents most of the `PDFDocumentProxy.getStats`-calls will return the same data.[1]
This re-factoring will obviously benefit longer document the most[2], and could actually be seen as a regression for one-page documents, since in practice there'll usually be a couple of "DocStats" messages sent during the parsing of the first page. However, if the user zooms/rotates the document (which causes re-rendering), note that even a one-page document would start to benefit from these changes.
Another benefit of having the data available/cached in the API is that unless the document stats change during parsing, repeated `PDFDocumentProxy.stats`-calls will return *the same identical* object.
This is something that we can easily take advantage of in the default viewer, by now *only* reporting "documentStats" telemetry[3] when the data actually have changed rather than once per rendered page (again beneficial in longer documents).
---
[1] Furthermore, the maximium number of `StreamType`/`FontType` are `10` respectively `12`, which means that regardless of the complexity and page count in a PDF document there'll never be more than twenty-two "DocStats" messages sent; see 41ac3f0c07/src/shared/util.js (L206-L232)
[2] One example is the `pdf.pdf` document in the test-suite, where rendering all of its 1310 pages only result in a total of seven "DocStats" messages being sent from the worker-thread.
[3] Reporting telemetry, in Firefox, includes using `JSON.stringify` on the data and then sending an event to the `PdfStreamConverter.jsm`-code.
In that code the event is handled and `JSON.parse` is used to retrieve the data, and in the "documentStats"-case we'll then iterate through the data to avoid double-reporting telemetry; see https://searchfox.org/mozilla-central/rev/8f4c180b87e52f3345ef8a3432d6e54bd1eb18dc/toolkit/components/pdfjs/content/PdfStreamConverter.jsm#515-549
It shouldn't be necessary to iterate through *all* pages when using a non-default `spreadMode`, since we already know which page(s) should become visible.
This code is a left-over from the initial (local) implementation that resulted in PR 14112, however I forgot to clean-up some things such as e.g. this loop.
Also fixes an outdated comment, see PR 14204 which removed the mentioned data-structure.
This patch preserves the old behaviour of appending a `loadingIcon`-div to all pages that are not yet loaded/rendered. However, the actual `loadingIcon`-spinner (i.e. the `loading-icon.gif` image) will only be displayed on *visible* pages to improve performance.
To avoid having to iterate through all pages in the document, which doesn't seem like a good idea for a PDF document with thousands of pages, we use a combination of the currently visible *and* cached pages to toggle the `loadingIcon`-spinner.
This code is the last piece[1] of the viewer that's not using standard `class`es, and by converting this code we get rid of some now unneeded boilerplate code (slightly reducing the size of the *built* `web/viewer.js` file).
Note that while this code was originally imported from a separate repository, it was last sync-ed with upstream *five years* ago which is why this re-factoring should be OK as far as I'm concerned (and we've done some other clean-up since then as well).
---
[1] Technically the `web/debugger.js` file is left as well, however that code is first of all not bundled in the *built* `web/viewer.js` file and secondly it's not even loaded by default either.
- First step to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1737260;
- several interactive pdfs use the possibility to hide/show buttons to show different icons;
- render pushbuttons on their own canvas and then insert it the annotation_layer;
- update test/driver.js in order to convert canvases for pushbuttons into images.
Reporting telemetry, in Firefox, includes using `JSON.stringify` on the data and then sending an event to the `PdfStreamConverter.jsm`-code.
In that code the event is handled and `JSON.parse` is used to retrieve the data, and in the "pageInfo"-case we'll then proceed to ignore everything except *the first* such event; see https://searchfox.org/mozilla-central/rev/24fac1ad31fb9c6e9c4c767c6a7ff45d226078f3/toolkit/components/pdfjs/content/PdfStreamConverter.jsm#509-514
All-in-all, sending the "pageInfo" telemetry for each rendered page is thus unnecessary and this patch makes the viewer send it only *once* instead.
*This is a tentative patch, since we unfortunately cannot easily test it (as far as I can tell).*
In Firefox this (obviously) works as-is, but in Google Chrome the `markedContent` spans are inserted within the regular text-content (in the DOM) and with non-zero heights.
*This is a tentative patch, since I don't have the necessary hardware to test it.*
See https://developer.mozilla.org/en-US/docs/Web/CSS/text-size-adjust, which is currently ignored in Firefox.
It seems overall safer, and more future-proof, to simply add this to the *entire* `textLayer` rather than its individual elements.
With the previous patch, this helper function is no longer used and keeping it around will simply increase the size of the builds.
This removal is purposely done *separately*, to make it easy to revert the patch in the future if this helper function would become useful again.
This relies on the fact that `Set`s preserve the insertion order[1], which means that we can utilize an iterator to access the *first* stored view.
Note that in the `resize`-method, we can now move the visible pages to the back of the buffer using a single loop (hence we don't need to use the `moveToEndOfArray` helper function any more).
---
[1] This applies to `Map`s as well, although that's not entirely relevant here.
The `PDFPageViewBuffer`-code is very important for the correct function of the viewer, but it's currently not tested at all.
While the `PDFPageViewBuffer` is obviously intended to be used with `PDFPageView`-instances, it only accesses a couple of `PDFPageView` properties/methods and consequently it's fairly easy to unit-test this code with dummy-data.
These unit-tests should help improve our confidence in this code, and will also come in handy with other changes that I'm working on (regarding modernizing and re-factoring the `PDFPageViewBuffer`-code).
The way that we're currently handling the last-`id` is very old, and there's no longer any good reason to special-case things when only one thumbnail is visible.
Furthermore, we can also modernize the loop slightly by using `for...of` instead of `Array.prototype.some()` when checking for fully visible thumbnails.
Note how in `PDFPageViewBuffer.resize` we're manually iterating through the visible pages in order to build a Set of the visible page `id`s. By instead moving the building of this Set into the `getVisibleElements` helper function, as part of the existing parsing, this code becomes *ever so slightly* more efficient.
Furthermore, more direct access to the visible page `id`s also come in handy in other parts of the viewer as well.
In the `BaseViewer.isPageVisible` method we no longer need to loop through the visible pages, but can instead directly check if the pageNumber is visible.
In the `PDFRenderingQueue.getHighestPriority` method, when checking for "holes" in the page layout, we can also avoid some unnecessary look-ups this way.
*Sometimes I'll hopefully learn to optimize my code directly when writing it, rather than having to do multiple clean-up passes; sorry about the churn here!*
For most page layouts there won't be any "holes" in the visible pages (or thumbnails), and in those cases it'd obviously be preferable not having to repeat any checks of already rendered pages.
Rather than only checking the "distance" between the first/last pages, we can instead compare the theoretical number of pages (between first/last) with the actually visible number of pages instead. This way, we're able to better detect the "holes"-case and can skip unnecessary parsing in the common case.
I missed this one spot in PR 12870, when converting the other cases in the "keydown" event handler. However, given that it only matters in PresentationMode and/or when "page-fit" zooming is enabled, this oversight shouldn't have had any user-observable impact (but we should fix it nonetheless).
This is a follow-up to PR 10675, since there I completely overlooked that we also need to handle the case where a PDF document has *failed* to load when the "supportsRangedLoading" message is sent to the viewer.
There were some links not working in some XFA files,I realized that the anchor tag that contains the link has an inline display and couldn't receive any height, solved this by adding a "position: absolute". Tested with two different files in Firefox Nightly and Chrome and now all links are working perfectly fine. Added reftest to avoid future regressions
The only reason for using a `DocumentFragment` in the first place, originally added in PR 8724, was to prevent errors in the `PDFPageView`-constructor. However, we should be able to simply make its `container`-option *optional* instead, since it's not being used for anything else in the class.
Note that pre-rendering still works correctly in my testing, and given that the `BaseViewer` keeps references to all `PDFPageView`-instances (via its `_pages` Array) it also shouldn't be possible to "lose" any pages/canvases this way.
Unfortunately there exist PDF documents where all pageLabels are empty strings, see e.g. http://www.cs.cornell.edu/~ragarwal/pubs/blk-switch.pdf (taken from an old issue), which result in the pageNumber-input being completely blank. That doesn't seem very helpful, and this patch simply extends the approach used to ignore pageLabels that are identical to standard page numbering.
In the GENERIC viewer, e.g. when dragging-and-dropping a new PDF document which automatically opens the outline, there can now be breaking errors in the `{BaseViewer, PDFThumbnailViewer}.#getScrollAhead` methods since there's no visible pages/thumbs during loading; sorry about the breakage!