**STR:**
1. Open the default viewer, with the `tracemonkey` file.
2. Open the findbar, and search for "trace".
3. Enable the "Highlight All" option.
4. Close the findbar.
5. Re-open the findbar, and click on the "findNext" button.
6. Scroll down to the *second* page of the document.
**ER:**
Since "Highlight All" is active, all matches on the *second* page should be highlighted.
**AR:**
No matches are highlighted on the *second* page.
This is relevant for e.g. `PDFSinglePageViewer`, and `PDFViewer` with Presentation Mode active.
By moving this code to a helper method in `BaseViewer`, it's thus possible to reduce the amount of duplicate code that currently needed in `PDFViewer` and `PDFSinglePageViewer`.
For a short document, such as e.g. the `tracemonkey` file, this repeated normalization won't matter much, but for documents with a couple of thousand pages it seems completely unnecessary (and wasteful) to keep repeating the normalization whenever for every single page.
Currently the text-content is normalized every time that a new search operation is started, which seems completely useless considering that the "raw" text-content is never used for anything.
For a short document, such as e.g. the `tracemonkey` file, this repeated normalization won't matter much, but for documents with a couple of thousand pages it seems completely unnecessary (and wasteful) to keep repeating the normalization whenever e.g. a new search operation starts.
In the event that multiple instances of `PDFFindController` ever exists simultaneously, they will all be able to share just one `normalize` function in this way. Furthermore, the regular expression is now created lazily rather than at class construction time.
Despite all highlighted matches being removed in response to the 'findbarclose' event, there's a risk that a match could still be scrolled into view *after* the findbar has been closed[1].
Hence we need to ensure that long running searches, particularily those happening in large and/or slow loading documents[2], are ignored as well.
---
[1] The match is hidden, as expected, but the document could still scroll unexpectedly.
[2] Large documents loaded with `disableAutoFetch = true` and `disableStream = true` set are particularily susceptible to this issue.
Given that dispatching the 'updatetextlayermatches' event with `pageIndex = -1` set is now used to target the textLayers of *all* pages, there's no need to send individual events to every single page during `_nextMatch`. Since there can be an arbitrary number of pages in a document, this small/simple optimization seems too easy to ignore.
This patch does four things:
- Change the search related methods in `TextLayerBuilder` to be "private", since there're only called from within the class itself now.
- Use `const` for local variables not intended to change in the search related methods in `TextLayerBuilder`.
- Finally, removes most `this.findController` checks since they are redundant. Note how both `this._convertMatches` and `this._renderMatches` are *only* ever called, from `this._updateMatches`, when `this.findController` is actually defined. Hence there's really no need to repeat those checks all over the place, especially with all the relevant methods now being marked as "private".
- Always initialize the `this._pageMatchesLength` property with an empty array, to simplify the code in `TextLayerBuilder`.
This should, hopefully, cover all the possible ways[1] in which "fake workers" are loaded. Given the different code-paths, adding unit-tests might not be that simple.
Note that in order to make this work, the various `fakeWorkerFilesLoader` functions were converted to return `Promises`.
---
[1] Unfortunately there's lots of them, for various build targets and configurations.
Rather than having every invocation of `Toolbar._updateUIState` compute a valid `pageScaleValue`, it seems easier to simply ensure that it happens when the value is actually updated.
Looking at the history of this code, this parameter has never been used.
I'm guessing that most likely the code in `web/toolbar.js` began life as a copy of `web/secondary_toolbar.js`, which would probably explain why that parameter exists.
*This patch is based on something that I noticed while working on PR 10126.*
The recent re-factoring of `PDFFindController` brought many improvements, among those the fact that access to `BaseViewer` is no longer required. However, with these changes there's one thing which now strikes me as not particularly user-friendly[1]: The fact that in order for searching to actually work, `PDFFindController.setDocument` must be called *and* a 'pagesinit' event must be dispatched (from somewhere).
For all other viewer components, calling the `setDocument` method[2] is enough in order for the component to actually be usable.
The `PDFFindController` thus stands out quite a bit, and it also becomes difficult to work with in any sort of custom implementation. For example: Imagine someone trying to use `PDFFindController` separately from the viewer[3], which *should* now be relatively simple given the re-factoring, and thus having to (somehow) figure out that they'll also need to manually dispatch a 'pagesinit' event for searching to work.
Note that the above even affects the unit-tests, where an out-of-place 'pagesinit' event is being used.
To attempt to address these problems, I'm thus suggesting that *only* `setDocument` should be used to indicate that searching may start. For the default viewer and/or the viewer components, `BaseViewer.setDocument` will now call `PDFFindController.setDocument` when the document is ready, thus requiring no outside configuration anymore[4]. For custom implementation, and the unit-tests, it's now as simple as just calling `PDFFindController.setDocument` to allow searching to start.
---
[1] I should have caught this during review of PR 10099, but unfortunately it's sometimes not until you actually work with the code in question that things like these become clear.
[2] Assuming, obviously, that the viewer component in question actually implements such a method :-)
[3] There's even a very recent issue, filed by someone trying to do just that.
[4] Short of providing a `PDFFindController` instance when creating a `BaseViewer` instance, of course.
Since searching itself is an asynchronous operation, removal of highlights needs to be asynchronous too since otherwise there's a risk that the events happen in the wrong order and find highlights thus remain visible.
Also, this patch will now ensure that only 'findbarclose' events for the *current* document is handled since other ones doesn't really matter. Note in particular that when no document is loaded text-layers are, obviously, not present and subsequently it's unnecessary to attempt to hide non-existent find highlights.
For many years it's been possible to enter a search term into the findbar(s) before the document has finised loading, such that searching starts immediately once it has loaded.
PR 10099 accidentally broke that, which I unfortunately missed during reviewing.
Since searching is asynchronous you cannot directly check in `executeCommand` if the document is loaded/current, but need to wait until searching is actually enabled first.
Furthermore this patch also ensures that the `_findTimeout` is always correctly cleared given that it adds further asynchronous behaviour to searching, since you obviously only want to deal with searches relevant to the current document.
This is similar to the format used by a number of other viewer components, and should simplify the `PDFSidebar` initialization slightly.
Furthermore, by using the `eventBus` it's no longer necessary for `PDFSidebar` to have a direct dependency on `PDFOutlineViewer`.
There's still room for improvement here, but this patch is at least a start (since it's not clear to me how best to handle the viewers).
This attempts to reduced the level of indirection, and the amount of code, when dispatching `fileattachmentannotation` events, by removing the `PDFLinkService.onFileAttachmentAnnotation` method and just accessing `PDFLinkService.eventBus` directly in the `FileAttachmentAnnotationElement` constructor.
Given that other properties, such as `externalLinkTarget`/`externalLinkRel`, are already being accessed directly this pattern seems fine here as well.
This commit shows that we can now unit test the find controller and
that executing regular queries works. Note that this is only a first
step and not a complete suite of unit tests for all possible options
of the find controller.
While writing this unit test, I found two smaller issues that I
addressed directly. The first one is that in the previous find
controller refactoring I forgot to rename some occurrences of a now
private member variable. Fortunately this did not cause any bugs since
we did have a public getter and the fetched value may be changed by
reference, but it's nevertheless good to fix. The second issue is that
some entries in the `test/unit/clitests.json` file were not correct,
resulting in these tests not being executed on e.g., Travis CI.
With `PDFFindController` instances no longer (directly) depending on
`BaseViewer` instances, we can pass a single `findController` when
initializing a viewer, similar to other components.
This removes the dependency on a `PDFViewer` instance from the find
controller, which makes it more similar to other components and makes it
easier to unit test with a mock link service.
Finally, we remove the search capabilities from the SVG example since it
doesn't work there because there is no separate text layer.
Now it follows the same pattern as e.g., the document properties
component, which allows us to have one instance of the find controller
and set a new document to search upon switching documents.
Moreover, this allows us to get rid of the dependency on `pdfViewer` in
order to fetch the text content for a page. This is working towards
getting rid of the `pdfViewer` dependency upon initializing the
component entirely in future commits.
Finally, we make the `reset` method private since it's not supposed to
be used from the outside anymore now that `setDocument` takes care of
this, similar to other components.
This way the resetting of `PDFLinkService`/`PDFDocumentProperties` instances, as is done in `PDFViewerApplication.close`, only requires passing in *one* `null` argument instead of two.
The built-in PDF Viewer (in Firefox) cannot use the browser findbar when PDF files are embedded in e.g. iframe/object tags, and the PDF.js findbar (i.e. `PDFFindBar`) will thus be used instead in those cases.
This is slightly problematic, since the `MOZCENTRAL` version of the viewer uses a special, slimmed down, version of the `l10n.js` file that doesn't (currently) support plural forms. To prevent the matchesCounter from breaking completely in this edge-case, temporarily hard-code the plural form to use the default `[other]` version of the locale strings.
This prevents the findbar from intermittently displaying `0 of {number} matches`, which *could* theoretically happen for large and/or slow loading documents.
The find controller should only coordinate finding a string in the
document and should not be responsible for presenting the matches to the
user. The text layer builder already contains the logic to render the
matches in the viewer, so it should also take care of scrolling the
selected match into view.
The find controller already has quite a lot of state to maintain. We can
avoid keeping track of this member variable because when the find
controller is reset, so is the extract text promises array. Therefore,
we can just check if that array contains items or not to determine if
text extraction already started.
Moreover, there is no need to reset the `pageContents` array since the
`reset` method already takes care of that.
This is an unfortunate oversight on my part, which I stumbled upon when (locally) testing the `mozilla-central` follow-up patch necessary to enable the matches counter in the built-in PDF viewer.
The property is intended to contain a reference to a DOM element, which not only is nowhere to be found *now* but appears to never have existed in the first place.
As outlined in https://bugzilla.mozilla.org/show_bug.cgi?id=1282759 the internal Firefox name for the feature is `entireWord`, hence that name is used here as well for consistency (with "Whole words" being limited to the UI).
Given existing limitations of the PDF.js search functionality, e.g. the existing problems of searching across "new lines", there's some edge-cases where "Whole words" searching will ignore (valid) results.
However, considering that this is a pre-existing issue related to the way that the find controller joins text-content together, that shouldn't have to block this new feature in my opionion.
*Please note:* In order to enable this feature in the `MOZCENTRAL` version, a small follow-up patch for [PdfjsChromeUtils.jsm](https://hg.mozilla.org/mozilla-central/file/tip/browser/extensions/pdfjs/content/PdfjsChromeUtils.jsm) will be required once this has landed in `mozilla-central`.
For the `PDFFindBar` implementation, similar to the native Firefox findbar, the matches count displayed is now limited to a (hopefully) reasonable value.
*Please note:* In order to enable this feature in the `MOZCENTRAL` version, a follow-up patch will be required once this has landed in `mozilla-central`.
This patch is the first step to be able to eventually get rid of the `attachDOMEventsToEventBus` function, by allowing `EventBus` instances to simply re-dispatch most[1] events to the DOM.
Note that the re-dispatching is purposely implemented to occur *after* all registered `EventBus` listeners have been serviced, to prevent the ordering issues that necessitated the duplicated page/scale-change events.
The DOM events are currently necessary for the `mozilla-central` tests, see https://hg.mozilla.org/mozilla-central/file/tip/browser/extensions/pdfjs/test, and perhaps also for custom deployments of the PDF.js default viewer.
Once this have landed, and been successfully uplifted to `mozilla-central`, I intent to submit a patch to update the test-code to utilize the new preference. This will thus, eventually, make it possible to remove the `attachDOMEventsToEventBus` functionality.
*Please note:* I've successfully ran all `mozilla-central` tests locally, with these patches applied.
---
[1] The exception being events that originated on the `window` or `document`, since those are already globally available anyway.
The new events follow the same naming pattern as the 'pagesinit'/'pagesloaded' events dispatched on `BaseViewer` instances, and the intention is to allow the eventual removal of 'documentload'.
[api-minor] Add an `IsLinearized` property to the `PDFDocument.documentInfo` getter, to allow accessing the linearization status through the API (via `PDFDocumentProxy.getMetadata`)
When updating Preferences using the `set` method, the input is carefully validated. However, no validation is (currently) done when a `BasePreferences` instance is created, which probably isn't that great. Hence this patch that simply ignores, to not unnecessarily break loading of the viewer itself, any invalid Preferences.
Given that the various Preferences are currently, and have been for quite some time, only used when initializing `PDFViewerApplication` re-loading them when a new PDF file is opened in the viewer is essentially a no-op.
Furthermore, with the only usage of `BasePreferences.reload` now gone, the value of that method seems questionable at best. In the event that the functionality is actually needed again, similar to the `ViewHistory`, it'd probably make more sense to simply replace `PDFViewerApplication.preferences` with a new `BasePreferences` instance instead (using e.g. `DefaultExternalServices.createPreferences`).
Given that *all* Preferences are already fetched in `PDFViewerApplication._readPreferences`, the amount of boilerplate/duplication can be considerably reduced with the addition of a `BasePreferences.getAll` method.
The only reason that this check ever existed in the first place, is that originally there was a global `PDFJS.openExternalLinkInNewWindow` option which was then subsumed by the (more generic) `PDFJS.externalLinkTarget` option. (The `externalLinkTarget` has since been moved into a `PDFLinkService` option, as part of PDF.js version `2.0`.)
Hence, during the period where both `PDFJS.openExternalLinkInNewWindow` and `PDFJS.externalLinkTarget` existed side-by-side, there was a need to allow the former one to override the latter one (for backward compatibility purposes). However, that's no longer the case, and this extra `externalLinkTarget` check can now be removed.
*This was a stupid error on my part; sorry about breaking this!*
With the current code, the value of the `externalLinkTarget` option is now (potentially) updated *after* the viewer components have been initialized. For the "viewer in iframe/object tag" case, the result is that the value of the `externalLinkTarget` option isn't adjusted as intended any more.
Without providing useful (custom) error messages for the `no-restricted-globals` rule, see https://eslint.org/docs/rules/no-restricted-globals, it's quite likely that the rule will be incorrectly disabled rather than the required globals being imported as intended.
To reduced duplication of the `no-restricted-globals` rule in multiple `.eslintrc` files, it's instead moved to the top-level `.eslintrc` file and disabled as needed on a folder/file basis outside of `/src` and `/web`.
*Another small piece of clean-up of code I've previously written; follow-up to PR 8775.*
Importing `createPromiseCapability`, and then using it in just *one* spot, seems unnecessary since the `waitOnEventOrTimeout` function may just as well return a regular `Promise` directly.
Given that the non-default Spread modes (currently) doesn't affect the page layout when horizontal scrolling is enabled, having the Spread buttons appear active when clicking them appears to do *nothing* is probably confusing rather than helpful to users.
If the current viewer is a `PDFSinglePageViewer` instance the Scroll/Spread modes are no-ops, hence displaying buttons that do *nothing* when clicked will probably do very little besides confuse users.
The names 'resetscrollmode'/'resetspreadmode' were probably *not* great choices, given that the only thing being reset are toolbar buttons and not the actual Scroll/Spread modes. Furthermore, there's really no need for two separate events here.
The patch also adds a comment that ought to have been included in PR 9040, to prevent future refactoring/removing of what may appear to be an unnecessary `Promise.resolve` call.
This moves/exposes the `URL` polyfill similarily to the existing `ReadableStream` polyfill, rather than exposing it globally, to avoid interfering with any "outside" code.
Both the `URL` and `ReadableStream` polyfills are now exposed on the `pdfjsLib` object, such that they are accessible to the viewer components.
Furthermore, the `no-restricted-globals` ESLint rule is also enabled to prevent accidental usage of the native `URL`/`ReadableStream` implementations directly in the `src/` and `web/` folders; see also https://eslint.org/docs/rules/no-restricted-globals
Addresses the remaining TODO in https://github.com/mozilla/pdf.js/projects/6
Rather than having to manually call a method on `PDFFindController` instances from `BaseViewer.setDocument`, thus essentially having to resolve the private `_firstPagePromise` from the "outside", this can be done easily with the 'pagesinit' event dispatched on the `eventBus` instead.
Please note this particular `PDFFindController` code pre-dates the `eventBus` by almost three years, which should explain why the code looks the way it does.
Since the Scroll/Spread modes are now document specific, as all other properties such as page/scale/rotation, ensure that the toolbar is always correctly reset.
Since other viewer state, such as the current page/scale/rotation[1], are not available as `BaseViewer` constructor options, this makes the Scroll/Spread modes stand out quite a bit. Hence it probably makes sense to remove/deprecate this, to avoid inconsistent and possibly confusing state in this code.
---
[1] These properties are *purposely* not available in the constructor, since attempting to set them before a document is loaded has number of issues; please refer to https://github.com/mozilla/pdf.js/pull/8539#issuecomment-309706629 for additional details.
Since all the other viewer methods use the getter/setter pattern, e.g. for setting page/scale/rotation, the way that the Scroll/Spread modes are set thus stands out. For consistency, this really ought to use the same pattern as the rest of the `BaseViewer`. (To avoid breaking third-party implementations, the old methods are kept around as aliases.)
Note how the other "...OnLoad" preferences will allow a *non-default* value to always override a history entry. To improve overall consistency for the viewer options, and to reduce possibly confusing behaviour, this patch changes the `scrollModeOnLoad`/`spreadModeOnLoad` preferences to behave as all the other ones.
For some reason, these weren't added to `AppOptions` despite actually being set and read from `web/app.js`. Not adding them creates inconsistencies, since all other options *are* present in `web/app_options.js`.
This property isn't accessed anywhere in the `web/app.js` file, and is also not being reset in `PDFViewerApplication.close`. Hence it seems that it can simply be removed, especially since the fingerprint is already synchronously available through `PDFViewerApplication.pdfDocument.fingerprint` (provided that a document is loaded).
With the current code, the location in the viewer could change *well* after the user has started to interact with the viewer (for very large and/or slow loading documents). Attempt to reduce the likelyhood of that happening, by adding an upper bound to the time spent waiting before attempting to re-apply the initial position.
Rather than using a "special" property to check if a `ViewHistory` database entry existed on load, it seems that you could just as well check for the existence of one of the actually needed properties instead (here 'page' is used).
This way we can avoid storing what, more of less, amounts to useless state, which will help reduce the size of the `ViewHistory` database. Given that we don't directly, nor need to, validate the `ViewHistory` data but rely on other parts of the code-base to do so, the existance of an 'exists' property doesn't in and of itself really add much utility.
Finally, to simplify the implementation the 'exists' property won't be actively removed from the `ViewHistory` data. Instead we'll simply stop adding 'exists' when writing `ViewHistory` data, and rely on the existing pruning of old entries to eventually remove any remnants of 'exists' from storage.
Note how, in the current code, only *one* old history entry would ever be removed. That would mean that if e.g. the `cacheSize` is reduced, it would potentially require loading of multiple files before the database would be correctly pruned.
Furthermore, in the case where the database was empty on load there's no need to attempt to shrink it, since trying to reduce the size of an *empty* array won't do much :-)
Since the current page will be explicitly scrolled into view *directly* afterwards anyway (compare with e.g. the `pagesRotation` code), trying to maintain the current position when re-applying the zoom level during changing of Scroll modes is redundant.
Note how in `BaseViewer.forceRendering` the Scroll mode is used to determine how pre-rendering will work. Currently this is broken in Presentation Mode, if horizontal scrolling was enabled prior to entering fullscreen.
Furthermore, there's a few additional cases where the `this.scrollMode === ScrollMode.HORIZONTAL` check is pointless either in Presentation Mode or when a `PDFSinglePageViewer` instance is used.
Since the Scroll/Spread modes doesn't make (any) sense in `PDFSinglePageViewer` instances, the general structure of these methods can be improved to reflect that.
Given that this method is a no-op in `PDFSinglePageViewer`, similar to `_regroupSpreads`, let's improve the general code structure by simply moving the method.
There's no good reason to iterate through an arbitrary number of DOM elements this way, since a document could contain thousands of pages, when everything can be easily removed at once; compare with e.g. `BaseViewer._resetView` and `PDFThumbnailViewer._resetView`.
Furthermore given that it's a `PDFViewer` instance, the `this.viewer` property can be accessed directly. Besides, `_setDocumentViewerElement` only exists as a helper method for `setDocument` in the base class and none of this code applies for `PDFSinglePageViewer` instances either.
The Secondary Toolbar buttons for, not to mention the actual toggling of, Scroll/Spread modes are currently completely broken in older browsers (such as IE11).
As a follow-up, it'd probably be a good idea to try and find a *feature complete* `classList` polyfill that could be used instead, but this patch at least addresses the immediate regression.
Please refer to the compatibility information in https://developer.mozilla.org/en-US/docs/Web/API/Element/classList#Browser_compatibility
The `onOpenWithURL` method, in `PDFViewerApplication.initPassiveLoading`, accepts a `originalURL` parameter which is then passed on to `PDFViewerApplication.open` as is. However, the latter method expects the name of the parameter to be `originalUrl` (note the casing), meaning that `getDocument` will fail in this case.
For consistency, and to avoid confusion, the renaming is done in `web/chromecom.js` as well.
*The danger of fixing one bug is that it can, sometimes too easily, cause another one in the process; sorry for not catching this when testing PR 9816 locally.*
When the *entire* viewer becomes narrow enough, as controlled by the `@media all and (max-width: 840px)` media query, the sidebar should (semi-transparently) overlay the `viewerContainer` instead of moving it horizontally. Unfortunately the changes made in PR 9816 caused the relevant CSS rules to be skipped, because of how the inheritance model works in CSS.
I'm well aware that `!important` is usually advised against, since it "breaks" the CSS inheritance model. However in this case it seemed reasonable to use it, to not only fix the bug at hand but to also prevent similar bugs from occurring in the future.
This is a regression from PR 8993; it causes the pages to be offset horizontally in Presentation Mode, if and only if the sidebar is currently open when the user triggers Presentation Mode.
Please note that while this doesn't seem to affect Firefox, both Chrome and IE are however affected.
Interestingly enough, despite the Chrome extension being affected as well, I cannot find any issue filed about this. (Either Presentation Mode isn't used much at all, or users don't open the sidebar before entering Presentation Mode.)
It appears that Microsoft silently fixed the problem that required disabling of fullscreen mode, in e.g. `iframe`s, in IE 11; please see issue 4711 and PR 5525 for historical context.
Unfortunately my Google-fu isn't strong enough to find any *official* information regarding the fixing of the browser bug in IE. However testing of the default viewer in IE 11, with this patch applied, it now appears that Presentation Mode is working correctly even in an `iframe` in IE 11.
Further anecdotal evidence that the bug is in fact fixed, is for example that jQuery previously contained a work-around for the IE bug. However, that's removed over two years ago now; see ff1a0822f7 and the issues referenced there.
Given that the default viewer isn't intended to be used as-is anyway (in custom deployments), it didn't seem necessary to keep the `disableFullscreen` option around since it was *only* ever added for compatibility purposes.
Fixes 9585.
Not only is the `Util.loadScript` helper function unused on the Worker side, even trying to use it there would throw an Error (since `document` isn't defined/available in Workers).
Hence this helper function is moved, and its code modernized slightly by having it return a Promise rather than needing a callback function.
Finally, to reduced code duplication, the "new" loadScript function is exported and used in the viewer.
This should provide a better out-of-the-box experience when using PDF.js in a Node.js environment, since it's missing native support for both `@font-face` and `Image`.
Please note that this change *only* affects the default values, hence it's still possible for an API consumer to override those values when calling `getDocument`.
Also, prevents "ReferenceError: document is not defined" errors, when running the unit-tests in Node.js/Travis.
This commit adds `scrollModeOnLoad` and `spreadModeOnLoad` preferences
that control the default viewer state when opening a new document for
the first time.
This commit also contains a minor refactoring of some of the option UI
rendering code in extensions/chromium/options/options.js, as I couldn't
bear creating two more functions nearly identical to the four that
already existed.
This builds on the scrolling mode work to add three buttons for joining
page spreads together: one for the default view, with no page spreads,
and two for spreads starting on odd-numbered or even-numbered pages.
Prior to this commit, if the vertical scroll bar is absent and the horizontal
scroll bar is present, a link to a particular point on the page which should
induce a horizontal scroll did not do so, because the absence of a vertical
scroll bar meant that the viewer was not recognized as the nearest scrolling
ancestor. This commit adds a check for horizontal scroll bars when searching
for the scrolling ancestor.
*This is a final piece of clean-up of code that I recently wrote, after which I'm done :-)*
When the `getMainThreadWorkerMessageHandler` function was added, in PR 9385, it did so by basically introducing a `web/app.js` dependency in `src/display/api.js` through the `window.pdfjsNonProductionPdfWorker` property[1]. Even though this is limited to non-`PRODUCTION` mode, i.e. `gulp server`, it still seems unfortunate to have that sort of viewer dependency in the API code itself.
With the new, much nicer and shorter, names introduced in PR 9565 we can remove this non-`PRODUCTION` hack and just use `window.pdfjsWorker` in both the viewer and the API regardless of the build mode.
---
[1] It didn't seem correct to piggy-back on the `window.pdfjsDistBuildPdfWorker` property in non-`PRODUCTION` mode.
Please note that this patch *purposely* doesn't add every standard (or semi-standard) page name in existence, but rather only a few common ones. This is done to lessen the burden on localizers, since it's quite possible that all of the page names could need translation (depending on locale).
It's easy to add more standard page sizes in the future, but we should take care to *only* add those that are very commonly used in actual PDF files.
This uses a whitelist, based on the locale, to determine where non-metric units should be used.
Note that the behaviour implemented here seem consistent with desktop PDF viewers (e.g. Adobe Reader), where the pageSizes are *always* displayed with locale dependent units rather than pageSize dependent ones (since the latter would probably be quite confusing).
A couple of basic unit-tests are added, and a manual `isLandscape` check (in `web/base_viewer.js`) is also converted to use the helper function instead.
Test case to exercise the different encodings:
1. Create a file "some file#@%M<br>%25 .pdf"
2. Build the extension with `gulp chromium` and load it in Chrome.
3. Go to `chrome://extensions/` and ensure that the
"Allow access to file URLs" is disabled.
4. Try to open the file from step 1 in Chrome (maybe reload once).
5. PDF.js should be showing a file chooser button.
6. Click on that button and select a different file.
Test: Check that a confirmation dialog pops up that warns about
a different file name. Cancel the dialog.
7. Click on the button again and select the original file.
Test: Check that the file opens as expected.
The units are currently repeated after each dimension, which seems unnecessary and is also not done in other PDF viewers (such as e.g. Adobe Reader).
Furthermore, the name of the l10n arguments can be simplified slightly, since the name of the strings themselves should be enough information.
Finally, the `width`/`height` should be formatted according to the current locale, as is already done for other strings in the document properties dialog.
The `getPageSizeInches` method was implemented on `PDFDocumentProxy`, which seems conceptually wrong since the size property isn't global to the document but rather specific to each page. Hence the method is moved into `PDFPageProxy`, as `get pageSizeInches` instead to address this.
Despite the fact that new API functionality was implemented, no unit-tests were added. To prevent issues later on, we should *always* ensure that new functionality has at least some test-coverage; something that this patch also takes care of.
The new `PDFDocumentProperties._parsePageSize` method seemed unnecessary convoluted. Furthermore, in the "no data provided"-case it even returned incorrect data (an array, rather than the expected object).
Finally, the fallback strings didn't actually agree with the `en-US` locale. This inconsistency doesn't look too great, and it's thus addressed here as well.
PR #9493 moved from `appConfig.defaultUrl` to `AppOptions.get('defaultUrl')`.
However, it forgot to replace `appConfig.defaultUrl` in chromecom.js,
and as a result the extension is not able to open any PDF file.
This patch fixes that issue.
One additional complication with removing this option from the global `PDFJS` object, is that the viewer currently needs to check `disableAutoFetch` in a couple of places. To address this I'm thus proposing adding a getter in `PDFDocumentProxy`, to allow checking the *actually* used values for a particular `getDocument` invocation.
The `appConfig` contains (mostly) references to various DOM elements, used when initializing the viewer components.
Hence `defaultUrl` seem like a slightly better fit for the new `AppOptions` abstraction, not to mention that it should thus be easier to set/modify it for custom deployments of the default viewer.
The way that various options are handled in the default viewer is currently a bit of a mess (to say the least). Some viewer options reside in the global `PDFJS` object, while others reside in `Preferences`. To make matters worse, some options even exist in both of the two.
Since the goal, with PDF.js version `2.0`, is to reduce our usage of the global `PDFJS` object, we'll instead want pass in the options when initializing the viewer components and when calling API methods (such as `getDocument`).
However given the current state of things in the default viewer, this wouldn't be exactly easy to implement. Hence this patch, which attempts to consolidate the way that viewer (and later API) options are handled by introducing a `AppOptions` singleton that provides *one* centralized way of interacting with the various options in the default viewer.
In a1cfa5f4d7, the textLayerMode
preference was introduced, to replace the disableTextLayer and
enhanceTextSelection preferences.
As a result, the text selection preference was no longer visible
in Chrome (because preferences are only rendered by default for
boolean preferences, not for enumerations).
This commit adds the necessary bits to
extensions/chromium/options/options.{html,js}
so that the textLayerMode preference can be changed again.
Also, migration logic has been added to move over preferences
from the old to the new names:
- In web/chromecom.js, the logic is added to translate
preferences that were set by an administrator (it is read-only,
so this layer is unavoidable).
- In extensions/chromium/options/migration.js, similar logic is
added, except in this case the preference storage is writable,
so this migration logic happens only once.
The "enhanced text selection" mode is still experimental, so it
has been marked as experimental to signal that there may be bugs.
The list of tasks that block promotion to stable is at #7584.
This partially reverts df0836b9b8.
The entry in preferences_schema.json is restored because that is
required to make managed preferences visible to the extension code.
The default key is still removed from default_preferences.json,
because this change only concerns the Chrome extension, not the
other parts of PDF.js. To account for the missing key, the
deprecated key was added back in chromecom.js
The key needs to be restored in preferences_schema.json too,
because that's the only way to make managed preferences visible.
I'm using `Object.assign`, which was introduced in Chrome 45,
so the preference module will break in Chrome 45 and earlier.
This is fine, because we do not support Chrome before 49.
In order to simplify things, the undocumented `enableStats` option was removed and `pdfBug` is now instead used to enabled general debugging *and* page request/rendering stats.
Considering that in the default viewer the `stats` was only used when debugging was also enabled, this simplification (code wise) definitely seem worthwhile to me.
Rather than having two different (but connected) options for the textLayer, I think that it makes sense to try and unify this. For example: currently if `disableTextLayer === true`, then the value of `enhanceTextSelection` is simply ignored.
Since PDF.js version `2.0` already won't be backwards compatible in lots of ways, I don't think that we need to worry about migrating existing preferences here.
This removes the `PDFJS.externalLinkTarget`/`PDFJS.externalLinkRel` dependency from the viewer components, but please note that as a *temporary* solution the default viewer still uses it.
This removes the `PDFJS.imageResourcesPath` dependency from the viewer components and the test-suite, but please note that as a *temporary* solution the default viewer still uses it.
This removes the `PDFJS.maxCanvasPixels` dependency from the viewer components, but please note that as a *temporary* solution the default viewer still uses it.
This removes the `PDFJS.useOnlyCssZoom` dependency from the viewer components, but please note that as a *temporary* solution the default viewer still uses it.
Unfortunately, as far as I can tell, we still need the ability to adjust certain viewer options depending on the browser environment in PDF.js version `2.0`. However, we should be able to separate this from the general compatibility code in the `src/shared/compatibility.js` file.
This rule is available from https://www.npmjs.com/package/eslint-plugin-mozilla, and is enforced in mozilla-central. Note that we have the necessary `Array`/`String` polyfills and that most cases have already been fixed, see PRs 9032 and 9434.
This rule is available from https://www.npmjs.com/package/eslint-plugin-mozilla, and is enforced in mozilla-central. Note that we have a polyfill for `ChildNode.remove()` and that most cases have already been fixed, see PRs 8056 and 8138.
The code responsible for highlighting/scrolling thumbnails is quite old, and parts of it hasn't really changed much over time.
In particular, the `scrollThumbnailIntoView` method is currently using `document.querySelector` in order to find both the new/current thumbnail element. This seems totally unnessary, since we can easily keep track of the current thumbnail (similar to the "regular" viewer) and thus avoid unnecessary DOM look-ups.
Furthermore, note how the `PDFThumbnailView` constructor is currently highlighting the *first* thumbnail. This is yet another leftover from older viewer code, which we can now remove and instead take care of in `PDFThumbnailViewer.setDocument`.
Given that `PDFThumbnailView` does not, nor should it, have any concept of which thumbnail is currently selected, this change also improves the general structure of this code a tiny bit.
Despite this patch removing the `disableWorker` option itself, please note that we'll still fallback to loading the worker file(s) on the main-thread when running in environments without proper Web Worker support.
Furthermore it's still possible, even with this patch, to force the use of fake workers by manually loading the necessary file using a `<script>` tag on the main-thread.[1]
That way, the functionality of the now removed `SINGLE_FILE` build target and the resulting `build/pdf.combined.js` file can still be achieved simply by adding e.g. `<script src="build/pdf.worker.js"></script>` to the HTML (obviously with the path adjusted as needed).
Finally note that the `disableWorker` option is a performance footgun, and unfortunately many existing third-party examples actually use it without providing any sort of warning/justification.
---
[1] This approach is used in the default viewer, since certain kind of debugging may be easier if the code is running directly on the main-thread.
https://crbug.com/362061 was fixed in Chrome 36, and the lowest
supported Chrome version in the extension is Chrome 49, so the
work-around for a filesystem:-bug in chromecom can be removed.
This was introduced in #3582 to work around https://crbug.com/274024 .
The bug in Chrome has been fixed a long time ago (at least 33), so let's
simplify the code.
This patch updates the `IPDFStreamReader` interface and ensures that the interface/implementation of `network.js`, `fetch_stream.js`, `node_stream.js`, and `transport_stream.js` all match properly.
The unit-tests are also adjusted, to more closely replicate the actual behaviour of the various actual `IPDFStreamReader` implementations.
Finally, this patch adjusts the use of the Content-Disposition filename when setting the title in the viewer, and adds `PDFDocumentProperties` support as well.
When navigating away from the viewer, there's no good reason for disallowing replacement of a *temporary* history entry (in addition to empty ones).
Given that the current position is temporarily added to the browser history using a (short) timeout, the history entry will most likely already be correct when the 'pagehide' event fires. However, if the user is quick enough that might not always be the case, in which case the adjusted logic may help.
Since multiple empty lines is virtually unused in the code-base, and the few cases that do exist look like "typos", let's enforce greater consistency here; please see https://eslint.org/docs/rules/no-multiple-empty-lines.
It is only used in a few places to handle prefixing style properties if
necessary. However, we used it only for `transform`, `transformOrigin`
and `borderRadius`, which according to Can I Use are supported natively
(unprefixed) in the browsers that PDF.js 2.0 supports. Therefore, we can
remove this class, which should help performance too since this avoids
extra function calls in parts of the code that are called often.
Unless the debugging tools (i.e. `PDFBug`) are enabled, or the `browsertest` is running, the `PDFPageProxy.stats` aren't actually used for anything.
Rather than initializing unnecessary `StatTimer` instances, we can simply re-use *one* dummy class (with static methods) for every page. Note that by using a dummy `StatTimer` in this way, rather than letting `PDFPageProxy.stats` be undefined, we don't need to guard *every* single stats collection callsite.
Since it wouldn't make much sense to attempt to use `PDFPageProxy.stats` when stat collection is disabled, it was instead changed to a "private" property (i.e. `PDFPageProxy._stats`) and a getter was added for accessing `PDFPageProxy.stats`. This getter will now return `null` when stat collection is disabled, making that case easy to handle.
For benchmarking purposes, the test-suite used to re-create the `StatTimer` after loading/rendering each page. However, modifying properties on various API code from the outside in this way seems very error-prone, and is an anti-pattern that we really should avoid at all cost. Hence the `PDFPageProxy.cleanup` method was modified to accept an optional parameter, which will take care of resetting `this.stats` when necessary, and `test/driver.js` was updated accordingly.
Finally, a tiny bit more validation was added on the viewer side, to ensure that all the code we're attempting to access is defined when handling `PDFPageProxy` stats.
For sufficiently large page sizes, always limiting the minimum zoom level to 25% seem a bit too high. One example is pages, e.g. the first one, in:
Hence I think that it makes sense to lower `MIN_SCALE` slightly, since other PDF viewers (e.g. Adobe Reader) isn't limiting the minimum zoom level as aggressively. Obviously this will allow a greater number of pages to be visible at the same time in the viewer, but given that they will be small that shouldn't be an issue.
Note also that e.g. the `page-fit`/`page-width` zoom levels already allow `< MIN_SCALE` values, so I don't see why we shouldn't allow users the same functionality directly.
This was added, during the refactoring in PR 8556, to avoid outright breaking third-party users of the default viewer.
With PDF.js version `2.0`, where we're making API changes that aren't backwards compatible, we ought to be able to remove this piece of viewer code as well.
We've never attempted to limit the errors displayed in the default viewer to the current PDF file, but that's not really been a problem before. However after PR 7926, it's now possible to get password related error messages for *previously* opened PDF files in the default viewer.
**STR:**
1. Open a password protected PDF file, e.g. `issue6010_1.pdf` from the test-suite.
2. Cancel the password prompt.
3. Open any new PDF file in the viewer.
**AR:**
The error UI is displayed, with a `No password given` message.
**ER:**
No error displayed, since it's only relevent for a now closed PDF file.
This is obviously a minor issue, caused by us now rejecting the still pending `pdfLoadingTask` during the `PDFViewerApplication.close` call, but I don't think that it (generally) makes sense to show errors if they're not relevant to the *currently* displayed PDF file.
In order to move viewer related options from the global `PDFJS` object and into the initialization of the relevant components, we'll need to parse the hash parameters *before* calling `PDFViewerApplication._initializeViewerComponents`.
The only reason for adding this parameter in the first place, all the way back in PR 4074, was that the "maintain document position on zooming" feature was landed and backed out a couple of times before it finally stuck.
Hence it seemed, at the time, like a good idea to have a simple way to disable that behaviour. However, that was almost four years ago, and it's just not likely that we'd want/need to ever disable it now.
Furthermore I really cannot imagine why anyone would actually *want* to reset the position whenever zooming occurs, since it results in a quite annoying UX.
*So, to summarize:* Based on the above, I think that we should try to remove this parameter now. On the off chance that anyone complains, re-adding it shouldn't be difficult.
If the sidebar is resized such that the thumbnails are displayed in multiple columns, then scrolling the currently active thumbnail into view doesn't work correctly in some cases.
The reason is that the code in `PDFThumbnailViewer.scrollThumbnailIntoView` implicitly assumes that the thumbnails will be present in just *one* column. Since that may no longer be the case, it's not sufficient to simply check if the thumbnail is visible. Instead we must explicitly check that *all*, i.e. 100 percent, of the thumbnail is already visible, and otherwise scroll it into view.
By making use of modern CSS features, in this case [CSS variables](https://developer.mozilla.org/en-US/docs/Web/CSS/Using_CSS_variables), implementing sidebar resizing is actually quite simple. Not only will the amount of added code be fairly small, but it should also be easy to maintain since there's no need for complicated JavaScript hacks in order to update the CSS. Another benefit is that the JavaScript code doesn't need to make detailed assumptions about the exact structure of the HTML/CSS code.
Obviously this will not work in older browsers, such as IE, that lack support for CSS variables. In those cases sidebar resizing is simply disabled (via feature detection), and the resizing DOM element hidden, and the behaviour is thus *identical* to the current (fixed-width) sidebar.
However, considering the simplicity of the implementation, I really don't see why limiting this feature to "modern" browsers is a problem.
Finally, note that a few edge-cases meant that the patch is a bit larger than what the basic functionality would dictate. Among those is first of all proper RTL support, and secondly (automatic) resizing of the sidebar when the width of the *entire* viewer changes. Another, pre-existing, issue fixed here is the incomplete interface of `NullL10n`.
*Please note:* This patch has been successfully tested in both LTR and RTL viewer locales, in recent versions of Firefox and Chrome.
Fixes 2072.
https://eslint.org/docs/rules/no-var
Please note that two files were excluded:
1. `web/debugger.js`, since there's code in other files that currently depend on the global availability of code in `web/debugger.js`. Furthermore, since that file isn't used in production, doing a ES6 conversion probably isn't a priority.
2. `web/grab_to_pan.js`, since that file could be considered to be "external" code. We have made smaller changes to that file over the years, however doing a full ES6 `class` conversion might be a step too far!?
Nothing uses this option anymore, so setting it is a no-op now. We can
safely remove it.
Use `SKIP_BABEL` (instead of `PDFJS_NEXT`) now if you want to skip Babel
translation for a build.
If we want to (eventually) make it possible to resize the sidebar, then having its width indirectly affect the toolbar is going to wreck havoc on the media queries used to show/hide buttons in the main toolbar (since many of them depend on the toolbar state, and thus its width).
Updating all of the media queries dynamically with JavaScript seems like a non-starter, given that it'd cause *very* messy code. It thus seem to me that we'd need to fix the position of the sidebar, to have any hope of (in the short term) addressing issue 2072.
Hence, I'm suggesting that the we always layout the sidebar in a consistent vertical position, and only animate the `viewerContainer` rather than the entire `mainContainer`.
Fixes 4052.
Fixes bug 850591.
Some PDF files contain JavaScript actions that consist of nothing more that one, or possibly several, empty string(s). At least to me, printing a warning/showing the fallback seems completely unnecessary in that case.
Furthermore, this patch also makes use of an early `return`, so that we no longer will attempt to check for printing instructions when no JavaScript is present in the PDF file.
*Note:* It would perhaps make sense to change the API/core code, such that we ignore empty entries there instead. However, that would probably be considered a breaking changing with respect to backwards compatibility, hence this simple viewer only solution.
Fixes 5767.
I don't know if this is a regression, but I noticed earlier today that depending on the initial scale *and* sidebar state, the `annotationLayer` of the first rendered page may end up duplicated; please see screen-shot below.
[screen-shot]
I can reproduce this reliable with e.g. https://arxiv.org/pdf/1112.0542v1.pdf#zoom=page-width&pagemode=bookmarks.
When the document loads, rendering of the first page begins immediately. When the sidebar is then opened, that forces re-rendering which thus aborts rendering of the first page.
Note that calling `PDFPageView.draw()` will always, provided an `AnnotationLayerFactory` instance exists, call `AnnotationLayerBuilder.render()`. Hence the events described above will result in *two* such calls, where the actual annotation rendering/updating happens asynchronously.
For reasons that I don't (at all) understand, when multiple `pdfPage.getAnnotations()` promises are handled back-to-back (in `AnnotationLayerBuilder.render()`), the `this.div` property seems to not update in time for the subsequent calls.
This thus, at least in Firefox, result in double rendering of all annotations on the first page.
Obviously it'd be good to find out why it breaks, since it *really* shouldn't, but this patch at least provides a (hopefully) acceptable work-around by ignoring `getAnnotations()` calls for `AnnotationLayerBuilder` instances that we're destroying (in `PDFPageView.reset()`).
*It appears that this accidentally broken in PR 8775.*
Note that `PDFHistory.forward` is only used with certain named actions, and these aren't that commonly used, which ought to explain why this error managed to sneak in.
Steps to reproduce the issue (and verify the fix):
1. Navigate to e.g. http://mirrors.ctan.org/info/lshort/english/lshort.pdf
2. Click on a couple of links, or outline items, such that the history is populated with a few entries.
3. In the console, execute `PDFViewerApplication.pdfHistory.back()` one or more times, thus navigating back to a previous viewer position.
4. In the console, execute `PDFViewerApplication.pdfHistory.forward() one or more times.
At the last step above, no (forward) navigation happens with the current `master`; now compare with this patch.
The new `PDFSinglePageViewer` class extends the previously created abstract `BaseViewer` class.
There's *a lot* of existing functionality in `PDFViewer` that depends on all the pages being loaded and synchronously available, once the `setDocument` method has been called.
Given that initializing `PDFPageView` instances requires passing a DOM element to which the page is attached, the simplest solution I could come up with is to append all pages to a (hidden) document fragment and just swap them (one at a time) into the viewer when page switching occurs.
This patch introduces an abstract `BaseViewer` class, that the existing `PDFViewer` then extends. *Please note:* This lays the necessary foundation for the next patch.
Rather that registering a 'change' event listener on the `window`, which will thus (unnecessarily) fire in *a number* of other situations such as e.g. when the user changes the pageNumber or the current search term, we could/should just register it directly on the dynamically created `fileInput` DOM element instead.
I can see no really compelling reason why we actually need to listen for `file` changes on the `window` itself, and this way we're also able to keep the `fileInput` related code confined to one part of the code which should aid readability.
Furthermore, in custom deployments, there's less risk that we're going to interfere with "outside" code this way.
Finally, preprocessor guards were added to the `webViewerOpenFile` function, since that code doesn't make sense in e.g. the extension builds.
Rather than (basically) duplicating the `SimpleLinkService` in `test/driver.js`, with potential test failuires if you forget to update the test mock, it seems much nicer to just re-use the viewer component.
Note that `SimpleLinkService` is already bundled into the `build/components/pdf_viewer.js` file. Hence we only need to expose it similar to the other viewer components in that file, and make sure that the `gulp components` command runs as part of the test-setup.
reference tests
This patch allows us to use the common styles as used by the viewer as a
baseline for the annotation layer reference tests. They are extended
with a small set of overrides to ensure that all elements are visible
during the test.
The overrides file now only contains the absolutely necessary rules to
make all elements visible and is therefore no longer an almost verbatim
copy of the common styles.
This changes both `PDFViewer` and `PDFThumbnailViewer` to return early in the `pagesRotation` setters if the rotation doesn't change.
It also fixes an existing issue, in `PDFViewer`, that would cause errors if the rotation changes *before* the scale has been set to a non-default value.
Finally, in preparation for subsequent patches, it also refactors the rotation code in `web/app.js` to update the thumbnails and trigger rendering with the new `rotationchanging` event.
When testing the new `PDFHistory` implementation in practice, I felt that the current value of `UPDATE_VIEWAREA_TIMEOUT` is too large to be truly useful.
The purpose of the timeout is to attempt to address (the PDF.js part of) https://bugzilla.mozilla.org/show_bug.cgi?id=1153393, and it's currently fairly easy for the user e.g. close the browser before the timeout had a change to finish.
Obviously, the timeout is a best-effort solution, but with the current value of `UPDATE_VIEWAREA_TIMEOUT` it's not as useful as one would want.
Please note that lowering it shouldn't be a problem, since it still prevents the browser history from updating at *every* 'updateviewarea' event or during (quick) scrolling, which is all that's really needed to not impact the UX negatively.
---
Furthermore, with this lower timeout, we can also simplify the part of the 'popstate' event handler that attempted to update the browser history with the current position before moving back. In most cases, the current position will now already exist in the history, and this *greatly* decreases the complexity of this code path.
The main impetus for this change though, is that I unfortunately found that given the asynchronous nature of updating the browser history, there is some *edge* cases where that code could cause history corruption.
In practice, the user could thus get "stuck" at a particular history entry and not be able to move back. I haven't got any reliable STR for this, since it's so difficult to trigger, but it involved navigating around in a document such that a number of destinations are added to the browser history and then changing the rotation before going back/forward in the history.
Rather that attempting to patch this code, and making it even more difficult to understand than it already is or adding more asynchronous behaviour, by far the easiest solution is to remove it and simply rely on the (lowered) `UPDATE_VIEWAREA_TIMEOUT` instead.
Since e.g. zooming can occur when navigating to a new destionation, ensure that a resulting 'updateviewarea' event doesn't trigger adding of a *temporary* position to the browser history at a bad time.
By using the (heuristic) `POSITION_UPDATED_THRESHOLD` constant, we can ensure that the current document position will be added to the browser history when a sufficiently "large" number of `updateviewarea` events have been dispatched.
This patch attempts to address an issue in the old `PDFHistory` implementation, where the current position wouldn't be correctly saved when the browser was closed.
In theory this *should* already be working, however as the discussion in https://bugzilla.mozilla.org/show_bug.cgi?id=1153393 showed, it seems that both `pagehide` and `beforeunload` arrive to late to successfully update the history during closing.
Hence a timeout is used to *temporarily* add the current position to the browser history when the viewer is idle.
Note that we need to take care not to update the browser history too often, since that would render the viewer more or unusable. Furthermore, if the timeout is *too* long it may end up effectively disable this whole functionality.
The `UPDATE_VIEWAREA_TIMEOUT` constant is thus a heuristic value, which we may need to tweak taking the above into account.
This patch completely re-implements `PDFHistory` to get rid of various bugs currently present, and to hopefully make maintenance slightly easier. Most of the interface is similar to the existing one, but it should be somewhat simplified.
The new implementation should be more robust against failure, compared to the old one. Previously, it was too easy to end up in a state which basically caused the browser history to lock-up, preventing the user from navigating back/forward. (In the new implementation, the browser history should not be updated rather than breaking if things go wrong.)
Given that the code has to deal with various edge-cases, it's still not as simple as I would have liked, but it should now be somewhat easier to deal with.
The main source of complication in the code is actually that we allow the user to change the hash of a already loaded document (we'll no longer try to navigate back-and-forth in this case, since the next commit contains a workaround).
In the new code, there's also *a lot* more comments (perhaps too many?) to attempt to explain the logic. This is something that the old implementation was serverly lacking, which is a one of the reasons why it was so difficult to maintain.
One particular thing to note is that the new code uses the `pagehide` event rather than `beforeunload`, since the latter seems to be a bad idea based on https://bugzilla.mozilla.org/show_bug.cgi?id=1336763.
The current implementation of `PDFHistory` contains a number of smaller bugs, which are *very* difficult to address without breaking other parts of its code.
Possibly the main issue with the current implementation, is that I wrote it quite some time ago, and at the time my understanding of the various edge-cases the code has to deal with was quite limited.
Currently `PDFHistory` may, despite most of those cases being fixed, in certain edge-cases lock-up the browser history, essentially preventing the user from navigating back/forward.
Hence rather than trying to iterate on `PDFHistory` to make it better, the only viable approach is unfortunately rip it out in its entirety and re-write it from scratch.
*It appears that this accidentally broke with PR 8394.*
Currently, the following will be printed in the console:
```
An error occurred while loading the PDF.
[object Promise],[object Promise]
```
With this patch we'll again get proper output, e.g. something with this format:
```
An error occurred while loading the PDF.
PDF.js v? (build: ?)
Message: unknown encryption method
```
Since the very early days of the viewer, it's been possible to pass in a `scale` when opening a PDF file. However, most of the time it was/is actually being ignored, which limits its usefulness considerably.
In older versions of the viewer, if a document hash was present (i.e. `PDFViewerApplication.initialBookmark` being set) or if the document existed in the `ViewHistory`, the `scale` passed to `PDFViewerApplication.open` would thus always be ignored.
In addition to the above, in the current viewer there's even more cases where the `scale` parameter will be ignored: if a (valid) browser history entry exists on document load, or if the `defaultZoomValue` preference is set to a non-default value.
Hence the result is that in most situation, a `scale` passed to `PDFViewerApplication.open` will be completely ignored.
A much better, not to mention supported, way of setting the initial scale is by using the `defaultZoomLevel` preference. In comparision, this also has the advantage of being used in situations where the `scale` would be ignored.
All in all this leads to the current situation where we have code which is essentially dead, since no part of the viewer (by default) relies on it.
To clean up this code, and to avoid having to pass (basically) unused parameters around, I'd thus like to remove the ability to pass a `scale` to `PDFViewerApplication.open`.
The method signature was improved in PR 7440, which has now been present in a number of releases (starting with `v1.6.210`).
Hence we should be able to remove this now, and just print an error message if the old format is used.
This was added in PR 7793, which has now been present in a number of PDF.js releases (from version `v1.7.225`).
Hence we should be able to remove it now, considering that the migration code was only intended as a best effort solution to avoid wiping out all existing user data at once. Also, keep in mind that `ViewHistory` is already limited with regards to the number of documents it will simultaneous store data for.
As discussed in PR 8673, we cannot solve the general issue (since that would require parsing every single page). However, we can mitigate the effect somewhat, by waiting for the FileAttachment annotations of the initially rendered page.
Rather than duplicating initialization code, we can just call `this.reset()` instead (which is also similar to other existing code, e.g. `PDFViewer`). This will also help ensure that the DOM is completely reset, before any outline items or attachments are displayed.
At this point, the default viewer is already not usable in older browsers in `gulp server` mode, since we only run the code through Babel as part of the build step.
Hence there shouldn't be much point in manually loading `compatibility.js` in `viewer.html` the way that we've been doing, especially considering that it's already being loaded by `src/shared/util.js`.
Note that the PageMode, as specified in the API, will only be honoured when either: the user hasn't set the `sidebarViewOnLoad` preference to a non-default value, or a non-default `sidebarView` entry doesn't exist in the view history, or the "pagemode" hash parameter is included in the URL.
Since this is new functionality, the patch also includes a preference (`disablePageMode`), to make it easy to opt-out of this functionality if the user/implementor so wishes.
Add UI for the cursorToolOnLoad pref in the UI of the Chrome extension.
Add logic to migrate the enableHandToolOnLoad pref to cursorToolOnLoad.
For past values in the mutable extension storage area:
1. If enableHandToolOnLoad=true, save cursorToolOnLoad=1.
2. Remove enableHandToolOnLoad.
For the managed extension storage, which is immutable since it is based
on administrative policies, use the following logic:
1. If enableHandToolOnLoad=true and cursorToolOnLoad=0 (default).
set cursorToolOnLoad=0 and assume enableHandToolOnLoad=false.
2. As usual, managed preferences can (and will) be overridden by the user.
The first migration logic is in extensions/chromium/options/migration.js
and can be removed after a few months / less than many years.
The second migration logic is in web/chromecom.js, and should be kept
around for a long while (many years).
The need for this migration logic arises from the change by:
https://github.com/mozilla/pdf.js/pull/7635
* Check for undefined
new URL(file, window.location.href) throws the following error in IE11 + iPad Safari:
Unable to get property 'replace' of undefined or null reference
* Adapting previous change to pdf.js code standards
Added curly braces
* Moved check for undefined above try/catch
`__pdfjsdev_webpack__` was used to skip evaluating part of an AST,
in order to not mangle some `require` symbols.
This commit removes `__pdfjsdev_webpack__`, and:
- Uses `__non_webpack_require__` when one wants the output to
contain `require` instead of `__webpack_require__`.
- Adds options to the webpack config to prevent "polyfills" for
some Node.js-specific APIs to be added.
- Use `// eslint-disable-next-line no-undef` instead of `/* globals ... */`
for variables that are not meant to be used globally.
Since we no longer, after PR 8555, allow changing the scale until the document is loaded, that hack is no longer necessary. Furthermore, no part of that event handling function needs to run unless a document is loaded.
The reason that this hack was initially added, is that previously the `ViewHistory` might be updated *before* `PDFViewerApplication.setInitialView` had run (in some cases leading to incorrect inital document scale). Since that is no longer possible, this is now dead code.
Don't allow setting various properties, such as `currentPageNumber`/`currentScale`/`currentScaleValue`/`pagesRotation`, before `{PDFViewer, PDFThumbnailViewer}.setDocument` has been called
After PR 8394, where the l10n service was converted to be asynchronous, we're no longer calling `_adjustWidth` after updating the `findMsg` label. Hence it's currently possible that the width of the findbar won't be correct. The solution is simple though, just call `_adjustWidth` after the `findMsg` label has been (asynchronously) updated.
Another existing issue, which was an oversight in PR 8132, is that `PDFFindBar.updateResultsCount` may be called directly from `PDFFindController`. In that case, we're not calling `_adjustWidth` at all, which means that the findbar may also not have the correct width.
The simple solution here is to always call `_adjustWidth` at the end of `updateResultsCount` (which is why we no longer need the `_adjustWidth` call at the end of `updateUIState`).
Currently we're *only* hiding the label, but not actually resetting it until a new match is found.
Obviously it's being hidden, but it seems that it really ought to be completely reset as well (since e.g. `PDFFindBar.reset` won't technically reset *all* state otherwise).
Note that these files were among the first to be converted to ES6 classes, so it probably makes sense to do another pass to bring them inline with the most recent ES6 conversions.
These changes consists mainly of replacing `var` with `let`/`const`, adding a couple of default parameters to function signatures, and finally converting `EventBus`/`ProgressBar` to proper classes.
After PR 8510, we now always lookup the localized `page_scale_percent` string to prevent any possible ordering issues. Since the scaleSelect dropdown is updated asynchronous, there's really no point in having a helper function any more, hence this code can rather be placed inline in `Toolbar._updateUIState`.
*This is an existing issue that I noticed while testing PR 8552.*
When zooming or rotation occurs, we'll try to use the current canvas as a (CSS transformed) preview until the page has been completely re-drawn.
If you manage to change the scale (or rotation) *very* quickly, it's possible that `PDFPageView.update` can be called *before* a previous `render` operation has progressed far enough to remove the `hidden` property from the canvas.
The result is thus that a page may be *entirely* black during zooming or rotation, which doesn't look very good. This effect can be a bit difficult to spot, but it does manifest even in the default viewer.
Part of the rotation handling code, in what's now `web/app.js`, hasn't really changed since before the viewer was split into multiple files/components.
Similar to other properties, such as current page/scale, we should probably avoid tracking state in multiple places. Hence I'm suggesting that we don't store the rotation in `PDFViewerApplication`, and access the value in `PDFViewer` instead.
Since `PDFViewerApplication.pageRotation` has existed for a very long time, a getter was added to avoid outright breaking third-party code that may depend on it.
Currently a number of these properties do not work correctly if set *before* calling `setDocument`; please refer to the discussion starting in https://github.com/mozilla/pdf.js/pull/8539#issuecomment-309706629.
Rather than trying to have *some* of these methods working, but not others, it seems much more consistent to simply always require that `setDocument` has been called.
Since this call occurs *before* the `PDFViewer.setDocument` call, it won't actually cause any scale change.
Furthermore, moving it should not be necessary, since the `scale` is already used as the fallback case in `PDFViewerApplication.setInitialView` (provided it's non-zero, which isn't even the case in the default viewer).
Hence this patch should cause no functional changes at all, since it simply removes a piece of unnecessary code.
This patch adds Streams API support in getTextContent
so that we can stream data in chunks instead of fetching
whole data from worker thread to main thread. This patch
supports Streams API without changing the core functionality
of getTextContent.
Enqueue textContent directly at getTextContent in partialEvaluator.
Adds desiredSize and ready property in streamSink.
Currently, these properties are reset in what appears to be somewhat arbitrary locations (within the `load` and `open` methods respectively). The explanation is probably that both of these properties predates the existence of any centralized clean-up code in the viewer.
Hence I think that it makes sense to move the resetting of these properties to the `close` method, since that improves the overview of what's actually cleaned-up/reset when changing documents in the viewer.
This method is currently called from `PDFViewer._scrollUpdate` on *every* scroll event in the viewer.
However, I cannot see why this code is now necessary (assuming that it once was), since text-selection and searching still works *exactly* the same way with this patch as with the current `master`.
When `PDFPageView.updatePosition` is called, the page can be in either of these states:
1. The page hasn't been rendered, in which case the `textLayer` property doesn't exist yet.
2. The page is currently rendering, meaning that the `textLayer` property exists. Given that the `textContent` won't be fetched until the page has been successfully rendered, `TextLayerBuilder.render` will return immediately and effectively be a no-op (since there's nothing to render yet).
3. The has been been rendered, and the `textLayer` is currently rendering.
4. The page, and its `textLayer`, has been completely rendered. In this case, `TextLayerBuilder.render` will return immediately and effectively be a no-op.
Here, only the *third* case seem to require any further analysis:
When scrolling occurs while the `textLayer` is rendering, `TextLayerBuilder.render` will via a helper method call `TextLayerRenderTask.cancel` (in src/display/text_layer.js) to stop processing.
However, due to the run-to-completion nature of JavaScript, once `TextLayerRenderTask._render` has been invoked `appendText` will always run.[1]
So even though we cancel rendering of pending `textLayer`s during scrolling, via the repeated `TextLayerBuilder.render` calls from within the `PDFPageView.updatePosition` method, that does *not* prevent us from running the code inside of `TextLayerRenderTask._render` over and over for the *same* page; which all seems *very* inefficient to me.[2]
All this will thus have the effect of delaying the *actual* rendering of a `textLayer` ever so slightly while scrolling in the viewer. However, it does so at the expense of potentially hundreds of unnecessary `appendText` calls.[3]
Hence it seems to me that it's less resource intensive overall to simply let rendering of the `textLayer` complete, once it has started. Obviously, we still abort all rendering of a page, and its `textLayer`, when it's being destroyed (e.g. by being evicted from the page cache).
In case that there's any worry that the patch could affect e.g. highlighting of search results, please note that the existing code in `TextLayerBuilder.render` already calls `updateMatches` when the `TextLayerTask` resolves successfully.
*I'm sorry that this became quite long, but to try and summarize:*
`PDFPageView.updatePosition` doesn't actually do anything in *most* cases. In the one case where it matters, it seems that it's actually doing more harm than good; which is why I'm proposing that we just remove it.
---
[1] Although we may be able to skip the `render` call, provided that it happens *after* a `timeout` (as is the case in the default viewer).
[2] With current work being done to support streaming of `TextContent`, we'd also need to add just as many duplicate API calls to `PDFPageView.updatePosition`.
[3] The number of duplicate `appendText` calls is directly proportional not only to the scroll speed, but also to the number of pages in the document.
Since the localization service is now asynchronous, depending on the load the browser is under, there's a small risk that the lookup of the 'page_scale_percent' string could be delayed slightly.
If the scale would change a couple of times in *very* quick succession, there's perhaps a *theoretical* possibility that the Zoom dropdown would display an incorrect value.
Consider the following, somewhat contrived, theoretical example of two zoom commands being executed *right* after one another:
```javascript
PDFViewerApplication.pdfViewer.currentScale = 1.23;
PDFViewerApplication.pdfViewer.currentScaleValue = 'page-width';
```
Only the `currentScale` call will currently trigger a l10n lookup in `selectScaleOption`. However, as far as I understand, there's no *guarantee* that the l10n string is resolved *before* `selectScaleOption` is called again as a result of the `currentScaleValue` call.
This thus has the possibility of putting the Zoom dropdown into an inconsistent state, since it's currently updated synchronously for one code-path and asynchronously for another.
To avoid these issues, I'm proposing that we *always* update the Zoom dropdown asynchronously, such that we can guarantee that the ordering is correct.
After PR 8394, at least in Firefox, the Zoom dropdown now frequently displays an old custom scale instead of the correct one. To see this behaviour, the following STR works for me:
1. Open https://mozilla.github.io/pdf.js/web/viewer.html.
2. Zoom in, by clicking on the corresponding button in the toolbar.
3. Run `PDFViewerApplication.pdfViewer.currentScaleValue` in the console.
4. Compare what's displayed in the Zoom dropdown with what's printed in the console. (If no difference can be observed, try repeating steps 2 through 4 a couple of times.)
I really don't understand why this happens, but it seems that waiting until a custom scale has been set *before* selecting it fixes things in Firefox (and works fine in e.g. Chrome as well).
Note that this patch thus makes this particular piece of the code consistent with the state prior to PR 8394.
The click handler used in Presentation Mode didn't check if the first/last page was already reached, which after PR 7529 now causes an unnecessary console error.
Hence we should simply use the already existing `_goToPreviousPage`/`_goToNextPage` methods instead, since they do the necessary bounds checking.
Fixes 8498.
Moreover, rename `FindStates` to `FindState` since enumeration names are
usually not in plural, for readability and consistency with the ones in
`src/shared/util.js`.
This patch intends to simplify future ES6 refactoring of the thumbnail code, since the current code isn't going to work well together with proper `Class`es.
With the current way that the `HandTool` is implemented, if someone would try to also add a Zoom tool (as issue 1260 asks for) that probably wouldn't work very well given that you'd then have two cursor tools which may not play nice together.
Hence this patch, which attempts to refactor things so that it should be simpler to add e.g. a Zoom tool as well (given that that issue is marked as "good-beginner-bug", and I'm not sure if that really applies considering the current state of the code).
Note that I personally have no interest in implementing a Zoom tool (similar to Adobe Reader) since I wouldn't use it, but I figured that it can't hurt to make this code a bit more future proof.
PR 7341 added special handling for `nameddest`s that look like pageNumbers, to prevent issues since we previously *incorrectly* supported specifying a pageNumber directly in the hash; i.e. `#10` versus the correct `#page=10` format.
Since this behaviour wasn't correct, PR 7757 fixed and deprecated the old format, which means that we no longer need to maintain the `nameddest` hack in multiple files.
This patch replaces a `var self = this;` statement with arrow functions, and `var` with `let` in `PDFLinkService.navigateTo`.
Furthermore, when I started looking at this method, it quickly became clear that its code is somewhat of a mess. Since I'm one of the persons that have touched this code over the years, I figured that it'd be a good idea to try and clean it up a bit.
Also replaces `var` with `let` in code that's touched in the patch. Please note that this should be completely safe, for two separate reasons, since trying to access let in a scope where it's not defined is first of all a runtime error and second of all an ESLint error (thanks to the `no-undef` rule).
Currently this method first uses a loop to build a temporary array to hold Promises, which are then resolved from a recursive helper function once the textContent is fetched for each page.
To me, this is unncessarily complicated, since we can do everything within one loop by simply chaining the asynchronous calls to retrieve the textContent. (Note that this guarantees that the textContent of the pages is still fetched sequentially.)
Also replaces `var` with `let` in the functions/methods that are touched in the patch. Please note that this should be completely safe, for two separate reasons, since trying to access `let` in a scope where it's not defined is first of all a runtime error and second of all an ESLint error (thanks to the `no-undef` rule).
This patch contains the following improvements:
- Only fetch the various document properties *once* per PDF file opened, and cache the result (in a frozen object).
- Always update the *entire* dialog at once, to prevent inconsistent UI state (issue 8371).
- Ensure that the dialog, and all its internal properties, are reset when `PDFViewerApplication.close` is called.
- Inline, and re-factor, the `getProperties` method in `open`, since that's the only call-site.
- Always overwrite the fileSize with the value obtained from `pdfDocument.getDownloadInfo`, to ensure that it's correct.
- ES6-ify the code that's touched in this patch.
Fixes 8371.
Please see http://eslint.org/docs/rules/object-shorthand.
For the most part, these changes are of the search-and-replace kind, and the previously enabled `no-undef` rule should complement the tests in helping ensure that no stupid errors crept into to the patch.
Rather than having to manually use `initializedPromise`, which really ought to be a private property, to ensure that setting/getting values in the `ViewHistory` works as intended, this re-factoring simply changes all of its methods to be asynchronous.
Furthermore, a `getMultiple` method (mirroring the existing `setMultiple` one) is also added to `ViewHistory`.
Finally, this patch also addresses an existing issue, where certain preferences (e.g. the default zoom level) would be ignored when calling `setInitialView` if reading from the `ViewHistory` fails for some reason.
With the exception of just one test-case, all the current `ui_utils` unit-tests can run successfully on Node.js (since most of them doesn't rely on the DOM).
To get this working, I had to first of all add a new `LIB` build flag such that `gulp lib` produces a `web/pdfjs.js` file that is able to load `pdf.js` successfully.
Second of all, since neither `document` nor `navigator` is available in Node.js, `web/ui_utils.js` was adjusted slightly to avoid errors.
[Firefox addon] Remove the `window.FirefoxCom` hack from `web/viewer.js`, since it was made redunant by the `setExternalLocalizerServices` refactoring (PR 7202 follow-up)
This prevents issues with the filename detection being skipped, when trying to download the opened PDF attachment, since `getPDFFileNameFromURL` ignores `data:` URLs for performance reasons.
The patch also changes the `defaultFilename` to use the ES6 default parameter notation, and fixes the formatting of the JSDoc comment.
Finally, since `getPDFFileNameFromURL` currently has no unit-tests, a few basic ones are added to avoid regressions.
In the first commit in PR 8203, I changed how the `DownloadManager` was included/initialized in `GENERIC`/`CHROME` builds.
The change was prompted by the fact that you cannot have conditional `import`s with ES6 modules, and I wanted to avoid bundling the general `DownloadManager` into the various Firefox specific build targets.
What I completely missed though, is that the new code meant that `download_manager.js` will now be pulling in the *entire* viewer (through `app.js`).
This is a *really* stupid mistake on my part, since it causes the `dist/build/pdf_viewer.js` used with the viewer components to now include basically the entire default viewer.
The simplest solution that I could come up with, is to add a `genericcom.js` file (similar to the `firefoxcom.js`/`chromecom.js` files) which will be responsible for importing/initializing the `DownloadManager`.
The browsers have become smarter and made this hack no longer
functional. Since this is now enforced by practically all browsers,
there is nothing more we can do about this. It is up to the user to
serve the viewer over HTTPS or deal with the warning.
Note that this is in no way specific for PDF.js. Any site with password
inputs served over HTTP has this problem right now. This hack was
provided as a convenience for the users, but nothing more than that.
Since calling `PDFPageView.setPdfPage` will in turn call `PDFPageView.reset`, which cancels all rendering and completely resets the page, it's thus possible that we currently cause some unnecessary re-rendering during the initial loading phase of the viewer.
Depending on the order in which data arrives, it's possible (and in practice always seem to happen) that the `pdfPage` property of the *second* page has already been set during `PDFViewer.setDocument`, by the time that the request for the `pdfPage` is resolved in `PDFViewer._ensurePdfPageLoaded`.
Also, note how the `setPdfPage` call in `PDFViewer.setDocument` is already guarded by this kind of check.
In various viewer files, there's a number of cases where we basically duplicate the functionality of `createPromiseCapability` manually.
As far as I can tell, a couple of these cases have existed for a very long time, and notable even before the `createPromiseCapability` utility function existed.
Also, since we can write ES6 code now, the patch also replaces a couple of `bind` usages with arrow functions in code that's touched in the patch.
This patch implements support for line annotations. Other viewers only
show the popup annotation when hovering over the line, which may have
any orientation. To make this possible, we render an invisible line (SVG
element) over the line on the canvas that acts as the trigger for the
popup annotation. This invisible line has the same starting coordinates,
ending coordinates and width of the line on the canvas.