Some arabic chars like \ufe94 could be searched in a pdf, hence it must be normalized
when creating the search query. So to avoid to duplicate the normalization code,
everything is moved in the find controller.
The previous code to normalize text was using NFKC but with a hardcoded map, hence it
has been replaced by the use of normalize("NFKC") (it helps to reduce the bundle size
by 30kb).
In playing with this \ufe94 char, I noticed that the bidi algorithm wasn't taking into
account some RTL unicode ranges, the generated font wasn't embedding the mapping this
char and the unicode ranges in the OS/2 table weren't up-to-date.
When normalized some chars can be replaced by several ones and it induced to have
some extra chars in the text layer. To avoid any regression, when copying some text
from the text layer, a copied string is normalized (NFKC) before being put in the
clipboard (it works like this in either Acrobat or Chrome).
After the previous patch we now have only *a single* `PRODUCTION` occurrence in the entire code-base, more specifically in the `web/viewer.html` file.
This special build-target can be replaced with any condition that always evaluate to `false`, such as e.g. a comment.
*Please note:* This patch might be considered too hacky, hence I completely understand if it's rejected.
This *special* build-target is very old, and was introduced with the first pre-processor that only uses comments to enable/disable code.
When the new pre-processor was added `PRODUCTION` effectively became redundant, at least in JavaScript code, since `typeof PDFJSDev === "undefined"` checks now do the same thing.
This patch proposes that we remove `PRODUCTION` from the JavaScript code, since that simplifies the conditions and thus improves readability in many cases.
*Please note:* There's not, nor has there ever been, any gulp-task that set `PRODUCTION = false` during building.
To make this functionality work out-of-the-box in custom implementations, see e.g. the "viewer components" examples, it'd be slightly easier if we dynamically create/insert the "hiddenCopyElement" in the `PDFViewer` constructor.
Given that the "copy all text" feature still appears to work just as before with this patch, hopefully I'm not overlooking any reason why doing this would be a bad idea.
I was playing with the new "copy all text" feature, and stumbled upon one document where the copied text was truncated; see http://mirrors.ctan.org/info/lshort/english/lshort.pdf
The problem turns out to be that on [page 83](https://ftp.acc.umu.se/mirror/CTAN/info/lshort/english/lshort.pdf#page=83) the textLayer contains `\u0000` and apparently copying just stops when a null char is encountered.
To fix this we can simply use an existing helper function, and with this patch we're able to successfully copy all the text in that document.
*Please note:* This patch only extends the `PDFFindController` implementation itself to support this functionality, however it's *purposely* not exposed in the default viewer.
This replaces the previous `phraseSearch`-parameter, and a `query`-string will now always be interpreted as a phrase-search.
To enable searching for individual words, the `query`-parameter must instead consist of an Array of strings. This way it's now also possible to combine phrase/word searches, with a `query`-parameter looking something like `["Lorem ipsum", "foo", "bar"]` which will search for the phrase "Lorem ipsum" *and* the words "foo" respectively "bar".
By getting the width/height of the first page initially, we can slightly reduce the amount of code needed both in the `hasEqualPageSizes`-check and when building the print-styles.
For the moment there is no real consensus on how we should download a pdf on Android.
Hence we keep this solution for the moment but behind a pref (which will be true on
nightly only).
Currently we repeat the same code in lots of places, to update the "toggled" class and "aria-checked" attribute, when various toolbar buttons are clicked.
For the MOZCENTRAL build-target this patch reduces the size of the *built* `web/viewer.js` file by just over `1.2` kilo-bytes.
Currently `float: inline-start/inline-end` is only supported in Firefox, see https://developer.mozilla.org/en-US/docs/Web/CSS/float#browser_compatibility, and in order to support other browsers we're thus forced to jump through some hoops.
This leads to slightly less nice code in the *built-in* Firefox PDF Viewer, and this patch attempts to improve the current situation:
- Use Stylelint to forbid direct use of `float: inline-start/inline-end` in the CSS files, to prevent future bugs in the general PDF.js viewer.
- Do a build-time replacement, only in MOZCENTRAL builds, to replace the CSS-variables with raw `float: inline-start/inline-end` instances.
This effectively implements some of the changes from https://phabricator.services.mozilla.com/D170496, but using our existing "direction aware" CSS-variable to limit the amount of code changes needed.
With the changes in PR 16153 we're no longer setting a `<base href>` in the Firefox PDF Viewer, hence it shouldn't be necessary to keep setting a `baseUrl` in the `PDFLinkService`-class.
Given that the original document URL is now kept, the browser itself will handle relative URLs and we can thus slightly reduce the amount of string parsing required when handling various links in the viewer.
Currently if you e.g. enable the `useOnlyCssZoom` option rendering may no longer finish as intended. To reproduce:
- Enable the `useOnlyCssZoom` option.
- Load https://github.com/mozilla/pdf.js/files/1522715/wuppertal_2012.pdf (in the development viewer).
- When rendering starts, *immediately* change the zoom-level.
In this case the document will never finish rendering, since the `postponeDrawing`-functionality will (here incorrectly) abort rendering and with CSS-only zooming rendering is only expected to happen once per page.
To fix this we'll simply ignore any `drawingDelay` when CSS-only zooming is used (regardless if it's triggered via the option or the zoom-level being very large).
Currently the `zoomLayer` isn't rotated correctly in all cases. To reproduce:
- Load https://github.com/mozilla/pdf.js/files/1522715/wuppertal_2012.pdf
- Let the document render.
- Rotate the document *four* times, such that the original rotation is restored.
The easiest solution, as far as I can tell, is that we always set the `transform` just as we did (for years) prior to the changes in PR 15812.
- Reduce a little bit of duplication by enforcing the max/min scale-values once, at the end, in the `increaseScale`/`decreaseScale` methods.
- Convert the "private" `PDFViewer` scale-related methods into actually private ones, now that JavaScript supports that.
Given that the viewer always set the `dir`-attribute, to either LTR or RTL, we should be able to use this logical CSS property to (very slightly) reduce the size of the CSS; please see https://developer.mozilla.org/en-US/docs/Web/CSS/inset-block
The signatures of these methods were changed in PR 15886, which has now been included in a couple of releases, hence it should hopefully be OK to remove the fallback code-paths now.
Also, the methods are updated slightly to be explicit about what options are supported and we'll no longer pass along any arbitrary options to the "private" methods.
Some of these pre-processor statements are *many* years old, and could thus do with some clean-up. Note that the pre-processor originally didn't support else-if statements, and by using those the code becomes a bit less verbose.
The idea is to apply an overall filter on each page: the main advantage
is to have some filtered images which could help to make them visible for
some users.
The tag <base> is used to resolve relative URIs within the document.
Newly added SVG filters use a relative URI which then use the URI in base
but this one mismatches with the document URI and consequently filters are
not found in the Firefox viewer.
So this patch just removes <base> and replace few relative URLs by absolute
ones.
In order to help to identify a link, we add a border around it with the LinkText color.
And backdrop colors are inverted when the mouse pointer hovers them, this way it should
help to identify the link where the pointer is.
Given that the debugging hash-parameters will only be used when the `pdfBugEnabled` option is manually set[1], we can skip a *tiny* bit of asynchronicity for "regular" users.
---
[1] Note that it's enabled by default in the development viewer, i.e. in `gulp server` mode.
In the mac case we don't want to care about the scaleFactor threshold
because else if too big another move could start and then subsequent
events aren't considered as wheel events.
It isn't really ideal and at some point we'll need to find a way at
least for the Firefox case to get the real events instead of the fake
wheel ones.
In looking at a profile, I noticed in Marker chart that there's an animation
for loading-icon.gif even if this icon isn't visible.
This patch doesn't completely remove it but just slightly postpones it.
This further extends the web-specific import maps introduced in PR 16009, to allow removing *most* of the build-time `require` statements from the viewer. The few remaining ones are fallbacks used for the COMPONENTS respectively the `legacy` GENERIC builds.
After the compatibility updates in PR 15968 it's no longer strictly necessary to build the `viewer.css` file in order for the *development viewer* to work in Chromium-based browsers.
*Please note:* Given that Chromium-based browsers still don't support the *unprefixed* `mask-image` property the icons won't look right, however the development viewer itself works.
Given that Firefox is the *primary* development target, and that running `gulp generic` locally will generate polyfilled CSS, it seems reasonable to make this simplification here.
Currently there's no toolbar in the GV-viewer, hence invoking the pageLabels functionality isn't meaningful and just leads to unnecessary parsing on both the main- and worker-threads. (And if a toolbar is added at some point, it's not clear to me if we'd want to support pageLabels in the GV-viewer anyway.)
Currently we have a couple of pre-processor checks, specifically for the GV-viewer, spread throughout the code. This works fine when *building* the viewer, however they're obviously ignored in development mode (i.e. `gulp server`).
This leads to a situation where the GV development viewer, i.e. http://localhost:8888/web/viewer-geckoview.html, behaves subtly different from its built version. This could easily lead to bugs, hence this patch introduces a development mode constant to hopefully improve things here.
Finally, in a follow-up to PR 15842, also ignores the `pageMode`-state since there's no sidebar available.
Currently there's no UI for this functionality in the GV-viewer, however we still call the API methods. This potentially leads to a bunch of worker-thread parsing, for PDF documents with these features, despite the result being completely unused.
Given that mobile devices are usually more resource constrained than desktop/laptop computers, not to mentioned battery life, we can avoid doing work that'll just be ignored anyway.
The `DownloadManager.openOrDownloadData` method is written for the default-viewer specifically, assuming a viewer able to handle e.g. URL search/hash parameters. In the viewer components there's obviously no such functionality, and we should thus trigger downloading of PDF attachments directly instead.
Given that the GV-viewer isn't using most of the UI-related components of the default-viewer, we can avoid including them in the *built* viewer to save space.[1]
The least "invasive" way of implementing this, at least that I could come up with, is to leverage import maps with suitable stubs for the GV-viewer.
The one slightly annoying thing is that we now have larger import maps across multiple html-files, and you'll need to remember to update all of them when making future changes.
---
[1] With this patch, the built `viewer.js` size is 391 kB and `viewer-geckoview.js` is 285 kB.
This is very old code, where we loop through the user-provided options and build an internal parameter object. To prevent errors we also need to ensure that the parameters are correct/valid, which is especially important for the ones that are sent to the worker-thread such that structured cloning won't fail.[1]
Over the years this has led to more and more code being added in `getDocument` to validate the user-provided options, and at this point *most* of them have at least basic validation. However the way that this is implemented feels slightly backwards, since we first build the internal parameter object and only *afterwards* validate those parameters.[2]
Hence this patch changes the `getDocument` function to instead check/validate the supported options upfront, and then *explicitly* build the internal parameter object with only the needed properties.
---
[1] Note the supported types at https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm#supported_types
[2] The internal parameter object may also, because of the loop, end up with lots of unnecessary properties since anything that the user provides is being copied.
These functions invoke the `PDFViewer.currentPageNumber` setter, which already checks that a `pdfDocument` is currently active. Also, given that they're event handlers for the First/Last-page buttons (in the SecondaryToolbar) they can't be invoked before the viewer has been fully initalized.
The default value of the `--scale-select-width` CSS variable has been choosen such that it should be large enough for most locales. This means that in many locales we don't even update the CSS variable at all, and for those locales where we do the update happens *one time* early during the viewer initialization (i.e. before the PDF document has loaded).
*Please note:* Compared to other recent PRs, the effect of these changes ought to be really tiny and are mostly done to promote better coding patterns.
The reasons for making this change are:
- There's no UI available to toggle the cursor-tools in the GeckoView-specific viewer.
- The `HandTool`-implementation basically *simulates* touch scrolling, and is thus unlikely to be helpful/useful anyway.
- PR 15831 already changed the relevant call-sites to handle `PDFViewerApplication.pdfCursorTools` being undefined.
Some of the code in this method is *very* old, and we could thus modernize it a little bit by removing a couple of the loops used to build the `getDocument` argument.
These `@media` rules were most likely just copy-pasted from the regular viewer, however none of them are currently necessary since the GeckoView-specific viewer doesn't have any toolbars.
Note that the whole purpose of these CSS rules are to make the toolbar, of the regular viewer, responsive. If we in the future add toolbars for the GeckoView-specific viewer, these rules most likely wouldn't be usable as-is anyway.
This option was added specifically for third-party users, but has never been used in the PDF.js project itself. Furthermore there's no preference that can be used to enable it, and you need to provide the `removePageBorders` option when initializing a `PDFViewer`-instance.
This patch thus get rid of a little bit more unused code in the Firefox PDF Viewer.
*Unfortunately I missed this during testing/reviewing of PR 15992.*
With the changes in PR 15992 we're now only adding the `loadingIcon`-class when rendering is actually `RUNNING`, in order to improve overall performance.
However when resetting the page, i.e. the `INITIAL` state, we also need to remove the `loadingIcon` completely. Without this patch if you scroll through a document where the pages don't load instantaneously, see e.g. issue 2504, we'll leave the `loadingIcon`-class attached to pages that have had their rendering cancelled *and* also been evicted from the `PDFPageViewBuffer`-instance.
This way we don't have a lot of useless divs and we let the css engine handle the
creation/destruction of the :after pseudo-element.
It'll help to slightly improve performance when zooming.
The only parameter that we actually need here is the `PDFDataRangeTransport`-instance, since the others are not necessary.
- The `url` parameter, as passed to the `getDocument` function in the API, is simply being ignored; see 2d87a2eb1c/src/display/api.js (L447-L458)
- The `length` parameter, as passed to the `getDocument` function in the API, is always being overwritten; see 2d87a2eb1c/src/display/api.js (L519-L525)
Until PR 12563 is deemed safe to land, I'd still like to be able to use worker-modules in the viewer during local development.
Hence this patch which *temporarily* adds a new `workerModules` hash-parameter, only available in non-PRODUCTION mode, that allows using worker-modules in the development viewer.
To enable this functionality, simply use http://localhost:8888/web/viewer.html#workerModules=true
If this method was added today, I really can't imagine that we'd support anything *except* objects. Unfortunately we cannot just remove this now, since the code has existed since "forever", however we can deprecate this and limit it to only the GENERIC build.
Furthermore, we can avoid a redundant `PDFViewerApplication.setTitleUsingUrl` call in the Firefox PDF Viewer since the title has already been set previously in that case.
This function is only used in PresentationMode these days, but we can still improve it a little bit:
- Use the existing web-platform `deltaMode` constants, rather than defining our own constants for those values.
- Access the `deltaMode` first, before the `delta{X, Y}` properties, to avoid being affected by bug 1392460 (similar to the default viewer).
It seems nicer overall, since we're exporting the `ProgressBar` in the viewer-components, to move this functionality into the `ProgressBar`-class itself rather than handling it "manually" in the default-viewer.
At the beginning of a search we can an update can be triggered with 0 over 0
found matches.
In the GeckoView context, we can't update the finder whenever we want but only
when it has been required.
Rather than adding `@media (forced-colors: active) { ... }`-blocks throughout the CSS code, we should utilize CSS variables instead as in our other CSS files.
When a css variable is update in a node then all the children under this
node are updated.
In order to avoid to update all the UI when a page is rescaling, this
patch moves the --scale-factor from the :root to the viewer container.
After the changes in PR 15812 we'll now *intermittently* display completely black canvases during zooming. To reproduce this, try switching to wrapped-scrolling and zoom in/out very quickly using either the mouse-wheel or pinching.
This patch removes the recently introduced `transferPdfData` API-option, and simply enables transferring of TypedArray data *by default* instead of copying it. This will help reduce main-thread memory usage, however it will take ownership of the TypedArrays. Currently this only applies to the following cases:
- TypedArrays passed to the `getDocument`-function in the API, in order to open PDF documents from binary data.
- TypedArrays passed to a `PDFDataRangeTransport`-instance, used to support custom PDF document fetching/loading (see e.g. the Firefox PDF Viewer).
*PLEASE NOTE:* To avoid being affected by this, please simply *copy* any TypedArray data before passing it to either of the functions/methods mentioned above.
Now that we transfer TypedArray data that we previously only copied, we need to be more careful with input validation. Given how the `{IPDFStreamReader, IPDFStreamRangeReader}.read` methods will always return ArrayBuffer data, which is then transferred to the worker-thread[1], the actual TypedArray data passed to the API thus need to have the same exact size as its underlying ArrayBuffer to prevent issues.
Hence we'll check for this and only allow transferring of *safe* TypedArray data, and fallback to simply copying the data just as before. This obviously shouldn't be an issue in the Firefox PDF Viewer, but for the general PDF.js library we need to be more careful here.
---
[1] See e09ad99973/src/display/api.js (L2492-L2506) respectively e09ad99973/src/display/api.js (L2578-L2590)
- Scale factor is rounded to only scale by integer percent, hence the unused
ticks are accumulated (like we already do for zoom with the mouse wheel).
- Use the same thing for the pinch-to-zoom on a touchscreen: it led to slightly
refactor the code because it happened to ignore a not so small scale which
led to a not so smooth zooming.
Also, removes the `initialData`-parameter JSDocs for the `getDocument`-function given that this parameter has been completely unused since PR 8982 (over five years ago). Note that the `initialData`-parameter is, and always was, intended to be provided when initializing a `PDFDataRangeTransport`-instance.
After the changes in PR 15850, the `background-color` of the sidebar is now unnecessarily dark in the light-theme. Hence, we can simply remove this CSS rule to improve things overall (and these changes don't affect the dark-theme much at all).
This is even an overall consistency improvement, given the existing `--sidebar-narrow-bg-color` values.
In most of the cases, showing the loading icon is useless because
it's for a very short time, consequently it doesn't bring any useful
information for the user.
After a delay (400ms), the icon is shown in order to inform the user
that the viewer isn't stuck but it's doing something.
In GeckoView, on an event, a callback must be executed with the result of an action,
but the callback can be used only one time.
So for each FindInPage event, we must trigger only one matches count update.