The built-in PDF Viewer (in Firefox) cannot use the browser findbar when PDF files are embedded in e.g. iframe/object tags, and the PDF.js findbar (i.e. `PDFFindBar`) will thus be used instead in those cases.
This is slightly problematic, since the `MOZCENTRAL` version of the viewer uses a special, slimmed down, version of the `l10n.js` file that doesn't (currently) support plural forms. To prevent the matchesCounter from breaking completely in this edge-case, temporarily hard-code the plural form to use the default `[other]` version of the locale strings.
This prevents the findbar from intermittently displaying `0 of {number} matches`, which *could* theoretically happen for large and/or slow loading documents.
The find controller should only coordinate finding a string in the
document and should not be responsible for presenting the matches to the
user. The text layer builder already contains the logic to render the
matches in the viewer, so it should also take care of scrolling the
selected match into view.
Currently there's only a single spot in the code-base where `JpegImage.getData` is called, however it nonetheless seem like a good idea to ensure during tests that the `isSourcePDF` parameter is correctly set. (Especially considering that the PDF use-cases will break without it.)
Additionally, in `JpegImage._getLinearizedBlockData`, the code can be made a tiny bit more efficient by checking the value of `isSourcePDF` *first* to avoid useless checks (for the default PDF use-cases).
The find controller already has quite a lot of state to maintain. We can
avoid keeping track of this member variable because when the find
controller is reset, so is the extract text promises array. Therefore,
we can just check if that array contains items or not to determine if
text extraction already started.
Moreover, there is no need to reset the `pageContents` array since the
`reset` method already takes care of that.
Implement unit tests for the `isSameOrigin` and `createValidAbsoluteUrl` utility functions and use the `const` keyword for constants in `src/shared/util.js`
This is an unfortunate oversight on my part, which I stumbled upon when (locally) testing the `mozilla-central` follow-up patch necessary to enable the matches counter in the built-in PDF viewer.
The property is intended to contain a reference to a DOM element, which not only is nowhere to be found *now* but appears to never have existed in the first place.
As outlined in https://bugzilla.mozilla.org/show_bug.cgi?id=1282759 the internal Firefox name for the feature is `entireWord`, hence that name is used here as well for consistency (with "Whole words" being limited to the UI).
Given existing limitations of the PDF.js search functionality, e.g. the existing problems of searching across "new lines", there's some edge-cases where "Whole words" searching will ignore (valid) results.
However, considering that this is a pre-existing issue related to the way that the find controller joins text-content together, that shouldn't have to block this new feature in my opionion.
*Please note:* In order to enable this feature in the `MOZCENTRAL` version, a small follow-up patch for [PdfjsChromeUtils.jsm](https://hg.mozilla.org/mozilla-central/file/tip/browser/extensions/pdfjs/content/PdfjsChromeUtils.jsm) will be required once this has landed in `mozilla-central`.
For the `PDFFindBar` implementation, similar to the native Firefox findbar, the matches count displayed is now limited to a (hopefully) reasonable value.
*Please note:* In order to enable this feature in the `MOZCENTRAL` version, a follow-up patch will be required once this has landed in `mozilla-central`.
There have been lots of problems with trying to map glyphs to their unicode
values. It's more reliable to just use the private use areas so the browser's
font renderer doesn't mess with the glyphs.
Using the private use area for all glyphs did highlight other issues that this
patch also had to fix:
* small private use area - Previously, only the BMP private use area was used
which can't map many glyphs. Now, the (much bigger) PUP 16 area can also be
used.
* glyph zero not shown - Browsers will not use the glyph from a font if it is
glyph id = 0. This issue was less prevalent when we mapped to unicode values
since the fallback font would be used. However, when using the private use
area, the glyph would not be drawn at all. This is illustrated in one of the
current test cases (issue #8234) where there's an "ä" glyph at position
zero. The PDF looked like it rendered correctly, but it was actually not
using the glyph from the font. To properly show the first glyph it is always
duplicated and appended to the glyphs and the maps are adjusted.
* supplementary characters - The private use area PUP 16 is 4 bytes, so
String.fromCodePoint must be used where we previously used
String.fromCharCode. This is actually an issue that should have been fixed
regardless of this patch.
* charset - Freetype fails to load fonts when the charset size doesn't match
number of glyphs in the font. We now write out a fake charset with the
correct length. This also brought up the issue that glyphs with seac/endchar
should only ever write a standard charset, but we now write a custom one.
To get around this the seac analysis is permanently enabled so those glyphs
are instead always drawn as two glyphs.