In the PDF file in question, some of the 'name' table entries have `record.length === 0`. This becomes problematic in the non-unicode case, since `font.getBytes(0)` will fetch the *entire* stream.
Given that OTS rejects 'name' entries larger than `2^16`, this thus explain the sanitizer errors.
Fixes 7020.
Functionality wise, `ensureThumbnailVisible` is essentially just a shorthand for `scrollThumbnailIntoView`. (And note that `PDFViewer` doesn't implement a `ensurePageVisible` method.)
The only remaining usage of `PDFThumbnailViewer_ensureThumbnailVisible` is inside `PDFPresentationMode`, which introduces an otherwise unnecessary `PDFThumbnailViewer` dependency there.
We're already relying on the `presentationmodechanged` event, in various files, to track the state of Presentation Mode. Thus we can simply listen for that event in `PDFSidebar` too, and update the thumbnails if necessary.
The sidebar code has, except for minor fixes/additions (such as attachments), been largely untouch for years.
To avoid having a bunch of sidebar code sprinkled throughout viewer.js, this patch moves the sidebar code into a separate file (pdf_sidebar.js), similar to how most other functionality has been moved in the last few years.
Besides simply moving code around, this patch also has the added benefit that we now keep track of the sidebar state (not just opened/closed).
This now makes it possible to handle both `Preferences` *and* `ViewHistory` settings for the sidebar state in a cleaner way, preventing strange and confusing interactions between the two.
Changes `PDFOutlineView`/`PDFAttachmentView` to be initialized once, since we're always creating them, and refactor their `render` methods to instead pass in the `outline`/`attachments`.
For consistency with other "classes", the `PDFOutlineView`/`PDFAttachmentView` are renamed to `PDFOutlineViewer`/`PDFAttachmentViewer`.
Also, make sure that the outline/attachments are reset when the document is closed. Currently we keep the old ones around until the `getOutline`/`getAttachments` API calls are resolved for a new document.
Re: issue 7017.
This should prevent the error, but does not fix the broken rendering.
The rendering issues are caused by `svg.js` not supporting the various text rendering modes, in this case specifically "invisible" (as indicated in the console `Warning: Unimplemented method setTextRenderingMode`).
Compared to the canvas case, where we just ignore invisible text, a smarter solution is probably required for the SVG case. Since just ignoring invisible text in `svg.js` would mean that text-selection isn't possible.
To reduce code duplication, the initialization code now uses the `reset` method.
Also, this patch moves `charactersToNormalize` out of `PDFFindController`, since it seemed better suited to be a "constant".
Open http://www.puolustusvoimat.fi/wcm/61ba4180411e702ea19ee9e364705c96/luonnonmuonaohjelmalumo1985.pdf?MOD=AJPERES#pagemode=bookmarks.
Note how the outline looks entirely empty, but hovering over it you'll see that there are entries present. There's two separate issues here, first of all the fact that you cannot visually make out the outline items, and secondly that the lack of text means that the clickable area becomes too small.
In Adobe Reader this issue is somewhat mitigated, since they use an icon for every item. For PDF.js, the easist way to address this seems to be making use of a default title.
This issue should be *very* rare in practice, but given the simplicity of the solution I think that we should fix this.
The `hasImage` property is a left-over from older thumbnail code, and has been made obsolete by `renderingState`.
Having two different properties tracking (basically) the same state is asking for trouble, since it's very easy to forget to update one of them, with annoying bugs as the result.
This is required to be able to use it in the annotation display code,
where we now apply it to sanitize the filename of the FileAttachment
annotation. The PDF file from https://bugzilla.mozilla.org/show_bug.cgi?id=1230933 has shown that some PDF generators include the path of the file rather than the filename, which causes filenames with weird initial characters. PDF viewers handle this differently (for example Foxit Reader just replaces forward slashes with spaces), but we think it's better to only show the filename as intended.
Additionally we add unit tests for the `getFilenameFromUrl` helper
function.
*A more robust solution for issue 6066.*
As a temporary work-around for (the upstream) [bug 1164199](https://bugzilla.mozilla.org/show_bug.cgi?id=1164199), we parsed *all* images in the Firefox addon during a short time.
Doing so uncovered an issue with our image handling (see 6066), for JPEG images with a `DeviceGray` ColorSpace *and* `bpc !== 1` (bits per component).
As long as we let the browser handle image decoding in this case, this isn't going to be an issue, but I do think that we should proactively fix this to avoid future issues if we change where the images are decoded (in `jpg.js` vs in browser).
Also, we currently don't seem to have a test-case for that kind of image data.
Currently the `C` entry in an outline item is returned as is, which is neither particularly useful nor what the API documentation claims.
This patch also adds unit-tests for both the color handling, and the `F` entry (bold/italic flags).
`Dict_getAll` is problematic for a number of reasons. First of all, as issue 6961 shows, it can be really bad for performance, since it dereferences all indirect objects.
Second of all, all the derefencing can lead to data being unncessarily requested when ranged/chunked loading is used, thus unnecessarily delaying rendering.
Note: For cases where `Dict_getAll` was previously used, `Dict_getKeys` in combination with `Dict_get` can be used instead. This has the advantage that data isn't requested until it's actually needed.