Note that in the PDF files provided by the reporter, this issue was limited to `Rect` arrays in AcroForm entries (which we currently don't support).
However, since a bad PDF generator could create this problem in *any* kind of annotation, the reduced test-case included here uses a simple LinkAnnotation instead.
Fixes 7115.
This patch adds a `getUnicodeForGlyph` helper function, which is used to recover Unicode values for non-standard glyph names.
Some PDF generators, e.g. Scribus PDF, use improper `uniXXXX` glyph names which breaks the glyph mapping. We can avoid this by converting them to "standard" glyph names instead.
Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1132849.
Fixes 6893.
Fixes 6894.
In the PDF file in question, some of the 'name' table entries have `record.length === 0`. This becomes problematic in the non-unicode case, since `font.getBytes(0)` will fetch the *entire* stream.
Given that OTS rejects 'name' entries larger than `2^16`, this thus explain the sanitizer errors.
Fixes 7020.
Some bad PDF generators, in particular "Scribus PDF", duplicates resources *a lot* at various levels of the PDF files. This can lead to `PartialEvaluator_hasBlendModes` taking an unreasonable amount of time to complete.
The reason is that the current code is using `Dict_getAll`, which recursively dereferences *all* indirect objects, which can be really slow. This patch instead uses `Dict_getKeys`, and then manually looks up only the necessary indirect objects.
I've added the PDF file as a `load` test. The most important thing here is probably to ensure that the file remains available in the repo, and the comment should help reduced the chance of regressions. (Note that locally, the `load` test times out without this patch, but we cannot really assume that that always happens.)
Fixes 6961.
Re: issue 5089.
(Note that since there are other outline features that we currently don't support, e.g. bold/italic text and custom colours, I thus think we can keep the referenced issue open.)
As part of the link cleanup in issue 6854, obtaining this file through the Internet Archive didn't work.
However, given that the file was added in order to test an issue with `CropBox/MediaBox`, a reduced test-case should do just fine instead.
Please refer to issue 1155, and PR 1212.
It seems to be fairly common for OCR software to include incomplete TrueType fonts, notable missing the "glyf" table, in PDF files. Since we currently reject such fonts, the result is that text-selection/copying is broken.
This patch contains a suggested approach to try and use these kind of broken fonts, by using existing code in `sanitizeGlyphLocations` to replace a missing "glyf" table with dummy data.
Fixes 4684.
Fixes 6007.
Fixes 6829.
*This patch follows a similar idea as PR 5756.*
The patch is based on the nice debugging done by Brendan in the referenced issue 6782.
A better way to handle this, and similar issues, would probably be to completely ignore what the PDF file claims about font type/subtype, and just check the actual data. But until that kind of rewrite happens, this patch should help.
Fixes 6782.
Currently we're not applying Patterns for text, but only for graphics.
This patch is unfortunately not a complete solution, but rather a step on the way, since there are still some PDF files where the Patterns look more like a solid colour, rather than the intended gradient.
I've been unable to fix these issues completely, and I've not managed to determine if the remaining issues are caused either by the pattern code, the canvas code, or perhaps both.
However, given that even this simple patch improves the current situation quite a bit, I figured that it couldn't hurt to submit it as-is.
- Fixes 5804.
- Fixes 6130.
- Improves 3988 a lot, since the text is now visible. However, it looks like the text is *one* solid colour, instead of the correct gradient.
- Improves 5432, since the text is no longer gray. (This file also suffers from the same problem as the previous one.)
Most code for Popup annotations is already present for Text annotations.
This patch extracts the popup creation logic from the Text annotation
code so it can be reused for Popup annotations.
Not only does this add support for Popup annotations, the Text
annotation code is also considerably easier. If a `Popup` entry is
available for a Text annotation, it will not be more than an image. The
popup will be handled by the Popup annotation. However, it is also
possible for Text annotations to not have a separate Popup annotation,
in which case the Text annotation handles the popup creation itself.
In `Font_checkAndRepair` we can decide that a font isn't TrueType, and instead parse it as CFF. In that case it's quite possible that the `fontMatrix` will be changed, and without calling `adjustWidths` we're failing to update the glyph widths correctly.
Fixes 5027.
Fixes 5084.
Fixes 6556.
Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1204903.
After PR 6590, `font.spaceWidth` is now called in more cases than before (in `PartialEvaluator_getTextContent`), which exposed an underlying issue with `IdentityToUnicodeMap_charCodeOf` throwing an error.
This breaks text-selection in some PDF files found in the wild, hence this patch replaces the `error` with an actual function instead (modelled after `IdentityCMap_charCodeOf`).
This patch goes a bit further than issue 6612 requires, and replaces all kinds of whitespace with standard spaces.
When testing this locally, it actually seemed to slightly improve two existing test-cases (`tracemonkey-text` and `taro-text`).
Fixes 6612.