2011-10-07 00:01:22 +09:00
|
|
|
*.pdf
|
2013-05-31 06:54:49 +09:00
|
|
|
*.error
|
2011-10-07 00:01:22 +09:00
|
|
|
|
Fallback gracefully when encountering corrupt PDF files with empty /MediaBox and /CropBox entries
This is based on a real-world PDF file I encountered very recently[1], although I'm currently unable to recall where I saw it.
Note that different PDF viewers handle these sort of errors differently, with Adobe Reader outright failing to render the attached PDF file whereas PDFium mostly handles it "correctly".
The patch makes the following notable changes:
- Refactor the `cropBox` and `mediaBox` getters, on the `Page`, to reduce unnecessary duplication. (This will also help in the future, if support for extracting additional page bounding boxes are added to the API.)
- Ensure that the page bounding boxes, i.e. `cropBox` and `mediaBox`, are never empty to prevent issues/weirdness in the viewer.
- Ensure that the `view` getter on the `Page` will never return an empty intersection of the `cropBox` and `mediaBox`.
- Add an *optional* parameter to `Util.intersect`, to allow checking that the computed intersection isn't actually empty.
- Change `Util.intersect` to have consistent return types, since Arrays are of type `Object` and falling back to returning a `Boolean` thus seem strange.
---
[1] In that case I believe that only the `cropBox` was empty, but it seemed like a good idea to attempt to fix a bunch of related cases all at once.
2019-08-08 22:54:46 +09:00
|
|
|
!boundingBox_invalid.pdf
|
2011-10-07 00:01:22 +09:00
|
|
|
!tracemonkey.pdf
|
2015-11-08 21:18:23 +09:00
|
|
|
!TrueType_without_cmap.pdf
|
2015-03-09 22:36:45 +09:00
|
|
|
!franz.pdf
|
2015-09-28 22:09:24 +09:00
|
|
|
!franz_2.pdf
|
2021-01-12 23:21:19 +09:00
|
|
|
!fraction-highlight.pdf
|
2015-11-16 21:15:36 +09:00
|
|
|
!german-umlaut-r.pdf
|
2021-04-21 02:21:52 +09:00
|
|
|
!issue13269.pdf
|
2015-10-01 21:46:03 +09:00
|
|
|
!xref_command_missing.pdf
|
2016-01-16 23:51:10 +09:00
|
|
|
!issue1155r.pdf
|
Add a couple more, mostly `text`, reference tests for non-embedded symbolic fonts without included encoding information
I've started to look into how we can fix issue 7580, but quickly became worried that fixing it could easily mean that we'd trade one fixed PDF file for a multitude of broken ones.
Hence I started going through the history of the code that choose the fallback encoding, and noticed that it has been changed a number of times over the years to deal with various cases of weirdness/errors in non-embedded fonts.
To my relief it turned out that almost all the PRs, please see a possibly incomplete [list here], that changed this code actually included `eq` test-cases.
However, in one case it appears that a PR missed to add a test-case. Furthermore since the fallback encoding may also be the only source for creating a `toUnicode` map, changing the encoding could possibly regress only the text-selection despite a PDF file still rendering correctly.
Therefore, this PR adds one new `eq` test, and also a number of additional `text` tests for PDF files already present in the test-suite.
Note that it's obviously possible that there's a certain overlap between the added tests, but I'd be *a whole lot* more concerned with causing regressions.
2016-09-10 20:10:15 +09:00
|
|
|
!issue2017r.pdf
|
2021-08-31 04:51:59 +09:00
|
|
|
!bug1727053.pdf
|
2021-05-24 02:03:53 +09:00
|
|
|
!issue11913.pdf
|
2013-01-11 11:00:44 +09:00
|
|
|
!issue2391-1.pdf
|
2013-01-12 04:04:56 +09:00
|
|
|
!issue2391-2.pdf
|
2021-09-25 00:30:56 +09:00
|
|
|
!issue14046.pdf
|
2021-10-20 12:14:48 +09:00
|
|
|
!issue7891_bc1.pdf
|
2022-05-06 18:11:38 +09:00
|
|
|
!issue14881.pdf
|
2015-12-22 20:59:23 +09:00
|
|
|
!issue3214.pdf
|
2015-11-07 21:04:03 +09:00
|
|
|
!issue4665.pdf
|
2023-01-16 20:56:43 +09:00
|
|
|
!checkbox-bad-appearance.pdf
|
2016-01-09 19:50:48 +09:00
|
|
|
!issue4684.pdf
|
2019-05-24 09:45:25 +09:00
|
|
|
!issue8092.pdf
|
2016-06-27 20:51:11 +09:00
|
|
|
!issue5256.pdf
|
2015-03-07 19:39:10 +09:00
|
|
|
!issue5801.pdf
|
2015-12-28 22:10:30 +09:00
|
|
|
!issue5946.pdf
|
2015-05-10 18:28:15 +09:00
|
|
|
!issue5972.pdf
|
2015-04-30 19:40:54 +09:00
|
|
|
!issue5874.pdf
|
2016-06-01 06:01:35 +09:00
|
|
|
!issue5808.pdf
|
2019-11-10 22:37:42 +09:00
|
|
|
!issue6179_reduced.pdf
|
2016-03-29 22:46:21 +09:00
|
|
|
!issue6204.pdf
|
[api-minor] Always allow e.g. rendering to continue even if there are errors, and add a `stopAtErrors` parameter to `getDocument` to opt-out of this behaviour (issue 6342, issue 3795, bug 1130815)
Other PDF readers, e.g. Adobe Reader and PDFium (in Chrome), will attempt to render as much of a page as possible even if there are errors present.
Currently we just bail as soon the first error is hit, which means that we'll usually not render anything in these cases and just display a blank page instead.
NOTE: This patch changes the default behaviour of the PDF.js API to always attempt to recover as much data as possible, even when encountering errors during e.g. `getOperatorList`/`getTextContent`, which thus improve our handling of corrupt PDF files and allow the default viewer to handle errors slightly more gracefully.
In the event that an API consumer wishes to use the old behaviour, where we stop parsing as soon as an error is encountered, the `stopAtErrors` parameter can be set at `getDocument`.
Fixes, inasmuch it's possible since the PDF files are corrupt, e.g. issue 6342, issue 3795, and [bug 1130815](https://bugzilla.mozilla.org/show_bug.cgi?id=1130815) (and probably others too).
2017-02-19 22:03:08 +09:00
|
|
|
!issue6342.pdf
|
2017-04-11 03:58:02 +09:00
|
|
|
!issue6652.pdf
|
2016-01-06 10:07:21 +09:00
|
|
|
!issue6782.pdf
|
For embedded Type1 fonts without included `ToUnicode`/`Encoding` data, attempt to improve text selection by using the `builtInEncoding` to amend the `toUnicode` map (issue 6901, issue 7182, issue 7217, bug 917796, bug 1242142)
Note that in order to prevent any possible issues, this patch does *not* try to amend the `toUnicode` data for Type1 fonts that contain either `ToUnicode` or `Encoding` entries in the font dictionary.
Fixes, or at least improves, issues/bugs such as e.g. 6658, 6901, 7182, 7217, bug 917796, bug 1242142.
2016-08-18 01:33:06 +09:00
|
|
|
!issue6901.pdf
|
2016-02-10 01:09:17 +09:00
|
|
|
!issue6961.pdf
|
2016-05-15 05:13:12 +09:00
|
|
|
!issue6962.pdf
|
2016-03-02 05:39:33 +09:00
|
|
|
!issue7020.pdf
|
2016-03-23 21:52:30 +09:00
|
|
|
!issue7101.pdf
|
2016-03-26 22:41:15 +09:00
|
|
|
!issue7115.pdf
|
2016-06-08 05:40:06 +09:00
|
|
|
!issue7180.pdf
|
2017-10-19 06:10:40 +09:00
|
|
|
!issue7769.pdf
|
2016-04-15 21:22:36 +09:00
|
|
|
!issue7200.pdf
|
2016-04-21 22:10:40 +09:00
|
|
|
!issue7229.pdf
|
2016-06-13 21:22:15 +09:00
|
|
|
!issue7403.pdf
|
2016-07-23 20:04:27 +09:00
|
|
|
!issue7406.pdf
|
2016-07-17 23:41:53 +09:00
|
|
|
!issue7426.pdf
|
2016-06-25 19:41:26 +09:00
|
|
|
!issue7439.pdf
|
2021-05-11 09:43:37 +09:00
|
|
|
!issue7847_radial.pdf
|
2022-10-04 00:55:13 +09:00
|
|
|
!issue8844.pdf
|
2022-05-24 19:12:53 +09:00
|
|
|
!issue14953.pdf
|
2022-08-31 01:40:27 +09:00
|
|
|
!issue15367.pdf
|
2022-09-01 00:50:28 +09:00
|
|
|
!issue15372.pdf
|
2016-07-24 21:32:48 +09:00
|
|
|
!issue7446.pdf
|
Catch errors and continue parsing in `parseCMap` (issue 7492)
After PR 7039, the PDF file in issue 7492 no longer renders at all, but note that text selection wasn't working correctly previously.
The problem with the PDF file in issue 7492 is that the `cMap`, in the `toUnicode` entry in the font, contains an invalid name:
```
/CMapName /-usr-share-fonts-truetype-Panton-Panton Family-Fontfabric - Panton.otf,000-UTF16 def
```
When we parse that line, things obviously break because there are spaces present in the wrong places.
To avoid that issue, the patch simply lets `parseCMap` continue when errors are encountered, to try and recover usable data. Note that by not aborting immediatly when an error is encountered, we are also able to fix the text selection.
Obviously, it could be argued that we should just immediatly reject a corrupt `cMap`. But given that they usually are correct, it seems that trying to recover as much data as possible from corrupt one can only be a good thing for both glyph mapping and text selection.
Fixes 7492.
2016-07-18 23:01:02 +09:00
|
|
|
!issue7492.pdf
|
2016-08-16 19:23:53 +09:00
|
|
|
!issue7544.pdf
|
2017-09-26 09:24:21 +09:00
|
|
|
!issue7507.pdf
|
2020-08-17 15:49:19 +09:00
|
|
|
!issue6931_reduced.pdf
|
2022-04-27 17:34:31 +09:00
|
|
|
!issue14847.pdf
|
2020-12-24 02:57:44 +09:00
|
|
|
!doc_actions.pdf
|
Fallback to the `StandardEncoding` for Nonsymbolic fonts without `/Encoding` entry (issue 7580)
Even though this patch passes all tests (unit/font/reference) locally, including the new ones that I added in PR 7621, I'm still a bit nervous about modifying the code that choose the fallback encoding for fonts without an `/Encoding` entry.
Note that over the years this code has been changed on a number of occasions, see a possibly incomplete [list here], to deal with various cases of incorrect font data.
According to the PDF specification, see http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G8.1904184, it seems that we should fallback to the `StandardEncoding` for Nonsymbolic fonts.
There's obviously a risk that fixing this particular issue *could* break other PDF files for which we don't have tests. However I've tried to change the logic as little as possible in this patch, to hopefully reduce possible breakage.
Based on debugging numerous font issue, it seems that a lot of fonts actually set the Symbolic flag, even when they are in fact *not* Symbolic. Fonts actually marked as Nonsymbolic seem to be somewhat less common, which I hope should reduce the risk of the patch somewhat.
Fixes 7580.
2016-09-13 20:43:23 +09:00
|
|
|
!issue7580.pdf
|
2016-09-06 16:56:18 +09:00
|
|
|
!issue7598.pdf
|
2020-12-18 21:17:23 +09:00
|
|
|
!issue12750.pdf
|
2016-09-23 21:11:08 +09:00
|
|
|
!issue7665.pdf
|
2017-07-24 16:51:40 +09:00
|
|
|
!issue7696.pdf
|
2016-11-22 21:48:06 +09:00
|
|
|
!issue7835.pdf
|
2020-05-22 21:07:28 +09:00
|
|
|
!issue11922_reduced.pdf
|
2016-11-30 02:28:32 +09:00
|
|
|
!issue7855.pdf
|
2020-07-15 07:17:27 +09:00
|
|
|
!issue11144_reduced.pdf
|
2016-12-06 18:21:42 +09:00
|
|
|
!issue7872.pdf
|
2016-12-21 00:42:15 +09:00
|
|
|
!issue7901.pdf
|
2017-02-12 03:11:52 +09:00
|
|
|
!issue8061.pdf
|
2021-07-22 04:27:39 +09:00
|
|
|
!bug1721218_reduced.pdf
|
2017-02-23 21:28:50 +09:00
|
|
|
!issue8088.pdf
|
2017-03-03 20:22:55 +09:00
|
|
|
!issue8125.pdf
|
Build a fallback `ToUnicode` map for simple fonts (issue 8229)
In some fonts, the included `ToUnicode` data is incomplete causing text-selection to not work properly. For simple fonts that contain encoding data, we can manually build a `ToUnicode` map to attempt to improve things.
Please note that since we're currently using the `ToUnicode` data during glyph mapping, in an attempt to avoid rendering regressions, I purposely didn't want to amend to original `ToUnicode` data for this text-selection edge-case.
Instead, I opted for the current solution, which will (hopefully) give slightly better text-extraction results in PDF file with incomplete `ToUnicode` data.
According to the PDF specification, see [section 9.10.2](http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G8.1873172):
> A conforming reader can use these methods, in the priority given, to map a character code to a Unicode value.
> ...
Reading that paragraph literally, it doesn't seem too unreasonable to use *different* methods for different charcodes.
Fixes 8229.
2017-11-26 21:29:43 +09:00
|
|
|
!issue8229.pdf
|
2019-01-29 22:24:48 +09:00
|
|
|
!issue8276_reduced.pdf
|
2017-05-25 00:36:39 +09:00
|
|
|
!issue8372.pdf
|
2021-09-09 09:31:10 +09:00
|
|
|
!issue9713.pdf
|
2021-06-25 21:31:55 +09:00
|
|
|
!xfa_filled_imm1344e.pdf
|
2017-05-23 23:08:02 +09:00
|
|
|
!issue8424.pdf
|
2017-06-10 01:51:31 +09:00
|
|
|
!issue8480.pdf
|
2020-08-08 05:04:53 +09:00
|
|
|
!bug1650302_reduced.pdf
|
2017-06-29 02:34:36 +09:00
|
|
|
!issue8570.pdf
|
2017-07-25 18:57:38 +09:00
|
|
|
!issue8697.pdf
|
2017-09-17 20:35:18 +09:00
|
|
|
!issue8702.pdf
|
2021-04-01 07:07:02 +09:00
|
|
|
!structure_simple.pdf
|
2021-01-07 19:25:09 +09:00
|
|
|
!issue12823.pdf
|
2017-08-04 14:19:36 +09:00
|
|
|
!issue8707.pdf
|
2017-08-25 00:32:53 +09:00
|
|
|
!issue8798r.pdf
|
2017-08-26 19:09:49 +09:00
|
|
|
!issue8823.pdf
|
2017-10-31 21:01:29 +09:00
|
|
|
!issue9084.pdf
|
2021-02-06 20:23:35 +09:00
|
|
|
!issue12963.pdf
|
2017-12-09 00:37:12 +09:00
|
|
|
!issue9105_reduced.pdf
|
2022-10-19 19:28:25 +09:00
|
|
|
!issue9105_other.pdf
|
2018-06-19 16:37:56 +09:00
|
|
|
!issue9252.pdf
|
2017-12-15 08:23:56 +09:00
|
|
|
!issue9262_reduced.pdf
|
2017-12-25 01:25:43 +09:00
|
|
|
!issue9291.pdf
|
2018-06-19 18:31:31 +09:00
|
|
|
!issue9418.pdf
|
2018-02-08 04:53:44 +09:00
|
|
|
!issue9458.pdf
|
2019-09-30 06:50:58 +09:00
|
|
|
!issue9655_reduced.pdf
|
2018-07-27 23:57:58 +09:00
|
|
|
!issue9915_reduced.pdf
|
2021-04-10 23:53:17 +09:00
|
|
|
!bug854315.pdf
|
2018-08-03 02:16:42 +09:00
|
|
|
!issue9940.pdf
|
2019-01-13 04:31:23 +09:00
|
|
|
!issue10388_reduced.pdf
|
2019-01-11 01:49:33 +09:00
|
|
|
!issue10438_reduced.pdf
|
2019-02-09 15:53:16 +09:00
|
|
|
!issue10529.pdf
|
2019-04-22 00:03:38 +09:00
|
|
|
!issue10542_reduced.pdf
|
2019-03-27 08:25:34 +09:00
|
|
|
!issue10665_reduced.pdf
|
2019-07-30 23:48:27 +09:00
|
|
|
!issue11016_reduced.pdf
|
2022-09-27 22:19:57 +09:00
|
|
|
!issue15516_reduced.pdf
|
2019-08-05 21:40:48 +09:00
|
|
|
!issue11045.pdf
|
2020-08-22 07:25:07 +09:00
|
|
|
!bug1057544.pdf
|
2019-09-18 16:44:18 +09:00
|
|
|
!issue11150_reduced.pdf
|
2021-01-15 19:56:20 +09:00
|
|
|
!issue6127.pdf
|
2021-10-20 12:14:48 +09:00
|
|
|
!issue7891_bc0.pdf
|
2019-10-15 21:06:54 +09:00
|
|
|
!issue11242_reduced.pdf
|
2023-03-21 20:24:21 +09:00
|
|
|
!issue16176.pdf
|
2019-10-26 18:57:56 +09:00
|
|
|
!issue11279.pdf
|
2020-01-26 00:53:34 +09:00
|
|
|
!issue11362.pdf
|
2021-05-11 09:43:37 +09:00
|
|
|
!issue13325_reduced.pdf
|
2020-02-09 03:51:16 +09:00
|
|
|
!issue11578_reduced.pdf
|
2020-03-02 23:34:00 +09:00
|
|
|
!issue11651.pdf
|
Attempt to cache repeated images at the document, rather than the page, level (issue 11878)
Currently image resources, as opposed to e.g. font resources, are handled exclusively on a page-specific basis. Generally speaking this makes sense, since pages are separate from each other, however there's PDF documents where many (or even all) pages actually references exactly the same image resources (through the XRef table). Hence, in some cases, we're decoding the *same* images over and over for every page which is obviously slow and wasting both CPU and memory resources better used elsewhere.[1]
Obviously we cannot simply treat all image resources as-if they're used throughout the entire PDF document, since that would end up increasing memory usage too much.[2]
However, by introducing a `GlobalImageCache` in the worker we can track image resources that appear on more than one page. Hence we can switch image resources from being page-specific to being document-specific, once the image resource has been seen on more than a certain number of pages.
In many cases, such as e.g. the referenced issue, this patch will thus lead to reduced memory usage for image resources. Scrolling through all pages of the document, there's now only a few main-thread copies of the same image data, as opposed to one for each rendered page (i.e. there could theoretically be *twenty* copies of the image data).
While this obviously benefit both CPU and memory usage in this case, for *very* large image data this patch *may* possibly increase persistent main-thread memory usage a tiny bit. Thus to avoid negatively affecting memory usage too much in general, particularly on the main-thread, the `GlobalImageCache` will *only* cache a certain number of image resources at the document level and simply fallback to the default behaviour.
Unfortunately the asynchronous nature of the code, with ranged/streamed loading of data, actually makes all of this much more complicated than if all data could be assumed to be immediately available.[3]
*Please note:* The patch will lead to *small* movement in some existing test-cases, since we're now using the built-in PDF.js JPEG decoder more. This was done in order to simplify the overall implementation, especially on the main-thread, by limiting it to only the `OPS.paintImageXObject` operator.
---
[1] There's e.g. PDF documents that use the same image as background on all pages.
[2] Given that data stored in the `commonObjs`, on the main-thread, are only cleared manually through `PDFDocumentProxy.cleanup`. This as opposed to data stored in the `objs` of each page, which is automatically removed when the page is cleaned-up e.g. by being evicted from the cache in the default viewer.
[3] If the latter case were true, we could simply check for repeat images *before* parsing started and thus avoid handling *any* duplicate image resources.
2020-05-18 21:17:56 +09:00
|
|
|
!issue11878.pdf
|
2021-08-21 06:17:53 +09:00
|
|
|
!issue13916.pdf
|
2021-09-17 18:29:58 +09:00
|
|
|
!issue14023.pdf
|
2022-01-11 21:43:16 +09:00
|
|
|
!issue14438.pdf
|
2022-07-05 21:08:53 +09:00
|
|
|
!issue14999_reduced.pdf
|
2016-11-04 03:48:08 +09:00
|
|
|
!bad-PageLabels.pdf
|
2017-12-29 22:39:29 +09:00
|
|
|
!decodeACSuccessive.pdf
|
2021-09-30 05:07:07 +09:00
|
|
|
!issue13003.pdf
|
2013-02-07 08:19:29 +09:00
|
|
|
!filled-background.pdf
|
2011-10-07 00:01:22 +09:00
|
|
|
!ArabicCIDTrueType.pdf
|
|
|
|
!ThuluthFeatures.pdf
|
|
|
|
!arial_unicode_ab_cidfont.pdf
|
|
|
|
!arial_unicode_en_cidfont.pdf
|
|
|
|
!asciihexdecode.pdf
|
2016-10-01 19:05:07 +09:00
|
|
|
!bug766086.pdf
|
2015-12-26 05:57:08 +09:00
|
|
|
!bug793632.pdf
|
2022-04-22 18:40:13 +09:00
|
|
|
!issue14821.pdf
|
2015-10-05 06:47:03 +09:00
|
|
|
!bug1020858.pdf
|
2020-09-20 00:47:38 +09:00
|
|
|
!prefilled_f1040.pdf
|
2015-04-02 23:26:14 +09:00
|
|
|
!bug1050040.pdf
|
2015-09-01 20:31:02 +09:00
|
|
|
!bug1200096.pdf
|
2016-05-08 01:23:47 +09:00
|
|
|
!bug1068432.pdf
|
2021-01-04 22:25:30 +09:00
|
|
|
!issue12295.pdf
|
2016-05-27 00:34:00 +09:00
|
|
|
!bug1146106.pdf
|
2021-05-27 17:06:13 +09:00
|
|
|
!issue13447.pdf
|
2019-02-22 00:25:34 +09:00
|
|
|
!bug1245391_reduced.pdf
|
2016-09-17 21:36:42 +09:00
|
|
|
!bug1252420.pdf
|
2019-03-08 20:55:44 +09:00
|
|
|
!bug1513120_reduced.pdf
|
2020-08-08 03:46:41 +09:00
|
|
|
!bug1538111.pdf
|
2019-06-01 03:44:24 +09:00
|
|
|
!bug1552113.pdf
|
2021-05-14 20:38:26 +09:00
|
|
|
!issue6132.pdf
|
2018-08-04 08:47:45 +09:00
|
|
|
!issue9949.pdf
|
2016-10-08 03:51:02 +09:00
|
|
|
!bug1308536.pdf
|
Always choose a (3, 1) cmap table for TrueType fonts that have an encoding specified, regardless of the Symbolic font flag (bug 1337429)
This patch basically reverts one aspect of TrueType (3, 1) cmap parsing to the state prior to PR 4259. After that PR, a number of regressions occurred in this particular code-path, which necessitated a number of follow-ups such as PRs 5703, 5743, and 6425.
The empirical data suggests, at least to me, that we should always prefer a (3, 1) cmap for TrueType fonts when they have an encoding, regardless of the Symbolic font flag.
Obviously this patch passes all unit/font/reference tests locally, and I made sure that all the PRs mentioned above landed with test-cases included.
However, in my opinion, there's still a very real possibility that this patch could potentially cause new regressions.
Given that the PDF file in bug 1337429 has been broken for almost *three* years before anyone noticed, and considering that the code-path in question has been the source of numerous regressions, I do *not* intend to request uplift of this patch to previous Firefox versions (assuming that it's even accepted).
Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1337429.
2017-02-15 22:18:42 +09:00
|
|
|
!bug1337429.pdf
|
2020-02-05 21:59:47 +09:00
|
|
|
!bug1606566.pdf
|
2015-12-03 09:47:20 +09:00
|
|
|
!issue5564_reduced.pdf
|
2011-10-07 00:01:22 +09:00
|
|
|
!canvas.pdf
|
2016-03-08 04:56:15 +09:00
|
|
|
!bug1132849.pdf
|
|
|
|
!issue6894.pdf
|
Apply Patterns, if necessary, when rendering text
Currently we're not applying Patterns for text, but only for graphics.
This patch is unfortunately not a complete solution, but rather a step on the way, since there are still some PDF files where the Patterns look more like a solid colour, rather than the intended gradient.
I've been unable to fix these issues completely, and I've not managed to determine if the remaining issues are caused either by the pattern code, the canvas code, or perhaps both.
However, given that even this simple patch improves the current situation quite a bit, I figured that it couldn't hurt to submit it as-is.
- Fixes 5804.
- Fixes 6130.
- Improves 3988 a lot, since the text is now visible. However, it looks like the text is *one* solid colour, instead of the correct gradient.
- Improves 5432, since the text is no longer gray. (This file also suffers from the same problem as the previous one.)
2015-12-30 01:57:10 +09:00
|
|
|
!issue5804.pdf
|
2020-08-08 05:04:53 +09:00
|
|
|
!issue11131_reduced.pdf
|
2020-02-09 01:43:53 +09:00
|
|
|
!Pages-tree-refs.pdf
|
Apply Patterns, if necessary, when rendering text
Currently we're not applying Patterns for text, but only for graphics.
This patch is unfortunately not a complete solution, but rather a step on the way, since there are still some PDF files where the Patterns look more like a solid colour, rather than the intended gradient.
I've been unable to fix these issues completely, and I've not managed to determine if the remaining issues are caused either by the pattern code, the canvas code, or perhaps both.
However, given that even this simple patch improves the current situation quite a bit, I figured that it couldn't hurt to submit it as-is.
- Fixes 5804.
- Fixes 6130.
- Improves 3988 a lot, since the text is now visible. However, it looks like the text is *one* solid colour, instead of the correct gradient.
- Improves 5432, since the text is no longer gray. (This file also suffers from the same problem as the previous one.)
2015-12-30 01:57:10 +09:00
|
|
|
!ShowText-ShadingPattern.pdf
|
2011-10-07 00:01:22 +09:00
|
|
|
!complex_ttf_font.pdf
|
2016-01-06 02:53:31 +09:00
|
|
|
!issue3694_reduced.pdf
|
2011-10-07 00:01:22 +09:00
|
|
|
!extgstate.pdf
|
2021-09-09 09:31:10 +09:00
|
|
|
!issue4706.pdf
|
2011-10-07 00:01:22 +09:00
|
|
|
!rotation.pdf
|
|
|
|
!simpletype3font.pdf
|
|
|
|
!sizes.pdf
|
2015-09-16 20:52:57 +09:00
|
|
|
!javauninstall-7r.pdf
|
2017-09-02 17:06:44 +09:00
|
|
|
!file_url_link.pdf
|
2016-09-25 19:19:22 +09:00
|
|
|
!multiple-filters-length-zero.pdf
|
2016-10-08 21:06:28 +09:00
|
|
|
!non-embedded-NuptialScript.pdf
|
2015-09-10 23:59:30 +09:00
|
|
|
!issue3205r.pdf
|
2015-11-15 21:27:48 +09:00
|
|
|
!issue3207r.pdf
|
2015-11-14 00:47:02 +09:00
|
|
|
!issue3263r.pdf
|
2015-11-01 20:38:59 +09:00
|
|
|
!issue3879r.pdf
|
2016-03-06 06:32:54 +09:00
|
|
|
!issue5686.pdf
|
|
|
|
!issue3928.pdf
|
2021-05-13 03:41:36 +09:00
|
|
|
!issue8565.pdf
|
Move svg:clipPath generation from clip to endPath
In the PDF from issue 8527, the clip operator (W) shows up before a path
is defined. The current SVG backend however expects a path to exist
before generating a `<svg:clipPath>` element.
In the example, the path was defined after the clip, followed by a
endPath operator (n).
So this commit fixes the bug by moving the path generation logic from
clip to endPath.
Our canvas backend appears to use similar logic:
`CanvasGraphics_endPath` calls `consumePath`, which in turn draws the
clip and resets the `pendingClip` state. The canvas backend calls
`consumePath` from multiple other places, so we probably need to check
whether doing so is also necessary for the SVG backend.
I scanned our corpus of PDF files in test/pdfs, and found that in every
instance (except for one), the "W" PDF operator (clip) is immediately
followed by "n" (endPath). The new test from this commit (clippath.pdf)
starts with "W", followed by a path definition and then "n".
# Commands used to find some of the clipping commands:
grep -ra '^W$' -C7 | less -S
grep -ra '^W ' -C7 | less -S
grep -ra ' W$' -C7 | less -S
test/pdfs/issue6413.pdf is the only file where "W" (a tline 55) is not
followed by "n". In fact, the "W" is the last operation of a series of
XObject painting operations, and removing it does not have any effect
on the rendered PDF (confirmed by looking at the output of PDF.js's
canvas backend, and ImageMagick's convert command).
2017-06-19 19:40:48 +09:00
|
|
|
!clippath.pdf
|
2018-01-05 07:43:07 +09:00
|
|
|
!issue8795_reduced.pdf
|
2022-02-16 09:44:50 +09:00
|
|
|
!bug1755507.pdf
|
2011-10-28 22:38:55 +09:00
|
|
|
!close-path-bug.pdf
|
2015-11-04 00:03:08 +09:00
|
|
|
!issue6019.pdf
|
2015-11-13 05:41:16 +09:00
|
|
|
!issue6621.pdf
|
2016-01-06 02:53:31 +09:00
|
|
|
!issue6286.pdf
|
2021-03-31 22:32:30 +09:00
|
|
|
!issue13107_reduced.pdf
|
2015-11-18 21:47:56 +09:00
|
|
|
!issue1055r.pdf
|
2020-03-22 22:09:08 +09:00
|
|
|
!issue11713.pdf
|
2015-10-28 02:20:29 +09:00
|
|
|
!issue1293r.pdf
|
2020-06-04 15:43:46 +09:00
|
|
|
!issue11931.pdf
|
2015-11-15 23:32:11 +09:00
|
|
|
!issue1655r.pdf
|
2015-10-21 03:22:06 +09:00
|
|
|
!issue6541.pdf
|
2021-05-24 02:03:53 +09:00
|
|
|
!issue10640.pdf
|
2015-07-31 20:59:02 +09:00
|
|
|
!issue2948.pdf
|
2015-08-01 00:55:51 +09:00
|
|
|
!issue6231_1.pdf
|
2019-01-05 08:13:13 +09:00
|
|
|
!issue10402.pdf
|
2017-12-14 09:10:14 +09:00
|
|
|
!issue7074_reduced.pdf
|
2015-09-04 05:29:12 +09:00
|
|
|
!issue6413.pdf
|
2014-04-17 21:52:33 +09:00
|
|
|
!issue4630.pdf
|
2015-06-26 05:16:08 +09:00
|
|
|
!issue4909.pdf
|
2019-05-24 08:47:22 +09:00
|
|
|
!scorecard_reduced.pdf
|
2015-12-08 06:30:09 +09:00
|
|
|
!issue5084.pdf
|
2017-10-19 10:33:35 +09:00
|
|
|
!issue8960_reduced.pdf
|
2015-01-02 22:21:56 +09:00
|
|
|
!issue5202.pdf
|
2017-11-29 04:24:27 +09:00
|
|
|
!images_1bit_grayscale.pdf
|
2014-09-09 22:29:31 +09:00
|
|
|
!issue5280.pdf
|
2020-10-05 23:38:01 +09:00
|
|
|
!issue12399_reduced.pdf
|
2021-04-27 18:49:03 +09:00
|
|
|
!annotation-ink-without-appearance.pdf
|
2015-06-19 04:53:15 +09:00
|
|
|
!issue5677.pdf
|
2015-05-20 22:08:55 +09:00
|
|
|
!issue5954.pdf
|
2015-11-24 00:57:43 +09:00
|
|
|
!issue6612.pdf
|
2011-10-29 06:11:14 +09:00
|
|
|
!alphatrans.pdf
|
2017-11-29 02:40:22 +09:00
|
|
|
!pattern_text_embedded_font.pdf
|
2011-11-12 07:44:47 +09:00
|
|
|
!devicen.pdf
|
2011-11-10 02:39:55 +09:00
|
|
|
!cmykjpeg.pdf
|
2011-12-02 21:55:04 +09:00
|
|
|
!issue840.pdf
|
2020-11-04 00:04:08 +09:00
|
|
|
!160F-2019.pdf
|
2016-03-02 10:05:33 +09:00
|
|
|
!issue4402_reduced.pdf
|
2015-11-17 07:38:23 +09:00
|
|
|
!issue845r.pdf
|
2015-11-17 01:03:59 +09:00
|
|
|
!issue3405r.pdf
|
2021-11-02 14:35:08 +09:00
|
|
|
!issue14130.pdf
|
2019-09-18 03:01:17 +09:00
|
|
|
!issue7339_reduced.pdf
|
2013-11-03 07:07:13 +09:00
|
|
|
!issue3438.pdf
|
2020-01-22 03:36:41 +09:00
|
|
|
!issue11403_reduced.pdf
|
2022-01-14 01:53:22 +09:00
|
|
|
!ContentStreamNoCycleType3insideType3.pdf
|
2022-01-14 01:36:36 +09:00
|
|
|
!ContentStreamCycleType3insideType3.pdf
|
2012-10-14 04:21:15 +09:00
|
|
|
!issue2074.pdf
|
2011-12-09 10:17:24 +09:00
|
|
|
!scan-bad.pdf
|
2021-07-01 07:09:07 +09:00
|
|
|
!issue13561_reduced.pdf
|
2014-10-13 05:36:50 +09:00
|
|
|
!bug847420.pdf
|
2013-10-31 23:10:08 +09:00
|
|
|
!bug860632.pdf
|
2015-02-19 21:58:03 +09:00
|
|
|
!bug894572.pdf
|
2015-04-09 23:09:24 +09:00
|
|
|
!bug911034.pdf
|
2015-02-07 05:58:01 +09:00
|
|
|
!bug1108301.pdf
|
2021-11-01 00:48:53 +09:00
|
|
|
!issue10301.pdf
|
2015-04-25 20:27:10 +09:00
|
|
|
!bug1157493.pdf
|
2023-01-17 23:26:58 +09:00
|
|
|
!issue15910.pdf
|
2020-06-30 16:24:01 +09:00
|
|
|
!issue4260_reduced.pdf
|
2016-02-25 01:56:28 +09:00
|
|
|
!bug1250079.pdf
|
2018-07-10 05:11:46 +09:00
|
|
|
!bug1473809.pdf
|
2020-08-22 07:25:07 +09:00
|
|
|
!issue12120_reduced.pdf
|
2012-04-25 08:53:11 +09:00
|
|
|
!pdfjsbad1586.pdf
|
2020-12-11 10:32:18 +09:00
|
|
|
!standard_fonts.pdf
|
2011-12-08 12:38:34 +09:00
|
|
|
!freeculture.pdf
|
2022-04-19 23:53:44 +09:00
|
|
|
!issue14802.pdf
|
2015-11-16 08:26:42 +09:00
|
|
|
!issue6006.pdf
|
2012-03-02 12:23:36 +09:00
|
|
|
!pdfkit_compressed.pdf
|
2012-03-18 07:35:04 +09:00
|
|
|
!TAMReview.pdf
|
2015-09-05 19:29:16 +09:00
|
|
|
!pr4922.pdf
|
2015-10-17 01:48:26 +09:00
|
|
|
!pr6531_1.pdf
|
|
|
|
!pr6531_2.pdf
|
2016-05-25 00:35:45 +09:00
|
|
|
!pr7352.pdf
|
2013-11-18 22:48:06 +09:00
|
|
|
!bug900822.pdf
|
2017-09-21 03:37:56 +09:00
|
|
|
!bug1392647.pdf
|
2011-12-13 12:42:39 +09:00
|
|
|
!issue918.pdf
|
2021-09-22 07:44:12 +09:00
|
|
|
!bug920426.pdf
|
2012-10-27 13:30:01 +09:00
|
|
|
!issue1905.pdf
|
2013-11-02 01:33:30 +09:00
|
|
|
!issue2833.pdf
|
2014-07-31 20:46:11 +09:00
|
|
|
!issue2931.pdf
|
|
|
|
!issue3323.pdf
|
|
|
|
!issue4304.pdf
|
2021-10-20 12:14:48 +09:00
|
|
|
!issue9017_reduced.pdf
|
2014-12-30 00:28:23 +09:00
|
|
|
!issue4379.pdf
|
2014-04-11 04:36:37 +09:00
|
|
|
!issue4550.pdf
|
2021-09-04 01:10:27 +09:00
|
|
|
!issue13316_reduced.pdf
|
2023-01-30 20:04:10 +09:00
|
|
|
!issue15977_reduced.pdf
|
2015-10-23 22:15:06 +09:00
|
|
|
!issue4575.pdf
|
2023-02-28 02:34:12 +09:00
|
|
|
!colorspace_atan.pdf
|
2014-08-19 07:57:52 +09:00
|
|
|
!bug1011159.pdf
|
2015-02-14 19:59:10 +09:00
|
|
|
!issue5734.pdf
|
2014-08-04 01:34:52 +09:00
|
|
|
!issue4875.pdf
|
2020-03-25 22:06:01 +09:00
|
|
|
!issue11740_reduced.pdf
|
2020-12-10 22:22:05 +09:00
|
|
|
!issue12705.pdf
|
2014-08-04 01:34:52 +09:00
|
|
|
!issue4881.pdf
|
2015-05-11 04:48:17 +09:00
|
|
|
!issue5994.pdf
|
2015-10-24 01:20:57 +09:00
|
|
|
!issue6151.pdf
|
2013-06-21 07:03:30 +09:00
|
|
|
!rotated.pdf
|
2012-02-20 11:12:57 +09:00
|
|
|
!issue1249.pdf
|
2013-11-02 07:13:31 +09:00
|
|
|
!issue1171.pdf
|
2011-12-17 03:54:31 +09:00
|
|
|
!smaskdim.pdf
|
2013-03-05 05:28:04 +09:00
|
|
|
!endchar.pdf
|
2011-12-29 14:23:17 +09:00
|
|
|
!type4psfunc.pdf
|
2012-03-20 01:09:42 +09:00
|
|
|
!issue1350.pdf
|
2012-01-12 11:14:49 +09:00
|
|
|
!S2.pdf
|
2014-05-04 00:28:30 +09:00
|
|
|
!glyph_accent.pdf
|
2014-01-27 22:17:14 +09:00
|
|
|
!personwithdog.pdf
|
2021-04-28 17:15:31 +09:00
|
|
|
!find_all.pdf
|
2013-08-24 02:57:11 +09:00
|
|
|
!helloworld-bad.pdf
|
2012-01-18 13:50:49 +09:00
|
|
|
!zerowidthline.pdf
|
2023-02-28 02:34:12 +09:00
|
|
|
!colorspace_cos.pdf
|
2021-05-28 20:30:18 +09:00
|
|
|
!issue13242.pdf
|
2021-02-20 23:23:54 +09:00
|
|
|
!js-colors.pdf
|
2021-03-16 00:16:49 +09:00
|
|
|
!annotation-line-without-appearance-empty-Rect.pdf
|
2021-01-10 04:07:08 +09:00
|
|
|
!issue12841_reduced.pdf
|
2013-11-03 08:56:48 +09:00
|
|
|
!bug868745.pdf
|
2014-09-18 23:10:46 +09:00
|
|
|
!mmtype1.pdf
|
2015-09-10 19:49:41 +09:00
|
|
|
!issue4436r.pdf
|
2015-02-07 08:13:41 +09:00
|
|
|
!issue5704.pdf
|
2015-02-24 00:01:08 +09:00
|
|
|
!issue5751.pdf
|
2015-02-07 08:13:41 +09:00
|
|
|
!bug893730.pdf
|
2014-01-03 09:44:11 +09:00
|
|
|
!bug864847.pdf
|
2022-10-27 00:59:48 +09:00
|
|
|
!issue15629.pdf
|
2012-03-11 12:12:33 +09:00
|
|
|
!issue1002.pdf
|
2012-02-20 15:12:22 +09:00
|
|
|
!issue925.pdf
|
2014-09-27 20:14:25 +09:00
|
|
|
!issue2840.pdf
|
2015-01-02 22:21:56 +09:00
|
|
|
!issue4061.pdf
|
2014-04-25 01:48:18 +09:00
|
|
|
!issue4668.pdf
|
2021-05-24 02:03:53 +09:00
|
|
|
!issue13226.pdf
|
2017-10-30 11:07:02 +09:00
|
|
|
!PDFJS-7562-reduced.pdf
|
2020-04-15 20:34:13 +09:00
|
|
|
!issue11768_reduced.pdf
|
2014-07-24 21:59:21 +09:00
|
|
|
!issue5039.pdf
|
2021-10-13 07:24:18 +09:00
|
|
|
!issue14117.pdf
|
2014-07-29 01:41:47 +09:00
|
|
|
!issue5070.pdf
|
2014-08-31 21:03:25 +09:00
|
|
|
!issue5238.pdf
|
2022-02-26 20:27:39 +09:00
|
|
|
!jp2k-resetprob.pdf
|
2014-08-29 21:15:19 +09:00
|
|
|
!issue5244.pdf
|
2014-10-25 18:35:13 +09:00
|
|
|
!issue5291.pdf
|
2020-08-08 18:43:46 +09:00
|
|
|
!issue4398.pdf
|
2014-10-19 05:29:21 +09:00
|
|
|
!issue5421.pdf
|
2014-11-05 00:16:48 +09:00
|
|
|
!issue5470.pdf
|
2015-01-11 22:54:12 +09:00
|
|
|
!issue5501.pdf
|
2014-12-30 23:43:04 +09:00
|
|
|
!issue5599.pdf
|
2015-02-20 19:19:21 +09:00
|
|
|
!issue5747.pdf
|
2015-06-10 00:52:36 +09:00
|
|
|
!issue6099.pdf
|
2015-08-09 19:31:05 +09:00
|
|
|
!issue6336.pdf
|
2015-08-28 20:42:01 +09:00
|
|
|
!issue6387.pdf
|
2015-09-07 00:16:31 +09:00
|
|
|
!issue6410.pdf
|
2020-06-26 19:36:28 +09:00
|
|
|
!issue11124.pdf
|
2017-06-30 03:52:49 +09:00
|
|
|
!issue8586.pdf
|
2017-09-20 05:43:23 +09:00
|
|
|
!jbig2_symbol_offset.pdf
|
2012-03-30 00:53:51 +09:00
|
|
|
!gradientfill.pdf
|
2013-11-14 04:45:59 +09:00
|
|
|
!bug903856.pdf
|
2022-03-03 22:14:22 +09:00
|
|
|
!issue14618.pdf
|
2014-04-08 20:07:29 +09:00
|
|
|
!bug850854.pdf
|
2021-01-04 22:25:30 +09:00
|
|
|
!issue12810.pdf
|
2014-05-18 07:57:06 +09:00
|
|
|
!bug866395.pdf
|
2020-06-30 19:18:06 +09:00
|
|
|
!issue12010_reduced.pdf
|
2020-03-24 22:33:43 +09:00
|
|
|
!issue11718_reduced.pdf
|
2014-07-29 23:48:01 +09:00
|
|
|
!bug1027533.pdf
|
2014-06-28 19:38:25 +09:00
|
|
|
!bug1028735.pdf
|
2015-03-26 20:40:37 +09:00
|
|
|
!bug1046314.pdf
|
2014-10-09 04:11:41 +09:00
|
|
|
!bug1065245.pdf
|
2021-05-11 09:43:37 +09:00
|
|
|
!issue6769.pdf
|
2015-07-03 20:14:41 +09:00
|
|
|
!bug1151216.pdf
|
2021-11-05 08:45:48 +09:00
|
|
|
!issue8111.pdf
|
2015-07-03 20:14:41 +09:00
|
|
|
!bug1175962.pdf
|
2015-10-03 01:04:08 +09:00
|
|
|
!bug1020226.pdf
|
2019-03-27 02:38:44 +09:00
|
|
|
!issue9534_reduced.pdf
|
2019-08-25 02:57:35 +09:00
|
|
|
!attachment.pdf
|
2012-04-13 09:59:30 +09:00
|
|
|
!basicapi.pdf
|
2022-10-20 23:24:17 +09:00
|
|
|
!issue15590.pdf
|
2022-10-20 00:49:40 +09:00
|
|
|
!issue15594_reduced.pdf
|
2017-10-24 09:25:50 +09:00
|
|
|
!issue2884_reduced.pdf
|
2012-04-18 02:39:17 +09:00
|
|
|
!mixedfonts.pdf
|
2012-06-08 08:00:07 +09:00
|
|
|
!shading_extend.pdf
|
2013-01-13 04:21:30 +09:00
|
|
|
!noembed-identity.pdf
|
2013-01-24 01:15:02 +09:00
|
|
|
!noembed-identity-2.pdf
|
2013-01-17 00:13:34 +09:00
|
|
|
!noembed-jis7.pdf
|
2020-11-06 01:49:32 +09:00
|
|
|
!issue12504.pdf
|
2013-01-17 00:13:34 +09:00
|
|
|
!noembed-eucjp.pdf
|
2021-10-24 18:51:57 +09:00
|
|
|
!bug1627427_reduced.pdf
|
2013-01-17 00:13:34 +09:00
|
|
|
!noembed-sjis.pdf
|
2013-02-08 21:29:22 +09:00
|
|
|
!vertical.pdf
|
2021-05-07 18:49:58 +09:00
|
|
|
!issue13343.pdf
|
2014-09-04 04:57:57 +09:00
|
|
|
!ZapfDingbats.pdf
|
2014-04-08 03:50:27 +09:00
|
|
|
!bug878026.pdf
|
2015-11-06 23:54:50 +09:00
|
|
|
!issue1045.pdf
|
2014-06-27 07:41:44 +09:00
|
|
|
!issue5010.pdf
|
2019-01-23 02:59:36 +09:00
|
|
|
!issue10339_reduced.pdf
|
2022-10-10 16:40:30 +09:00
|
|
|
!issue15557.pdf
|
2014-06-15 05:51:13 +09:00
|
|
|
!issue4934.pdf
|
2014-04-29 03:09:00 +09:00
|
|
|
!issue4650.pdf
|
2016-02-25 03:48:02 +09:00
|
|
|
!issue6721_reduced.pdf
|
2013-11-02 06:30:28 +09:00
|
|
|
!issue3025.pdf
|
2021-04-19 06:37:22 +09:00
|
|
|
!french_diacritics.pdf
|
2013-01-12 10:10:09 +09:00
|
|
|
!issue2099-1.pdf
|
2013-06-24 03:20:47 +09:00
|
|
|
!issue3371.pdf
|
2013-03-18 22:06:59 +09:00
|
|
|
!issue2956.pdf
|
2014-05-03 04:04:16 +09:00
|
|
|
!issue2537r.pdf
|
2020-07-15 07:17:27 +09:00
|
|
|
!issue269_1.pdf
|
2013-12-17 08:19:31 +09:00
|
|
|
!bug946506.pdf
|
2013-12-18 05:32:24 +09:00
|
|
|
!issue3885.pdf
|
2020-03-14 06:09:27 +09:00
|
|
|
!issue11697_reduced.pdf
|
2014-01-21 09:44:46 +09:00
|
|
|
!bug859204.pdf
|
2016-09-18 22:35:12 +09:00
|
|
|
!annotation-tx.pdf
|
|
|
|
!annotation-tx2.pdf
|
|
|
|
!annotation-tx3.pdf
|
2015-08-04 00:34:30 +09:00
|
|
|
!coons-allflags-withfunction.pdf
|
2015-08-05 06:55:55 +09:00
|
|
|
!tensor-allflags-withfunction.pdf
|
2018-11-21 01:50:37 +09:00
|
|
|
!issue10084_reduced.pdf
|
2014-02-06 03:58:14 +09:00
|
|
|
!issue4246.pdf
|
2021-09-15 18:06:25 +09:00
|
|
|
!issue11915.pdf
|
2021-01-09 02:40:09 +09:00
|
|
|
!js-authors.pdf
|
2014-03-18 22:07:54 +09:00
|
|
|
!issue4461.pdf
|
2014-04-12 01:55:39 +09:00
|
|
|
!issue4573.pdf
|
2014-12-12 01:29:26 +09:00
|
|
|
!issue4722.pdf
|
2023-01-21 20:00:17 +09:00
|
|
|
!bug1811668_reduced.pdf
|
2014-07-31 05:15:06 +09:00
|
|
|
!issue4800.pdf
|
2021-11-05 08:45:48 +09:00
|
|
|
!issue9243.pdf
|
2021-03-27 02:23:18 +09:00
|
|
|
!issue13147.pdf
|
2020-03-04 04:55:51 +09:00
|
|
|
!issue11477_reduced.pdf
|
2018-04-10 22:44:42 +09:00
|
|
|
!text_clip_cff_cid.pdf
|
2014-06-24 03:55:51 +09:00
|
|
|
!issue4801.pdf
|
2015-02-28 20:58:53 +09:00
|
|
|
!issue5334.pdf
|
2018-09-30 23:29:16 +09:00
|
|
|
!annotation-caret-ink.pdf
|
2015-07-25 19:26:36 +09:00
|
|
|
!bug1186827.pdf
|
2020-12-11 06:45:14 +09:00
|
|
|
!issue12706.pdf
|
2015-07-25 19:26:36 +09:00
|
|
|
!issue215.pdf
|
2015-12-05 03:52:45 +09:00
|
|
|
!issue5044.pdf
|
2015-11-16 21:15:36 +09:00
|
|
|
!issue1512r.pdf
|
2015-11-09 20:57:20 +09:00
|
|
|
!issue2128r.pdf
|
2021-10-20 12:14:48 +09:00
|
|
|
!bug1703683_page2_reduced.pdf
|
2015-07-03 06:47:47 +09:00
|
|
|
!issue5540.pdf
|
2023-01-06 18:27:20 +09:00
|
|
|
!issue15893_reduced.pdf
|
2014-12-18 06:42:06 +09:00
|
|
|
!issue5549.pdf
|
2021-04-14 20:58:43 +09:00
|
|
|
!visibility_expressions.pdf
|
2014-12-18 06:46:47 +09:00
|
|
|
!issue5475.pdf
|
2019-05-08 00:44:37 +09:00
|
|
|
!issue10519_reduced.pdf
|
2014-12-26 05:04:01 +09:00
|
|
|
!annotation-border-styles.pdf
|
2023-02-28 02:34:12 +09:00
|
|
|
!colorspace_sin.pdf
|
2015-12-05 20:22:09 +09:00
|
|
|
!IdentityToUnicodeMap_charCodeOf.pdf
|
2018-01-05 07:43:07 +09:00
|
|
|
!PDFJS-9279-reduced.pdf
|
2014-12-19 05:26:02 +09:00
|
|
|
!issue5481.pdf
|
2021-09-27 04:04:11 +09:00
|
|
|
!resetform.pdf
|
2015-02-10 07:32:16 +09:00
|
|
|
!issue5567.pdf
|
2015-02-05 23:25:23 +09:00
|
|
|
!issue5701.pdf
|
2021-05-11 09:43:37 +09:00
|
|
|
!issue6769_no_matrix.pdf
|
2020-07-15 07:17:27 +09:00
|
|
|
!issue12007_reduced.pdf
|
2015-05-14 21:08:43 +09:00
|
|
|
!issue5896.pdf
|
2015-05-14 00:25:42 +09:00
|
|
|
!issue6010_1.pdf
|
|
|
|
!issue6010_2.pdf
|
2015-06-07 01:00:14 +09:00
|
|
|
!issue6068.pdf
|
2015-06-05 04:28:14 +09:00
|
|
|
!issue6081.pdf
|
2015-07-11 03:18:53 +09:00
|
|
|
!issue6069.pdf
|
2015-07-21 01:25:02 +09:00
|
|
|
!issue6106.pdf
|
2015-11-25 16:44:06 +09:00
|
|
|
!issue6296.pdf
|
2016-04-11 06:39:15 +09:00
|
|
|
!bug852992_reduced.pdf
|
2021-04-21 01:32:23 +09:00
|
|
|
!issue13271.pdf
|
2015-12-24 02:17:23 +09:00
|
|
|
!issue6298.pdf
|
2016-01-18 06:03:21 +09:00
|
|
|
!issue6889.pdf
|
2021-05-11 09:43:37 +09:00
|
|
|
!issue11473.pdf
|
2015-07-21 01:25:02 +09:00
|
|
|
!bug1001080.pdf
|
2022-11-23 20:58:00 +09:00
|
|
|
!issue15716.pdf
|
2020-12-11 08:29:07 +09:00
|
|
|
!bug1671312_reduced.pdf
|
2021-01-14 23:34:44 +09:00
|
|
|
!bug1671312_ArialNarrow.pdf
|
2015-07-11 19:15:43 +09:00
|
|
|
!issue6108.pdf
|
2015-08-30 08:23:52 +09:00
|
|
|
!issue6113.pdf
|
2015-11-16 04:07:54 +09:00
|
|
|
!openoffice.pdf
|
2020-11-19 02:54:26 +09:00
|
|
|
!js-buttons.pdf
|
2016-02-23 08:21:28 +09:00
|
|
|
!issue7014.pdf
|
2017-06-30 09:14:58 +09:00
|
|
|
!issue8187.pdf
|
2015-12-23 05:31:56 +09:00
|
|
|
!annotation-link-text-popup.pdf
|
2018-01-05 07:43:07 +09:00
|
|
|
!issue9278.pdf
|
2015-12-23 05:31:56 +09:00
|
|
|
!annotation-text-without-popup.pdf
|
2015-12-28 08:33:41 +09:00
|
|
|
!annotation-underline.pdf
|
2021-04-07 21:06:22 +09:00
|
|
|
!issue13193.pdf
|
2020-08-08 03:46:41 +09:00
|
|
|
!annotation-underline-without-appearance.pdf
|
2020-07-15 07:17:27 +09:00
|
|
|
!issue269_2.pdf
|
2021-06-15 13:09:45 +09:00
|
|
|
!issue13372.pdf
|
2015-12-29 23:09:28 +09:00
|
|
|
!annotation-strikeout.pdf
|
2020-08-08 03:46:41 +09:00
|
|
|
!annotation-strikeout-without-appearance.pdf
|
2015-12-30 23:28:26 +09:00
|
|
|
!annotation-squiggly.pdf
|
2021-11-10 06:39:21 +09:00
|
|
|
!issue14256.pdf
|
2020-08-08 03:46:41 +09:00
|
|
|
!annotation-squiggly-without-appearance.pdf
|
2016-01-01 23:31:46 +09:00
|
|
|
!annotation-highlight.pdf
|
2020-08-08 03:46:41 +09:00
|
|
|
!annotation-highlight-without-appearance.pdf
|
2020-09-28 21:39:48 +09:00
|
|
|
!issue12418_reduced.pdf
|
2019-04-14 01:45:22 +09:00
|
|
|
!annotation-freetext.pdf
|
2017-04-03 03:50:17 +09:00
|
|
|
!annotation-line.pdf
|
2021-04-30 23:43:27 +09:00
|
|
|
!evaljs.pdf
|
2021-10-20 12:14:48 +09:00
|
|
|
!issue12798_page1_reduced.pdf
|
2021-03-01 02:51:37 +09:00
|
|
|
!annotation-line-without-appearance.pdf
|
2020-10-22 00:21:33 +09:00
|
|
|
!bug1669099.pdf
|
2017-07-24 07:34:39 +09:00
|
|
|
!annotation-square-circle.pdf
|
2021-02-22 01:10:35 +09:00
|
|
|
!annotation-square-circle-without-appearance.pdf
|
2017-09-16 23:37:50 +09:00
|
|
|
!annotation-stamp.pdf
|
2021-09-19 05:28:23 +09:00
|
|
|
!issue14048.pdf
|
2021-11-01 01:46:42 +09:00
|
|
|
!issue11656.pdf
|
2016-02-15 05:27:53 +09:00
|
|
|
!annotation-fileattachment.pdf
|
2016-09-15 04:51:21 +09:00
|
|
|
!annotation-text-widget.pdf
|
2022-09-16 05:19:16 +09:00
|
|
|
!issue15443.pdf
|
2016-09-26 00:08:17 +09:00
|
|
|
!annotation-choice-widget.pdf
|
2021-05-24 02:03:53 +09:00
|
|
|
!issue10900.pdf
|
2016-12-16 06:15:38 +09:00
|
|
|
!annotation-button-widget.pdf
|
2017-09-24 00:01:19 +09:00
|
|
|
!annotation-polyline-polygon.pdf
|
2021-04-28 02:02:20 +09:00
|
|
|
!annotation-polyline-polygon-without-appearance.pdf
|
2016-03-03 11:10:15 +09:00
|
|
|
!zero_descent.pdf
|
2016-10-13 20:47:17 +09:00
|
|
|
!operator-in-TJ-array.pdf
|
2016-12-07 07:07:16 +09:00
|
|
|
!issue7878.pdf
|
2017-02-11 15:25:05 +09:00
|
|
|
!font_ascent_descent.pdf
|
2021-01-26 07:40:57 +09:00
|
|
|
!listbox_actions.pdf
|
2019-12-25 07:42:42 +09:00
|
|
|
!issue11442_reduced.pdf
|
2020-01-30 21:13:51 +09:00
|
|
|
!issue11549_reduced.pdf
|
2017-03-07 09:17:27 +09:00
|
|
|
!issue8097_reduced.pdf
|
2022-08-02 20:50:40 +09:00
|
|
|
!issue15262.pdf
|
2022-01-25 08:26:45 +09:00
|
|
|
!bug1743245.pdf
|
2020-12-06 05:27:38 +09:00
|
|
|
!quadpoints.pdf
|
2017-05-16 20:01:03 +09:00
|
|
|
!transparent.pdf
|
2021-08-24 18:30:19 +09:00
|
|
|
!issue13931.pdf
|
2017-07-06 22:08:37 +09:00
|
|
|
!xobject-image.pdf
|
2022-09-15 18:35:21 +09:00
|
|
|
!issue15441.pdf
|
2021-05-30 01:06:49 +09:00
|
|
|
!issue6605.pdf
|
2017-09-20 04:19:40 +09:00
|
|
|
!ccitt_EndOfBlock_false.pdf
|
2018-08-27 04:37:05 +09:00
|
|
|
!issue9972-1.pdf
|
|
|
|
!issue9972-2.pdf
|
|
|
|
!issue9972-3.pdf
|
2015-11-07 23:02:37 +09:00
|
|
|
!tiling-pattern-box.pdf
|
2015-10-04 00:02:19 +09:00
|
|
|
!tiling-pattern-large-steps.pdf
|
2021-05-24 02:03:53 +09:00
|
|
|
!issue13201.pdf
|
2022-01-18 02:53:03 +09:00
|
|
|
!issue14462_reduced.pdf
|
2020-01-31 23:22:54 +09:00
|
|
|
!issue11555.pdf
|
2020-09-12 23:52:38 +09:00
|
|
|
!issue12337.pdf
|
2020-11-03 15:44:21 +09:00
|
|
|
!pr12564.pdf
|
2021-01-22 06:33:43 +09:00
|
|
|
!pr12828.pdf
|
2021-10-25 23:09:26 +09:00
|
|
|
!secHandler.pdf
|
2022-02-16 09:44:50 +09:00
|
|
|
!issue14297.pdf
|
2021-10-25 00:29:30 +09:00
|
|
|
!rc_annotation.pdf
|
2021-11-25 02:55:28 +09:00
|
|
|
!issue14267.pdf
|
|
|
|
!PDFBOX-4352-0.pdf
|
2021-11-25 21:28:24 +09:00
|
|
|
!REDHAT-1531897-0.pdf
|
2021-11-28 01:34:05 +09:00
|
|
|
!xfa_issue14315.pdf
|
[api-minor] Validate the /Pages-tree /Count entry during document initialization (issue 14303)
*This patch basically extends the approach from PR 10392, by also checking the last page.*
Currently, in e.g. the `Catalog.numPages`-getter, we're simply assuming that if the /Pages-tree has an *integer* /Count entry it must also be correct/valid.
As can be seen in the referenced PDF documents, that entry may be completely bogus which causes general parsing to breaking down elsewhere in the worker-thread (and hanging the browser).
Rather than hoping that the /Count entry is correct, similar to all other data found in PDF documents, we obviously need to validate it. This turns out to be a little less straightforward than one would like, since the only way to do this (as far as I know) is to parse the *entire* /Pages-tree and essentially counting the pages.
To avoid doing that for all documents, this patch tries to take a short-cut by checking if the last page (based on the /Count entry) can be successfully fetched. If so, we assume that the /Count entry is correct and use it as-is, otherwise we'll iterate through (potentially) the *entire* /Pages-tree to determine the number of pages.
Unfortunately these changes will have a number of *somewhat* negative side-effects, please see a possibly incomplete list below, however I cannot see a better way to address this bug.
- This will slow down initial loading/rendering of all documents, at least by some amount, since we now need to fetch/parse more of the /Pages-tree in order to be able to access the *last* page of the PDF documents.
- For poorly generated PDF documents, where the entire /Pages-tree only has *one* level, we'll unfortunately need to fetch/parse the *entire* /Pages-tree to get to the last page. While there's a cache to help reduce repeated data lookups, this will affect initial loading/rendering of *some* long PDF documents,
- This will affect the `disableAutoFetch = true` mode negatively, since we now need to fetch/parse more data during document initialization. While the `disableAutoFetch = true` mode should still be helpful in larger/longer PDF documents, for smaller ones the effect/usefulness may unfortunately be lost.
As one *small* additional bonus, we should now also be able to support opening PDF documents where the /Pages-tree /Count entry is completely invalid (e.g. contains a non-integer value).
Fixes two of the issues listed in issue 14303, namely the `poppler-67295-0.pdf` and `poppler-85140-0.pdf` documents.
2021-11-26 02:34:11 +09:00
|
|
|
!poppler-67295-0.pdf
|
|
|
|
!poppler-85140-0.pdf
|
2022-06-09 21:06:51 +09:00
|
|
|
!issue15012.pdf
|
2022-07-08 19:06:25 +09:00
|
|
|
!issue15150.pdf
|
2021-12-02 09:40:52 +09:00
|
|
|
!poppler-395-0-fuzzed.pdf
|
|
|
|
!GHOSTSCRIPT-698804-1-fuzzed.pdf
|
2022-04-21 18:57:12 +09:00
|
|
|
!issue14814.pdf
|
Prevent circular references in XRef tables from hanging the worker-thread (issue 14303)
*Please note:* While this patch on its own is sufficient to prevent the worker-thread from hanging, however in combination with PR 14311 these PDF documents will both load *and* render correctly.
Rather than focusing on the particular structure of these PDF documents, it seemed (at least to me) to make sense to try and prevent all circular references when fetching/looking-up data using the XRef table.
To avoid a solution that required tracking the references manually everywhere, the implementation settled on here instead handles that internally in the `XRef.fetch`-method. This should work, since that method *and* the `Parser`/`Lexer`-implementations are completely synchronous.
Note also that the existing `XRef`-caching, used for all data-types *except* Streams, should hopefully help to lessen the performance impact of these changes.
One *potential* problem with these changes could be certain *browser* exceptions, since those are generally not catchable in JavaScript code, however those would most likely "stop" worker-thread parsing anyway (at least I hope so).
Finally, note that I settled on returning dummy-data rather than throwing an exception. This was done to allow parsing, for the rest of the document, to continue such that *one* bad reference doesn't prevent an entire document from loading.
Fixes two of the issues listed in issue 14303, namely the `poppler-91414-0.zip-2.gz-53.pdf` and `poppler-91414-0.zip-2.gz-54.pdf` documents.
2021-11-26 22:11:39 +09:00
|
|
|
!poppler-91414-0-53.pdf
|
|
|
|
!poppler-91414-0-54.pdf
|
2021-12-02 03:35:02 +09:00
|
|
|
!poppler-742-0-fuzzed.pdf
|
|
|
|
!poppler-937-0-fuzzed.pdf
|
2021-12-04 17:35:40 +09:00
|
|
|
!PDFBOX-3148-2-fuzzed.pdf
|
2021-12-07 21:16:38 +09:00
|
|
|
!poppler-90-0-fuzzed.pdf
|
2022-01-11 21:43:16 +09:00
|
|
|
!issue14415.pdf
|
2022-01-09 01:57:06 +09:00
|
|
|
!issue14307.pdf
|
2022-01-26 23:35:46 +09:00
|
|
|
!issue14497.pdf
|
2022-11-10 22:00:23 +09:00
|
|
|
!bug1799927.pdf
|
2022-01-28 06:51:30 +09:00
|
|
|
!issue14502.pdf
|
2022-02-06 06:33:54 +09:00
|
|
|
!issue13211.pdf
|
2022-03-22 06:10:46 +09:00
|
|
|
!issue14627.pdf
|
2022-05-03 02:28:00 +09:00
|
|
|
!issue14862.pdf
|
|
|
|
!issue14705.pdf
|
2022-05-28 21:08:43 +09:00
|
|
|
!bug1771477.pdf
|
2022-06-07 21:44:17 +09:00
|
|
|
!bug1724918.pdf
|
2022-06-17 22:20:40 +09:00
|
|
|
!issue15053.pdf
|
2022-06-19 23:39:54 +09:00
|
|
|
!bug1675139.pdf
|
2022-06-24 21:23:06 +09:00
|
|
|
!issue15092.pdf
|
2022-07-29 19:29:19 +09:00
|
|
|
!bug1782186.pdf
|
2022-07-29 00:59:03 +09:00
|
|
|
!tracemonkey_a11y.pdf
|
2022-08-19 02:27:53 +09:00
|
|
|
!bug1782564.pdf
|
2022-09-04 19:47:45 +09:00
|
|
|
!issue15340.pdf
|
2022-10-14 20:14:13 +09:00
|
|
|
!bug1795263.pdf
|
2022-10-20 06:01:36 +09:00
|
|
|
!issue15597.pdf
|
2022-10-24 22:26:00 +09:00
|
|
|
!bug1796741.pdf
|
2022-10-19 00:07:47 +09:00
|
|
|
!textfields.pdf
|
|
|
|
!freetext_no_appearance.pdf
|
2022-11-15 02:14:08 +09:00
|
|
|
!issue15690.pdf
|
2022-11-28 23:53:17 +09:00
|
|
|
!bug1802888.pdf
|
2022-11-29 18:46:48 +09:00
|
|
|
!issue15759.pdf
|
2022-11-30 00:37:02 +09:00
|
|
|
!issue15753.pdf
|
2022-12-08 02:27:32 +09:00
|
|
|
!issue15789.pdf
|
2022-12-13 22:33:58 +09:00
|
|
|
!fields_order.pdf
|
2022-12-13 08:07:45 +09:00
|
|
|
!issue15815.pdf
|
2022-12-14 00:08:36 +09:00
|
|
|
!issue15818.pdf
|
2023-01-05 03:49:31 +09:00
|
|
|
!autoprint.pdf
|
2023-01-26 21:04:48 +09:00
|
|
|
!bug1811694.pdf
|
|
|
|
!bug1811510.pdf
|
2023-02-08 03:26:46 +09:00
|
|
|
!bug1815476.pdf
|
2023-02-08 05:16:14 +09:00
|
|
|
!issue16021.pdf
|
2023-02-09 22:58:41 +09:00
|
|
|
!bug1770750.pdf
|
2023-02-16 19:25:15 +09:00
|
|
|
!issue16063.pdf
|
2023-02-20 00:33:05 +09:00
|
|
|
!issue16067.pdf
|
2023-03-09 22:13:28 +09:00
|
|
|
!bug1820909.1.pdf
|
2023-03-28 19:00:53 +09:00
|
|
|
!issue16221.pdf
|