Initial import of first test harness
The harness (test.py) operates as follows. First it locates executable browsers
(or symlinks or scripts) named "[browser][version]", e.g. "firefox4".
It then launches the located browsers and asks them to load the file
test_slave.html. At the same time, test.py sets up an HTTP server on
localhost:8080 (there's a race condition here currently ;). After
test_slave loads in the browser(s), it fetches the task manifest
(test_manifest.json). The entries in the manifest specify which PDF
to load and how many times to cycle through page rendering. This will
probably evolve over time. test_slave then performs the requested
tasks and POSTs the results back to test.py, which saves them. When
all the results of for a task are in, test.py checks them.
There are three types of tests currently. "==" tests compare the
rendering of a PDF against a master copy. This is not yet implemented
because setting up a master copy is complicated. "fbf" tests render
all a PDF's pages, then go back to page 1 and render all pages a
second time. The renderings from the first round must match the ones
from the second round. "load" tests just check that a PDF's pages
load without errors.
Currently the test harness will only launch a "firefox4" target. This
can be a bash script in your pdf.js checkout, pdf.js/firefox4,
something like the following
#!/bin/bash
dist="/path/to/firefox4/installation"
profile=`mktemp -dt 'pdf.js-test-ff-profile-XXXXXXXXXX'`
$dist/firefox -no-remote -profile $profile $*
rm -rf $profile
(Yes, this script doesn't clean up properly on early termination.)
It's possible to run the tests in a normal browsing session, but that
might be annoying. With that set up, run the harness like so
python test.py
If all goes well, you'll see all "TEST-PASS" messages printed to
stdout. If something goes wrong, you'll see "TEST-UNEXPECTED-FAIL"
printed to stdout.
2011-06-19 10:09:21 +09:00
|
|
|
[
|
2013-02-07 08:19:29 +09:00
|
|
|
{ "id": "filled-background-range",
|
|
|
|
"file": "pdfs/filled-background.pdf",
|
|
|
|
"md5": "2e3120255d9c3e79b96d2543b12d2589",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2011-06-22 06:53:57 +09:00
|
|
|
{ "id": "tracemonkey-eq",
|
2011-06-24 01:48:34 +09:00
|
|
|
"file": "pdfs/tracemonkey.pdf",
|
2011-11-05 01:14:52 +09:00
|
|
|
"md5": "9a192d8b1a7dc652a19835f6f08098bd",
|
Initial import of first test harness
The harness (test.py) operates as follows. First it locates executable browsers
(or symlinks or scripts) named "[browser][version]", e.g. "firefox4".
It then launches the located browsers and asks them to load the file
test_slave.html. At the same time, test.py sets up an HTTP server on
localhost:8080 (there's a race condition here currently ;). After
test_slave loads in the browser(s), it fetches the task manifest
(test_manifest.json). The entries in the manifest specify which PDF
to load and how many times to cycle through page rendering. This will
probably evolve over time. test_slave then performs the requested
tasks and POSTs the results back to test.py, which saves them. When
all the results of for a task are in, test.py checks them.
There are three types of tests currently. "==" tests compare the
rendering of a PDF against a master copy. This is not yet implemented
because setting up a master copy is complicated. "fbf" tests render
all a PDF's pages, then go back to page 1 and render all pages a
second time. The renderings from the first round must match the ones
from the second round. "load" tests just check that a PDF's pages
load without errors.
Currently the test harness will only launch a "firefox4" target. This
can be a bash script in your pdf.js checkout, pdf.js/firefox4,
something like the following
#!/bin/bash
dist="/path/to/firefox4/installation"
profile=`mktemp -dt 'pdf.js-test-ff-profile-XXXXXXXXXX'`
$dist/firefox -no-remote -profile $profile $*
rm -rf $profile
(Yes, this script doesn't clean up properly on early termination.)
It's possible to run the tests in a normal browsing session, but that
might be annoying. With that set up, run the harness like so
python test.py
If all goes well, you'll see all "TEST-PASS" messages printed to
stdout. If something goes wrong, you'll see "TEST-UNEXPECTED-FAIL"
printed to stdout.
2011-06-19 10:09:21 +09:00
|
|
|
"rounds": 1,
|
2011-06-22 06:53:57 +09:00
|
|
|
"type": "eq"
|
Initial import of first test harness
The harness (test.py) operates as follows. First it locates executable browsers
(or symlinks or scripts) named "[browser][version]", e.g. "firefox4".
It then launches the located browsers and asks them to load the file
test_slave.html. At the same time, test.py sets up an HTTP server on
localhost:8080 (there's a race condition here currently ;). After
test_slave loads in the browser(s), it fetches the task manifest
(test_manifest.json). The entries in the manifest specify which PDF
to load and how many times to cycle through page rendering. This will
probably evolve over time. test_slave then performs the requested
tasks and POSTs the results back to test.py, which saves them. When
all the results of for a task are in, test.py checks them.
There are three types of tests currently. "==" tests compare the
rendering of a PDF against a master copy. This is not yet implemented
because setting up a master copy is complicated. "fbf" tests render
all a PDF's pages, then go back to page 1 and render all pages a
second time. The renderings from the first round must match the ones
from the second round. "load" tests just check that a PDF's pages
load without errors.
Currently the test harness will only launch a "firefox4" target. This
can be a bash script in your pdf.js checkout, pdf.js/firefox4,
something like the following
#!/bin/bash
dist="/path/to/firefox4/installation"
profile=`mktemp -dt 'pdf.js-test-ff-profile-XXXXXXXXXX'`
$dist/firefox -no-remote -profile $profile $*
rm -rf $profile
(Yes, this script doesn't clean up properly on early termination.)
It's possible to run the tests in a normal browsing session, but that
might be annoying. With that set up, run the harness like so
python test.py
If all goes well, you'll see all "TEST-PASS" messages printed to
stdout. If something goes wrong, you'll see "TEST-UNEXPECTED-FAIL"
printed to stdout.
2011-06-19 10:09:21 +09:00
|
|
|
},
|
2020-06-29 03:14:03 +09:00
|
|
|
{ "id": "tracemonkey-renderTaskOnContinue",
|
|
|
|
"file": "pdfs/tracemonkey.pdf",
|
|
|
|
"md5": "9a192d8b1a7dc652a19835f6f08098bd",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"renderTaskOnContinue": true
|
|
|
|
},
|
Initial import of first test harness
The harness (test.py) operates as follows. First it locates executable browsers
(or symlinks or scripts) named "[browser][version]", e.g. "firefox4".
It then launches the located browsers and asks them to load the file
test_slave.html. At the same time, test.py sets up an HTTP server on
localhost:8080 (there's a race condition here currently ;). After
test_slave loads in the browser(s), it fetches the task manifest
(test_manifest.json). The entries in the manifest specify which PDF
to load and how many times to cycle through page rendering. This will
probably evolve over time. test_slave then performs the requested
tasks and POSTs the results back to test.py, which saves them. When
all the results of for a task are in, test.py checks them.
There are three types of tests currently. "==" tests compare the
rendering of a PDF against a master copy. This is not yet implemented
because setting up a master copy is complicated. "fbf" tests render
all a PDF's pages, then go back to page 1 and render all pages a
second time. The renderings from the first round must match the ones
from the second round. "load" tests just check that a PDF's pages
load without errors.
Currently the test harness will only launch a "firefox4" target. This
can be a bash script in your pdf.js checkout, pdf.js/firefox4,
something like the following
#!/bin/bash
dist="/path/to/firefox4/installation"
profile=`mktemp -dt 'pdf.js-test-ff-profile-XXXXXXXXXX'`
$dist/firefox -no-remote -profile $profile $*
rm -rf $profile
(Yes, this script doesn't clean up properly on early termination.)
It's possible to run the tests in a normal browsing session, but that
might be annoying. With that set up, run the harness like so
python test.py
If all goes well, you'll see all "TEST-PASS" messages printed to
stdout. If something goes wrong, you'll see "TEST-UNEXPECTED-FAIL"
printed to stdout.
2011-06-19 10:09:21 +09:00
|
|
|
{ "id": "tracemonkey-fbf",
|
2011-06-24 01:48:34 +09:00
|
|
|
"file": "pdfs/tracemonkey.pdf",
|
2011-11-05 01:14:52 +09:00
|
|
|
"md5": "9a192d8b1a7dc652a19835f6f08098bd",
|
Initial import of first test harness
The harness (test.py) operates as follows. First it locates executable browsers
(or symlinks or scripts) named "[browser][version]", e.g. "firefox4".
It then launches the located browsers and asks them to load the file
test_slave.html. At the same time, test.py sets up an HTTP server on
localhost:8080 (there's a race condition here currently ;). After
test_slave loads in the browser(s), it fetches the task manifest
(test_manifest.json). The entries in the manifest specify which PDF
to load and how many times to cycle through page rendering. This will
probably evolve over time. test_slave then performs the requested
tasks and POSTs the results back to test.py, which saves them. When
all the results of for a task are in, test.py checks them.
There are three types of tests currently. "==" tests compare the
rendering of a PDF against a master copy. This is not yet implemented
because setting up a master copy is complicated. "fbf" tests render
all a PDF's pages, then go back to page 1 and render all pages a
second time. The renderings from the first round must match the ones
from the second round. "load" tests just check that a PDF's pages
load without errors.
Currently the test harness will only launch a "firefox4" target. This
can be a bash script in your pdf.js checkout, pdf.js/firefox4,
something like the following
#!/bin/bash
dist="/path/to/firefox4/installation"
profile=`mktemp -dt 'pdf.js-test-ff-profile-XXXXXXXXXX'`
$dist/firefox -no-remote -profile $profile $*
rm -rf $profile
(Yes, this script doesn't clean up properly on early termination.)
It's possible to run the tests in a normal browsing session, but that
might be annoying. With that set up, run the harness like so
python test.py
If all goes well, you'll see all "TEST-PASS" messages printed to
stdout. If something goes wrong, you'll see "TEST-UNEXPECTED-FAIL"
printed to stdout.
2011-06-19 10:09:21 +09:00
|
|
|
"rounds": 2,
|
|
|
|
"type": "fbf"
|
|
|
|
},
|
2012-09-12 08:04:01 +09:00
|
|
|
{ "id": "tracemonkey-text",
|
|
|
|
"file": "pdfs/tracemonkey.pdf",
|
|
|
|
"md5": "9a192d8b1a7dc652a19835f6f08098bd",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2016-08-18 02:56:43 +09:00
|
|
|
{ "id": "tracemonkey-text-enhance",
|
|
|
|
"file": "pdfs/tracemonkey.pdf",
|
|
|
|
"md5": "9a192d8b1a7dc652a19835f6f08098bd",
|
|
|
|
"rounds": 1,
|
|
|
|
"enhance": true,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2013-11-17 09:51:34 +09:00
|
|
|
{ "id": "issue3925",
|
|
|
|
"file": "pdfs/issue3925.pdf",
|
|
|
|
"md5": "c5c895deecf7a7565393587e0d61be2b",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "text"
|
|
|
|
},
|
Fallback gracefully when encountering corrupt PDF files with empty /MediaBox and /CropBox entries
This is based on a real-world PDF file I encountered very recently[1], although I'm currently unable to recall where I saw it.
Note that different PDF viewers handle these sort of errors differently, with Adobe Reader outright failing to render the attached PDF file whereas PDFium mostly handles it "correctly".
The patch makes the following notable changes:
- Refactor the `cropBox` and `mediaBox` getters, on the `Page`, to reduce unnecessary duplication. (This will also help in the future, if support for extracting additional page bounding boxes are added to the API.)
- Ensure that the page bounding boxes, i.e. `cropBox` and `mediaBox`, are never empty to prevent issues/weirdness in the viewer.
- Ensure that the `view` getter on the `Page` will never return an empty intersection of the `cropBox` and `mediaBox`.
- Add an *optional* parameter to `Util.intersect`, to allow checking that the computed intersection isn't actually empty.
- Change `Util.intersect` to have consistent return types, since Arrays are of type `Object` and falling back to returning a `Boolean` thus seem strange.
---
[1] In that case I believe that only the `cropBox` was empty, but it seemed like a good idea to attempt to fix a bunch of related cases all at once.
2019-08-08 22:54:46 +09:00
|
|
|
{ "id": "boundingBox_invalid",
|
|
|
|
"file": "pdfs/boundingBox_invalid.pdf",
|
|
|
|
"md5": "f6dfc471bf43abac00cdcf05d81cb070",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2019-07-30 23:48:27 +09:00
|
|
|
{ "id": "issue11016",
|
|
|
|
"file": "pdfs/issue11016_reduced.pdf",
|
|
|
|
"md5": "b75578bd052d2e6acdcc85b615eab6b1",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2019-09-18 16:44:18 +09:00
|
|
|
{ "id": "issue11150",
|
|
|
|
"file": "pdfs/issue11150_reduced.pdf",
|
|
|
|
"md5": "8b86381089a9ec28723791245a9adfa6",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2020-01-30 21:13:51 +09:00
|
|
|
{ "id": "issue11549",
|
|
|
|
"file": "pdfs/issue11549_reduced.pdf",
|
|
|
|
"md5": "a1ea636f413e02e10dbdf379ab4a99ae",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2020-03-02 23:34:00 +09:00
|
|
|
{ "id": "issue11651-eq",
|
|
|
|
"file": "pdfs/issue11651.pdf",
|
|
|
|
"md5": "375233ad8dc4181a06148f8412f35b91",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
|
|
|
{ "id": "issue11651-text",
|
|
|
|
"file": "pdfs/issue11651.pdf",
|
|
|
|
"md5": "375233ad8dc4181a06148f8412f35b91",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2013-08-16 23:32:40 +09:00
|
|
|
{ "id": "issue1293",
|
2015-10-28 02:20:29 +09:00
|
|
|
"file": "pdfs/issue1293r.pdf",
|
|
|
|
"md5": "4a098f5051f34fab036f5bbe88f8deef",
|
2013-08-16 23:32:40 +09:00
|
|
|
"rounds": 1,
|
2015-10-28 02:20:29 +09:00
|
|
|
"link": false,
|
2013-08-16 23:32:40 +09:00
|
|
|
"type": "load",
|
|
|
|
"about": "PDF with undefined stream length."
|
|
|
|
},
|
2015-12-03 09:47:20 +09:00
|
|
|
{ "id": "issue5564_reduced",
|
|
|
|
"file": "pdfs/issue5564_reduced.pdf",
|
|
|
|
"md5": "097853614b56fc10bfbf7e56daa0c66b",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "Causes cmap to be created with invalid glyph ids."
|
|
|
|
},
|
2013-12-17 08:19:31 +09:00
|
|
|
{ "id": "bug946506",
|
|
|
|
"file": "pdfs/bug946506.pdf",
|
|
|
|
"md5": "c28911b5c31bdc337c2ce404c5971cfc",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "Fonts referenced only by name and not by an object identifier."
|
|
|
|
},
|
2015-04-09 23:09:24 +09:00
|
|
|
{ "id": "bug911034",
|
|
|
|
"file": "pdfs/bug911034.pdf",
|
|
|
|
"md5": "54ee432a4e16b26b242fbf549cdad177",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-01-04 05:17:50 +09:00
|
|
|
{ "id": "bug921760",
|
|
|
|
"file": "pdfs/bug921760.pdf",
|
|
|
|
"md5": "1aa136d786a65b0d7cce7bdb3c58c6c3",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-11-05 02:16:33 +09:00
|
|
|
{ "id": "issue3879",
|
2015-11-01 20:38:59 +09:00
|
|
|
"file": "pdfs/issue3879r.pdf",
|
|
|
|
"md5": "756cdaf5eb54c7d78b0aa737236b9a4f",
|
2013-11-05 02:16:33 +09:00
|
|
|
"rounds": 1,
|
2015-11-01 20:38:59 +09:00
|
|
|
"link": false,
|
2013-11-05 02:16:33 +09:00
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-12-18 05:32:24 +09:00
|
|
|
{ "id": "issue3885",
|
|
|
|
"file": "pdfs/issue3885.pdf",
|
|
|
|
"md5": "319c998910453bc44d40c7748cd2cb79",
|
|
|
|
"rounds": 1,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-11-02 01:33:30 +09:00
|
|
|
{ "id": "issue2833",
|
|
|
|
"file": "pdfs/issue2833.pdf",
|
|
|
|
"md5": "7bc6e17c41586155c188d7408bcb9ab5",
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-05-10 11:57:00 +09:00
|
|
|
{ "id": "issue2881",
|
|
|
|
"file": "pdfs/issue2881.pdf",
|
|
|
|
"md5": "ea6ade27d2cb146676d23dcd6605d5ee",
|
|
|
|
"link": "true",
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-01-02 06:46:27 +09:00
|
|
|
{ "id": "issue3903",
|
|
|
|
"file": "pdfs/issue3903.pdf",
|
|
|
|
"md5": "c9a4a8012e15cf3b10d671982ce35a92",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2018-01-05 07:43:07 +09:00
|
|
|
{ "id": "issue8795_reduced",
|
|
|
|
"file": "pdfs/issue8795_reduced.pdf",
|
|
|
|
"md5": "3ce58fa4aff351d46c42e0677d582099",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-01-11 11:00:44 +09:00
|
|
|
{ "id": "issue2391-1",
|
|
|
|
"file": "pdfs/issue2391-1.pdf",
|
|
|
|
"md5": "25ae9cb959612e7b343b55da63af2716",
|
|
|
|
"rounds": 1,
|
Error, rather than warn, once a number of invalid path operators are encountered in `EvaluatorPreprocessor.read` (bug 1443140)
Incomplete path operators, in particular, can result in fairly chaotic rendering artifacts, as can be observed on page four of the referenced PDF file.
The initial (naive) solution that was attempted, was to simply throw a `FormatError` as soon as any invalid (i.e. too short) operator was found and rely on the existing `ignoreErrors` code-paths. However, doing so would have caused regressions in some files; see the existing `issue2391-1` test-case, which was promoted to an `eq` test to help prevent future bugs.
Hence this patch, which adds special handling for invalid path operators since those may cause quite bad rendering artifacts.
You could, in all fairness, argue that the patch is a handwavy solution and I wouldn't object. However, given that this only concerns *corrupt* PDF files, the way that PDF viewers (PDF.js included) try to gracefully deal with those could probably be described as a best-effort solution anyway.
This patch also adjusts the existing `warn`/`info` messages to print the command name according to the PDF specification, rather than an internal PDF.js enumeration value. The former should be much more useful for debugging purposes.
Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1443140.
2018-06-24 16:53:32 +09:00
|
|
|
"type": "eq"
|
2013-01-11 11:00:44 +09:00
|
|
|
},
|
2013-01-12 04:04:56 +09:00
|
|
|
{ "id": "issue2391-2",
|
|
|
|
"file": "pdfs/issue2391-2.pdf",
|
|
|
|
"md5": "7e68756d11021a087383eaac95ba45dd",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
Error, rather than warn, once a number of invalid path operators are encountered in `EvaluatorPreprocessor.read` (bug 1443140)
Incomplete path operators, in particular, can result in fairly chaotic rendering artifacts, as can be observed on page four of the referenced PDF file.
The initial (naive) solution that was attempted, was to simply throw a `FormatError` as soon as any invalid (i.e. too short) operator was found and rely on the existing `ignoreErrors` code-paths. However, doing so would have caused regressions in some files; see the existing `issue2391-1` test-case, which was promoted to an `eq` test to help prevent future bugs.
Hence this patch, which adds special handling for invalid path operators since those may cause quite bad rendering artifacts.
You could, in all fairness, argue that the patch is a handwavy solution and I wouldn't object. However, given that this only concerns *corrupt* PDF files, the way that PDF viewers (PDF.js included) try to gracefully deal with those could probably be described as a best-effort solution anyway.
This patch also adjusts the existing `warn`/`info` messages to print the command name according to the PDF specification, rather than an internal PDF.js enumeration value. The former should be much more useful for debugging purposes.
Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1443140.
2018-06-24 16:53:32 +09:00
|
|
|
{ "id": "bug1443140",
|
|
|
|
"file": "pdfs/bug1443140.pdf",
|
|
|
|
"md5": "8f9347b0d5620537850b24b8385b0982",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 4,
|
|
|
|
"lastPage": 4,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2018-07-10 05:11:46 +09:00
|
|
|
{ "id": "bug1473809",
|
|
|
|
"file": "pdfs/bug1473809.pdf",
|
|
|
|
"md5": "4b1ca51cf8cad58a1ce0618667341c76",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-01-19 08:06:12 +09:00
|
|
|
{ "id": "issue2531",
|
|
|
|
"file": "pdfs/issue2531.pdf",
|
|
|
|
"md5": "c58e6642d8a6e2ddd5e07a543ef8f30d",
|
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
2013-01-23 03:46:54 +09:00
|
|
|
"firstPage": 4,
|
|
|
|
"lastPage": 4,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-12-20 04:23:58 +09:00
|
|
|
{ "id": "issue3999",
|
|
|
|
"file": "pdfs/issue3999.pdf",
|
|
|
|
"md5": "0a59cd612e93758aa9f104470f45574b",
|
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-01-23 03:46:54 +09:00
|
|
|
{ "id": "issue2537",
|
2014-05-03 04:04:16 +09:00
|
|
|
"file": "pdfs/issue2537r.pdf",
|
|
|
|
"md5": "0f47a8bda08eebd986c254e65dcc2a76",
|
2013-01-23 03:46:54 +09:00
|
|
|
"rounds": 1,
|
2013-01-19 08:06:12 +09:00
|
|
|
"type": "eq"
|
|
|
|
},
|
Initial import of first test harness
The harness (test.py) operates as follows. First it locates executable browsers
(or symlinks or scripts) named "[browser][version]", e.g. "firefox4".
It then launches the located browsers and asks them to load the file
test_slave.html. At the same time, test.py sets up an HTTP server on
localhost:8080 (there's a race condition here currently ;). After
test_slave loads in the browser(s), it fetches the task manifest
(test_manifest.json). The entries in the manifest specify which PDF
to load and how many times to cycle through page rendering. This will
probably evolve over time. test_slave then performs the requested
tasks and POSTs the results back to test.py, which saves them. When
all the results of for a task are in, test.py checks them.
There are three types of tests currently. "==" tests compare the
rendering of a PDF against a master copy. This is not yet implemented
because setting up a master copy is complicated. "fbf" tests render
all a PDF's pages, then go back to page 1 and render all pages a
second time. The renderings from the first round must match the ones
from the second round. "load" tests just check that a PDF's pages
load without errors.
Currently the test harness will only launch a "firefox4" target. This
can be a bash script in your pdf.js checkout, pdf.js/firefox4,
something like the following
#!/bin/bash
dist="/path/to/firefox4/installation"
profile=`mktemp -dt 'pdf.js-test-ff-profile-XXXXXXXXXX'`
$dist/firefox -no-remote -profile $profile $*
rm -rf $profile
(Yes, this script doesn't clean up properly on early termination.)
It's possible to run the tests in a normal browsing session, but that
might be annoying. With that set up, run the harness like so
python test.py
If all goes well, you'll see all "TEST-PASS" messages printed to
stdout. If something goes wrong, you'll see "TEST-UNEXPECTED-FAIL"
printed to stdout.
2011-06-19 10:09:21 +09:00
|
|
|
{ "id": "html5-canvas-cheat-sheet-load",
|
2011-06-24 01:48:34 +09:00
|
|
|
"file": "pdfs/canvas.pdf",
|
2011-11-05 01:14:52 +09:00
|
|
|
"md5": "59510028561daf62e00bf9f6f066b033",
|
Initial import of first test harness
The harness (test.py) operates as follows. First it locates executable browsers
(or symlinks or scripts) named "[browser][version]", e.g. "firefox4".
It then launches the located browsers and asks them to load the file
test_slave.html. At the same time, test.py sets up an HTTP server on
localhost:8080 (there's a race condition here currently ;). After
test_slave loads in the browser(s), it fetches the task manifest
(test_manifest.json). The entries in the manifest specify which PDF
to load and how many times to cycle through page rendering. This will
probably evolve over time. test_slave then performs the requested
tasks and POSTs the results back to test.py, which saves them. When
all the results of for a task are in, test.py checks them.
There are three types of tests currently. "==" tests compare the
rendering of a PDF against a master copy. This is not yet implemented
because setting up a master copy is complicated. "fbf" tests render
all a PDF's pages, then go back to page 1 and render all pages a
second time. The renderings from the first round must match the ones
from the second round. "load" tests just check that a PDF's pages
load without errors.
Currently the test harness will only launch a "firefox4" target. This
can be a bash script in your pdf.js checkout, pdf.js/firefox4,
something like the following
#!/bin/bash
dist="/path/to/firefox4/installation"
profile=`mktemp -dt 'pdf.js-test-ff-profile-XXXXXXXXXX'`
$dist/firefox -no-remote -profile $profile $*
rm -rf $profile
(Yes, this script doesn't clean up properly on early termination.)
It's possible to run the tests in a normal browsing session, but that
might be annoying. With that set up, run the harness like so
python test.py
If all goes well, you'll see all "TEST-PASS" messages printed to
stdout. If something goes wrong, you'll see "TEST-UNEXPECTED-FAIL"
printed to stdout.
2011-06-19 10:09:21 +09:00
|
|
|
"rounds": 1,
|
|
|
|
"type": "load"
|
2011-06-22 07:14:42 +09:00
|
|
|
},
|
2011-12-23 08:43:14 +09:00
|
|
|
{ "id": "intelisa-eq",
|
2011-06-25 11:23:29 +09:00
|
|
|
"file": "pdfs/intelisa.pdf",
|
2013-12-17 09:37:10 +09:00
|
|
|
"md5": "24643ebe348a568cfe6a532055c71493",
|
2011-06-25 11:23:29 +09:00
|
|
|
"link": true,
|
2013-01-31 04:31:08 +09:00
|
|
|
"lastPage": 100,
|
2011-06-25 11:23:29 +09:00
|
|
|
"rounds": 1,
|
2011-12-23 08:43:14 +09:00
|
|
|
"type": "eq"
|
2011-06-25 11:23:29 +09:00
|
|
|
},
|
2013-01-30 06:16:54 +09:00
|
|
|
{ "id": "issue2128",
|
2015-11-09 20:57:20 +09:00
|
|
|
"file": "pdfs/issue2128r.pdf",
|
|
|
|
"md5": "64118f4e74590b88bd476b4178a516d5",
|
|
|
|
"link": false,
|
2013-01-30 06:16:54 +09:00
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-01-31 03:01:32 +09:00
|
|
|
{ "id": "german-umlaut",
|
2015-11-16 21:15:36 +09:00
|
|
|
"file": "pdfs/german-umlaut-r.pdf",
|
|
|
|
"md5": "baa2cd74c76473cf7b914a17403a7f9a",
|
|
|
|
"link": false,
|
2013-01-31 03:01:32 +09:00
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-01-21 09:44:46 +09:00
|
|
|
{ "id": "bug859204",
|
|
|
|
"file": "pdfs/bug859204.pdf",
|
|
|
|
"md5": "ac1ea1dbfa6ac9d5b13167483049af0b",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-07-29 23:48:01 +09:00
|
|
|
{ "id": "bug1027533",
|
|
|
|
"file": "pdfs/bug1027533.pdf",
|
|
|
|
"md5": "07235b2bb0e03f8d727072d48fae3b0a",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-06-28 19:38:25 +09:00
|
|
|
{ "id": "bug1028735",
|
|
|
|
"file": "pdfs/bug1028735.pdf",
|
|
|
|
"md5": "5d1a2a87d176ff3b24e66af3cb2365be",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2016-05-08 01:23:47 +09:00
|
|
|
{ "id": "bug1068432",
|
|
|
|
"file": "pdfs/bug1068432.pdf",
|
|
|
|
"md5": "b76ac8d7d0ef471f28535c881f421e33",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2016-05-27 00:34:00 +09:00
|
|
|
{ "id": "bug1146106",
|
|
|
|
"file": "pdfs/bug1146106.pdf",
|
|
|
|
"md5": "a323d3766da49ee40f7d5dff0aeb0cc1",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-01-31 03:01:32 +09:00
|
|
|
{ "id": "issue1512",
|
2015-11-16 21:15:36 +09:00
|
|
|
"file": "pdfs/issue1512r.pdf",
|
|
|
|
"md5": "af48ede2658d99cca423147085c6609b",
|
|
|
|
"link": false,
|
2013-01-31 03:01:32 +09:00
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2011-06-22 07:14:42 +09:00
|
|
|
{ "id": "pdfspec-load",
|
2011-06-24 01:48:34 +09:00
|
|
|
"file": "pdfs/pdf.pdf",
|
2011-11-05 01:14:52 +09:00
|
|
|
"md5": "dbdb23c939d2be09b43126c3c56060c7",
|
2011-06-22 07:14:42 +09:00
|
|
|
"link": true,
|
2013-01-31 04:31:08 +09:00
|
|
|
"lastPage": 500,
|
2011-06-22 07:14:42 +09:00
|
|
|
"rounds": 1,
|
|
|
|
"type": "load"
|
2011-06-26 23:15:33 +09:00
|
|
|
},
|
2013-01-16 08:04:05 +09:00
|
|
|
{ "id": "issue2129",
|
|
|
|
"file": "pdfs/issue2129.pdf",
|
|
|
|
"md5": "b082dd2cb3648f979fd668f498af14d6",
|
|
|
|
"link": true,
|
2013-01-31 04:31:08 +09:00
|
|
|
"lastPage": 1,
|
2013-01-16 08:04:05 +09:00
|
|
|
"rounds": 1,
|
|
|
|
"type": "load"
|
|
|
|
},
|
2011-07-06 02:21:28 +09:00
|
|
|
{ "id": "shavian-load",
|
|
|
|
"file": "pdfs/shavian.pdf",
|
2013-12-17 09:37:10 +09:00
|
|
|
"md5": "4fabf0a03e82693007435020bc446f9b",
|
2011-07-06 02:21:28 +09:00
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "load"
|
|
|
|
},
|
2011-06-26 23:15:33 +09:00
|
|
|
{ "id": "sizes",
|
|
|
|
"file": "pdfs/sizes.pdf",
|
2011-11-05 01:14:52 +09:00
|
|
|
"md5": "c101ba7b44aee36048e1ac7b98f302ea",
|
2011-06-26 23:15:33 +09:00
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2011-06-28 10:13:11 +09:00
|
|
|
},
|
2011-10-08 12:46:01 +09:00
|
|
|
{ "id": "plusminus",
|
|
|
|
"file": "pdfs/Test-plusminus.pdf",
|
2011-11-05 01:14:52 +09:00
|
|
|
"md5": "1ec7ade5b95ac9aaba3a618af28d34c7",
|
2011-10-08 12:46:01 +09:00
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-09-18 23:10:46 +09:00
|
|
|
{ "id": "mmtype1",
|
|
|
|
"file": "pdfs/mmtype1.pdf",
|
|
|
|
"md5": "7d632263d28bc2ff05ee0cc426966a5a",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2011-07-06 08:43:47 +09:00
|
|
|
{ "id": "openoffice-pdf",
|
2015-11-16 04:07:54 +09:00
|
|
|
"file": "pdfs/openoffice.pdf",
|
|
|
|
"md5": "ecb0225c7ce2df2cd9a5323a349f8f84",
|
|
|
|
"link": false,
|
2011-07-06 08:43:47 +09:00
|
|
|
"rounds": 1,
|
|
|
|
"type": "load"
|
2011-07-13 17:55:13 +09:00
|
|
|
},
|
|
|
|
{ "id": "openofficecidtruetype-pdf",
|
|
|
|
"file": "pdfs/arial_unicode_en_cidfont.pdf",
|
2011-11-05 01:14:52 +09:00
|
|
|
"md5": "03591cdf20214fb0b2dd5e5c3dd32d8c",
|
2011-07-13 17:55:13 +09:00
|
|
|
"rounds": 1,
|
|
|
|
"type": "load"
|
|
|
|
},
|
|
|
|
{ "id": "openofficearabiccidtruetype-pdf",
|
|
|
|
"file": "pdfs/arial_unicode_ab_cidfont.pdf",
|
2011-11-05 01:14:52 +09:00
|
|
|
"md5": "35090fa7d29e7196ae3421812e554988",
|
2011-07-13 17:55:13 +09:00
|
|
|
"rounds": 1,
|
|
|
|
"type": "load"
|
|
|
|
},
|
|
|
|
{ "id": "arabiccidtruetype-pdf",
|
|
|
|
"file": "pdfs/ArabicCIDTrueType.pdf",
|
2011-11-05 01:14:52 +09:00
|
|
|
"md5": "d66dbd18bdb572d3ac8b88b32de2ece6",
|
2011-07-13 17:55:13 +09:00
|
|
|
"rounds": 1,
|
|
|
|
"type": "load"
|
2011-07-21 13:17:31 +09:00
|
|
|
},
|
2012-09-15 03:37:00 +09:00
|
|
|
{ "id": "arabiccidtruetype-text",
|
|
|
|
"file": "pdfs/ArabicCIDTrueType.pdf",
|
|
|
|
"md5": "d66dbd18bdb572d3ac8b88b32de2ece6",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2011-08-19 22:04:34 +09:00
|
|
|
{ "id": "complexttffont-pdf",
|
|
|
|
"file": "pdfs/complex_ttf_font.pdf",
|
2011-11-05 01:14:52 +09:00
|
|
|
"md5": "76de93f9116b01b693bf8583b3e76d91",
|
2011-08-19 22:04:34 +09:00
|
|
|
"rounds": 1,
|
|
|
|
"type": "load"
|
|
|
|
},
|
2013-11-03 08:56:48 +09:00
|
|
|
{ "id": "bug868745",
|
|
|
|
"file": "pdfs/bug868745.pdf",
|
|
|
|
"md5": "86111ea5097dd7daffcdea891ad1b348",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "load"
|
|
|
|
},
|
2011-10-03 21:57:01 +09:00
|
|
|
{ "id": "thuluthfont-pdf",
|
|
|
|
"file": "pdfs/ThuluthFeatures.pdf",
|
2011-11-05 01:14:52 +09:00
|
|
|
"md5": "b7e18bf7a3d6a9c82aefa12d721072fc",
|
2011-10-03 21:57:01 +09:00
|
|
|
"rounds": 1,
|
2011-10-06 11:22:52 +09:00
|
|
|
"type": "eq"
|
2011-10-03 21:57:01 +09:00
|
|
|
},
|
2015-08-29 22:44:26 +09:00
|
|
|
{ "id": "taro",
|
|
|
|
"file": "pdfs/TaroUTR50SortedList112.pdf",
|
|
|
|
"md5": "ce63eab622ff473a43f8a8de85ef8a46",
|
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 4,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-06-21 07:03:30 +09:00
|
|
|
{ "id": "taro-text",
|
|
|
|
"file": "pdfs/TaroUTR50SortedList112.pdf",
|
|
|
|
"md5": "ce63eab622ff473a43f8a8de85ef8a46",
|
|
|
|
"link":true,
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 4,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2016-08-18 02:56:43 +09:00
|
|
|
{ "id": "taro-text-enhance",
|
|
|
|
"file": "pdfs/TaroUTR50SortedList112.pdf",
|
|
|
|
"md5": "ce63eab622ff473a43f8a8de85ef8a46",
|
|
|
|
"link":true,
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 4,
|
|
|
|
"enhance": true,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2013-06-21 07:03:30 +09:00
|
|
|
{ "id": "rotated-text",
|
|
|
|
"file": "pdfs/rotated.pdf",
|
|
|
|
"md5": "aed187f53e969ccdcbab0bb4c59f9e46",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2016-08-18 02:56:43 +09:00
|
|
|
{ "id": "rotated-text-enhance",
|
|
|
|
"file": "pdfs/rotated.pdf",
|
|
|
|
"md5": "aed187f53e969ccdcbab0bb4c59f9e46",
|
|
|
|
"rounds": 1,
|
|
|
|
"enhance": true,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2013-04-20 05:07:08 +09:00
|
|
|
{
|
|
|
|
"id": "issue3115",
|
2015-09-15 22:05:37 +09:00
|
|
|
"file": "pdfs/issue3115r.pdf",
|
2013-04-20 05:07:08 +09:00
|
|
|
"md5": "ea10f4131202b9b8f2a6cb7770d3f185",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"link": true,
|
2014-08-24 23:05:11 +09:00
|
|
|
"lastPage": 1,
|
|
|
|
"about": "The same file as issue2337."
|
2013-04-20 05:07:08 +09:00
|
|
|
},
|
2011-12-08 12:38:34 +09:00
|
|
|
{ "id": "freeculture",
|
|
|
|
"file": "pdfs/freeculture.pdf",
|
|
|
|
"md5": "dcdf3a8268e6a18938a42d5149efcfca",
|
|
|
|
"rounds": 1,
|
2013-01-31 04:31:08 +09:00
|
|
|
"lastPage": 5,
|
2011-12-08 12:38:34 +09:00
|
|
|
"type": "eq"
|
|
|
|
},
|
2011-09-18 07:13:22 +09:00
|
|
|
{ "id": "wnv_chinese-pdf",
|
|
|
|
"file": "pdfs/wnv_chinese.pdf",
|
2011-11-05 01:14:52 +09:00
|
|
|
"md5": "db682638e68391125e8982d3c984841e",
|
2011-09-18 07:13:22 +09:00
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2011-07-21 13:17:31 +09:00
|
|
|
{ "id": "i9-pdf",
|
|
|
|
"file": "pdfs/i9.pdf",
|
2011-11-05 01:14:52 +09:00
|
|
|
"md5": "ba7cd54fdff083bb389295bc0415f6c5",
|
2011-07-21 13:17:31 +09:00
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2011-08-04 14:22:24 +09:00
|
|
|
},
|
|
|
|
{ "id": "hmm-pdf",
|
|
|
|
"file": "pdfs/hmm.pdf",
|
2016-01-14 00:30:26 +09:00
|
|
|
"md5": "e08467e60101ee5f4a59716e86db6dc9",
|
2011-08-04 14:22:24 +09:00
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
2013-03-05 12:33:50 +09:00
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 2,
|
2011-08-04 14:22:24 +09:00
|
|
|
"type": "load"
|
2011-08-07 06:41:18 +09:00
|
|
|
},
|
|
|
|
{ "id": "rotation",
|
|
|
|
"file": "pdfs/rotation.pdf",
|
2011-11-05 01:14:52 +09:00
|
|
|
"md5": "4fb25ada00ce7528569d9791c14decf5",
|
2011-08-07 06:41:18 +09:00
|
|
|
"rounds": 1,
|
2011-10-14 06:57:56 +09:00
|
|
|
"type": "eq"
|
2011-08-14 22:40:22 +09:00
|
|
|
},
|
|
|
|
{ "id": "ecma262-pdf",
|
|
|
|
"file": "pdfs/ecma262.pdf",
|
2011-11-05 01:14:52 +09:00
|
|
|
"md5": "763ead98f535578842891e5574e0af0f",
|
2011-08-14 22:40:22 +09:00
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "load"
|
2014-04-08 03:50:27 +09:00
|
|
|
},
|
2016-01-06 02:53:31 +09:00
|
|
|
{ "id": "issue6286",
|
|
|
|
"file": "pdfs/issue6286.pdf",
|
|
|
|
"md5": "d13fd1b98fb1c9980356314fd1d3a91b",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2017-10-30 11:07:02 +09:00
|
|
|
{ "id": "PDFJS-7562-reduced",
|
|
|
|
"file": "pdfs/PDFJS-7562-reduced.pdf",
|
|
|
|
"md5": "ddfb96fd492599fe54adbc685493ba3a",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2016-01-06 02:53:31 +09:00
|
|
|
{ "id": "issue3694_reduced",
|
2016-01-06 05:21:14 +09:00
|
|
|
"file": "pdfs/issue3694_reduced.pdf",
|
2016-01-06 02:53:31 +09:00
|
|
|
"md5": "c1438c7bad12d70c4cd684f8ce04448f",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-10-13 05:36:50 +09:00
|
|
|
{ "id": "bug847420",
|
|
|
|
"file": "pdfs/bug847420.pdf",
|
|
|
|
"md5": "0decd96fec4ef858c2c663a6de24e887",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-04-08 03:50:27 +09:00
|
|
|
{ "id": "bug878026",
|
|
|
|
"file": "pdfs/bug878026.pdf",
|
|
|
|
"md5": "13072db0586f2b4e96de189e23fc7395",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
2011-08-15 02:26:48 +09:00
|
|
|
},
|
2014-04-11 04:36:37 +09:00
|
|
|
{ "id": "issue4550-text",
|
|
|
|
"file": "pdfs/issue4550.pdf",
|
|
|
|
"md5": "d64cfc4b50e225f596130d9938e8d5cc",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2019-09-30 06:50:58 +09:00
|
|
|
{ "id": "issue9655-text",
|
|
|
|
"file": "pdfs/issue9655_reduced.pdf",
|
|
|
|
"md5": "87259a82cf3cda18e240517ca53c312a",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2011-08-15 02:26:48 +09:00
|
|
|
{ "id": "jai-pdf",
|
|
|
|
"file": "pdfs/jai.pdf",
|
2011-11-05 01:14:52 +09:00
|
|
|
"md5": "1f5dd128c3757420a881a155f2f8ace3",
|
2011-08-15 02:26:48 +09:00
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "load"
|
2011-08-29 06:36:58 +09:00
|
|
|
},
|
|
|
|
{ "id": "cable",
|
|
|
|
"file": "pdfs/cable.pdf",
|
2011-11-05 01:14:52 +09:00
|
|
|
"md5": "09a41b9a759d60c698228224ab85b46d",
|
2011-08-29 06:36:58 +09:00
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2011-08-31 12:32:56 +09:00
|
|
|
},
|
|
|
|
{ "id": "pdkids",
|
|
|
|
"file": "pdfs/pdkids.pdf",
|
2011-11-05 01:14:52 +09:00
|
|
|
"md5": "278982bf016dbe46d2066f9245d9b3e6",
|
2011-08-31 12:32:56 +09:00
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2011-09-15 11:29:32 +09:00
|
|
|
},
|
2011-09-14 09:23:49 +09:00
|
|
|
{ "id": "artofwar",
|
|
|
|
"file": "pdfs/artofwar.pdf",
|
2011-11-05 01:14:52 +09:00
|
|
|
"md5": "7bdd51c327b74f1f7abdd90eedb2f912",
|
2011-09-14 09:23:49 +09:00
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2011-09-15 11:45:12 +09:00
|
|
|
},
|
2013-06-24 03:20:47 +09:00
|
|
|
{ "id": "issue3371",
|
|
|
|
"file": "pdfs/issue3371.pdf",
|
|
|
|
"password": "ELXRTQWS",
|
|
|
|
"md5": "db2fedbd36d6fa27d4e52f9bd2d96b8c",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "load"
|
|
|
|
},
|
2015-02-28 20:58:53 +09:00
|
|
|
{ "id": "issue5334",
|
|
|
|
"file": "pdfs/issue5334.pdf",
|
|
|
|
"md5": "5575020f37f6e5b3c43b8183bf7f96ae",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2017-10-19 06:10:40 +09:00
|
|
|
{ "id": "issue7769",
|
|
|
|
"file": "pdfs/issue7769.pdf",
|
|
|
|
"md5": "814f167b8437eb8e4e4b6e89743011d5",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-12-05 03:52:45 +09:00
|
|
|
{ "id": "issue5044",
|
|
|
|
"file": "pdfs/issue5044.pdf",
|
|
|
|
"md5": "44788cd31dcb4a2495ded34a84c4a765",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
2015-02-28 20:58:53 +09:00
|
|
|
},
|
2015-07-25 19:26:36 +09:00
|
|
|
{ "id": "bug1186827",
|
|
|
|
"file": "pdfs/bug1186827.pdf",
|
|
|
|
"md5": "6c5526ae1a9d66cb517153001afc196e",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
|
|
|
{ "id": "issue215",
|
|
|
|
"file": "pdfs/issue215.pdf",
|
|
|
|
"md5": "31f3dc60ecf008987d970edfd2b1df61",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-04-08 20:07:29 +09:00
|
|
|
{ "id": "bug850854",
|
|
|
|
"file": "pdfs/bug850854.pdf",
|
|
|
|
"md5": "346a034a80120d123b9fefc42bcb11da",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2011-09-18 00:32:42 +09:00
|
|
|
{ "id": "wdsg_fitc",
|
|
|
|
"file": "pdfs/wdsg_fitc.pdf",
|
2011-11-05 01:14:52 +09:00
|
|
|
"md5": "5bb1c2b83705d4cdfc43197ee74f07f9",
|
2011-09-18 00:32:42 +09:00
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2011-09-18 01:23:34 +09:00
|
|
|
},
|
2011-10-02 11:24:11 +09:00
|
|
|
{ "id": "unix01",
|
|
|
|
"file": "pdfs/unix01.pdf",
|
2011-11-05 01:14:52 +09:00
|
|
|
"md5": "2742999f0bf9b9c035dbb0736096e220",
|
2011-10-02 11:24:11 +09:00
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-06-26 05:16:08 +09:00
|
|
|
{ "id": "issue4909",
|
|
|
|
"file": "pdfs/issue4909.pdf",
|
|
|
|
"md5": "ecd62637532aa8383837ef11d72a96b9",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-09-26 23:02:12 +09:00
|
|
|
{ "id": "issue4914",
|
|
|
|
"file": "pdfs/issue4914.pdf",
|
|
|
|
"md5": "6e1da9c5283d9acef0314d87ce953659",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq",
|
2015-12-21 23:15:26 +09:00
|
|
|
"annotations": true,
|
2015-09-26 23:02:12 +09:00
|
|
|
"about": "PDF with annotations, some of which have the Hidden flag set."
|
|
|
|
},
|
2018-08-04 08:47:45 +09:00
|
|
|
{ "id": "issue9949",
|
|
|
|
"file": "pdfs/issue9949.pdf",
|
|
|
|
"md5": "55d24d7fc71b849818ea91d3b9eaf302",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-11-07 21:04:03 +09:00
|
|
|
{ "id": "issue4665-text",
|
|
|
|
"file": "pdfs/issue4665.pdf",
|
|
|
|
"md5": "0de1308432819c101881df7ca4424575",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2011-10-12 08:26:25 +09:00
|
|
|
{ "id": "fit11-talk",
|
|
|
|
"file": "pdfs/fit11-talk.pdf",
|
2011-11-05 01:14:52 +09:00
|
|
|
"md5": "eb7b224107205db4fea9f7df0185f77d",
|
2011-10-12 08:26:25 +09:00
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2011-09-15 11:29:32 +09:00
|
|
|
{ "id": "fips197",
|
|
|
|
"file": "pdfs/fips197.pdf",
|
2012-10-17 00:30:14 +09:00
|
|
|
"md5": "4742c3f470cd8c4686a0dbb3da808b71",
|
2011-09-15 11:29:32 +09:00
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
2011-10-13 07:52:11 +09:00
|
|
|
"type": "eq"
|
2011-09-21 07:42:52 +09:00
|
|
|
},
|
2014-04-29 03:09:00 +09:00
|
|
|
{ "id": "issue4650",
|
|
|
|
"file": "pdfs/issue4650.pdf",
|
|
|
|
"md5": "ad736804f57f9f96f5ac108e514e1686",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
2014-06-15 05:51:13 +09:00
|
|
|
},
|
2020-06-13 21:41:34 +09:00
|
|
|
{ "id": "issue6707",
|
|
|
|
"file": "pdfs/issue6707.pdf",
|
|
|
|
"md5": "068ceaec23d265b1d38dfa6ab279f017",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2016-02-25 03:48:02 +09:00
|
|
|
{ "id": "issue6721_reduced",
|
|
|
|
"file": "pdfs/issue6721_reduced.pdf",
|
|
|
|
"md5": "719aa66d8081a15e3ba6032ed4279237",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2017-12-27 21:35:18 +09:00
|
|
|
{ "id": "issue6737",
|
|
|
|
"file": "pdfs/issue6737.pdf",
|
|
|
|
"md5": "6f091967ad15ba63855c56049e86c68b",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-06-27 07:41:44 +09:00
|
|
|
{ "id": "issue5010",
|
|
|
|
"file": "pdfs/issue5010.pdf",
|
|
|
|
"md5": "419f4b13403a0871c463ec69d96e342c",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-03-09 22:36:45 +09:00
|
|
|
{ "id": "franz",
|
|
|
|
"file": "pdfs/franz.pdf",
|
|
|
|
"md5": "12f0bbdec09900cbdd86e1737ed5f992",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "Type1 font with |Ref|s in the Differences array of the Encoding dictionary."
|
|
|
|
},
|
2020-06-20 18:34:41 +09:00
|
|
|
{ "id": "issue8078",
|
|
|
|
"file": "pdfs/issue8078.pdf",
|
|
|
|
"md5": "8b7d74bc24b4157393e4e88a511c05f1",
|
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2019-05-24 09:45:25 +09:00
|
|
|
{ "id": "issue8092",
|
|
|
|
"file": "pdfs/issue8092.pdf",
|
|
|
|
"md5": "e4f3376b35fd132580246c3db1fbd738",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2015-03-09 22:36:45 +09:00
|
|
|
},
|
2015-09-28 22:09:24 +09:00
|
|
|
{ "id": "franz_2",
|
|
|
|
"file": "pdfs/franz_2.pdf",
|
|
|
|
"md5": "9d301ed8816e879891115b5cc3c39559",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "XObject with BBox array containing indirect object."
|
|
|
|
},
|
2016-03-26 22:41:15 +09:00
|
|
|
{ "id": "issue7115",
|
|
|
|
"file": "pdfs/issue7115.pdf",
|
|
|
|
"md5": "63c78e25a0433dd5a01ddff6ec720f29",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"annotations": true,
|
|
|
|
"about": "Annotation with Rect array containing indirect objects."
|
|
|
|
},
|
2020-05-22 21:07:28 +09:00
|
|
|
{ "id": "issue11922",
|
|
|
|
"file": "pdfs/issue11922_reduced.pdf",
|
|
|
|
"md5": "711bb4b4bf92d967644b4f88d11a93db",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "load"
|
|
|
|
},
|
2017-09-02 17:06:44 +09:00
|
|
|
{ "id": "file_url_link",
|
|
|
|
"file": "pdfs/file_url_link.pdf",
|
|
|
|
"md5": "b0253c96c38d43bc49259bbf36db938a",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"annotations": true,
|
|
|
|
"about": "Annotation with (unsupported) file:// URL."
|
|
|
|
},
|
2020-08-08 03:46:41 +09:00
|
|
|
{ "id": "bug1538111",
|
|
|
|
"file": "pdfs/bug1538111.pdf",
|
|
|
|
"md5": "3f3635cfc25d132fb1054042e520e297",
|
|
|
|
"rounds": 1,
|
|
|
|
"annotations": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2019-06-01 03:44:24 +09:00
|
|
|
{ "id": "bug1552113",
|
|
|
|
"file": "pdfs/bug1552113.pdf",
|
|
|
|
"md5": "dafb7ba1328e8deaab2e3619c94bf974",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"annotations": true,
|
|
|
|
"about": "Annotation with (ridiculously) large border width."
|
|
|
|
},
|
2020-09-24 17:28:29 +09:00
|
|
|
{ "id": "bug1627030",
|
|
|
|
"file": "pdfs/bug1627030.pdf",
|
|
|
|
"md5": "4cde6134daa80449c43defd02c1393e2",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 3,
|
|
|
|
"lastPage": 3,
|
|
|
|
"type": "eq",
|
|
|
|
"annotations": true
|
|
|
|
},
|
2014-06-15 05:51:13 +09:00
|
|
|
{ "id": "issue4934",
|
|
|
|
"file": "pdfs/issue4934.pdf",
|
|
|
|
"md5": "6099da44f677702ae65a648b51a2226d",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
2014-04-29 03:09:00 +09:00
|
|
|
},
|
2015-01-02 22:21:56 +09:00
|
|
|
{ "id": "issue4061",
|
|
|
|
"file": "pdfs/issue4061.pdf",
|
|
|
|
"md5": "236aaa8840a47c3c061f8e3034549764",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2016-09-22 17:11:27 +09:00
|
|
|
{ "id": "issue4090",
|
|
|
|
"file": "pdfs/issue4090.pdf",
|
|
|
|
"md5": "8cec73e090985acf6094c683d7944425",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-01-02 22:21:56 +09:00
|
|
|
{ "id": "issue5202",
|
|
|
|
"file": "pdfs/issue5202.pdf",
|
|
|
|
"md5": "bb9cc69211112e66aab40828086a4e5a",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-08-31 21:03:25 +09:00
|
|
|
{ "id": "issue5238",
|
|
|
|
"file": "pdfs/issue5238.pdf",
|
|
|
|
"md5": "6ddecda00893be1793de20a70c83a3c2",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-11-05 00:16:48 +09:00
|
|
|
{ "id": "issue5470",
|
|
|
|
"file": "pdfs/issue5470.pdf",
|
|
|
|
"md5": "4805fdcd7e142e8df3c04c6ba06025af",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-10-01 21:46:03 +09:00
|
|
|
{ "id": "xref_command_missing",
|
|
|
|
"file": "pdfs/xref_command_missing.pdf",
|
|
|
|
"md5": "06cdb0f13cfeff41d6bfb24b7bbe1268",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
2019-10-13 01:15:55 +09:00
|
|
|
"type": "eq"
|
2015-10-01 21:46:03 +09:00
|
|
|
},
|
2020-11-25 21:44:57 +09:00
|
|
|
{ "id": "issue12402",
|
|
|
|
"file": "pdfs/issue12402.pdf",
|
|
|
|
"md5": "70031cf610e24cc7164fb6ecd6980c8e",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 8,
|
|
|
|
"lastPage": 8,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2018-08-26 08:49:31 +09:00
|
|
|
{ "id": "issue10004",
|
|
|
|
"file": "pdfs/issue10004.pdf",
|
|
|
|
"md5": "64d1853060cefe3be50e5c4617dd0505",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "load"
|
|
|
|
},
|
2019-01-13 04:31:23 +09:00
|
|
|
{ "id": "issue10388",
|
|
|
|
"file": "pdfs/issue10388_reduced.pdf",
|
|
|
|
"md5": "62a6b2adbea1535432bd94a3516e2d4c",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2017-09-26 09:24:21 +09:00
|
|
|
{ "id": "issue7507",
|
|
|
|
"file": "pdfs/issue7507.pdf",
|
|
|
|
"md5": "f7aeaafe0c89b94436e94eaa63307303",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2018-02-08 04:53:44 +09:00
|
|
|
{ "id": "issue9458",
|
|
|
|
"file": "pdfs/issue9458.pdf",
|
|
|
|
"md5": "ee54358d8b2fdc75dc8da5220cf8e8da",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-01-11 22:54:12 +09:00
|
|
|
{ "id": "issue5501",
|
|
|
|
"file": "pdfs/issue5501.pdf",
|
|
|
|
"md5": "55a60699728fc92f491a2d7d490474e4",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
2017-08-04 14:19:36 +09:00
|
|
|
},
|
2017-10-31 21:01:29 +09:00
|
|
|
{ "id": "issue9084",
|
|
|
|
"file": "pdfs/issue9084.pdf",
|
|
|
|
"md5": "5570ec01cc869d299fec1b2f68926a08",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
Add basic support for non-embedded Calibri fonts (issue 9195)
There's a number of issues with the fonts in the referenced PDF file. First of all, they contain broken `ToUnicode` data (`NUL` bytes all over the place). However even if you skip those, the `ToUnicode` data appears to contain nothing but a `IdentityH` CMap which won't help provide a proper glyph mapping.
The real issue actually turns out to be that the PDF file uses the "Calibri" font[1], but doesn't include any font files. Since that one isn't a standard font, and uses a fairly different CID to GID map compared to the standard fonts, we're not able to render the file even remotely correct.
To work around this, I'm thus proposing that we include a (incomplete) glyph map for Calibri, and fallback to the standard Helvetica font. Obviously this isn't going to look perfect, but it's really the best that we can hope to achieve given that the PDF file is missing the necessary font data.
Finally, please note that none of the PDF readers I've tried (Adobe Reader, PDFium in Chrome) were able to extract the text (which isn't very surprising, given the broken `ToUnicode` data).
Fixes 9195.
---
[1] According to Wikipedia, see https://en.wikipedia.org/wiki/Calibri, Calibri is (primarily) a Windows font.
2017-12-03 22:02:22 +09:00
|
|
|
{ "id": "issue9195",
|
|
|
|
"file": "pdfs/issue9195.pdf",
|
|
|
|
"md5": "90e78a11abdc6c5ae79b8b95cfbb1895",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2018-06-19 16:37:56 +09:00
|
|
|
{ "id": "issue9252",
|
|
|
|
"file": "pdfs/issue9252.pdf",
|
|
|
|
"md5": "c7d039d808d9344a95d2c9cfa7586ca3",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2018-06-19 18:31:31 +09:00
|
|
|
{ "id": "issue9418",
|
|
|
|
"file": "pdfs/issue9418.pdf",
|
|
|
|
"md5": "32ecad8098acb1938539d47944ecb54b",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2017-12-15 08:23:56 +09:00
|
|
|
{ "id": "issue9262",
|
|
|
|
"file": "pdfs/issue9262_reduced.pdf",
|
|
|
|
"md5": "5347ce2d7b3866625c22e115fd90e0de",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
Attempt to actually resolve ColourSpace names in accordance with the specification (issue 9285)
Please refer to the PDF specification, in particular http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G7.3801570
> A colour space shall be specified in one of two ways:
> - Within a content stream, the CS or cs operator establishes the current colour space parameter in the graphics state. The operand shall always be name object, which either identifies one of the colour spaces that need no additional parameters (DeviceGray, DeviceRGB, DeviceCMYK, or some cases of Pattern) or shall be used as a key in the ColorSpace subdictionary of the current resource dictionary (see 7.8.3, "Resource Dictionaries"). In the latter case, the value of the dictionary entry in turn shall be a colour space array or name. A colour space array shall never be inline within a content stream.
>
> - Outside a content stream, certain objects, such as image XObjects, shall specify a colour space as an explicit parameter, often associated with the key ColorSpace. In this case, the colour space array or name shall always be defined directly as a PDF object, not by an entry in the ColorSpace resource subdictionary. This convention also applies when colour spaces are defined in terms of other colour spaces.
2017-12-20 23:35:04 +09:00
|
|
|
{ "id": "issue9285",
|
|
|
|
"file": "pdfs/issue9285.pdf",
|
|
|
|
"md5": "aa53ad98a72fd76c414101927951448b",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2017-08-04 14:19:36 +09:00
|
|
|
{ "id": "issue8707",
|
|
|
|
"file": "pdfs/issue8707.pdf",
|
|
|
|
"md5": "d3dc670adde9ec9fb82c974027033029",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
2015-01-11 22:54:12 +09:00
|
|
|
},
|
2017-09-16 04:23:47 +09:00
|
|
|
{ "id": "issue8895",
|
|
|
|
"file": "pdfs/issue8895.pdf",
|
|
|
|
"md5": "098658008fc2bf7d433fd0d6d468a9e1",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2017-12-09 00:37:12 +09:00
|
|
|
{ "id": "issue9105",
|
|
|
|
"file": "pdfs/issue9105_reduced.pdf",
|
|
|
|
"md5": "f3889f7c7b60e1ab998aac430cc7e08e",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2020-07-15 07:17:27 +09:00
|
|
|
{ "id": "issue269_1",
|
|
|
|
"file": "pdfs/issue269_1.pdf",
|
|
|
|
"md5": "ab932f697b4d2e2bf700de15a8efea9c",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "Optional marked content."
|
|
|
|
},
|
|
|
|
{ "id": "issue269_2",
|
|
|
|
"file": "pdfs/issue269_2.pdf",
|
|
|
|
"md5": "0f553510850ee17c87fbab3fac564165",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "Optional marked content."
|
|
|
|
},
|
|
|
|
{ "id": "issue11144_reduced",
|
|
|
|
"file": "pdfs/issue11144_reduced.pdf",
|
|
|
|
"md5": "09e3e771ebd6867558074e900adb54b9",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "Optional marked content."
|
|
|
|
},
|
|
|
|
{ "id": "issue12007_reduced",
|
|
|
|
"file": "pdfs/issue12007_reduced.pdf",
|
|
|
|
"md5": "3aa9d8a0c5ff8594245149f9c7379613",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "Optional marked content."
|
|
|
|
},
|
2019-01-11 01:49:33 +09:00
|
|
|
{ "id": "issue10438",
|
|
|
|
"file": "pdfs/issue10438_reduced.pdf",
|
|
|
|
"md5": "bb26f68493e33af17b256a6ffe777a24",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2020-03-25 22:06:01 +09:00
|
|
|
{ "id": "issue11740",
|
|
|
|
"file": "pdfs/issue11740_reduced.pdf",
|
|
|
|
"md5": "f3f2957f171af52229c6e749e8a5572b",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2019-01-25 06:27:36 +09:00
|
|
|
{ "id": "issue10491",
|
|
|
|
"file": "pdfs/issue10491.pdf",
|
|
|
|
"md5": "0759ec46739b13bb0b66170a18d33d4f",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 2,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2019-04-22 00:03:38 +09:00
|
|
|
{ "id": "issue10542",
|
|
|
|
"file": "pdfs/issue10542_reduced.pdf",
|
|
|
|
"md5": "92406cb903be6c7a63221ba61fcb8eaf",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2017-09-18 01:06:43 +09:00
|
|
|
{ "id": "issue6289",
|
|
|
|
"file": "pdfs/issue6289.pdf",
|
|
|
|
"md5": "0869f3d147c734ec484ffd492104095d",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-12-01 05:15:18 +09:00
|
|
|
{ "id": "issue5509",
|
|
|
|
"file": "pdfs/issue5509.pdf",
|
|
|
|
"md5": "1975ef8db7355b1d691bc79d0749574b",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "load"
|
|
|
|
},
|
2017-09-06 19:41:48 +09:00
|
|
|
{ "id": "pr8808",
|
|
|
|
"file": "pdfs/pr8808.pdf",
|
|
|
|
"md5": "bdac6051a98fd8dcfc5344b05fed06f4",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-12-30 23:43:04 +09:00
|
|
|
{ "id": "issue5599",
|
|
|
|
"file": "pdfs/issue5599.pdf",
|
|
|
|
"md5": "529a4a9409ac024aeb57a047210280fe",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "load"
|
|
|
|
},
|
2015-11-06 23:54:50 +09:00
|
|
|
{ "id": "issue1045",
|
|
|
|
"file": "pdfs/issue1045.pdf",
|
|
|
|
"md5": "61d7e9bfbc03cd457dcefeab3e78a687",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2015-06-19 04:53:15 +09:00
|
|
|
{ "id": "issue5677",
|
|
|
|
"file": "pdfs/issue5677.pdf",
|
|
|
|
"md5": "c9101578fcb806269145132724d24ac1",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-12-28 22:10:30 +09:00
|
|
|
{ "id": "issue5946",
|
|
|
|
"file": "pdfs/issue5946.pdf",
|
|
|
|
"md5": "1217a3c8e2ee4ceb96d85a2f27e437b4",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2017-10-19 10:33:35 +09:00
|
|
|
{ "id": "issue8960_reduced",
|
|
|
|
"file": "pdfs/issue8960_reduced.pdf",
|
|
|
|
"md5": "12ccf71307f4b5bd4148d5f985ffde07",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-05-20 22:08:55 +09:00
|
|
|
{ "id": "issue5954",
|
|
|
|
"file": "pdfs/issue5954.pdf",
|
|
|
|
"md5": "4f60ec0d9bbeec845b681242b8982361",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2017-03-03 20:22:55 +09:00
|
|
|
{ "id": "issue8125",
|
|
|
|
"file": "pdfs/issue8125.pdf",
|
|
|
|
"md5": "2073d699ea82156682542f811300b3e8",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2017-03-17 18:29:08 +09:00
|
|
|
{ "id": "issue8169",
|
|
|
|
"file": "pdfs/issue8169.pdf",
|
|
|
|
"md5": "62fd6479f9e1c8c5ce8cba6b1781d0a5",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
2017-03-24 00:08:59 +09:00
|
|
|
},
|
|
|
|
{ "id": "issue8182",
|
|
|
|
"file": "pdfs/issue8182.pdf",
|
|
|
|
"md5": "e295ae13dcbefd449f9a4957aed5e582",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
2017-03-17 18:29:08 +09:00
|
|
|
},
|
2018-02-01 18:35:38 +09:00
|
|
|
{ "id": "issue9425",
|
|
|
|
"file": "pdfs/issue9425.pdf",
|
|
|
|
"md5": "cb5e99c9ada308304ca2dfcb7f72e3a0",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2018-03-12 22:00:37 +09:00
|
|
|
{ "id": "issue9540",
|
|
|
|
"file": "pdfs/issue9540.pdf",
|
|
|
|
"md5": "7de7979270c9136bdd737428185fbbed",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2011-09-21 07:42:52 +09:00
|
|
|
{ "id": "txt2pdf",
|
|
|
|
"file": "pdfs/txt2pdf.pdf",
|
2011-11-05 01:14:52 +09:00
|
|
|
"md5": "02cefa0f5e8d96313bb05163b2f88c8c",
|
2011-09-21 07:42:52 +09:00
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "load"
|
2011-09-22 02:04:21 +09:00
|
|
|
},
|
|
|
|
{ "id": "f1040",
|
|
|
|
"file": "pdfs/f1040.pdf",
|
2013-12-17 09:37:10 +09:00
|
|
|
"md5": "7323b50c6d28d959b8b4b92c469b2469",
|
2011-09-22 02:04:21 +09:00
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "load"
|
2011-09-23 05:17:28 +09:00
|
|
|
},
|
2016-09-18 22:35:12 +09:00
|
|
|
{ "id": "f1040-annotations",
|
|
|
|
"file": "pdfs/f1040.pdf",
|
|
|
|
"md5": "7323b50c6d28d959b8b4b92c469b2469",
|
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"annotations": true
|
|
|
|
},
|
2016-09-07 23:06:18 +09:00
|
|
|
{ "id": "f1040-forms",
|
|
|
|
"file": "pdfs/f1040.pdf",
|
|
|
|
"md5": "7323b50c6d28d959b8b4b92c469b2469",
|
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"forms": true
|
|
|
|
},
|
2017-09-20 05:43:23 +09:00
|
|
|
{ "id": "jbig2_symbol_offset",
|
|
|
|
"file": "pdfs/jbig2_symbol_offset.pdf",
|
|
|
|
"md5": "6b22a0f838008fa4d8cb5b40ba095c48",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-03-26 20:40:37 +09:00
|
|
|
{ "id": "bug1046314",
|
|
|
|
"file": "pdfs/bug1046314.pdf",
|
|
|
|
"md5": "fc658439f44cd2dd27c8bee7e7a8344e",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-04-02 23:26:14 +09:00
|
|
|
{ "id": "bug1050040",
|
|
|
|
"file": "pdfs/bug1050040.pdf",
|
|
|
|
"md5": "9076b29bd157e2646b457f29a4472a07",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-09-01 20:31:02 +09:00
|
|
|
{ "id": "bug1200096",
|
|
|
|
"file": "pdfs/bug1200096.pdf",
|
|
|
|
"md5": "b6bd8df094b5d511c13ed095d2a07515",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2019-02-22 00:25:34 +09:00
|
|
|
{ "id": "bug1245391",
|
|
|
|
"file": "pdfs/bug1245391_reduced.pdf",
|
|
|
|
"md5": "6c946045ee0f2f663f269717c0f1614a",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2019-03-08 20:55:44 +09:00
|
|
|
{ "id": "bug1245391-text",
|
|
|
|
"file": "pdfs/bug1245391_reduced.pdf",
|
|
|
|
"md5": "6c946045ee0f2f663f269717c0f1614a",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "text"
|
|
|
|
},
|
|
|
|
{ "id": "bug1513120-text",
|
|
|
|
"file": "pdfs/bug1513120_reduced.pdf",
|
|
|
|
"md5": "e88ad8b5bb385296f475ca51ce0d216d",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2020-03-22 22:09:08 +09:00
|
|
|
{ "id": "issue11713",
|
|
|
|
"file": "pdfs/issue11713.pdf",
|
|
|
|
"md5": "bafe5801234feeb95969da106f2ce6d8",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2016-02-25 01:56:28 +09:00
|
|
|
{ "id": "bug1250079",
|
|
|
|
"file": "pdfs/bug1250079.pdf",
|
|
|
|
"md5": "a1dd21a70ae7097d96273e85a80b26ef",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
2019-10-13 01:15:55 +09:00
|
|
|
"type": "eq"
|
2016-02-25 01:56:28 +09:00
|
|
|
},
|
2016-10-08 03:51:02 +09:00
|
|
|
{ "id": "bug1308536",
|
|
|
|
"file": "pdfs/bug1308536.pdf",
|
|
|
|
"md5": "cc2258981e33ad8d96acbf87318716d5",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2017-08-25 02:14:33 +09:00
|
|
|
{ "id": "bug1393476",
|
|
|
|
"file": "pdfs/bug1393476.pdf",
|
|
|
|
"md5": "163ee8727c77f27ee651eec777bb20a9",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2020-05-04 23:51:08 +09:00
|
|
|
{ "id": "issue11871",
|
|
|
|
"file": "pdfs/issue11871.pdf",
|
|
|
|
"md5": "9c533eacd0ca892df4191360848668a2",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 2,
|
|
|
|
"lastPage": 2,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2016-09-17 21:36:42 +09:00
|
|
|
{ "id": "bug1252420",
|
|
|
|
"file": "pdfs/bug1252420.pdf",
|
|
|
|
"md5": "f21c911b9b655972b06ef782a1fa6a17",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2017-09-21 03:37:56 +09:00
|
|
|
{ "id": "bug1392647",
|
|
|
|
"file": "pdfs/bug1392647.pdf",
|
|
|
|
"md5": "9770ea476630ca7d560b7c39430f8850",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2011-09-23 05:17:28 +09:00
|
|
|
{ "id": "hudsonsurvey",
|
|
|
|
"file": "pdfs/hudsonsurvey.pdf",
|
2011-11-05 01:14:52 +09:00
|
|
|
"md5": "bf0e6576a7b6c2fe7485bce1b78e006f",
|
2011-09-23 05:17:28 +09:00
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "load"
|
2011-09-24 23:44:50 +09:00
|
|
|
},
|
2011-09-24 06:37:44 +09:00
|
|
|
{ "id": "extgstate",
|
|
|
|
"file": "pdfs/extgstate.pdf",
|
2011-11-05 01:14:52 +09:00
|
|
|
"md5": "001bb4ec04463a01d93aad748361f049",
|
2011-09-24 06:37:44 +09:00
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
2011-10-14 06:57:56 +09:00
|
|
|
"type": "eq"
|
2011-09-24 23:57:59 +09:00
|
|
|
},
|
2016-06-01 06:40:19 +09:00
|
|
|
{ "id": "extgstate-text",
|
|
|
|
"file": "pdfs/extgstate.pdf",
|
|
|
|
"md5": "001bb4ec04463a01d93aad748361f049",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2011-10-08 12:24:26 +09:00
|
|
|
{ "id": "usmanm-bad",
|
|
|
|
"file": "pdfs/usmanm-bad.pdf",
|
2011-11-05 01:14:52 +09:00
|
|
|
"md5": "38afb822433aaf07fc8f54807cd4f61a",
|
2011-10-08 12:24:26 +09:00
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2016-03-08 04:56:15 +09:00
|
|
|
{ "id": "bug1132849",
|
|
|
|
"file": "pdfs/bug1132849.pdf",
|
|
|
|
"md5": "aedfbead1f8feb35cf2e38b279133b47",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
|
|
|
{ "id": "issue6894",
|
|
|
|
"file": "pdfs/issue6894.pdf",
|
|
|
|
"md5": "bb84f2025c11f23cf436170049f81215",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-01-27 22:17:14 +09:00
|
|
|
{ "id": "personwithdog",
|
|
|
|
"file": "pdfs/personwithdog.pdf",
|
|
|
|
"md5": "cd68fb2ce00dab97801b3e51495b99e3",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-07-31 20:59:02 +09:00
|
|
|
{ "id": "issue2948",
|
|
|
|
"file": "pdfs/issue2948.pdf",
|
|
|
|
"md5": "26210bed6a57d5466042aff22f0249f0",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-10-21 03:22:06 +09:00
|
|
|
{ "id": "issue6541",
|
|
|
|
"file": "pdfs/issue6541.pdf",
|
|
|
|
"md5": "81bc5b146404207ea40f2c55301b2bb6",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-08-01 00:55:51 +09:00
|
|
|
{ "id": "issue6231_1",
|
|
|
|
"file": "pdfs/issue6231_1.pdf",
|
|
|
|
"md5": "eb13a9366a5142833a858472c68b4749",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-04-13 03:37:49 +09:00
|
|
|
{ "id": "usmanm-bad-auto-fetch",
|
|
|
|
"file": "pdfs/usmanm-bad.pdf",
|
|
|
|
"md5": "38afb822433aaf07fc8f54807cd4f61a",
|
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"enableAutoFetch": true
|
|
|
|
},
|
2011-10-02 09:01:58 +09:00
|
|
|
{ "id": "vesta-bad",
|
|
|
|
"file": "pdfs/vesta.pdf",
|
2011-11-05 01:14:52 +09:00
|
|
|
"md5": "0afebc109b7c17b95619ea3fab5eafe6",
|
2011-10-02 09:01:58 +09:00
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "load"
|
|
|
|
},
|
2015-11-13 05:41:16 +09:00
|
|
|
{ "id": "issue6621",
|
|
|
|
"file": "pdfs/issue6621.pdf",
|
|
|
|
"md5": "8079ce514fb2cdded4251eade6380ba9",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-12-08 06:30:09 +09:00
|
|
|
{ "id": "issue5084",
|
|
|
|
"file": "pdfs/issue5084.pdf",
|
|
|
|
"md5": "a42a076ba90e20e3aae9af869eb4de45",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2011-12-09 10:17:24 +09:00
|
|
|
{ "id": "scan-bad",
|
|
|
|
"file": "pdfs/scan-bad.pdf",
|
|
|
|
"md5": "4cf988f01ab83f61aca57f406dfd6584",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "load"
|
|
|
|
},
|
2016-02-13 02:05:19 +09:00
|
|
|
{ "id": "issue6549",
|
|
|
|
"file": "pdfs/issue6549.pdf",
|
|
|
|
"md5": "699aeea73a6f45375022ffc6cc80f12a",
|
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"firstPage": 2,
|
|
|
|
"lastPage": 3,
|
|
|
|
"type": "load"
|
|
|
|
},
|
2011-09-24 23:44:50 +09:00
|
|
|
{ "id": "ibwa-bad",
|
|
|
|
"file": "pdfs/ibwa-bad.pdf",
|
2011-11-05 01:14:52 +09:00
|
|
|
"md5": "6ca059d32b74ac2688ae06f727fee755",
|
2011-09-24 23:44:50 +09:00
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "load"
|
2011-09-29 09:54:40 +09:00
|
|
|
},
|
2012-04-18 02:39:17 +09:00
|
|
|
{ "id": "mixedfonts",
|
|
|
|
"file": "pdfs/mixedfonts.pdf",
|
|
|
|
"md5": "a582b83fa1b3a25a6f13803a367c71ec",
|
|
|
|
"link": false,
|
2011-09-29 09:54:40 +09:00
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2011-10-05 00:52:15 +09:00
|
|
|
},
|
2011-10-12 09:45:55 +09:00
|
|
|
{ "id": "pal-o47",
|
|
|
|
"file": "pdfs/pal-o47.pdf",
|
2011-11-05 01:14:52 +09:00
|
|
|
"md5": "81ae15e539e89f0f0b41169d923b611b",
|
2011-10-12 09:45:55 +09:00
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-07-31 05:15:06 +09:00
|
|
|
{ "id": "issue4800",
|
|
|
|
"file": "pdfs/issue4800.pdf",
|
|
|
|
"md5": "80d285dfdb8e8e0dd66a2353f0c78b05",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "TrueType font with (0, 1) cmap."
|
|
|
|
},
|
2014-06-24 03:55:51 +09:00
|
|
|
{ "id": "issue4801",
|
|
|
|
"file": "pdfs/issue4801.pdf",
|
|
|
|
"md5": "7f32764717447a8b5c8eac08c9ab8380",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
Always choose a (3, 1) cmap table for TrueType fonts that have an encoding specified, regardless of the Symbolic font flag (bug 1337429)
This patch basically reverts one aspect of TrueType (3, 1) cmap parsing to the state prior to PR 4259. After that PR, a number of regressions occurred in this particular code-path, which necessitated a number of follow-ups such as PRs 5703, 5743, and 6425.
The empirical data suggests, at least to me, that we should always prefer a (3, 1) cmap for TrueType fonts when they have an encoding, regardless of the Symbolic font flag.
Obviously this patch passes all unit/font/reference tests locally, and I made sure that all the PRs mentioned above landed with test-cases included.
However, in my opinion, there's still a very real possibility that this patch could potentially cause new regressions.
Given that the PDF file in bug 1337429 has been broken for almost *three* years before anyone noticed, and considering that the code-path in question has been the source of numerous regressions, I do *not* intend to request uplift of this patch to previous Firefox versions (assuming that it's even accepted).
Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1337429.
2017-02-15 22:18:42 +09:00
|
|
|
{ "id": "bug1337429",
|
|
|
|
"file": "pdfs/bug1337429.pdf",
|
|
|
|
"md5": "4e6e4dfdab884e9465bdce657b590028",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-05-04 00:28:30 +09:00
|
|
|
{ "id": "glyph_accent",
|
|
|
|
"file": "pdfs/glyph_accent.pdf",
|
|
|
|
"md5": "1526e4edaa3ec439ebf156d0a0b385aa",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "Glyph accent drawn as curves."
|
|
|
|
},
|
2014-10-03 02:58:56 +09:00
|
|
|
{ "id": "issue5138",
|
|
|
|
"file": "pdfs/issue5138.pdf",
|
|
|
|
"md5": "9931686d7dee0df62640fbf58bed3323",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "Glyph that gets mapped to unicode non-breaking-space."
|
|
|
|
},
|
2011-10-04 08:36:01 +09:00
|
|
|
{ "id": "simpletype3font",
|
|
|
|
"file": "pdfs/simpletype3font.pdf",
|
2011-11-05 01:14:52 +09:00
|
|
|
"md5": "b374c7543920840c61999e9e86939f99",
|
2011-10-04 08:36:01 +09:00
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
2011-10-05 01:06:51 +09:00
|
|
|
"type": "eq"
|
2011-10-29 06:11:14 +09:00
|
|
|
},
|
2012-09-15 03:37:00 +09:00
|
|
|
{ "id": "simpletype3font-text",
|
|
|
|
"file": "pdfs/simpletype3font.pdf",
|
|
|
|
"md5": "b374c7543920840c61999e9e86939f99",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2019-04-11 19:26:15 +09:00
|
|
|
{ "id": "issue10717",
|
|
|
|
"file": "pdfs/issue10717.pdf",
|
|
|
|
"md5": "6d2ed03db798cc6beb3c7bdf103f5c1a",
|
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 2,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "Type3 fonts with image resources; both pages need to be tested, otherwise the bug won't manifest."
|
|
|
|
},
|
2020-11-06 01:49:32 +09:00
|
|
|
{ "id": "issue12504",
|
|
|
|
"file": "pdfs/issue12504.pdf",
|
|
|
|
"md5": "04fcc87f3e7e9e925e3ef83cf0bf49f4",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"annotations": true
|
|
|
|
},
|
2011-10-28 22:38:55 +09:00
|
|
|
{ "id": "close-path-bug",
|
|
|
|
"file": "pdfs/close-path-bug.pdf",
|
2011-11-05 01:14:52 +09:00
|
|
|
"md5": "48dd17ef58393857d2d038d33699cac5",
|
2011-10-28 22:38:55 +09:00
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2011-11-01 00:51:45 +09:00
|
|
|
},
|
2011-10-29 06:11:14 +09:00
|
|
|
{ "id": "alphatrans",
|
|
|
|
"file": "pdfs/alphatrans.pdf",
|
2011-11-05 01:14:52 +09:00
|
|
|
"md5": "5ca2d3da0c5f20b3a5a14e895ad24b65",
|
2011-10-29 06:11:14 +09:00
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2011-11-10 02:39:55 +09:00
|
|
|
},
|
2020-09-28 21:39:48 +09:00
|
|
|
{ "id": "issue12418",
|
|
|
|
"file": "pdfs/issue12418_reduced.pdf",
|
|
|
|
"md5": "596b70f00a5f88ff58f4f4d06fcf75f1",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-11-26 21:27:12 +09:00
|
|
|
{ "id": "issue6692",
|
|
|
|
"file": "pdfs/issue6692.pdf",
|
|
|
|
"md5": "ba078e0ddd59cda4b6c51ea10599f49a",
|
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"firstPage": 11,
|
|
|
|
"lastPage": 11,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2011-11-12 07:44:47 +09:00
|
|
|
{ "id": "devicen",
|
|
|
|
"file": "pdfs/devicen.pdf",
|
2011-11-23 02:06:53 +09:00
|
|
|
"md5": "aac6a91725435d1376c6ff492dc5cb75",
|
2011-11-12 07:44:47 +09:00
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2011-11-23 03:39:26 +09:00
|
|
|
},
|
2015-06-14 04:05:13 +09:00
|
|
|
{ "id": "issue5939",
|
|
|
|
"file": "pdfs/issue5939.pdf",
|
|
|
|
"md5": "43c61e06ad407c158763f0860c99bb04",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2011-11-10 02:39:55 +09:00
|
|
|
{ "id": "cmykjpeg",
|
|
|
|
"file": "pdfs/cmykjpeg.pdf",
|
2011-11-23 03:39:26 +09:00
|
|
|
"md5": "85d162b48ce98503a382d96f574f70a2",
|
2011-11-10 02:39:55 +09:00
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2011-11-30 08:47:53 +09:00
|
|
|
},
|
2016-03-02 10:05:33 +09:00
|
|
|
{ "id": "issue4402_reduced",
|
|
|
|
"file": "pdfs/issue4402_reduced.pdf",
|
|
|
|
"md5": "6cc7e61a581889eec3ed7402d87161c4",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2016-06-13 21:22:15 +09:00
|
|
|
{ "id": "issue7403",
|
|
|
|
"file": "pdfs/issue7403.pdf",
|
|
|
|
"md5": "0f7bb6b3c58e33bbf76ce5161cd665c3",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2016-07-23 20:04:27 +09:00
|
|
|
{ "id": "issue7406",
|
|
|
|
"file": "pdfs/issue7406.pdf",
|
|
|
|
"md5": "7a3d322d7c595a36b4470cfb6a54a2b7",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
Allow over-writing entries, in `XRef.indexObjects`, only when the generation number matches (issues 11230, 11139, 9552, 9129, 7303)
This patch is making me somewhat worried about future regressions, since it's certainly easy to imagine this completely breaking certain kinds of corrupt/edited PDF documents while fixing others.[1]
Obviously it passes all existing reference tests (and even improves one), however compared to many other patches there's no telling how much it could break.
The only reason that I'm even submitting this patch, is because of the number of open issues that it would address.
Generally speaking though, the best course of action would probably be if `XRef.indexObjects` was re-written to be much more robust (since it currently feels somewhat hand-wavy in parts). E.g. by actually checking/validating more of the objects before committing to them.
---
[1] Especially given that it's reverting part of PR 5910, however in the case of issue 5909 it seems that other (more recent) changes have actually made that PR redundant.
2019-10-12 03:39:02 +09:00
|
|
|
{ "id": "issue7303",
|
|
|
|
"file": "pdfs/issue7303.pdf",
|
|
|
|
"md5": "3a5a4ab6755d6c3b0c490996b83d69d2",
|
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 2,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
Check that the first page can be successfully loaded, to try and ascertain the validity of the XRef table (issue 7496, issue 10326)
For PDF documents with sufficiently broken XRef tables, it's usually quite obvious when you need to fallback to indexing the entire file. However, for certain kinds of corrupted PDF documents the XRef table will, for all intents and purposes, appear to be valid. It's not until you actually try to fetch various objects that things will start to break, which is the case in the referenced issues[1].
Since there's generally a real effort being in made PDF.js to load even corrupt PDF documents, this patch contains a suggested approach to attempt to do a bit more validation of the XRef table during the initial document loading phase.
Here the choice is made to attempt to load the *first* page, as a basic sanity check of the validity of the XRef table. Please note that attempting to load a more-or-less arbitrarily chosen object without any context of what it's supposed to be isn't a very useful, which is why this particular choice was made.
Obviously, just because the first page can be loaded successfully that doesn't guarantee that the *entire* XRef table is valid, however if even the first page fails to load you can be reasonably sure that the document is *not* valid[2].
Even though this patch won't cause any significant increase in the amount of parsing required during initial loading of the document[3], it will require loading of more data upfront which thus delays the initial `getDocument` call.
Whether or not this is a problem depends very much on what you actually measure, please consider the following examples:
```javascript
console.time('first');
getDocument(...).promise.then((pdfDocument) => {
console.timeEnd('first');
});
console.time('second');
getDocument(...).promise.then((pdfDocument) => {
pdfDocument.getPage(1).then((pdfPage) => { // Note: the API uses `pageNumber >= 1`, the Worker uses `pageIndex >= 0`.
console.timeEnd('second');
});
});
```
The first case is pretty much guaranteed to show a small regression, however the second case won't be affected at all since the Worker caches the result of `getPage` calls. Again, please remember that the second case is what matters for the standard PDF.js use-case which is why I'm hoping that this patch is deemed acceptable.
---
[1] In issue 7496, the problem is that the document is edited without the XRef table being correctly updated.
In issue 10326, the generator was sorting the XRef table according to the offsets rather than the objects.
[2] The idea of checking the first page in particular came from the "standard" use-case for the PDF.js library, i.e. the default viewer, where a failure to load the first page basically means that nothing will work; note how `{BaseViewer, PDFThumbnailViewer}.setDocument` depends completely on being able to fetch the *first* page.
[3] The only extra parsing is caused by, potentially, having to traverse *part* of the `Pages` tree to find the first page.
2018-12-05 05:51:27 +09:00
|
|
|
{ "id": "issue7496",
|
|
|
|
"file": "pdfs/issue7496.pdf",
|
|
|
|
"md5": "b422981ae781166e75c0fb4c3634ed96",
|
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
2019-10-13 01:15:55 +09:00
|
|
|
"lastPage": 2,
|
Allow over-writing entries, in `XRef.indexObjects`, only when the generation number matches (issues 11230, 11139, 9552, 9129, 7303)
This patch is making me somewhat worried about future regressions, since it's certainly easy to imagine this completely breaking certain kinds of corrupt/edited PDF documents while fixing others.[1]
Obviously it passes all existing reference tests (and even improves one), however compared to many other patches there's no telling how much it could break.
The only reason that I'm even submitting this patch, is because of the number of open issues that it would address.
Generally speaking though, the best course of action would probably be if `XRef.indexObjects` was re-written to be much more robust (since it currently feels somewhat hand-wavy in parts). E.g. by actually checking/validating more of the objects before committing to them.
---
[1] Especially given that it's reverting part of PR 5910, however in the case of issue 5909 it seems that other (more recent) changes have actually made that PR redundant.
2019-10-12 03:39:02 +09:00
|
|
|
"type": "eq",
|
|
|
|
"annotations": true
|
|
|
|
},
|
|
|
|
{ "id": "issue9129",
|
|
|
|
"file": "pdfs/issue9129.pdf",
|
|
|
|
"md5": "939ffc8d6d29b1d74e9d0f98b227b97f",
|
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 1,
|
2019-10-13 01:15:55 +09:00
|
|
|
"type": "eq"
|
Check that the first page can be successfully loaded, to try and ascertain the validity of the XRef table (issue 7496, issue 10326)
For PDF documents with sufficiently broken XRef tables, it's usually quite obvious when you need to fallback to indexing the entire file. However, for certain kinds of corrupted PDF documents the XRef table will, for all intents and purposes, appear to be valid. It's not until you actually try to fetch various objects that things will start to break, which is the case in the referenced issues[1].
Since there's generally a real effort being in made PDF.js to load even corrupt PDF documents, this patch contains a suggested approach to attempt to do a bit more validation of the XRef table during the initial document loading phase.
Here the choice is made to attempt to load the *first* page, as a basic sanity check of the validity of the XRef table. Please note that attempting to load a more-or-less arbitrarily chosen object without any context of what it's supposed to be isn't a very useful, which is why this particular choice was made.
Obviously, just because the first page can be loaded successfully that doesn't guarantee that the *entire* XRef table is valid, however if even the first page fails to load you can be reasonably sure that the document is *not* valid[2].
Even though this patch won't cause any significant increase in the amount of parsing required during initial loading of the document[3], it will require loading of more data upfront which thus delays the initial `getDocument` call.
Whether or not this is a problem depends very much on what you actually measure, please consider the following examples:
```javascript
console.time('first');
getDocument(...).promise.then((pdfDocument) => {
console.timeEnd('first');
});
console.time('second');
getDocument(...).promise.then((pdfDocument) => {
pdfDocument.getPage(1).then((pdfPage) => { // Note: the API uses `pageNumber >= 1`, the Worker uses `pageIndex >= 0`.
console.timeEnd('second');
});
});
```
The first case is pretty much guaranteed to show a small regression, however the second case won't be affected at all since the Worker caches the result of `getPage` calls. Again, please remember that the second case is what matters for the standard PDF.js use-case which is why I'm hoping that this patch is deemed acceptable.
---
[1] In issue 7496, the problem is that the document is edited without the XRef table being correctly updated.
In issue 10326, the generator was sorting the XRef table according to the offsets rather than the objects.
[2] The idea of checking the first page in particular came from the "standard" use-case for the PDF.js library, i.e. the default viewer, where a failure to load the first page basically means that nothing will work; note how `{BaseViewer, PDFThumbnailViewer}.setDocument` depends completely on being able to fetch the *first* page.
[3] The only extra parsing is caused by, potentially, having to traverse *part* of the `Pages` tree to find the first page.
2018-12-05 05:51:27 +09:00
|
|
|
},
|
Allow over-writing entries, in `XRef.indexObjects`, only when the generation number matches (issues 11230, 11139, 9552, 9129, 7303)
This patch is making me somewhat worried about future regressions, since it's certainly easy to imagine this completely breaking certain kinds of corrupt/edited PDF documents while fixing others.[1]
Obviously it passes all existing reference tests (and even improves one), however compared to many other patches there's no telling how much it could break.
The only reason that I'm even submitting this patch, is because of the number of open issues that it would address.
Generally speaking though, the best course of action would probably be if `XRef.indexObjects` was re-written to be much more robust (since it currently feels somewhat hand-wavy in parts). E.g. by actually checking/validating more of the objects before committing to them.
---
[1] Especially given that it's reverting part of PR 5910, however in the case of issue 5909 it seems that other (more recent) changes have actually made that PR redundant.
2019-10-12 03:39:02 +09:00
|
|
|
{ "id": "issue9552",
|
|
|
|
"file": "pdfs/issue9552.pdf",
|
|
|
|
"md5": "7f80fd5b426926f88fd2a9fdc02cd3bd",
|
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"annotations": true
|
|
|
|
},
|
Check that the first page can be successfully loaded, to try and ascertain the validity of the XRef table (issue 7496, issue 10326)
For PDF documents with sufficiently broken XRef tables, it's usually quite obvious when you need to fallback to indexing the entire file. However, for certain kinds of corrupted PDF documents the XRef table will, for all intents and purposes, appear to be valid. It's not until you actually try to fetch various objects that things will start to break, which is the case in the referenced issues[1].
Since there's generally a real effort being in made PDF.js to load even corrupt PDF documents, this patch contains a suggested approach to attempt to do a bit more validation of the XRef table during the initial document loading phase.
Here the choice is made to attempt to load the *first* page, as a basic sanity check of the validity of the XRef table. Please note that attempting to load a more-or-less arbitrarily chosen object without any context of what it's supposed to be isn't a very useful, which is why this particular choice was made.
Obviously, just because the first page can be loaded successfully that doesn't guarantee that the *entire* XRef table is valid, however if even the first page fails to load you can be reasonably sure that the document is *not* valid[2].
Even though this patch won't cause any significant increase in the amount of parsing required during initial loading of the document[3], it will require loading of more data upfront which thus delays the initial `getDocument` call.
Whether or not this is a problem depends very much on what you actually measure, please consider the following examples:
```javascript
console.time('first');
getDocument(...).promise.then((pdfDocument) => {
console.timeEnd('first');
});
console.time('second');
getDocument(...).promise.then((pdfDocument) => {
pdfDocument.getPage(1).then((pdfPage) => { // Note: the API uses `pageNumber >= 1`, the Worker uses `pageIndex >= 0`.
console.timeEnd('second');
});
});
```
The first case is pretty much guaranteed to show a small regression, however the second case won't be affected at all since the Worker caches the result of `getPage` calls. Again, please remember that the second case is what matters for the standard PDF.js use-case which is why I'm hoping that this patch is deemed acceptable.
---
[1] In issue 7496, the problem is that the document is edited without the XRef table being correctly updated.
In issue 10326, the generator was sorting the XRef table according to the offsets rather than the objects.
[2] The idea of checking the first page in particular came from the "standard" use-case for the PDF.js library, i.e. the default viewer, where a failure to load the first page basically means that nothing will work; note how `{BaseViewer, PDFThumbnailViewer}.setDocument` depends completely on being able to fetch the *first* page.
[3] The only extra parsing is caused by, potentially, having to traverse *part* of the `Pages` tree to find the first page.
2018-12-05 05:51:27 +09:00
|
|
|
{ "id": "issue10326",
|
|
|
|
"file": "pdfs/issue10326.pdf",
|
|
|
|
"md5": "015c13b09ef735ea1204f38992c60487",
|
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 1,
|
2019-10-13 01:15:55 +09:00
|
|
|
"type": "eq"
|
Check that the first page can be successfully loaded, to try and ascertain the validity of the XRef table (issue 7496, issue 10326)
For PDF documents with sufficiently broken XRef tables, it's usually quite obvious when you need to fallback to indexing the entire file. However, for certain kinds of corrupted PDF documents the XRef table will, for all intents and purposes, appear to be valid. It's not until you actually try to fetch various objects that things will start to break, which is the case in the referenced issues[1].
Since there's generally a real effort being in made PDF.js to load even corrupt PDF documents, this patch contains a suggested approach to attempt to do a bit more validation of the XRef table during the initial document loading phase.
Here the choice is made to attempt to load the *first* page, as a basic sanity check of the validity of the XRef table. Please note that attempting to load a more-or-less arbitrarily chosen object without any context of what it's supposed to be isn't a very useful, which is why this particular choice was made.
Obviously, just because the first page can be loaded successfully that doesn't guarantee that the *entire* XRef table is valid, however if even the first page fails to load you can be reasonably sure that the document is *not* valid[2].
Even though this patch won't cause any significant increase in the amount of parsing required during initial loading of the document[3], it will require loading of more data upfront which thus delays the initial `getDocument` call.
Whether or not this is a problem depends very much on what you actually measure, please consider the following examples:
```javascript
console.time('first');
getDocument(...).promise.then((pdfDocument) => {
console.timeEnd('first');
});
console.time('second');
getDocument(...).promise.then((pdfDocument) => {
pdfDocument.getPage(1).then((pdfPage) => { // Note: the API uses `pageNumber >= 1`, the Worker uses `pageIndex >= 0`.
console.timeEnd('second');
});
});
```
The first case is pretty much guaranteed to show a small regression, however the second case won't be affected at all since the Worker caches the result of `getPage` calls. Again, please remember that the second case is what matters for the standard PDF.js use-case which is why I'm hoping that this patch is deemed acceptable.
---
[1] In issue 7496, the problem is that the document is edited without the XRef table being correctly updated.
In issue 10326, the generator was sorting the XRef table according to the offsets rather than the objects.
[2] The idea of checking the first page in particular came from the "standard" use-case for the PDF.js library, i.e. the default viewer, where a failure to load the first page basically means that nothing will work; note how `{BaseViewer, PDFThumbnailViewer}.setDocument` depends completely on being able to fetch the *first* page.
[3] The only extra parsing is caused by, potentially, having to traverse *part* of the `Pages` tree to find the first page.
2018-12-05 05:51:27 +09:00
|
|
|
},
|
2020-01-22 03:36:41 +09:00
|
|
|
{ "id": "issue11403",
|
|
|
|
"file": "pdfs/issue11403_reduced.pdf",
|
|
|
|
"md5": "08287b64f442cb7c329b97c4774aa1cd",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
Allow over-writing entries, in `XRef.indexObjects`, only when the generation number matches (issues 11230, 11139, 9552, 9129, 7303)
This patch is making me somewhat worried about future regressions, since it's certainly easy to imagine this completely breaking certain kinds of corrupt/edited PDF documents while fixing others.[1]
Obviously it passes all existing reference tests (and even improves one), however compared to many other patches there's no telling how much it could break.
The only reason that I'm even submitting this patch, is because of the number of open issues that it would address.
Generally speaking though, the best course of action would probably be if `XRef.indexObjects` was re-written to be much more robust (since it currently feels somewhat hand-wavy in parts). E.g. by actually checking/validating more of the objects before committing to them.
---
[1] Especially given that it's reverting part of PR 5910, however in the case of issue 5909 it seems that other (more recent) changes have actually made that PR redundant.
2019-10-12 03:39:02 +09:00
|
|
|
{ "id": "issue11139",
|
|
|
|
"file": "pdfs/issue11139.pdf",
|
|
|
|
"md5": "006dd4f4bb1878bc14a12072d81a4524",
|
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
|
|
|
{ "id": "issue11230",
|
|
|
|
"file": "pdfs/issue11230.pdf",
|
|
|
|
"md5": "db0a1464d8f9f3ce079b52e0cacdccd3",
|
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"firstPage": 100,
|
|
|
|
"lastPage": 100,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2016-08-16 19:23:53 +09:00
|
|
|
{ "id": "issue7544",
|
|
|
|
"file": "pdfs/issue7544.pdf",
|
|
|
|
"md5": "87e3a9fc7d6a6c1bd5b53af6926ce48e",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2020-03-04 04:55:51 +09:00
|
|
|
{ "id": "issue11477",
|
|
|
|
"file": "pdfs/issue11477_reduced.pdf",
|
|
|
|
"md5": "f4e5735569afce79e52b3fd56d10ae13",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2016-09-23 21:11:08 +09:00
|
|
|
{ "id": "issue7665",
|
|
|
|
"file": "pdfs/issue7665.pdf",
|
|
|
|
"md5": "f1199d16195a61e8232e2d1e742ed46b",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "load",
|
|
|
|
"about": "Encrypted file with indirect objects in the /Encrypt dictionary."
|
|
|
|
},
|
2011-11-30 08:47:53 +09:00
|
|
|
{ "id": "protectip",
|
|
|
|
"file": "pdfs/protectip.pdf",
|
|
|
|
"md5": "676e7a7b8f96d04825361832b1838a93",
|
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2011-11-30 13:22:30 +09:00
|
|
|
},
|
2013-07-30 05:24:32 +09:00
|
|
|
{ "id": "issue3427",
|
|
|
|
"file": "pdfs/issue3427.pdf",
|
|
|
|
"md5": "61979ede77f4557c65d4eb3c1a6dceeb",
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2011-11-30 13:22:30 +09:00
|
|
|
{ "id": "piperine",
|
|
|
|
"file": "pdfs/piperine.pdf",
|
|
|
|
"md5": "603ca43dc5732dbba1579f122958c0c2",
|
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2011-12-02 21:55:04 +09:00
|
|
|
},
|
|
|
|
{ "id": "issue840",
|
|
|
|
"file": "pdfs/issue840.pdf",
|
|
|
|
"md5": "20d88011dd7e3c4fb5274979094dab93",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2011-12-06 11:42:39 +09:00
|
|
|
},
|
2016-01-26 06:56:34 +09:00
|
|
|
{ "id": "issue1536",
|
|
|
|
"file": "pdfs/issue1536.pdf",
|
|
|
|
"md5": "bcaa52e6216399592ad5aa9fc49f1436",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 6,
|
|
|
|
"lastPage": 6,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
|
|
|
{ "id": "issue2098",
|
|
|
|
"file": "pdfs/issue2098.pdf",
|
|
|
|
"md5": "e9fa2b7cb935ffb95b510322d1e047e1",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2011-12-08 14:30:48 +09:00
|
|
|
{ "id": "bpl13210",
|
|
|
|
"file": "pdfs/bpl13210.pdf",
|
|
|
|
"md5": "8a08512baa9fa95378d9ad4b995947c7",
|
|
|
|
"link": true,
|
2013-01-31 04:31:08 +09:00
|
|
|
"lastPage": 5,
|
2011-12-08 14:30:48 +09:00
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2016-12-21 00:42:15 +09:00
|
|
|
{ "id": "issue7901",
|
|
|
|
"file": "pdfs/issue7901.pdf",
|
|
|
|
"md5": "16059a3af6e81ae9272daa57ea03e6e9",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2019-10-26 18:57:56 +09:00
|
|
|
{ "id": "issue11279",
|
|
|
|
"file": "pdfs/issue11279.pdf",
|
|
|
|
"md5": "03361d24f3ed63b93f77abf731f8fc73",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2017-06-10 01:51:31 +09:00
|
|
|
{ "id": "issue8480",
|
|
|
|
"file": "pdfs/issue8480.pdf",
|
|
|
|
"md5": "769bc07bf8041d95667f2d32aaf75665",
|
|
|
|
"rounds": 1,
|
2016-12-21 00:42:15 +09:00
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2018-07-27 23:57:58 +09:00
|
|
|
{ "id": "issue9915",
|
|
|
|
"file": "pdfs/issue9915_reduced.pdf",
|
|
|
|
"md5": "c56dabe5066a6c821901920e09dffe00",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2017-06-29 02:34:36 +09:00
|
|
|
{ "id": "issue8570",
|
|
|
|
"file": "pdfs/issue8570.pdf",
|
|
|
|
"md5": "0355731adb72df233eaa10464dcc8c51",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2017-07-24 16:51:40 +09:00
|
|
|
{ "id": "issue7696",
|
|
|
|
"file": "pdfs/issue7696.pdf",
|
|
|
|
"md5": "0593f52d03251164caa219d704a15e4c",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2011-12-06 11:42:39 +09:00
|
|
|
{ "id": "tutorial",
|
|
|
|
"file": "pdfs/tutorial.pdf",
|
|
|
|
"md5": "6e122f618c27f3aa9a689423e3be6b8d",
|
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2011-12-07 13:33:59 +09:00
|
|
|
},
|
|
|
|
{ "id": "geothermal.pdf",
|
|
|
|
"file": "pdfs/geothermal.pdf",
|
|
|
|
"md5": "ecffc0ce38ffdf1e90dc952f186e9a91",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
2013-01-31 04:31:08 +09:00
|
|
|
"lastPage": 5,
|
2011-12-07 13:33:59 +09:00
|
|
|
"type": "eq"
|
2011-12-08 11:59:44 +09:00
|
|
|
},
|
2011-12-10 12:21:58 +09:00
|
|
|
{ "id": "issue919",
|
|
|
|
"file": "pdfs/issue919.pdf",
|
|
|
|
"md5": "3a1716a512aca4d7a8d6106bd4885d14",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
2013-01-31 04:31:08 +09:00
|
|
|
"lastPage": 3,
|
2011-12-10 12:21:58 +09:00
|
|
|
"type": "eq"
|
2011-12-13 08:34:11 +09:00
|
|
|
},
|
2016-02-08 21:49:30 +09:00
|
|
|
{ "id": "issue6066",
|
|
|
|
"file": "pdfs/issue6066.pdf",
|
|
|
|
"md5": "b26eb08fc5ab2518ba8fde603bdfc46b",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 2,
|
|
|
|
"lastPage": 2,
|
[api-minor] Decode all JPEG images with the built-in PDF.js decoder in `src/core/jpg.js`
Currently some JPEG images are decoded by the built-in PDF.js decoder in `src/core/jpg.js`, while others attempt to use the browser JPEG decoder. This inconsistency seem unfortunate for a number of reasons:
- It adds, compared to the other image formats supported in the PDF specification, a fair amount of code/complexity to the image handling in the PDF.js library.
- The PDF specification support JPEG images with features, e.g. certain ColorSpaces, that browsers are unable to decode natively. Hence, determining if a JPEG image is possible to decode natively in the browser require a non-trivial amount of parsing. In particular, we're parsing (part of) the raw JPEG data to extract certain marker data and we also need to parse the ColorSpace for the JPEG image.
- While some JPEG images may, for all intents and purposes, appear to be natively supported there's still cases where the browser may fail to decode some JPEG images. In order to support those cases, we've had to implement a fallback to the PDF.js JPEG decoder if there's any issues during the native decoding. This also means that it's no longer possible to simply send the JPEG image to the main-thread and continue parsing, but you now need to actually wait for the main-thread to indicate success/failure first.
In practice this means that there's a code-path where the worker-thread is forced to wait for the main-thread, while the reverse should *always* be the case.
- The native decoding, for anything except the *simplest* of JPEG images, result in increased peak memory usage because there's a handful of short-lived copies of the JPEG data (see PR 11707).
Furthermore this also leads to data being *parsed* on the main-thread, rather than the worker-thread, which you usually want to avoid for e.g. performance and UI-reponsiveness reasons.
- Not all environments, e.g. Node.js, fully support native JPEG decoding. This has, historically, lead to some issues and support requests.
- Different browsers may use different JPEG decoders, possibly leading to images being rendered slightly differently depending on the platform/browser where the PDF.js library is used.
Originally the implementation in `src/core/jpg.js` were unable to handle all of the JPEG images in the test-suite, but over the last couple of years I've fixed (hopefully) all of those issues.
At this point in time, there's two kinds of failure with this patch:
- Changes which are basically imperceivable to the naked eye, where some pixels in the images are essentially off-by-one (in all components), which could probably be attributed to things such as different rounding behaviour in the browser/PDF.js JPEG decoder.
This type of "failure" accounts for the *vast* majority of the total number of changes in the reference tests.
- Changes where the JPEG images now looks *ever so slightly* blurrier than with the native browser decoder. For quite some time I've just assumed that this pointed to a general deficiency in the `src/core/jpg.js` implementation, however I've discovered when comparing two viewers side-by-side that the differences vanish at higher zoom levels (usually around 200% is enough).
Basically if you disable [this downscaling in canvas.js](https://github.com/mozilla/pdf.js/blob/8fb82e939cf0c8618a4e775ff17fc96f726872b5/src/display/canvas.js#L2356-L2395), which is what happens when zooming in, the differences simply vanish!
Hence I'm pretty satisfied that there's no significant problems with the `src/core/jpg.js` implementation, and the problems are rather tied to the general quality of the downscaling algorithm used. It could even be seen as a positive that *all* images now share the same downscaling behaviour, since this actually fixes one old bug; see issue 7041.
2020-01-20 20:10:16 +09:00
|
|
|
"type": "eq"
|
2016-02-08 21:49:30 +09:00
|
|
|
},
|
2019-02-09 15:53:16 +09:00
|
|
|
{ "id": "issue10529",
|
|
|
|
"file": "pdfs/issue10529.pdf",
|
|
|
|
"md5": "1a4d404a137c610ff0c747cbea3b8666",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2019-03-05 07:17:21 +09:00
|
|
|
{ "id": "issue10614",
|
|
|
|
"file": "pdfs/issue10614.pdf",
|
|
|
|
"md5": "c41da60ce9af100cb78e1c2a6ba18232",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2020-01-26 00:53:34 +09:00
|
|
|
{ "id": "issue11362",
|
|
|
|
"file": "pdfs/issue11362.pdf",
|
|
|
|
"md5": "bcb08162d4bff1d32ead8b563e866c93",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq",
|
|
|
|
"annotations": true
|
|
|
|
},
|
2019-12-05 23:44:51 +09:00
|
|
|
{ "id": "issue11385",
|
|
|
|
"file": "pdfs/issue11385.pdf",
|
|
|
|
"md5": "cc04b23b845366857426a5aa6acf227b",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2020-08-08 18:43:46 +09:00
|
|
|
{ "id": "issue4398",
|
|
|
|
"file": "pdfs/issue4398.pdf",
|
|
|
|
"md5": "f3c1f967e99a1f5659e0e196f4293706",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2016-12-05 04:44:12 +09:00
|
|
|
{ "id": "issue6071",
|
|
|
|
"file": "pdfs/issue6071.pdf",
|
|
|
|
"md5": "2e08526d8e7c9ba4269fc12ef488d3eb",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-10-27 13:30:01 +09:00
|
|
|
{ "id": "issue1905",
|
|
|
|
"file": "pdfs/issue1905.pdf",
|
|
|
|
"md5": "b1bbd72ca6522ae1502aa26320f81994",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2011-12-13 12:42:39 +09:00
|
|
|
{ "id": "issue918",
|
|
|
|
"file": "pdfs/issue918.pdf",
|
|
|
|
"md5": "d582cc0f2592ae82936589ced2a47e55",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2011-12-15 06:41:36 +09:00
|
|
|
},
|
2013-05-04 03:13:45 +09:00
|
|
|
{ "id": "issue3188",
|
|
|
|
"file": "pdfs/issue3188.pdf",
|
|
|
|
"md5": "161b72604d86f40ab2f765ddd3b61227",
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-04-24 12:14:58 +09:00
|
|
|
{ "id": "issue1586",
|
2012-04-25 08:53:11 +09:00
|
|
|
"file": "pdfs/pdfjsbad1586.pdf",
|
|
|
|
"md5": "793d0870f0b0c613799b0677d64daca4",
|
2012-04-24 12:14:58 +09:00
|
|
|
"rounds": 1,
|
|
|
|
"type": "load"
|
|
|
|
},
|
2016-01-12 20:55:31 +09:00
|
|
|
{ "id": "aboutstacks",
|
|
|
|
"file": "pdfs/aboutstacks.pdf",
|
|
|
|
"md5": "6e7c8416a293ba2d83bc8dd20c6ccf51",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2017-10-24 09:25:50 +09:00
|
|
|
{ "id": "issue2884_reduced",
|
|
|
|
"file": "pdfs/issue2884_reduced.pdf",
|
|
|
|
"md5": "18386542fc82affa2a5d3722549f8211",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2016-09-22 21:07:20 +09:00
|
|
|
{ "id": "bug956965",
|
|
|
|
"file": "pdfs/bug956965.pdf",
|
|
|
|
"md5": "9b2f1176c797ee84e989a507e745f89d",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 33,
|
|
|
|
"lastPage": 33,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2011-12-17 03:54:31 +09:00
|
|
|
{ "id": "smaskdim",
|
|
|
|
"file": "pdfs/smaskdim.pdf",
|
2011-12-19 10:28:25 +09:00
|
|
|
"md5": "de80aeca7cbf79940189fd34d59671ee",
|
2011-12-17 03:54:31 +09:00
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2011-12-30 05:39:00 +09:00
|
|
|
},
|
2013-03-05 05:28:04 +09:00
|
|
|
{ "id": "endchar",
|
|
|
|
"file": "pdfs/endchar.pdf",
|
|
|
|
"md5": "90a59acdd62252fdb4fefa482e12d6b3",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2017-04-06 07:16:48 +09:00
|
|
|
{ "id": "issue8234",
|
|
|
|
"file": "pdfs/issue8234.pdf",
|
|
|
|
"md5": "32650fc60c51a9813b98bc9876dc15af",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
Cache JPEG images, just as we do for other image formats, in `evaluator.js` (issue 8380)
For some reason, we're putting all kind of images *except* JPEG into the `imageCache` in `evaluator.js`.[1]
This means that in the PDF file in issue 8380, we'll keep sending the *same* two small images[2] to the main-thread and decoding them over and over. This is obviously hugely inefficient!
As can be seen from the discussion in the issue, the performance becomes *extremely* bad if the user has the addon "Adblock Plus" installed. However, even in a clean Firefox profile, the performance isn't that great.
This patch not only addresses the performance implications of the "Adblock Plus" addon together with that particular PDF file, but it *also* improves the rendering times considerably for *all* users.
Locally, with a clean profile, the rendering times are reduced from `~2000 ms` to `~500 ms` for my setup!
Obviously, the general structure of the PDF file and its operator sequence is still hugely inefficient, however I'd say that the performance with this patch is good enough to consider the issue (as it stands) resolved.[3]
Fixes 8380.
---
[1] Not technically true, since inline images are cached from `parser.js`, but whatever :-)
[2] The two JPEG images have dimensions 1x2, respectively 4x2.
[3] To make this even more efficient, a new state would have to be added to the `QueueOptimizer`. Given that PDF files this stupid fortunately aren't too common, I'm not convinced that it's worth doing.
2017-05-07 19:34:47 +09:00
|
|
|
{ "id": "issue8380",
|
|
|
|
"file": "pdfs/issue8380.pdf",
|
|
|
|
"md5": "2782af6a4d0540fcea3897560f842094",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2017-05-23 23:08:02 +09:00
|
|
|
{ "id": "issue8424",
|
|
|
|
"file": "pdfs/issue8424.pdf",
|
|
|
|
"md5": "3de1ea4c085e8fe8e156153418058955",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2011-12-31 02:24:13 +09:00
|
|
|
{ "id": "type4psfunc",
|
2011-12-30 05:39:00 +09:00
|
|
|
"file": "pdfs/type4psfunc.pdf",
|
|
|
|
"md5": "7e6027a02ff78577f74dccdf84e37189",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2012-01-05 10:55:04 +09:00
|
|
|
},
|
2017-05-24 19:12:14 +09:00
|
|
|
{ "id": "issue8330",
|
|
|
|
"file": "pdfs/issue8330.pdf",
|
|
|
|
"md5": "9010093d07dd62d3b49378fd48cf45f9",
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-04-12 01:55:39 +09:00
|
|
|
{ "id": "issue4573",
|
|
|
|
"file": "pdfs/issue4573.pdf",
|
|
|
|
"md5": "34b0c4fdee19e57033275b766c5f57a3",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "Seac with differences array that messes up mapping."
|
|
|
|
},
|
2014-09-27 20:14:25 +09:00
|
|
|
{ "id": "issue2840",
|
|
|
|
"file": "pdfs/issue2840.pdf",
|
|
|
|
"md5": "d9df49f6d62668d099e0fb7e74f8f337",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
Apply Patterns, if necessary, when rendering text
Currently we're not applying Patterns for text, but only for graphics.
This patch is unfortunately not a complete solution, but rather a step on the way, since there are still some PDF files where the Patterns look more like a solid colour, rather than the intended gradient.
I've been unable to fix these issues completely, and I've not managed to determine if the remaining issues are caused either by the pattern code, the canvas code, or perhaps both.
However, given that even this simple patch improves the current situation quite a bit, I figured that it couldn't hurt to submit it as-is.
- Fixes 5804.
- Fixes 6130.
- Improves 3988 a lot, since the text is now visible. However, it looks like the text is *one* solid colour, instead of the correct gradient.
- Improves 5432, since the text is no longer gray. (This file also suffers from the same problem as the previous one.)
2015-12-30 01:57:10 +09:00
|
|
|
{ "id": "issue5804",
|
|
|
|
"file": "pdfs/issue5804.pdf",
|
|
|
|
"md5": "442f27939edb6aaf173ceff38d69bb14",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2016-07-24 21:32:48 +09:00
|
|
|
{ "id": "issue7446",
|
|
|
|
"file": "pdfs/issue7446.pdf",
|
|
|
|
"md5": "1b5bdd8a806e6ab9eda7c1c707b75fc6",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "load",
|
|
|
|
"about": "Annotation without the required /Subtype."
|
|
|
|
},
|
Catch errors and continue parsing in `parseCMap` (issue 7492)
After PR 7039, the PDF file in issue 7492 no longer renders at all, but note that text selection wasn't working correctly previously.
The problem with the PDF file in issue 7492 is that the `cMap`, in the `toUnicode` entry in the font, contains an invalid name:
```
/CMapName /-usr-share-fonts-truetype-Panton-Panton Family-Fontfabric - Panton.otf,000-UTF16 def
```
When we parse that line, things obviously break because there are spaces present in the wrong places.
To avoid that issue, the patch simply lets `parseCMap` continue when errors are encountered, to try and recover usable data. Note that by not aborting immediatly when an error is encountered, we are also able to fix the text selection.
Obviously, it could be argued that we should just immediatly reject a corrupt `cMap`. But given that they usually are correct, it seems that trying to recover as much data as possible from corrupt one can only be a good thing for both glyph mapping and text selection.
Fixes 7492.
2016-07-18 23:01:02 +09:00
|
|
|
{ "id": "issue7492-eq",
|
|
|
|
"file": "pdfs/issue7492.pdf",
|
|
|
|
"md5": "7b0b28919c1088a2a5a0aeedbaa4c3ca",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
|
|
|
{ "id": "issue7492-text",
|
|
|
|
"file": "pdfs/issue7492.pdf",
|
|
|
|
"md5": "7b0b28919c1088a2a5a0aeedbaa4c3ca",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "text"
|
|
|
|
},
|
Build a fallback `ToUnicode` map for simple fonts (issue 8229)
In some fonts, the included `ToUnicode` data is incomplete causing text-selection to not work properly. For simple fonts that contain encoding data, we can manually build a `ToUnicode` map to attempt to improve things.
Please note that since we're currently using the `ToUnicode` data during glyph mapping, in an attempt to avoid rendering regressions, I purposely didn't want to amend to original `ToUnicode` data for this text-selection edge-case.
Instead, I opted for the current solution, which will (hopefully) give slightly better text-extraction results in PDF file with incomplete `ToUnicode` data.
According to the PDF specification, see [section 9.10.2](http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G8.1873172):
> A conforming reader can use these methods, in the priority given, to map a character code to a Unicode value.
> ...
Reading that paragraph literally, it doesn't seem too unreasonable to use *different* methods for different charcodes.
Fixes 8229.
2017-11-26 21:29:43 +09:00
|
|
|
{ "id": "issue8229",
|
|
|
|
"file": "pdfs/issue8229.pdf",
|
|
|
|
"md5": "a729f663782e87ebc1efad0755ebf6a5",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "text"
|
|
|
|
},
|
Apply Patterns, if necessary, when rendering text
Currently we're not applying Patterns for text, but only for graphics.
This patch is unfortunately not a complete solution, but rather a step on the way, since there are still some PDF files where the Patterns look more like a solid colour, rather than the intended gradient.
I've been unable to fix these issues completely, and I've not managed to determine if the remaining issues are caused either by the pattern code, the canvas code, or perhaps both.
However, given that even this simple patch improves the current situation quite a bit, I figured that it couldn't hurt to submit it as-is.
- Fixes 5804.
- Fixes 6130.
- Improves 3988 a lot, since the text is now visible. However, it looks like the text is *one* solid colour, instead of the correct gradient.
- Improves 5432, since the text is no longer gray. (This file also suffers from the same problem as the previous one.)
2015-12-30 01:57:10 +09:00
|
|
|
{ "id": "ShowText-ShadingPattern",
|
|
|
|
"file": "pdfs/ShowText-ShadingPattern.pdf",
|
|
|
|
"md5": "fe683725db037ffe19d390969610a652",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "Please note that this file currently renders incorrectly."
|
|
|
|
},
|
2020-10-05 23:38:01 +09:00
|
|
|
{ "id": "issue12399",
|
|
|
|
"file": "pdfs/issue12399_reduced.pdf",
|
|
|
|
"md5": "01bdd258be93e10f8399708eecedbfd6",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2016-06-01 06:01:35 +09:00
|
|
|
{ "id": "issue5808-text",
|
|
|
|
"file": "pdfs/issue5808.pdf",
|
|
|
|
"md5": "e0584dd540d7859d6c191aa53379692e",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "text"
|
|
|
|
},
|
For embedded Type1 fonts without included `ToUnicode`/`Encoding` data, attempt to improve text selection by using the `builtInEncoding` to amend the `toUnicode` map (issue 6901, issue 7182, issue 7217, bug 917796, bug 1242142)
Note that in order to prevent any possible issues, this patch does *not* try to amend the `toUnicode` data for Type1 fonts that contain either `ToUnicode` or `Encoding` entries in the font dictionary.
Fixes, or at least improves, issues/bugs such as e.g. 6658, 6901, 7182, 7217, bug 917796, bug 1242142.
2016-08-18 01:33:06 +09:00
|
|
|
{ "id": "issue6901-eq",
|
|
|
|
"file": "pdfs/issue6901.pdf",
|
|
|
|
"md5": "1a0604b1a7a3aaf2162b425a9a84230b",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
|
|
|
{ "id": "issue6901-text",
|
|
|
|
"file": "pdfs/issue6901.pdf",
|
|
|
|
"md5": "1a0604b1a7a3aaf2162b425a9a84230b",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2016-05-15 05:13:12 +09:00
|
|
|
{ "id": "issue6962",
|
|
|
|
"file": "pdfs/issue6962.pdf",
|
|
|
|
"md5": "d40e871ecca68baf93114bd28c782148",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2015-01-17 20:53:32 +09:00
|
|
|
{ "id": "issue5644",
|
|
|
|
"file": "pdfs/issue5644.pdf",
|
|
|
|
"md5": "6f9313c5043b3ecb0ab2df321d3e1847",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 6,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2017-02-23 21:28:50 +09:00
|
|
|
{ "id": "issue8088",
|
|
|
|
"file": "pdfs/issue8088.pdf",
|
|
|
|
"md5": "5bbc33c7433799487518eb0d8094348c",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 3,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-05-18 07:57:06 +09:00
|
|
|
{ "id": "bug866395",
|
|
|
|
"file": "pdfs/bug866395.pdf",
|
|
|
|
"md5": "f03bc77e84637241980b09a0a220f575",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "Font with an empty font file."
|
|
|
|
},
|
2016-05-09 01:28:18 +09:00
|
|
|
{ "id": "bug1037816",
|
|
|
|
"file": "pdfs/bug1037816.pdf",
|
|
|
|
"md5": "8a45299d7b102a9c1cadb8883b8debc9",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "load",
|
|
|
|
"about": "ObjStm stream containing 'endobj' commands."
|
|
|
|
},
|
2012-01-05 10:55:04 +09:00
|
|
|
{ "id": "ocs",
|
|
|
|
"file": "pdfs/ocs.pdf",
|
|
|
|
"md5": "2ade57e954ae7632749cf328daeaa7a8",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "load"
|
2012-01-05 10:57:08 +09:00
|
|
|
},
|
2013-02-23 23:04:17 +09:00
|
|
|
{ "id": "issue2139",
|
|
|
|
"file": "pdfs/issue2139.pdf",
|
|
|
|
"md5": "ee4072992e7c5ffd5063181916a2fcae",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-01-12 13:19:21 +09:00
|
|
|
{ "id": "issue1010",
|
|
|
|
"file": "pdfs/issue1010.pdf",
|
|
|
|
"md5": "f991ef093484a107fe9f59dff18fc155",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-05-19 01:14:09 +09:00
|
|
|
{ "id": "issue1709",
|
|
|
|
"file": "pdfs/issue1709.pdf",
|
|
|
|
"md5": "84497bd23b7c82d03d2681a1cb1d9ed0",
|
|
|
|
"rounds": 1,
|
2013-01-31 04:31:08 +09:00
|
|
|
"lastPage": 10,
|
2012-05-19 01:14:09 +09:00
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2016-12-06 18:21:42 +09:00
|
|
|
{ "id": "issue7872",
|
|
|
|
"file": "pdfs/issue7872.pdf",
|
|
|
|
"md5": "81781dfecfcb7e9cd9cc7e60f8b747b7",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "MediaBox and CropBox with indirect objects."
|
|
|
|
},
|
2020-08-22 07:25:07 +09:00
|
|
|
{ "id": "bug1057544",
|
|
|
|
"file": "pdfs/bug1057544.pdf",
|
|
|
|
"md5": "49ad71b82ead1ee0fe4ddb41aa9e30b4",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-02-23 13:08:46 +09:00
|
|
|
{ "id": "issue2642",
|
|
|
|
"file": "pdfs/issue2642.pdf",
|
|
|
|
"md5": "b6679861fdce3bbab0c1fa51bb7f5077",
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-11-14 08:27:46 +09:00
|
|
|
{ "id": "issue3848",
|
|
|
|
"file": "pdfs/issue3848.pdf",
|
|
|
|
"md5": "2498cf0650cc97ceca3e24dfa0425a73",
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "load",
|
|
|
|
"about": "Document tree with pages and page nodes on the same level."
|
|
|
|
},
|
2012-01-05 04:49:37 +09:00
|
|
|
{ "id": "issue1015",
|
|
|
|
"file": "pdfs/issue1015.pdf",
|
|
|
|
"md5": "b61503d1b445742b665212866afb60e2",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
2012-01-10 11:37:39 +09:00
|
|
|
},
|
2012-01-22 08:18:36 +09:00
|
|
|
{ "id": "issue1096",
|
|
|
|
"file": "pdfs/issue1096.pdf",
|
|
|
|
"md5": "7f75d2b4b93c78d401ff39e8c1b00612",
|
|
|
|
"rounds": 1,
|
2013-01-31 04:31:08 +09:00
|
|
|
"lastPage": 9,
|
2012-01-22 08:18:36 +09:00
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-01-27 09:51:58 +09:00
|
|
|
{ "id": "issue1127",
|
|
|
|
"file": "pdfs/issue1127.pdf",
|
|
|
|
"md5": "4fb2be5ffefeafda4ba977de2a1bb4d8",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
Add a couple more, mostly `text`, reference tests for non-embedded symbolic fonts without included encoding information
I've started to look into how we can fix issue 7580, but quickly became worried that fixing it could easily mean that we'd trade one fixed PDF file for a multitude of broken ones.
Hence I started going through the history of the code that choose the fallback encoding, and noticed that it has been changed a number of times over the years to deal with various cases of weirdness/errors in non-embedded fonts.
To my relief it turned out that almost all the PRs, please see a possibly incomplete [list here], that changed this code actually included `eq` test-cases.
However, in one case it appears that a PR missed to add a test-case. Furthermore since the fallback encoding may also be the only source for creating a `toUnicode` map, changing the encoding could possibly regress only the text-selection despite a PDF file still rendering correctly.
Therefore, this PR adds one new `eq` test, and also a number of additional `text` tests for PDF files already present in the test-suite.
Note that it's obviously possible that there's a certain overlap between the added tests, but I'd be *a whole lot* more concerned with causing regressions.
2016-09-10 20:10:15 +09:00
|
|
|
{ "id": "issue1127-text",
|
|
|
|
"file": "pdfs/issue1127.pdf",
|
|
|
|
"md5": "4fb2be5ffefeafda4ba977de2a1bb4d8",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2014-03-18 22:07:54 +09:00
|
|
|
{ "id": "issue4461-load",
|
|
|
|
"file": "pdfs/issue4461.pdf",
|
|
|
|
"md5": "9df4ecaae429adb5dc93d9342a53159d",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "load",
|
|
|
|
"about": "Document without /Resources entry in page dictionary"
|
|
|
|
},
|
2012-02-20 11:12:57 +09:00
|
|
|
{ "id": "issue1249-load",
|
|
|
|
"file": "pdfs/issue1249.pdf",
|
|
|
|
"md5": "4f81339fa09422a7db980f34ea963609",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "load"
|
|
|
|
},
|
2012-01-10 11:37:39 +09:00
|
|
|
{ "id": "liveprogramming",
|
|
|
|
"file": "pdfs/liveprogramming.pdf",
|
|
|
|
"md5": "7bd4dad1188232ef597d36fd72c33e52",
|
|
|
|
"rounds": 1,
|
2013-01-31 04:31:08 +09:00
|
|
|
"lastPage": 3,
|
2012-01-10 11:37:39 +09:00
|
|
|
"link": true,
|
|
|
|
"type": "load"
|
2012-01-12 11:14:49 +09:00
|
|
|
},
|
2019-09-18 03:01:17 +09:00
|
|
|
{ "id": "issue7339",
|
|
|
|
"file": "pdfs/issue7339_reduced.pdf",
|
|
|
|
"md5": "7092ab6a1acc31db9d1caaa0447334a0",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-04-30 19:40:54 +09:00
|
|
|
{ "id": "bug1142033",
|
|
|
|
"file": "pdfs/bug1142033.pdf",
|
|
|
|
"md5": "1d9afd397e89a0f52c056f449ec93daa",
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2017-05-02 21:14:03 +09:00
|
|
|
{ "id": "bug1354114",
|
|
|
|
"file": "pdfs/bug1354114.pdf",
|
|
|
|
"md5": "ad718d04702d29a37792c7f222fe1fa6",
|
|
|
|
"rounds": 1,
|
|
|
|
"firstPage": 30,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-04-30 19:40:54 +09:00
|
|
|
{ "id": "issue5874",
|
|
|
|
"file": "pdfs/issue5874.pdf",
|
|
|
|
"md5": "25922edf223aa91bc259663d0a34a6ab",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-08-31 23:55:25 +09:00
|
|
|
{ "id": "pr4731",
|
|
|
|
"file": "pdfs/pr4731.pdf",
|
|
|
|
"md5": "0121642027e525c4b95357f1b5669e64",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "load"
|
|
|
|
},
|
2018-09-30 23:29:16 +09:00
|
|
|
{ "id": "annotation-caret-ink",
|
|
|
|
"file": "pdfs/annotation-caret-ink.pdf",
|
|
|
|
"md5": "6218ca235580d1975474c979e0128c2d",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"annotations": true
|
|
|
|
},
|
2019-11-10 22:37:42 +09:00
|
|
|
{ "id": "issue6179_reduced",
|
|
|
|
"file": "pdfs/issue6179_reduced.pdf",
|
|
|
|
"md5": "457ff3561b83346ae3caf91acd252040",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"annotations": true
|
|
|
|
},
|
[api-minor] Always allow e.g. rendering to continue even if there are errors, and add a `stopAtErrors` parameter to `getDocument` to opt-out of this behaviour (issue 6342, issue 3795, bug 1130815)
Other PDF readers, e.g. Adobe Reader and PDFium (in Chrome), will attempt to render as much of a page as possible even if there are errors present.
Currently we just bail as soon the first error is hit, which means that we'll usually not render anything in these cases and just display a blank page instead.
NOTE: This patch changes the default behaviour of the PDF.js API to always attempt to recover as much data as possible, even when encountering errors during e.g. `getOperatorList`/`getTextContent`, which thus improve our handling of corrupt PDF files and allow the default viewer to handle errors slightly more gracefully.
In the event that an API consumer wishes to use the old behaviour, where we stop parsing as soon as an error is encountered, the `stopAtErrors` parameter can be set at `getDocument`.
Fixes, inasmuch it's possible since the PDF files are corrupt, e.g. issue 6342, issue 3795, and [bug 1130815](https://bugzilla.mozilla.org/show_bug.cgi?id=1130815) (and probably others too).
2017-02-19 22:03:08 +09:00
|
|
|
{ "id": "bug1130815-eq",
|
|
|
|
"file": "pdfs/bug1130815.pdf",
|
|
|
|
"md5": "3ff3b550c3af766991b2a1b11d00de85",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
|
|
|
{ "id": "bug1130815-text",
|
|
|
|
"file": "pdfs/bug1130815.pdf",
|
|
|
|
"md5": "3ff3b550c3af766991b2a1b11d00de85",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2016-04-21 22:27:18 +09:00
|
|
|
{ "id": "issue3248",
|
|
|
|
"file": "pdfs/issue3248.pdf",
|
|
|
|
"md5": "970767ed68de46c316d74de67965999b",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "load"
|
|
|
|
},
|
2017-09-17 20:35:18 +09:00
|
|
|
{ "id": "issue8702-eq",
|
|
|
|
"file": "pdfs/issue8702.pdf",
|
|
|
|
"md5": "59d501ed1518d78ef6ee442cf824b0f6",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
|
|
|
{ "id": "issue8702-text",
|
|
|
|
"file": "pdfs/issue8702.pdf",
|
|
|
|
"md5": "59d501ed1518d78ef6ee442cf824b0f6",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2015-10-04 20:45:07 +09:00
|
|
|
{ "id": "pr4897",
|
|
|
|
"file": "pdfs/pr4897.pdf",
|
|
|
|
"md5": "26897633eea5e6d10345a130b1c1777c",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2016-04-21 22:10:40 +09:00
|
|
|
{ "id": "issue7229",
|
|
|
|
"file": "pdfs/issue7229.pdf",
|
|
|
|
"md5": "480e51aae0ac271780e4603d1561d15e",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "load"
|
|
|
|
},
|
2015-09-04 20:19:32 +09:00
|
|
|
{ "id": "issue1940",
|
|
|
|
"file": "pdfs/issue1940.pdf",
|
|
|
|
"md5": "4f0a0b92c1b5e6e86e1a82490087e6e5",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "load"
|
|
|
|
},
|
2020-11-04 00:04:08 +09:00
|
|
|
{ "id": "160F-2019",
|
|
|
|
"file": "pdfs/160F-2019.pdf",
|
|
|
|
"md5": "71591f11ee717e12887f529c84d5ae89",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"print": true,
|
|
|
|
"annotationStorage": {
|
|
|
|
"427R": {
|
|
|
|
"hidden": false,
|
|
|
|
"value": "hello world"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
},
|
[api-minor] Always allow e.g. rendering to continue even if there are errors, and add a `stopAtErrors` parameter to `getDocument` to opt-out of this behaviour (issue 6342, issue 3795, bug 1130815)
Other PDF readers, e.g. Adobe Reader and PDFium (in Chrome), will attempt to render as much of a page as possible even if there are errors present.
Currently we just bail as soon the first error is hit, which means that we'll usually not render anything in these cases and just display a blank page instead.
NOTE: This patch changes the default behaviour of the PDF.js API to always attempt to recover as much data as possible, even when encountering errors during e.g. `getOperatorList`/`getTextContent`, which thus improve our handling of corrupt PDF files and allow the default viewer to handle errors slightly more gracefully.
In the event that an API consumer wishes to use the old behaviour, where we stop parsing as soon as an error is encountered, the `stopAtErrors` parameter can be set at `getDocument`.
Fixes, inasmuch it's possible since the PDF files are corrupt, e.g. issue 6342, issue 3795, and [bug 1130815](https://bugzilla.mozilla.org/show_bug.cgi?id=1130815) (and probably others too).
2017-02-19 22:03:08 +09:00
|
|
|
{ "id": "issue6342-eq",
|
|
|
|
"file": "pdfs/issue6342.pdf",
|
|
|
|
"md5": "2ea85ca8d17117798f105be88bdb2bfd",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
|
|
|
{ "id": "issue6342-text",
|
|
|
|
"file": "pdfs/issue6342.pdf",
|
|
|
|
"md5": "2ea85ca8d17117798f105be88bdb2bfd",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2016-03-02 05:39:33 +09:00
|
|
|
{ "id": "issue7020",
|
|
|
|
"file": "pdfs/issue7020.pdf",
|
|
|
|
"md5": "93b464e21c649e64ae92eeafe99fc31b",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2017-06-30 03:52:49 +09:00
|
|
|
{ "id": "issue8586",
|
|
|
|
"file": "pdfs/issue8586.pdf",
|
|
|
|
"md5": "16b5230364017d3b0d2d65978eb35816",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2016-03-23 21:52:30 +09:00
|
|
|
{ "id": "issue7101",
|
|
|
|
"file": "pdfs/issue7101.pdf",
|
|
|
|
"md5": "cc9cabe12ac9ad49e5372ef33d10aeb4",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2016-04-15 21:22:36 +09:00
|
|
|
},
|
|
|
|
{ "id": "issue7200",
|
|
|
|
"file": "pdfs/issue7200.pdf",
|
|
|
|
"md5": "ddae17424ea23930eecf8b612a66ed0f",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2016-03-23 21:52:30 +09:00
|
|
|
},
|
2015-09-04 20:19:32 +09:00
|
|
|
{ "id": "pr4606",
|
|
|
|
"file": "pdfs/pr4606.pdf",
|
|
|
|
"md5": "6574fde2314648600056bd0e229df98c",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "load"
|
|
|
|
},
|
2012-01-12 11:14:49 +09:00
|
|
|
{ "id": "S2-eq",
|
|
|
|
"file": "pdfs/S2.pdf",
|
|
|
|
"md5": "d0b6137846df6e0fe058f234a87fb588",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2012-01-19 11:14:43 +09:00
|
|
|
},
|
2012-01-18 04:40:52 +09:00
|
|
|
{ "id": "issue1055",
|
2015-11-18 21:47:56 +09:00
|
|
|
"file": "pdfs/issue1055r.pdf",
|
|
|
|
"md5": "4aa72bb4779e3f301c45492f2a590459",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
2012-01-18 13:50:49 +09:00
|
|
|
},
|
2012-05-18 08:34:31 +09:00
|
|
|
{ "id": "issue1629",
|
|
|
|
"file": "pdfs/issue1629.pdf",
|
|
|
|
"md5": "0f2cbbf268383a377e95e6bbe36c6a9a",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-09-07 00:16:31 +09:00
|
|
|
{ "id": "issue6410",
|
|
|
|
"file": "pdfs/issue6410.pdf",
|
|
|
|
"md5": "fd5c5898d5b9754bb546724b7d31bf59",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-09-04 05:29:12 +09:00
|
|
|
{ "id": "issue6413",
|
|
|
|
"file": "pdfs/issue6413.pdf",
|
|
|
|
"md5": "08926ac7a46e27a4abbb31256b3a7b29",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-08-12 13:05:41 +09:00
|
|
|
{ "id": "issue1685",
|
|
|
|
"file": "pdfs/issue1685.pdf",
|
|
|
|
"md5": "b22c3741e6bd0e613d3eb3325ad31f7d",
|
|
|
|
"rounds": 1,
|
2013-01-31 04:31:08 +09:00
|
|
|
"lastPage": 2,
|
2012-08-12 13:05:41 +09:00
|
|
|
"link": true,
|
2012-05-18 08:34:31 +09:00
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-02-05 03:45:18 +09:00
|
|
|
{ "id": "issue1169",
|
|
|
|
"file": "pdfs/issue1169.pdf",
|
|
|
|
"md5": "3df3ed21fd43ac7fdb21e2015c8a7809",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
Add a couple more, mostly `text`, reference tests for non-embedded symbolic fonts without included encoding information
I've started to look into how we can fix issue 7580, but quickly became worried that fixing it could easily mean that we'd trade one fixed PDF file for a multitude of broken ones.
Hence I started going through the history of the code that choose the fallback encoding, and noticed that it has been changed a number of times over the years to deal with various cases of weirdness/errors in non-embedded fonts.
To my relief it turned out that almost all the PRs, please see a possibly incomplete [list here], that changed this code actually included `eq` test-cases.
However, in one case it appears that a PR missed to add a test-case. Furthermore since the fallback encoding may also be the only source for creating a `toUnicode` map, changing the encoding could possibly regress only the text-selection despite a PDF file still rendering correctly.
Therefore, this PR adds one new `eq` test, and also a number of additional `text` tests for PDF files already present in the test-suite.
Note that it's obviously possible that there's a certain overlap between the added tests, but I'd be *a whole lot* more concerned with causing regressions.
2016-09-10 20:10:15 +09:00
|
|
|
{ "id": "issue2017-eq",
|
|
|
|
"file": "pdfs/issue2017r.pdf",
|
|
|
|
"md5": "5d2bc43b386496cfb8841ca677db7046",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
|
|
|
{ "id": "issue2017-text",
|
|
|
|
"file": "pdfs/issue2017r.pdf",
|
|
|
|
"md5": "5d2bc43b386496cfb8841ca677db7046",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2016-06-27 20:51:11 +09:00
|
|
|
{ "id": "issue5256",
|
|
|
|
"file": "pdfs/issue5256.pdf",
|
|
|
|
"md5": "9383e17ced31f9afc940fb7898df8e68",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-08-21 06:01:48 +09:00
|
|
|
{ "id": "issue6360",
|
|
|
|
"file": "pdfs/issue6360.pdf",
|
|
|
|
"md5": "58c5455ffd84b1c07ad2d0fa90cd5e26",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2016-09-25 19:19:22 +09:00
|
|
|
{ "id": "multiple-filters-length-zero",
|
|
|
|
"file": "pdfs/multiple-filters-length-zero.pdf",
|
|
|
|
"md5": "c273c3a6fb79cbf3034fe1b62b204728",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-08-21 23:57:08 +09:00
|
|
|
{ "id": "issue5752",
|
|
|
|
"file": "pdfs/issue5752.pdf",
|
|
|
|
"md5": "aa20ad7cff71e9481c0cd623ddbff3b7",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-07-31 20:46:11 +09:00
|
|
|
{ "id": "issue2931",
|
|
|
|
"file": "pdfs/issue2931.pdf",
|
|
|
|
"md5": "ea40940eaf3541b312bda9329167da11",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2020-06-30 16:24:01 +09:00
|
|
|
{ "id": "issue4260",
|
|
|
|
"file": "pdfs/issue4260_reduced.pdf",
|
|
|
|
"md5": "6b39ebc8c4dddc41f1b031c070c212ad",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-09-04 04:57:57 +09:00
|
|
|
{ "id": "ZapfDingbats",
|
|
|
|
"file": "pdfs/ZapfDingbats.pdf",
|
|
|
|
"md5": "980df9b1c86715a3d8aa2d3c807a2b2c",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-11-08 21:18:23 +09:00
|
|
|
{ "id": "TrueType_without_cmap",
|
|
|
|
"file": "pdfs/TrueType_without_cmap.pdf",
|
|
|
|
"md5": "afca8bb11f2e1f7298b4e5dd85785fb0",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2017-07-25 18:57:38 +09:00
|
|
|
{ "id": "issue8697",
|
|
|
|
"file": "pdfs/issue8697.pdf",
|
|
|
|
"md5": "65c6f0d861d49fa685051a64c3f78694",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2016-10-08 21:06:28 +09:00
|
|
|
{ "id": "non-embedded-NuptialScript",
|
|
|
|
"file": "pdfs/non-embedded-NuptialScript.pdf",
|
|
|
|
"md5": "94225085d3fbf5d2d12b8be1c52bb3c1",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2019-10-15 21:06:54 +09:00
|
|
|
{ "id": "issue11242",
|
|
|
|
"file": "pdfs/issue11242_reduced.pdf",
|
|
|
|
"md5": "ba50b6ee537f3e815ccfe0c99e598e05",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-07-31 20:46:11 +09:00
|
|
|
{ "id": "issue3323",
|
|
|
|
"file": "pdfs/issue3323.pdf",
|
|
|
|
"md5": "1a14ff574013caeafa9d598269988764",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
|
|
|
{ "id": "issue4304",
|
|
|
|
"file": "pdfs/issue4304.pdf",
|
|
|
|
"md5": "1b1205bf0d7c1ad22a154b60da8e694d",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-12-30 00:28:23 +09:00
|
|
|
{ "id": "issue4379",
|
|
|
|
"file": "pdfs/issue4379.pdf",
|
|
|
|
"md5": "09715ec1a7b0f3a7ae02b3046f627b9f",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-10-04 20:12:07 +09:00
|
|
|
{ "id": "issue4387",
|
|
|
|
"file": "pdfs/issue4387.pdf",
|
|
|
|
"md5": "e06da177b5b9e36016fa2442510d62da",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "load"
|
|
|
|
},
|
2014-12-12 01:29:26 +09:00
|
|
|
{ "id": "issue4722",
|
|
|
|
"file": "pdfs/issue4722.pdf",
|
|
|
|
"md5": "a42ca858af7d179358f92f47e57c0fed",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-08-04 01:34:52 +09:00
|
|
|
{ "id": "issue4875",
|
|
|
|
"file": "pdfs/issue4875.pdf",
|
|
|
|
"md5": "9a558e18029a42c0ef4e9a8755e24733",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
|
|
|
{ "id": "issue4881",
|
|
|
|
"file": "pdfs/issue4881.pdf",
|
|
|
|
"md5": "e1f06be05eda4ddf734e9764f3f067f1",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-10-25 18:35:13 +09:00
|
|
|
{ "id": "issue5291",
|
|
|
|
"file": "pdfs/issue5291.pdf",
|
|
|
|
"md5": "edae085495c702069ffdbf785a826556",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-08-29 21:15:19 +09:00
|
|
|
{ "id": "issue5244",
|
|
|
|
"file": "pdfs/issue5244.pdf",
|
|
|
|
"md5": "a50cd364c3976c744627b4b9bb90c761",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-05-11 04:48:17 +09:00
|
|
|
{ "id": "issue5994",
|
|
|
|
"file": "pdfs/issue5994.pdf",
|
|
|
|
"md5": "6799733a39d29b3828d6628bf2c5c382",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2017-02-12 03:11:52 +09:00
|
|
|
{ "id": "issue8061",
|
|
|
|
"file": "pdfs/issue8061.pdf",
|
|
|
|
"md5": "d61fe1dcdcd55bca00b351b2fc2c6dc7",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
Fallback to the `StandardEncoding` for Nonsymbolic fonts without `/Encoding` entry (issue 7580)
Even though this patch passes all tests (unit/font/reference) locally, including the new ones that I added in PR 7621, I'm still a bit nervous about modifying the code that choose the fallback encoding for fonts without an `/Encoding` entry.
Note that over the years this code has been changed on a number of occasions, see a possibly incomplete [list here], to deal with various cases of incorrect font data.
According to the PDF specification, see http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G8.1904184, it seems that we should fallback to the `StandardEncoding` for Nonsymbolic fonts.
There's obviously a risk that fixing this particular issue *could* break other PDF files for which we don't have tests. However I've tried to change the logic as little as possible in this patch, to hopefully reduce possible breakage.
Based on debugging numerous font issue, it seems that a lot of fonts actually set the Symbolic flag, even when they are in fact *not* Symbolic. Fonts actually marked as Nonsymbolic seem to be somewhat less common, which I hope should reduce the risk of the patch somewhat.
Fixes 7580.
2016-09-13 20:43:23 +09:00
|
|
|
{ "id": "issue7580-eq",
|
|
|
|
"file": "pdfs/issue7580.pdf",
|
|
|
|
"md5": "44dd5a9b4373fcab9890cf567722a766",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
|
|
|
{ "id": "issue7580-text",
|
|
|
|
"file": "pdfs/issue7580.pdf",
|
|
|
|
"md5": "44dd5a9b4373fcab9890cf567722a766",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2015-11-24 00:57:43 +09:00
|
|
|
{ "id": "issue6612-text",
|
|
|
|
"file": "pdfs/issue6612.pdf",
|
|
|
|
"md5": "657f33236496916597cd70ef1222509a",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2012-01-18 13:50:49 +09:00
|
|
|
{ "id": "zerowidthline",
|
|
|
|
"file": "pdfs/zerowidthline.pdf",
|
|
|
|
"md5": "295d26e61a85635433f8e4b768953f60",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
2012-02-02 07:48:44 +09:00
|
|
|
},
|
2017-06-30 09:14:58 +09:00
|
|
|
{ "id": "issue8187",
|
|
|
|
"file": "pdfs/issue8187.pdf",
|
|
|
|
"md5": "1724dcada47b90c9217ee0139d8352a8",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2016-03-06 06:32:54 +09:00
|
|
|
{ "id": "issue5686",
|
|
|
|
"file": "pdfs/issue5686.pdf",
|
|
|
|
"md5": "78d16b9df07a355ad00d70504a9194f8",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "Type1 font where Length1/Length2 are slightly incorrect."
|
|
|
|
},
|
|
|
|
{ "id": "issue3928",
|
|
|
|
"file": "pdfs/issue3928.pdf",
|
|
|
|
"md5": "1963493f843e981cbe768b707ef7f08a",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "Type1 font where Length1/Length2 are several orders of magnitude too large."
|
|
|
|
},
|
2017-11-29 04:24:27 +09:00
|
|
|
{ "id": "images_1bit_grayscale",
|
|
|
|
"file": "pdfs/images_1bit_grayscale.pdf",
|
|
|
|
"md5": "e1c36a19563944891bd30cfc0199d07f",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-02-13 12:11:44 +09:00
|
|
|
{ "id": "html5checker",
|
|
|
|
"file": "pdfs/html5checker.pdf",
|
|
|
|
"md5": "74bbd80d1e7eb5f2951582233ef9ebab",
|
|
|
|
"rounds": 1,
|
2013-01-31 04:31:08 +09:00
|
|
|
"lastPage": 7,
|
2012-02-13 12:11:44 +09:00
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-02-07 08:13:41 +09:00
|
|
|
{ "id": "issue5704",
|
|
|
|
"file": "pdfs/issue5704.pdf",
|
|
|
|
"md5": "6e0b62585feef24dff2d7e7687cd8128",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-02-20 19:19:21 +09:00
|
|
|
{ "id": "issue5747",
|
|
|
|
"file": "pdfs/issue5747.pdf",
|
|
|
|
"md5": "5c36afc931dd1a3321ffa2e88952d174",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
2015-02-07 08:13:41 +09:00
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-06-10 00:52:36 +09:00
|
|
|
{ "id": "issue6099",
|
|
|
|
"file": "pdfs/issue6099.pdf",
|
|
|
|
"md5": "f8e9a26637c1b0d35f2bc653122a9c73",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2016-06-08 05:40:06 +09:00
|
|
|
{ "id": "issue7180-text",
|
|
|
|
"file": "pdfs/issue7180.pdf",
|
|
|
|
"md5": "73ed92d7ca55475f1f31d1d75fee3283",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2015-12-04 05:34:12 +09:00
|
|
|
{
|
2015-12-06 03:33:25 +09:00
|
|
|
"id": "bug1123803",
|
|
|
|
"file": "pdfs/bug1123803.pdf",
|
|
|
|
"md5": "0f3870ac5ad8899aec34d734e045a514",
|
2015-12-04 05:34:12 +09:00
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2020-04-12 19:27:11 +09:00
|
|
|
{ "id": "issue11794",
|
|
|
|
"file": "pdfs/issue11794.pdf",
|
|
|
|
"md5": "00d17b10a5fd7c06cddd7a0d2066ecdd",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2016-04-11 06:39:15 +09:00
|
|
|
{
|
|
|
|
"id": "bug852992",
|
|
|
|
"file": "pdfs/bug852992_reduced.pdf",
|
|
|
|
"md5": "c11439fe3b7f8bc39d89dcff58c50a0c",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
|
|
|
{
|
|
|
|
"id": "bug1199237",
|
|
|
|
"file": "pdfs/bug1199237.pdf",
|
|
|
|
"md5": "e9a63d3207ccc65a4955d5723546e962",
|
|
|
|
"rounds": 1,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
|
|
|
{
|
|
|
|
"id": "issue6165",
|
|
|
|
"file": "pdfs/issue6165.pdf",
|
|
|
|
"md5": "84ebde43b9121aa2ef8026388a4f4244",
|
|
|
|
"rounds": 1,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-11-04 00:03:08 +09:00
|
|
|
{
|
|
|
|
"id": "issue6019-text",
|
|
|
|
"file": "pdfs/issue6019.pdf",
|
|
|
|
"md5": "7a2e5dda3b0fc5c2e9060e378a8cdc4e",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2015-02-07 08:13:41 +09:00
|
|
|
{ "id": "bug893730",
|
|
|
|
"file": "pdfs/bug893730.pdf",
|
|
|
|
"md5": "2587379fb1b3bbff89c14f0863e78383",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-08-09 19:31:05 +09:00
|
|
|
{ "id": "issue6336",
|
|
|
|
"file": "pdfs/issue6336.pdf",
|
|
|
|
"md5": "1c457c12b3606e1de610235d6768bd78",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2017-04-11 03:58:02 +09:00
|
|
|
{ "id": "issue6652",
|
|
|
|
"file": "pdfs/issue6652.pdf",
|
|
|
|
"md5": "1c8e2953f84623bc773eb720c87a9331",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-03-07 19:39:10 +09:00
|
|
|
{ "id": "issue5801",
|
|
|
|
"file": "pdfs/issue5801.pdf",
|
|
|
|
"md5": "e9548650ad40e13e00d2a486bbc2bb1b",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2016-11-22 21:48:06 +09:00
|
|
|
{ "id": "issue7835",
|
|
|
|
"file": "pdfs/issue7835.pdf",
|
|
|
|
"md5": "afb3206a83ee3fd19f3dea0480f942ec",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-05-10 18:28:15 +09:00
|
|
|
{ "id": "issue5972",
|
|
|
|
"file": "pdfs/issue5972.pdf",
|
|
|
|
"md5": "51f03e1d38410b04c9dda7e75fe8a0a3",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2012-03-02 12:23:36 +09:00
|
|
|
{ "id": "pdfkit_compressed",
|
|
|
|
"file": "pdfs/pdfkit_compressed.pdf",
|
|
|
|
"md5": "ffe9c571d0a1572e234253e6c7cdee6c",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2017-03-17 05:41:13 +09:00
|
|
|
{ "id": "issue8117",
|
|
|
|
"file": "pdfs/issue8117.pdf",
|
|
|
|
"md5": "0c805ae480dd523148a16fe7ed0fc867",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2016-11-30 02:28:32 +09:00
|
|
|
{ "id": "issue7855",
|
|
|
|
"file": "pdfs/issue7855.pdf",
|
|
|
|
"md5": "290d4d5da041ffbcb1ea5d3b0ed8ee91",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-06-07 01:00:14 +09:00
|
|
|
{ "id": "issue6068",
|
|
|
|
"file": "pdfs/issue6068.pdf",
|
|
|
|
"md5": "bbcedb94776b40352729c16940a5b2bd",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2016-01-18 06:03:21 +09:00
|
|
|
{ "id": "issue6889",
|
|
|
|
"file": "pdfs/issue6889.pdf",
|
|
|
|
"md5": "397fa92da1a8bfa83dc8c20287854d15",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-03-18 07:35:04 +09:00
|
|
|
{ "id": "tamreview",
|
|
|
|
"file": "pdfs/TAMReview.pdf",
|
|
|
|
"md5": "8039aba56790d3597d2bc8c794a51301",
|
|
|
|
"rounds": 1,
|
2013-01-31 04:31:08 +09:00
|
|
|
"lastPage": 5,
|
2012-03-18 07:35:04 +09:00
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2018-04-10 22:44:42 +09:00
|
|
|
{ "id": "text_clip_cff_cid",
|
|
|
|
"file": "pdfs/text_clip_cff_cid.pdf",
|
|
|
|
"md5": "92d4920586f177cc0e83326e5b5d2ee1",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2018-06-13 18:02:06 +09:00
|
|
|
{ "id": "issue4244",
|
|
|
|
"file": "pdfs/issue4244.pdf",
|
|
|
|
"md5": "26845274a32a537182ced1fd693a38b2",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-03-26 06:30:44 +09:00
|
|
|
{ "id": "preistabelle",
|
|
|
|
"file": "pdfs/preistabelle.pdf",
|
|
|
|
"md5": "d2f0b2086160d4f3d325c79a5dc1fb4d",
|
|
|
|
"rounds": 1,
|
2013-01-31 04:31:08 +09:00
|
|
|
"lastPage": 2,
|
2012-03-26 06:30:44 +09:00
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
Add a couple more, mostly `text`, reference tests for non-embedded symbolic fonts without included encoding information
I've started to look into how we can fix issue 7580, but quickly became worried that fixing it could easily mean that we'd trade one fixed PDF file for a multitude of broken ones.
Hence I started going through the history of the code that choose the fallback encoding, and noticed that it has been changed a number of times over the years to deal with various cases of weirdness/errors in non-embedded fonts.
To my relief it turned out that almost all the PRs, please see a possibly incomplete [list here], that changed this code actually included `eq` test-cases.
However, in one case it appears that a PR missed to add a test-case. Furthermore since the fallback encoding may also be the only source for creating a `toUnicode` map, changing the encoding could possibly regress only the text-selection despite a PDF file still rendering correctly.
Therefore, this PR adds one new `eq` test, and also a number of additional `text` tests for PDF files already present in the test-suite.
Note that it's obviously possible that there's a certain overlap between the added tests, but I'd be *a whole lot* more concerned with causing regressions.
2016-09-10 20:10:15 +09:00
|
|
|
{ "id": "preistabelle-text",
|
|
|
|
"file": "pdfs/preistabelle.pdf",
|
|
|
|
"md5": "d2f0b2086160d4f3d325c79a5dc1fb4d",
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2013-03-03 21:36:44 +09:00
|
|
|
{ "id": "ichiji",
|
|
|
|
"file": "pdfs/ichiji.pdf",
|
|
|
|
"md5": "66b645802d33513cd598886e017392b8",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-03-20 01:09:42 +09:00
|
|
|
{ "id": "issue1350",
|
|
|
|
"file": "pdfs/issue1350.pdf",
|
|
|
|
"md5": "92f72a04a4d9d05b2dd433b51f32ab1f",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-01-03 09:44:11 +09:00
|
|
|
{ "id": "bug864847",
|
|
|
|
"file": "pdfs/bug864847.pdf",
|
|
|
|
"md5": "2b62cbba5d40a769be8e611eb5b61bfe",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
Add a couple more, mostly `text`, reference tests for non-embedded symbolic fonts without included encoding information
I've started to look into how we can fix issue 7580, but quickly became worried that fixing it could easily mean that we'd trade one fixed PDF file for a multitude of broken ones.
Hence I started going through the history of the code that choose the fallback encoding, and noticed that it has been changed a number of times over the years to deal with various cases of weirdness/errors in non-embedded fonts.
To my relief it turned out that almost all the PRs, please see a possibly incomplete [list here], that changed this code actually included `eq` test-cases.
However, in one case it appears that a PR missed to add a test-case. Furthermore since the fallback encoding may also be the only source for creating a `toUnicode` map, changing the encoding could possibly regress only the text-selection despite a PDF file still rendering correctly.
Therefore, this PR adds one new `eq` test, and also a number of additional `text` tests for PDF files already present in the test-suite.
Note that it's obviously possible that there's a certain overlap between the added tests, but I'd be *a whole lot* more concerned with causing regressions.
2016-09-10 20:10:15 +09:00
|
|
|
{ "id": "bug864847-text",
|
|
|
|
"file": "pdfs/bug864847.pdf",
|
|
|
|
"md5": "2b62cbba5d40a769be8e611eb5b61bfe",
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2012-02-20 15:12:22 +09:00
|
|
|
{ "id": "issue925",
|
|
|
|
"file": "pdfs/issue925.pdf",
|
|
|
|
"md5": "f58fe943090aff89dcc8e771bc0db4c2",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2017-12-25 01:25:43 +09:00
|
|
|
{ "id": "issue9291",
|
|
|
|
"file": "pdfs/issue9291.pdf",
|
|
|
|
"md5": "8c3f4edff54c1ca0b67700a05ea0ccee",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-02-24 00:01:08 +09:00
|
|
|
{ "id": "issue5751",
|
|
|
|
"file": "pdfs/issue5751.pdf",
|
|
|
|
"md5": "9334a52dc85747f1e3422767e0cd3ee9",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2016-07-17 23:41:53 +09:00
|
|
|
{ "id": "issue7426",
|
|
|
|
"file": "pdfs/issue7426.pdf",
|
|
|
|
"md5": "304e6cae18fdc07f66bd621fbe16b6cb",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2016-01-06 10:07:21 +09:00
|
|
|
{ "id": "issue6782",
|
|
|
|
"file": "pdfs/issue6782.pdf",
|
|
|
|
"md5": "b423f709600daa5745cc6d8234f7c608",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
2016-09-06 16:56:18 +09:00
|
|
|
},
|
2020-06-30 19:18:06 +09:00
|
|
|
{ "id": "issue12010",
|
|
|
|
"file": "pdfs/issue12010_reduced.pdf",
|
|
|
|
"md5": "8894ec63069dcf92c9f56baec05c0425",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
Ensure that all necessary /Font resources are included when saving a `WidgetAnnotation`-instance (issue 12294)
This patch contains a possible approach for fixing issue 12294, which compared to other PRs is purposely limited to the affected `WidgetAnnotation` code.
As mentioned elsewhere, considering that we're (at least for now) trying to fix *one specific* case, I think that we should avoid modifying the `Dict` primitive[1] and/or avoid a solution that (indirectly) modifies an existing `Dict`-instance[2].
This patch simply fixes the issue at hand, since that seems easiest for now, and I'd suggest that we worry about a more general approach if/when that actually becomes necessary.
Hence the solution implemented here, for `WidgetAnnotation`, is to simply use a combination of the local *and* AcroForm /DR resources during OperatorList-parsing to ensure that things work correctly regardless of where a particular /Font resource is found.
For saving of form-data, on the other hand, we want to avoid increasing the file-size unnecessarily and need to be smarter than just merging all of the available resources. To achive this, a new `WidgetAnnotation._getSaveFieldResources` method will when necessary produce a combined resources `Dict` with only the minimum amount of data from the AcroForm /DR resources included.
---
[1] You want to avoid anything that could cause the general `Dict` implementation to become slower, or more complex, just for handling an edge-case in my opinion.
[2] If an existing `Dict`-instance is modified unexpectedly, that could very easily lead to problems elsewhere since e.g. `Dict`-instances created during parsing are not expected to be changed.
2020-09-10 17:21:34 +09:00
|
|
|
{ "id": "issue12294-print",
|
|
|
|
"file": "pdfs/issue12294.pdf",
|
|
|
|
"md5": "a0ac5e03be38b5fb7a7a615e30024b28",
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq",
|
|
|
|
"print": true,
|
|
|
|
"annotationStorage": {
|
2020-11-04 00:04:08 +09:00
|
|
|
"2795R": {
|
|
|
|
"value": "氏 名 又 は 名 称 Full name"
|
|
|
|
}
|
Ensure that all necessary /Font resources are included when saving a `WidgetAnnotation`-instance (issue 12294)
This patch contains a possible approach for fixing issue 12294, which compared to other PRs is purposely limited to the affected `WidgetAnnotation` code.
As mentioned elsewhere, considering that we're (at least for now) trying to fix *one specific* case, I think that we should avoid modifying the `Dict` primitive[1] and/or avoid a solution that (indirectly) modifies an existing `Dict`-instance[2].
This patch simply fixes the issue at hand, since that seems easiest for now, and I'd suggest that we worry about a more general approach if/when that actually becomes necessary.
Hence the solution implemented here, for `WidgetAnnotation`, is to simply use a combination of the local *and* AcroForm /DR resources during OperatorList-parsing to ensure that things work correctly regardless of where a particular /Font resource is found.
For saving of form-data, on the other hand, we want to avoid increasing the file-size unnecessarily and need to be smarter than just merging all of the available resources. To achive this, a new `WidgetAnnotation._getSaveFieldResources` method will when necessary produce a combined resources `Dict` with only the minimum amount of data from the AcroForm /DR resources included.
---
[1] You want to avoid anything that could cause the general `Dict` implementation to become slower, or more complex, just for handling an edge-case in my opinion.
[2] If an existing `Dict`-instance is modified unexpectedly, that could very easily lead to problems elsewhere since e.g. `Dict`-instances created during parsing are not expected to be changed.
2020-09-10 17:21:34 +09:00
|
|
|
}
|
|
|
|
},
|
2016-09-06 16:56:18 +09:00
|
|
|
{ "id": "issue7598",
|
|
|
|
"file": "pdfs/issue7598.pdf",
|
|
|
|
"md5": "c5bc5a779bfcb4b234f853231b56cf60",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
2016-01-06 10:07:21 +09:00
|
|
|
},
|
2016-06-25 19:41:26 +09:00
|
|
|
{ "id": "issue7439",
|
|
|
|
"file": "pdfs/issue7439.pdf",
|
|
|
|
"md5": "56682657990a894c66db26560d3039d7",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-06-22 08:02:58 +09:00
|
|
|
{ "id": "bug867484",
|
|
|
|
"file": "pdfs/bug867484.pdf",
|
|
|
|
"md5": "347af7b0ef7279b1a7f43b03bfda4548",
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "load"
|
|
|
|
},
|
2013-10-31 23:10:08 +09:00
|
|
|
{ "id": "bug860632",
|
|
|
|
"file": "pdfs/bug860632.pdf",
|
|
|
|
"md5": "b3cabf9249c8fee76f61f6b3b7fdd5fd",
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-12-05 20:22:09 +09:00
|
|
|
{ "id": "IdentityToUnicodeMap_charCodeOf",
|
|
|
|
"file": "pdfs/IdentityToUnicodeMap_charCodeOf.pdf",
|
|
|
|
"md5": "da030686418c5e37d889127a05dafb83",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2017-05-25 00:36:39 +09:00
|
|
|
{ "id": "issue8372-text",
|
|
|
|
"file": "pdfs/issue8372.pdf",
|
|
|
|
"md5": "b02fb07364dd00ad5044bd259860da97",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2018-01-05 07:43:07 +09:00
|
|
|
{ "id": "issue9278",
|
|
|
|
"file": "pdfs/issue9278.pdf",
|
|
|
|
"md5": "9819c3a5715c1a46ea5a6740f9ead3da",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-02-19 21:58:03 +09:00
|
|
|
{ "id": "bug894572",
|
|
|
|
"file": "pdfs/bug894572.pdf",
|
|
|
|
"md5": "e54a6b0451939f685ed37e3d46e16158",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-02-07 05:58:01 +09:00
|
|
|
{ "id": "bug1108301",
|
|
|
|
"file": "pdfs/bug1108301.pdf",
|
|
|
|
"md5": "cc94cc7e5f5e281dfa7e21020dd90cc7",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-10-05 06:47:03 +09:00
|
|
|
{ "id": "bug1020858",
|
|
|
|
"file": "pdfs/bug1020858.pdf",
|
|
|
|
"md5": "cde53bcf75df14ff59c8a5a96fe437b9",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "load"
|
|
|
|
},
|
2015-04-25 20:27:10 +09:00
|
|
|
{ "id": "bug1157493",
|
|
|
|
"file": "pdfs/bug1157493.pdf",
|
|
|
|
"md5": "df96eddacf186c28a91e699800180c4f",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2019-03-27 08:25:34 +09:00
|
|
|
{ "id": "issue10665",
|
|
|
|
"file": "pdfs/issue10665_reduced.pdf",
|
|
|
|
"md5": "4c8938c808153f6b3840e8a5eb68b804",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
Ensure that `MurmurHash3_64.update` handles `ArrayBuffer` input correctly, to avoid hash-collisions (issue 12533)
Different fonts incorrectly end up with *identical* hashes, despite having different /ToUnicode data.
The issue, and it's very interesting that we've apparently not seen it before, appears to be caused by the fact that different /ToUnicode entries share the *same* underlying `ArrayBuffer`, which thus becomes problematic at the `const dataUint32 = new Uint32Array(data.buffer, 0, blockCounts);` line. The simplest solution thus seem to be to just *copy* the input, when it's an `ArrayBuffer`, rather than using it as-is. (Note that if we'd stringified the input, when calling `MurmurHash3_64.update`, the issue would also have been fixed. In this case, we're already creating an unique TypedArray.)
2020-10-27 00:20:55 +09:00
|
|
|
{ "id": "issue12533",
|
|
|
|
"file": "pdfs/issue12533.pdf",
|
|
|
|
"md5": "9824904320f884eee20d4e4573008e6f",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 17,
|
|
|
|
"lastPage": 18,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-04-05 01:50:20 +09:00
|
|
|
{ "id": "issue1466",
|
|
|
|
"file": "pdfs/issue1466.pdf",
|
|
|
|
"md5": "8a8877432e5bb10cfd50d60488d947bb",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-12-06 22:12:37 +09:00
|
|
|
{ "id": "bug1064894",
|
|
|
|
"file": "pdfs/bug1064894.pdf",
|
|
|
|
"md5": "22971b6a24912bca9c773379c10ef18a",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-10-09 04:11:41 +09:00
|
|
|
{ "id": "bug1065245",
|
|
|
|
"file": "pdfs/bug1065245.pdf",
|
|
|
|
"md5": "844b3af0a1d338a2e1bbe742f474bbb7",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "Inline JPEG images."
|
|
|
|
},
|
Don't read past the EOI marker for JPEG images with non-default restart interval (issue 7828)
*After browsing through (a version of) the JPEG specification, see https://www.w3.org/Graphics/JPEG/itu-t81.pdf, I hope that this patch makes sense.*
Note that while issue 7828 became a problem after PR 7661, it isn't really a regression from than PR. The explanation is rather that we're now relying on `core/jpg.js` instead of the Native Image decoder in more situations than before, which thus exposed an *existing* issue in our JPEG decoder.
Another factor also seems to be that in many JPEG images, the DRI (Define Restart Interval) marker isn't present, in which case this bug won't manifest either.
According to https://www.w3.org/Graphics/JPEG/itu-t81.pdf#page=89 (at the bottom of the page):
"NOTE – The final restart interval may be smaller than the size specified by the DRI marker segment, as it includes only the number of MCUs remaining in the scan."
Furthermore, according to https://www.w3.org/Graphics/JPEG/itu-t81.pdf#page=39 (in the middle of the page):
"[...] If restart is enabled and the restart interval is defined to be Ri, each entropy-coded segment except the last one shall contain Ri MCUs. The last one shall contain whatever number of MCUs completes the scan."
Based on the above, it thus seem to me that we should simply ensure that we're not attempting to continue to parse Scan data once we've found all MCUs (Minimum Coded Unit) of the image.
Fixes 7828.
2017-03-16 00:36:28 +09:00
|
|
|
{ "id": "issue7828",
|
|
|
|
"file": "pdfs/issue7828.pdf",
|
|
|
|
"md5": "462f96c877f5761fc3176156e3526184",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-08-12 10:10:31 +09:00
|
|
|
{ "id": "issue1655",
|
2015-11-15 23:32:11 +09:00
|
|
|
"file": "pdfs/issue1655r.pdf",
|
|
|
|
"md5": "569f48449ba57c15c4f9ade151a651c5",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
2012-08-12 10:10:31 +09:00
|
|
|
},
|
2012-10-13 12:33:56 +09:00
|
|
|
{ "id": "issue1687",
|
|
|
|
"file": "pdfs/issue1687.pdf",
|
|
|
|
"md5": "ea79d83821d1dd0663414b037080add5",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-02-02 07:48:44 +09:00
|
|
|
{ "id": "issue1133",
|
|
|
|
"file": "pdfs/issue1133.pdf",
|
|
|
|
"md5": "d1b61580cb100e3df93d33703af1773a",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
2012-02-07 09:11:52 +09:00
|
|
|
},
|
2012-08-21 05:57:21 +09:00
|
|
|
{ "id": "issue1658",
|
|
|
|
"file": "pdfs/issue1658.pdf",
|
|
|
|
"md5": "b71a0f641e83ad427b8bcfc180899a05",
|
|
|
|
"rounds": 1,
|
2019-10-17 20:15:59 +09:00
|
|
|
"firstPage": 10,
|
2013-01-31 04:31:08 +09:00
|
|
|
"lastPage": 10,
|
2012-08-21 05:57:21 +09:00
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-01-21 08:41:01 +09:00
|
|
|
{ "id": "issue1049",
|
|
|
|
"file": "pdfs/issue1049.pdf",
|
|
|
|
"md5": "15473fffcdde9fb8f3756a4cf1aab347",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
2012-02-15 04:55:39 +09:00
|
|
|
},
|
2015-06-16 19:47:50 +09:00
|
|
|
{ "id": "issue6117",
|
|
|
|
"file": "pdfs/issue6117.pdf",
|
|
|
|
"md5": "691f5f8268e07f3831e8293258a68da7",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 6,
|
|
|
|
"lastPage": 6,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-07-21 06:30:05 +09:00
|
|
|
{ "id": "issue6238",
|
|
|
|
"file": "pdfs/issue6238.pdf",
|
|
|
|
"md5": "6d7731ee22fbbdf746c8da01b8922d50",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 3,
|
|
|
|
"lastPage": 3,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-03-14 04:06:09 +09:00
|
|
|
{ "id": "cid_cff",
|
|
|
|
"file": "pdfs/cid_cff.pdf",
|
|
|
|
"md5": "a19a18eaa626262cc45e0760004d6de9",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-02-15 04:55:39 +09:00
|
|
|
{ "id": "issue1155",
|
2016-01-16 23:51:10 +09:00
|
|
|
"file": "pdfs/issue1155r.pdf",
|
|
|
|
"md5": "8d772a3c6109bda860b8d80d42d4c08c",
|
2012-02-15 04:55:39 +09:00
|
|
|
"rounds": 1,
|
2016-01-16 23:51:10 +09:00
|
|
|
"link": false,
|
2012-02-15 04:55:39 +09:00
|
|
|
"type": "eq"
|
2012-02-19 06:01:53 +09:00
|
|
|
},
|
2013-03-26 18:41:56 +09:00
|
|
|
{ "id": "link-annotation-border",
|
|
|
|
"file": "pdfs/link-annotation-border.pdf",
|
2013-05-29 07:31:54 +09:00
|
|
|
"md5": "a0550889b010df9fabe4e2107662c8c4",
|
2013-03-26 18:41:56 +09:00
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"link": true,
|
2015-12-19 06:29:22 +09:00
|
|
|
"type": "eq",
|
|
|
|
"annotations": true
|
2013-03-26 18:41:56 +09:00
|
|
|
},
|
2016-05-25 00:35:45 +09:00
|
|
|
{ "id": "pr7352",
|
|
|
|
"file": "pdfs/pr7352.pdf",
|
|
|
|
"md5": "336abca4b313cb215b0569883f1f683d",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"annotations": true
|
|
|
|
},
|
2012-03-11 12:12:33 +09:00
|
|
|
{ "id": "issue1002",
|
|
|
|
"file": "pdfs/issue1002.pdf",
|
|
|
|
"md5": "af62d6cd95079322d4af18edd960d15c",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2019-01-23 02:59:36 +09:00
|
|
|
{ "id": "issue10339",
|
|
|
|
"file": "pdfs/issue10339_reduced.pdf",
|
|
|
|
"md5": "e34ef74f188080f8194c7d8e8b68c562",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-05-22 12:15:09 +09:00
|
|
|
{ "id": "issue1721",
|
|
|
|
"file": "pdfs/issue1721.pdf",
|
|
|
|
"md5": "b47177f9e5197a76ec498733ecab60e6",
|
|
|
|
"rounds": 1,
|
2013-01-31 04:31:08 +09:00
|
|
|
"lastPage": 2,
|
2012-05-22 12:15:09 +09:00
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2018-01-24 00:04:07 +09:00
|
|
|
{ "id": "jbig2_huffman_1",
|
|
|
|
"file": "pdfs/jbig2_huffman_1.pdf",
|
|
|
|
"md5": "93ccd85c5686bea27d3e6c3b41921b21",
|
|
|
|
"lastPage": 1,
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
|
|
|
{ "id": "jbig2_huffman_2",
|
|
|
|
"file": "pdfs/jbig2_huffman_2.pdf",
|
|
|
|
"md5": "f019e2722bd64684e2093a0933e390f4",
|
|
|
|
"firstPage": 7,
|
|
|
|
"lastPage": 7,
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2019-03-27 02:38:44 +09:00
|
|
|
{ "id": "issue9534_reduced",
|
|
|
|
"file": "pdfs/issue9534_reduced.pdf",
|
|
|
|
"md5": "f9a47805555de5bc0f9f5f5188df6bad",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-10-14 01:41:44 +09:00
|
|
|
{ "id": "issue1233",
|
|
|
|
"file": "pdfs/issue1233.pdf",
|
|
|
|
"md5": "2d3565b0a286e29955796c37c66326c1",
|
|
|
|
"rounds": 1,
|
2013-01-31 04:31:08 +09:00
|
|
|
"lastPage": 1,
|
2012-10-14 01:41:44 +09:00
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-02-19 06:01:53 +09:00
|
|
|
{ "id": "issue1243",
|
|
|
|
"file": "pdfs/issue1243.pdf",
|
|
|
|
"md5": "130c849b83513d5ac5e03c6421fc7489",
|
|
|
|
"rounds": 1,
|
2013-01-31 04:31:08 +09:00
|
|
|
"lastPage": 2,
|
2012-02-19 06:01:53 +09:00
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
2012-02-21 08:19:12 +09:00
|
|
|
},
|
2014-07-24 21:59:21 +09:00
|
|
|
{ "id": "issue5039",
|
|
|
|
"file": "pdfs/issue5039.pdf",
|
|
|
|
"md5": "5c131f458ee6b65cc096ccaf0474ee3a",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-07-29 01:41:47 +09:00
|
|
|
{ "id": "issue5070",
|
|
|
|
"file": "pdfs/issue5070.pdf",
|
|
|
|
"md5": "ec2ca0b4954c8390a5b3b0ffd79a8e92",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-02-21 08:19:12 +09:00
|
|
|
{ "id": "issue1257",
|
|
|
|
"file": "pdfs/issue1257.pdf",
|
|
|
|
"md5": "9111533826bc21ed774e8e01603a2f54",
|
|
|
|
"rounds": 1,
|
2013-01-31 04:31:08 +09:00
|
|
|
"lastPage": 2,
|
2012-02-21 08:19:12 +09:00
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
2012-03-18 13:13:54 +09:00
|
|
|
},
|
|
|
|
{ "id": "issue1309",
|
|
|
|
"file": "pdfs/issue1309.pdf",
|
|
|
|
"md5": "e835fb7f3dab3073ad37d0bd3c6399fa",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-11-16 08:26:42 +09:00
|
|
|
{
|
|
|
|
"id": "issue6006",
|
|
|
|
"file": "pdfs/issue6006.pdf",
|
|
|
|
"md5": "65558dcdd2f20d4372458419c10bfa72",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-06-22 19:59:53 +09:00
|
|
|
{ "id": "issue1810",
|
|
|
|
"file": "pdfs/issue1810.pdf",
|
|
|
|
"md5": "b173a9dfb7bf00e1a298c6e8cb95c03e",
|
|
|
|
"rounds": 1,
|
2013-01-31 04:31:08 +09:00
|
|
|
"lastPage": 3,
|
2012-06-22 19:59:53 +09:00
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2020-01-13 21:32:23 +09:00
|
|
|
{ "id": "issue1877",
|
|
|
|
"file": "pdfs/issue1877.pdf",
|
|
|
|
"md5": "feac01f414f2e6792e4d3174944622f5",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2017-04-26 22:47:59 +09:00
|
|
|
{ "id": "issue7308",
|
|
|
|
"file": "pdfs/issue7308.pdf",
|
|
|
|
"md5": "ba2e23d3af93ac2c634d77ccbe2e79d5",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-04-24 07:50:02 +09:00
|
|
|
{ "id": "issue1597",
|
|
|
|
"file": "pdfs/issue1597.pdf",
|
|
|
|
"md5": "a5ebef467fd6e2fc0aeb56c9eb725ae3",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-07-20 02:58:07 +09:00
|
|
|
{ "id": "issue1419.pdf",
|
|
|
|
"file": "pdfs/issue1419.pdf",
|
|
|
|
"md5": "b5b6c6405d7b48418bccf97277957664",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
2019-10-17 20:15:59 +09:00
|
|
|
"firstPage": 2,
|
2015-08-28 23:05:34 +09:00
|
|
|
"lastPage": 2,
|
2012-07-20 02:58:07 +09:00
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-03-18 13:13:54 +09:00
|
|
|
{ "id": "issue1317",
|
|
|
|
"file": "pdfs/issue1317.pdf",
|
|
|
|
"md5": "6fb46275b30c48c8985617d4f86199e3",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
2012-03-30 00:53:51 +09:00
|
|
|
},
|
2015-08-19 05:25:37 +09:00
|
|
|
{ "id": "issue6364",
|
|
|
|
"file": "pdfs/issue6364.pdf",
|
|
|
|
"md5": "b290328531ecdddf6b4c794b4b2fec28",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-08-28 20:42:01 +09:00
|
|
|
{ "id": "issue6387-canvas",
|
|
|
|
"file": "pdfs/issue6387.pdf",
|
|
|
|
"md5": "08c39ac6d0aab1596e6e59793eaf3ee4",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
|
|
|
{ "id": "issue6387-text",
|
|
|
|
"file": "pdfs/issue6387.pdf",
|
|
|
|
"md5": "08c39ac6d0aab1596e6e59793eaf3ee4",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "text",
|
|
|
|
"about": "Note that the text layer seems to be off to the right."
|
|
|
|
},
|
2018-11-21 01:50:37 +09:00
|
|
|
{ "id": "issue10084",
|
|
|
|
"file": "pdfs/issue10084_reduced.pdf",
|
|
|
|
"md5": "ae37cf36f2e319688c608e4086836824",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2019-08-05 21:40:48 +09:00
|
|
|
},
|
|
|
|
{ "id": "issue11045",
|
|
|
|
"file": "pdfs/issue11045.pdf",
|
|
|
|
"md5": "101d4cb649cc006e0f2b14923e8d97d6",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
2018-11-21 01:50:37 +09:00
|
|
|
},
|
2019-10-31 23:53:51 +09:00
|
|
|
{ "id": "issue11287",
|
|
|
|
"file": "pdfs/issue11287.pdf",
|
|
|
|
"md5": "d7d6a7c124fad7b00f79112b71ee09d6",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2019-11-16 10:07:02 +09:00
|
|
|
{ "id": "issue11330",
|
|
|
|
"file": "pdfs/issue11330.pdf",
|
|
|
|
"md5": "03a8a53d4b0dc825e08554f5c0178308",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2020-02-09 03:51:16 +09:00
|
|
|
{ "id": "issue11578",
|
|
|
|
"file": "pdfs/issue11578_reduced.pdf",
|
|
|
|
"md5": "6cefb4bdfae2fa25e5585374735e321f",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2020-03-09 19:37:33 +09:00
|
|
|
{ "id": "issue11678",
|
|
|
|
"file": "pdfs/issue11678.pdf",
|
|
|
|
"md5": "e2efadeb91932f4c21e4fc682cce7de9",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 2,
|
|
|
|
"lastPage": 2,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-08-19 05:33:09 +09:00
|
|
|
{ "id": "issue4890",
|
|
|
|
"file": "pdfs/issue4890.pdf",
|
|
|
|
"md5": "1666feb4cd26318c2bdbea6a175dce87",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-09-26 02:32:04 +09:00
|
|
|
{ "id": "bug898853.pdf",
|
|
|
|
"file": "pdfs/bug898853.pdf",
|
|
|
|
"md5": "37c37702bf98d33f9f74e2380c4d1a3f",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "Has a multi-byte char codes."
|
|
|
|
},
|
2016-01-09 19:50:48 +09:00
|
|
|
{ "id": "issue4684-text",
|
|
|
|
"file": "pdfs/issue4684.pdf",
|
|
|
|
"md5": "af5056fcdfb08bd7adc1710d36e4b5b5",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "text",
|
|
|
|
"about": "Invisible (and broken) TrueType font used for text-selection."
|
|
|
|
},
|
2020-06-26 19:36:28 +09:00
|
|
|
{ "id": "issue11124",
|
|
|
|
"file": "pdfs/issue11124.pdf",
|
|
|
|
"md5": "9bde831515dc6b8bb2c7c00c8189aca9",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2020-04-15 20:34:13 +09:00
|
|
|
{ "id": "issue11768",
|
|
|
|
"file": "pdfs/issue11768_reduced.pdf",
|
|
|
|
"md5": "0cafde97d78bb6883531a325a996a5ef",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-10-14 05:35:23 +09:00
|
|
|
{ "id": "issue1912",
|
|
|
|
"file": "pdfs/issue1912.pdf",
|
|
|
|
"md5": "15305b7c2cba971e7423de3f6ad38fef",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-03-30 00:53:51 +09:00
|
|
|
{ "id": "gradientfill",
|
|
|
|
"file": "pdfs/gradientfill.pdf",
|
|
|
|
"md5": "cbc1988e4803f647fa83467a85f0e231",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2012-06-14 02:29:02 +09:00
|
|
|
},
|
|
|
|
{ "id": "issue1796",
|
|
|
|
"file": "pdfs/issue1796.pdf",
|
|
|
|
"md5": "9b9b60dc2a4cc3ea05932785d71304fe",
|
|
|
|
"rounds": 1,
|
2013-01-31 04:31:08 +09:00
|
|
|
"lastPage": 2,
|
2012-06-14 02:29:02 +09:00
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
2012-07-12 08:29:07 +09:00
|
|
|
},
|
2014-04-17 21:52:33 +09:00
|
|
|
{ "id": "issue4630",
|
|
|
|
"file": "pdfs/issue4630.pdf",
|
|
|
|
"md5": "46690a12b953c9ad660f3f6453f8785c",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-02-14 19:59:10 +09:00
|
|
|
{ "id": "issue5734-text",
|
|
|
|
"file": "pdfs/issue5734.pdf",
|
|
|
|
"md5": "2cf88f1786b039ba080623085c87beb9",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2015-05-14 21:08:43 +09:00
|
|
|
{ "id": "issue5896-text",
|
|
|
|
"file": "pdfs/issue5896.pdf",
|
|
|
|
"md5": "08f69084d72dabc5dfdcf5c1ff2a719f",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2012-08-12 01:57:42 +09:00
|
|
|
{ "id": "issue845",
|
2015-11-17 07:38:23 +09:00
|
|
|
"file": "pdfs/issue845r.pdf",
|
|
|
|
"md5": "b5f8fe4005cf3fb685fdb4a4c44ee4a2",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
2012-08-12 01:57:42 +09:00
|
|
|
},
|
2013-08-24 02:57:11 +09:00
|
|
|
{ "id": "helloworld-bad",
|
|
|
|
"file": "pdfs/helloworld-bad.pdf",
|
|
|
|
"md5": "bf5ab1cf7fe3a502c3754f55e6ceeabd",
|
|
|
|
"rounds": 1,
|
2019-10-13 01:15:55 +09:00
|
|
|
"type": "eq"
|
2013-08-24 02:57:11 +09:00
|
|
|
},
|
2017-05-24 00:57:26 +09:00
|
|
|
{ "id": "issue8047",
|
|
|
|
"file": "pdfs/issue8047.pdf",
|
|
|
|
"md5": "83f1b9f7e95caa8e2625390afd7c7276",
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2020-08-08 05:04:53 +09:00
|
|
|
{ "id": "issue11131_reduced",
|
|
|
|
"file": "pdfs/issue11131_reduced.pdf",
|
|
|
|
"md5": "004b7e7d2b133a8dc4fe64aaf3dc9533",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-07-12 08:29:07 +09:00
|
|
|
{ "id": "issue818",
|
|
|
|
"file": "pdfs/issue818.pdf",
|
|
|
|
"md5": "dd2f8a5bd65164ad74da2b45a6ca90cc",
|
|
|
|
"rounds": 1,
|
2013-01-31 04:31:08 +09:00
|
|
|
"lastPage": 1,
|
2012-07-12 08:29:07 +09:00
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
2012-08-04 08:11:43 +09:00
|
|
|
},
|
2013-10-31 00:54:19 +09:00
|
|
|
{ "id": "issue3405",
|
2015-11-17 01:03:59 +09:00
|
|
|
"file": "pdfs/issue3405r.pdf",
|
|
|
|
"md5": "12e8c6b3437e659f9138d892d10c7d3d",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
2013-10-31 00:54:19 +09:00
|
|
|
},
|
2012-11-25 06:13:13 +09:00
|
|
|
{ "id": "issue2006",
|
|
|
|
"file": "pdfs/issue2006.pdf",
|
|
|
|
"md5": "71ec73831ece9b508ad20efa6ff28642",
|
|
|
|
"rounds": 1,
|
2013-01-31 04:31:08 +09:00
|
|
|
"lastPage": 1,
|
2012-11-25 06:13:13 +09:00
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-08-04 08:11:43 +09:00
|
|
|
{ "id": "issue1729",
|
|
|
|
"file": "pdfs/issue1729.pdf",
|
|
|
|
"md5": "29b0eddc3e1dcb23a44384037032d470",
|
|
|
|
"rounds": 1,
|
2013-01-31 04:31:08 +09:00
|
|
|
"lastPage": 1,
|
2012-08-04 08:11:43 +09:00
|
|
|
"link": true,
|
|
|
|
"type": "load"
|
2012-08-17 07:22:28 +09:00
|
|
|
},
|
|
|
|
{ "id": "issue1985",
|
|
|
|
"file": "pdfs/issue1985.pdf",
|
|
|
|
"md5": "2ac7c68e26a8ef797aead15e4875cc6d",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "load"
|
2012-08-29 09:19:31 +09:00
|
|
|
},
|
2017-12-14 09:10:14 +09:00
|
|
|
{ "id": "issue7074_reduced",
|
|
|
|
"file": "pdfs/issue7074_reduced.pdf",
|
|
|
|
"md5": "46893f8aa33620a05acdc27e3b79469d",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-07-03 06:47:47 +09:00
|
|
|
{ "id": "issue5540",
|
|
|
|
"file": "pdfs/issue5540.pdf",
|
|
|
|
"md5": "12b69b19e366232422812ad8b2534f37",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-12-15 05:44:39 +09:00
|
|
|
{ "id": "issue2176",
|
|
|
|
"file": "pdfs/issue2176.pdf",
|
|
|
|
"md5": "ca5cbbc7e2b717997f0b24ffa485eac6",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-12-21 04:14:10 +09:00
|
|
|
{ "id": "issue1453",
|
|
|
|
"file": "pdfs/issue1453.pdf",
|
|
|
|
"md5": "cee0ee8ea3a0643cbd716d57fd44f628",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2017-11-29 02:40:22 +09:00
|
|
|
{ "id": "pattern_text_embedded_font",
|
|
|
|
"file": "pdfs/pattern_text_embedded_font.pdf",
|
|
|
|
"md5": "763b1b9efaecb2b5aefea71c39233f56",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-08-30 08:23:52 +09:00
|
|
|
{ "id": "issue6113",
|
|
|
|
"file": "pdfs/issue6113.pdf",
|
|
|
|
"md5": "365fa2d369c51ee3ff195dae907b6e25",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-11-07 23:02:37 +09:00
|
|
|
{
|
|
|
|
"id": "tiling-pattern-box",
|
|
|
|
"file": "pdfs/tiling-pattern-box.pdf",
|
|
|
|
"md5": "09100872824fc14012bd8f9bf4dbc632",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-10-04 00:02:19 +09:00
|
|
|
{
|
|
|
|
"id": "tiling-pattern-large-steps",
|
|
|
|
"file": "pdfs/tiling-pattern-large-steps.pdf",
|
|
|
|
"md5": "569aac1303c97004aab6a720d9b259b4",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-10-24 01:20:57 +09:00
|
|
|
{ "id": "issue6151",
|
|
|
|
"file": "pdfs/issue6151.pdf",
|
|
|
|
"md5": "926f8c6b25e6f0978759f7947d70e079",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "load"
|
|
|
|
},
|
2020-08-08 05:04:53 +09:00
|
|
|
{ "id": "bug1650302_reduced",
|
|
|
|
"file": "pdfs/bug1650302_reduced.pdf",
|
|
|
|
"md5": "d918c9ec936486e8b6656e10dd909014",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-03-12 02:23:47 +09:00
|
|
|
{ "id": "blendmode",
|
|
|
|
"file": "pdfs/blendmode.pdf",
|
2013-04-03 00:23:25 +09:00
|
|
|
"md5": "5a86e7e9333e93c58abc3f382e1e6ea2",
|
2013-03-12 02:23:47 +09:00
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "Every blend mode that PDF supports."
|
|
|
|
},
|
2013-03-13 09:20:38 +09:00
|
|
|
{ "id": "transparency_group",
|
|
|
|
"file": "pdfs/transparency_group.pdf",
|
|
|
|
"md5": "10391f76434128e5da70cff5fc485ff0",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "Rotated transparency group with blend mode."
|
|
|
|
},
|
2015-05-14 00:25:42 +09:00
|
|
|
{ "id": "issue6010_1",
|
|
|
|
"file": "pdfs/issue6010_1.pdf",
|
|
|
|
"md5": "b58adce5dbb08936ddb0d904f0da8716",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "load",
|
|
|
|
"password": "abc"
|
|
|
|
},
|
|
|
|
{ "id": "issue6010_2",
|
|
|
|
"file": "pdfs/issue6010_2.pdf",
|
|
|
|
"md5": "73a8091d0ab2a47af5ca45047f04da99",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "load",
|
|
|
|
"password": "\u00E6\u00F8\u00E5",
|
2016-01-14 00:40:21 +09:00
|
|
|
"about": "The password (æøå) is UTF8 encoded."
|
2015-05-14 00:25:42 +09:00
|
|
|
},
|
2015-09-28 05:43:23 +09:00
|
|
|
{ "id": "High-Pressure-Measurement-WP-001287",
|
|
|
|
"file": "pdfs/High-Pressure-Measurement-WP-001287.pdf",
|
|
|
|
"md5": "aeba7e47bbe50cbf08bb8bdff78fec8c",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 3,
|
|
|
|
"lastPage": 3,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-07-23 06:52:44 +09:00
|
|
|
{ "id": "issue3458.pdf",
|
|
|
|
"file": "pdfs/issue3458.pdf",
|
|
|
|
"md5": "dab8bd3ad1acfc8dc82a8381a3c8ff94",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "Pattern with current transform different than base transform."
|
|
|
|
},
|
2015-06-05 04:28:14 +09:00
|
|
|
{ "id": "issue6081",
|
|
|
|
"file": "pdfs/issue6081.pdf",
|
|
|
|
"md5": "854326ce9178d10ff4a0ff2aedf67e45",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-12-21 04:14:10 +09:00
|
|
|
{ "id": "issue2462",
|
|
|
|
"file": "pdfs/issue2462.pdf",
|
|
|
|
"md5": "d4e3dddfdd35464c71cf0310bff29b42",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-12-01 00:07:39 +09:00
|
|
|
{ "id": "issue1998",
|
|
|
|
"file": "pdfs/issue1998.pdf",
|
|
|
|
"md5": "586e0213be2f461360ec26770b5a4e48",
|
|
|
|
"rounds": 1,
|
2013-01-31 04:31:08 +09:00
|
|
|
"lastPage": 2,
|
2012-12-01 00:07:39 +09:00
|
|
|
"link": true,
|
2013-04-04 02:36:09 +09:00
|
|
|
"type": "eq"
|
2012-12-01 00:07:39 +09:00
|
|
|
},
|
2013-01-31 09:20:04 +09:00
|
|
|
{ "id": "issue2627",
|
|
|
|
"file": "pdfs/issue2627.pdf",
|
|
|
|
"md5": "1b6b2f19f4e1e1b926afb353b41fe6b2",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 4,
|
|
|
|
"lastPage": 4,
|
|
|
|
"type": "load"
|
|
|
|
},
|
2019-05-08 00:44:37 +09:00
|
|
|
{ "id": "issue10519",
|
|
|
|
"file": "pdfs/issue10519_reduced.pdf",
|
|
|
|
"md5": "8a2dae43c0ef47b0734bedaaa24f8c09",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2020-03-24 22:33:43 +09:00
|
|
|
{ "id": "issue11718",
|
|
|
|
"file": "pdfs/issue11718_reduced.pdf",
|
|
|
|
"md5": "a0deea064b4171bb8ea9f6e8a523e594",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-04-16 08:14:07 +09:00
|
|
|
{ "id": "issue3061",
|
|
|
|
"file": "pdfs/issue3061.pdf",
|
|
|
|
"md5": "696a7cb1b194d095ca3f7861779a606b",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "CFF CID font with font matrices in main top dict and sub top dict."
|
|
|
|
},
|
2013-08-23 02:12:16 +09:00
|
|
|
{ "id": "issue3566",
|
|
|
|
"file": "pdfs/issue3566.pdf",
|
|
|
|
"md5": "e9ab02aa769f4c040a6fa52f00d6e3f0",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"lastPage": 1,
|
|
|
|
"about": "CFF font with multiply-encoded glyph and no pdf encoding dict."
|
2013-04-16 08:14:07 +09:00
|
|
|
},
|
2015-07-03 20:14:41 +09:00
|
|
|
{ "id": "bug1151216",
|
|
|
|
"file": "pdfs/bug1151216.pdf",
|
|
|
|
"md5": "e66ea6ee0e0cdd5119224eb073055eca",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
|
|
|
{ "id": "bug1175962",
|
|
|
|
"file": "pdfs/bug1175962.pdf",
|
|
|
|
"md5": "012a5ea0d2733408ac7f8b88f7352bba",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "The same PDF file as in bug1175943."
|
|
|
|
},
|
2020-03-14 06:09:27 +09:00
|
|
|
{ "id": "issue11697",
|
|
|
|
"file": "pdfs/issue11697_reduced.pdf",
|
|
|
|
"md5": "5b3793a76f92b357bd8ccc02e1c54ba0",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-11-04 13:03:52 +09:00
|
|
|
{ "id": "issue1878",
|
|
|
|
"file": "pdfs/issue1878.pdf",
|
|
|
|
"md5": "b4fb0ce7c19368e7104dce3d0d34bcb3",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "load"
|
|
|
|
},
|
2013-10-11 04:41:11 +09:00
|
|
|
{ "id": "bigboundingbox",
|
|
|
|
"file": "pdfs/bigboundingbox.pdf",
|
|
|
|
"md5": "e5c5e2cb80826d6ebf535413865270cd",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "When the bounding box for the xobject is transformed it creates a huge bounding box."
|
|
|
|
},
|
2012-12-09 14:04:37 +09:00
|
|
|
{ "id": "issue2386",
|
|
|
|
"file": "pdfs/issue2386.pdf",
|
|
|
|
"md5": "7dc787639aa6765214e9ff5494d231ed",
|
|
|
|
"rounds": 1,
|
2013-01-31 04:31:08 +09:00
|
|
|
"lastPage": 2,
|
2012-12-09 14:04:37 +09:00
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-01-10 10:33:59 +09:00
|
|
|
{ "id": "issue1936",
|
|
|
|
"file": "pdfs/issue1936.pdf",
|
|
|
|
"md5": "7302eb9b6a626308e2a933aaed9e1756",
|
|
|
|
"rounds": 1,
|
2013-01-31 04:31:08 +09:00
|
|
|
"lastPage": 1,
|
2013-01-10 10:33:59 +09:00
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
Add a couple more, mostly `text`, reference tests for non-embedded symbolic fonts without included encoding information
I've started to look into how we can fix issue 7580, but quickly became worried that fixing it could easily mean that we'd trade one fixed PDF file for a multitude of broken ones.
Hence I started going through the history of the code that choose the fallback encoding, and noticed that it has been changed a number of times over the years to deal with various cases of weirdness/errors in non-embedded fonts.
To my relief it turned out that almost all the PRs, please see a possibly incomplete [list here], that changed this code actually included `eq` test-cases.
However, in one case it appears that a PR missed to add a test-case. Furthermore since the fallback encoding may also be the only source for creating a `toUnicode` map, changing the encoding could possibly regress only the text-selection despite a PDF file still rendering correctly.
Therefore, this PR adds one new `eq` test, and also a number of additional `text` tests for PDF files already present in the test-suite.
Note that it's obviously possible that there's a certain overlap between the added tests, but I'd be *a whole lot* more concerned with causing regressions.
2016-09-10 20:10:15 +09:00
|
|
|
{ "id": "issue1936-text",
|
|
|
|
"file": "pdfs/issue1936.pdf",
|
|
|
|
"md5": "7302eb9b6a626308e2a933aaed9e1756",
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2015-12-30 06:10:38 +09:00
|
|
|
{ "id": "bug951051",
|
|
|
|
"file": "pdfs/bug951051.pdf",
|
|
|
|
"md5": "05d325a5112bd3f6022367dab7bc07b9",
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "load"
|
|
|
|
},
|
2016-04-12 08:21:44 +09:00
|
|
|
{ "id": "bug1260585",
|
|
|
|
"file": "pdfs/bug1260585.pdf",
|
|
|
|
"md5": "9415b1eb00a43c97c15328cd4c8d136a",
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "load"
|
|
|
|
},
|
2013-07-09 11:03:59 +09:00
|
|
|
{ "id": "issue3062",
|
|
|
|
"file": "pdfs/issue3062.pdf",
|
|
|
|
"md5": "206715f1258f0e117df4180d98dd4d68",
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
|
|
|
{ "id": "issue2442",
|
|
|
|
"file": "pdfs/issue2442.pdf",
|
|
|
|
"md5": "4656e36c44f44c71a499f02ce6c781d3",
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-10-14 04:21:15 +09:00
|
|
|
{ "id": "issue2074",
|
|
|
|
"file": "pdfs/issue2074.pdf",
|
|
|
|
"md5": "5e4ba2241fc35d20e44eb52289a569ab",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-08-29 09:19:31 +09:00
|
|
|
{ "id": "colorkeymask",
|
|
|
|
"file": "pdfs/colorkeymask.pdf",
|
|
|
|
"md5": "9f11e815b485f7f0e1fa5c116c636cf9",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2013-01-13 04:21:30 +09:00
|
|
|
},
|
2013-03-14 04:24:55 +09:00
|
|
|
{ "id": "annotation-as",
|
|
|
|
"file": "pdfs/annotation-as.pdf",
|
|
|
|
"md5": "e51500c8adc9edcdcc8ebc6a575c90ab",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-07-03 04:27:06 +09:00
|
|
|
{ "id": "bug888437",
|
|
|
|
"file": "pdfs/bug888437.pdf",
|
|
|
|
"md5": "93370cd589a08732a05601441c899bcb",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-08-20 08:33:20 +09:00
|
|
|
{ "id": "issue3584",
|
|
|
|
"file": "pdfs/issue3584.pdf",
|
|
|
|
"md5": "7a00646865a840eefc76f05c588b60ce",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "CFF font that is drawn with clipping."
|
|
|
|
},
|
2020-09-20 00:47:38 +09:00
|
|
|
{ "id": "prefilled_f1040",
|
|
|
|
"file": "pdfs/prefilled_f1040.pdf",
|
|
|
|
"md5": "2335da66fb7c2c3b84971597f27785e2",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"print": true,
|
|
|
|
"annotationStorage": {
|
2020-11-04 00:04:08 +09:00
|
|
|
"1605R": {
|
|
|
|
"value": true
|
|
|
|
}
|
2020-09-20 00:47:38 +09:00
|
|
|
}
|
|
|
|
},
|
Move svg:clipPath generation from clip to endPath
In the PDF from issue 8527, the clip operator (W) shows up before a path
is defined. The current SVG backend however expects a path to exist
before generating a `<svg:clipPath>` element.
In the example, the path was defined after the clip, followed by a
endPath operator (n).
So this commit fixes the bug by moving the path generation logic from
clip to endPath.
Our canvas backend appears to use similar logic:
`CanvasGraphics_endPath` calls `consumePath`, which in turn draws the
clip and resets the `pendingClip` state. The canvas backend calls
`consumePath` from multiple other places, so we probably need to check
whether doing so is also necessary for the SVG backend.
I scanned our corpus of PDF files in test/pdfs, and found that in every
instance (except for one), the "W" PDF operator (clip) is immediately
followed by "n" (endPath). The new test from this commit (clippath.pdf)
starts with "W", followed by a path definition and then "n".
# Commands used to find some of the clipping commands:
grep -ra '^W$' -C7 | less -S
grep -ra '^W ' -C7 | less -S
grep -ra ' W$' -C7 | less -S
test/pdfs/issue6413.pdf is the only file where "W" (a tline 55) is not
followed by "n". In fact, the "W" is the last operation of a series of
XObject painting operations, and removing it does not have any effect
on the rendered PDF (confirmed by looking at the output of PDF.js's
canvas backend, and ImageMagick's convert command).
2017-06-19 19:40:48 +09:00
|
|
|
{ "id": "clippath",
|
|
|
|
"file": "pdfs/clippath.pdf",
|
|
|
|
"md5": "7ab95c0f106dccd90d6569f241fe8771",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "Clipping before a path exists, followed by adding a path and then drawing a rectangle."
|
|
|
|
},
|
2013-06-01 06:13:23 +09:00
|
|
|
{ "id": "annotation-tx",
|
|
|
|
"file": "pdfs/annotation-tx.pdf",
|
|
|
|
"md5": "56321ea830be9c4f8437ca17ac535b2d",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
2015-12-19 06:29:22 +09:00
|
|
|
"annotations": true,
|
2016-09-18 22:35:12 +09:00
|
|
|
"about": "Text widget annotation without appearance streams."
|
|
|
|
},
|
|
|
|
{ "id": "annotation-tx-forms",
|
|
|
|
"file": "pdfs/annotation-tx.pdf",
|
|
|
|
"md5": "56321ea830be9c4f8437ca17ac535b2d",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"forms": true,
|
|
|
|
"about": "Text widget annotation without appearance streams."
|
|
|
|
},
|
|
|
|
{ "id": "annotation-tx2-annotations",
|
|
|
|
"file": "pdfs/annotation-tx2.pdf",
|
|
|
|
"md5": "b7a32a751895d394fc07bb6ddb40c420",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"annotations": true
|
|
|
|
},
|
|
|
|
{ "id": "annotation-tx2-forms",
|
|
|
|
"file": "pdfs/annotation-tx2.pdf",
|
|
|
|
"md5": "b7a32a751895d394fc07bb6ddb40c420",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"forms": true
|
|
|
|
},
|
|
|
|
{ "id": "annotation-tx3-annotations",
|
|
|
|
"file": "pdfs/annotation-tx3.pdf",
|
|
|
|
"md5": "3aec45c6465ca4959c25df17c4356a1c",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"annotations": true
|
|
|
|
},
|
|
|
|
{ "id": "annotation-tx3-forms",
|
|
|
|
"file": "pdfs/annotation-tx3.pdf",
|
|
|
|
"md5": "3aec45c6465ca4959c25df17c4356a1c",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"forms": true
|
2013-06-01 06:13:23 +09:00
|
|
|
},
|
2013-03-19 21:37:31 +09:00
|
|
|
{ "id": "gesamt",
|
|
|
|
"file": "pdfs/gesamt.pdf",
|
2013-12-17 09:37:10 +09:00
|
|
|
"md5": "743aaa6f46ed0a42864f079d632d942e",
|
2013-03-19 21:37:31 +09:00
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-03-03 13:38:40 +09:00
|
|
|
{ "id": "javauninstall-7",
|
2015-09-16 20:52:57 +09:00
|
|
|
"file": "pdfs/javauninstall-7r.pdf",
|
|
|
|
"md5": "c0089082a86826fa0195eeeb73f7f895",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
2013-03-03 13:38:40 +09:00
|
|
|
},
|
2013-02-23 20:46:20 +09:00
|
|
|
{ "id": "jst2007-5",
|
|
|
|
"file": "pdfs/JST2007-5.pdf",
|
|
|
|
"md5": "9efa6c37fc771b36a60535036d1910bb",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 2,
|
|
|
|
"lastPage": 2,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-03-24 17:48:39 +09:00
|
|
|
{ "id": "kdchart",
|
|
|
|
"file": "pdfs/kdchart.pdf",
|
|
|
|
"md5": "2556a1d197d7cbe1f5edfc5c3557582b",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-03-16 19:44:54 +09:00
|
|
|
{ "id": "mao",
|
|
|
|
"file": "pdfs/mao.pdf",
|
|
|
|
"md5": "797093d67c4d4d4231ac6e1fb66bf6c3",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2017-03-25 01:24:30 +09:00
|
|
|
{ "id": "mao-text",
|
|
|
|
"file": "pdfs/mao.pdf",
|
|
|
|
"md5": "797093d67c4d4d4231ac6e1fb66bf6c3",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2013-01-13 04:21:30 +09:00
|
|
|
{ "id": "noembed-identity",
|
|
|
|
"file": "pdfs/noembed-identity.pdf",
|
|
|
|
"md5": "05d3803b6c22451e18cb60d8d8c75c0c",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2013-01-12 10:10:09 +09:00
|
|
|
},
|
2013-07-02 23:46:14 +09:00
|
|
|
{ "id": "bug887152",
|
|
|
|
"file": "pdfs/bug887152.pdf",
|
|
|
|
"md5": "783a3e7b1de2cf40a47ffe1f36a41d4f",
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-03-10 21:07:09 +09:00
|
|
|
{ "id": "bug1140761",
|
|
|
|
"file": "pdfs/bug1140761.pdf",
|
|
|
|
"md5": "b74eced7634d4f248dc6265f8225d432",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-08-19 07:57:52 +09:00
|
|
|
{ "id": "bug1011159",
|
|
|
|
"file": "pdfs/bug1011159.pdf",
|
|
|
|
"md5": "4532e22deb92d4cd2992d0cfe255582a",
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "Type3 font with negative HScale and font size"
|
|
|
|
},
|
2014-10-05 07:12:47 +09:00
|
|
|
{ "id": "bug1077808",
|
|
|
|
"file": "pdfs/bug1077808.pdf",
|
|
|
|
"md5": "4a4bfc27e3fafe2f74e7a4a4cd04b8dc",
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "Inline image with ASCII85Decode filter."
|
|
|
|
},
|
2017-08-21 21:34:15 +09:00
|
|
|
{ "id": "issue8798",
|
2017-08-25 00:32:53 +09:00
|
|
|
"file": "pdfs/issue8798r.pdf",
|
|
|
|
"md5": "3a0e29f013d9edcceb5d852e37738a77",
|
2017-08-26 19:09:49 +09:00
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
|
|
|
{ "id": "issue8823",
|
|
|
|
"file": "pdfs/issue8823.pdf",
|
|
|
|
"md5": "ad02d4aa374b315bf1766038d002d57a",
|
|
|
|
"link": false,
|
2017-08-21 21:34:15 +09:00
|
|
|
"rounds": 1,
|
Take the dictionary, and not just the image data, into account when caching inline images (issue 9398)
The reason for the bug is that we're only computing a checksum of the image data itself, but completely ignore the inline dictionary. The latter is important, since in practice it's not uncommon for inline images to be identical but use e.g. different ColourSpaces.
There's obviously a couple of different ways that we could compute a hash/checksum of the dictionary.
Initially I tried using `MurmurHash3_64` to compute a hash of the keys/values in the dictionary. Unfortunately this approach turned out to be *way* too slow in practice, especially for PDF files with a huge number of inline images; in particular issue 2618 would regresses quite badly with this solution.
The solution that is instead implemented in this patch, is to compute a checksum of the dictionary contents. While this is a much simpler, not to mention a lot more efficient, solution there's one drawback associated with it:
If the contents of inline image dictionaries are ordered differently, they will not be considered equal with this approach which could thus lead to failures to cache repeated inline images. In practice this doesn't seem to be a problem in any of the PDF files I've tested, and generally I'd rather err on the side of *not* caching given that too aggressive caching can easily lead to rendering bugs.
One small, but somewhat annoying, complication is that by the time `Parser.makeInlineImage` is called, we no longer know the *exact* stream position where the inline image dictionary starts. Having access to that information is crucial here, and the easiest solution I could come up with is to track this in the current `Lexer` instance.[1]
With the patch, we're thus able to fix the referenced issues without incurring large regressions in problematic cases such as issue 2618.
Fixes 9398; also improves/fixes the `issue8823` reference test.
---
[1] Obviously I'd have preferred if this patch could be limited to `Parser.makeInlineImage`, without the need for this "hack", but I'm not sure what that'd look like here.
2018-01-30 20:26:33 +09:00
|
|
|
"type": "eq",
|
|
|
|
"about": "Also tests issue9398."
|
2017-08-21 21:34:15 +09:00
|
|
|
},
|
2017-07-06 02:38:44 +09:00
|
|
|
{ "id": "issue8613",
|
|
|
|
"file": "pdfs/issue8613.pdf",
|
|
|
|
"md5": "bc7ad2db75710aa9916c5769e0c02123",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2018-02-02 04:43:01 +09:00
|
|
|
{ "id": "issue8614",
|
|
|
|
"file": "pdfs/issue8614.pdf",
|
|
|
|
"md5": "7e8b66cf674ac2b79d6b267d0c6f2fa2",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2020-01-18 20:53:40 +09:00
|
|
|
{ "id": "issue10880",
|
|
|
|
"file": "pdfs/issue10880.pdf",
|
|
|
|
"md5": "244ee5ee3ab88db8d8eb51d4416e2c97",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 7,
|
|
|
|
"lastPage": 7,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2020-07-06 20:02:41 +09:00
|
|
|
{ "id": "issue10989",
|
|
|
|
"file": "pdfs/issue10989.pdf",
|
|
|
|
"md5": "c16de154d9ae6dbeec0a113911957efe",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2018-07-16 01:53:08 +09:00
|
|
|
{ "id": "issue9650",
|
|
|
|
"file": "pdfs/issue9650.pdf",
|
|
|
|
"md5": "20d50bda6b1080b6d9088811299c791e",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2018-05-16 21:26:12 +09:00
|
|
|
{ "id": "issue9679",
|
|
|
|
"file": "pdfs/issue9679.pdf",
|
|
|
|
"md5": "3077d06add3875705aa1021c7b116023",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2019-08-08 17:31:48 +09:00
|
|
|
{ "id": "issue11052",
|
|
|
|
"file": "pdfs/issue11052.pdf",
|
|
|
|
"md5": "67c20e5bec51183c9283baa0cd49593f",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-12-09 09:07:38 +09:00
|
|
|
{ "id": "bug1108753",
|
|
|
|
"file": "pdfs/bug1108753.pdf",
|
|
|
|
"md5": "a7aaf92d55b4602afb0ca3d75198b56b",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-02-14 05:17:27 +09:00
|
|
|
{ "id": "issue5726",
|
|
|
|
"file": "pdfs/issue5726.pdf",
|
|
|
|
"md5": "f52f31ad3da316b599cade875ab049db",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-05-02 08:42:25 +09:00
|
|
|
{ "id": "bug816075",
|
|
|
|
"file": "pdfs/bug816075.pdf",
|
|
|
|
"md5": "7ec87c115c1f9ec41234cc7002555e82",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "A CIDFontType0 font with a CFF font that isn't actually CID."
|
|
|
|
},
|
2019-05-24 08:47:22 +09:00
|
|
|
{ "id": "scorecard_reduced",
|
|
|
|
"file": "pdfs/scorecard_reduced.pdf",
|
|
|
|
"md5": "aa8ed0827092c963eea64adb718a3806",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-10-01 13:40:28 +09:00
|
|
|
{ "id": "bug921409",
|
|
|
|
"file": "pdfs/bug921409.pdf",
|
|
|
|
"md5": "920e88dde0f5436ebe0df0281e1c30ca",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "A CIDFontType0 font that actually has a Type1C font file."
|
|
|
|
},
|
2013-01-24 01:15:02 +09:00
|
|
|
{ "id": "noembed-identity-2",
|
|
|
|
"file": "pdfs/noembed-identity-2.pdf",
|
2013-01-30 02:46:17 +09:00
|
|
|
"md5": "13b7d9ab9579d45c10bc8d499d087f21",
|
2013-01-24 01:15:02 +09:00
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-01-17 00:13:34 +09:00
|
|
|
{ "id": "noembed-jis7",
|
|
|
|
"file": "pdfs/noembed-jis7.pdf",
|
|
|
|
"md5": "a0f6cf5a830f23d0c35994a6aaf92b3d",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
|
|
|
{ "id": "noembed-eucjp",
|
|
|
|
"file": "pdfs/noembed-eucjp.pdf",
|
|
|
|
"md5": "d270f2d46db99b70235b4d37cbc313ad",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
|
|
|
{ "id": "noembed-sjis",
|
|
|
|
"file": "pdfs/noembed-sjis.pdf",
|
|
|
|
"md5": "51f9d150bf4afe498019b3029d451072",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-03-02 15:28:27 +09:00
|
|
|
{ "id": "ohkubo-ss04",
|
|
|
|
"file": "pdfs/ohkubo-SS04.pdf",
|
|
|
|
"md5": "b8c334073ff5be74fac1f36130943ea5",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-03-15 06:06:44 +09:00
|
|
|
{ "id": "issue2770",
|
|
|
|
"file": "pdfs/issue2770.pdf",
|
|
|
|
"md5": "36070d756d06eaa35c2227efb069fb1b",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "Has a 4 bit per component image with mask and decode."
|
|
|
|
},
|
2013-06-26 02:33:53 +09:00
|
|
|
{ "id": "issue2984",
|
|
|
|
"file": "pdfs/issue2984.pdf",
|
2013-12-17 09:37:10 +09:00
|
|
|
"md5": "a8e81e7b7c6d8b0ec26bbfee9954f110",
|
2013-06-26 02:33:53 +09:00
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq",
|
|
|
|
"lastPage": 1,
|
|
|
|
"about": "Type3 fonts with lots of switching between them."
|
|
|
|
},
|
2013-05-01 07:29:25 +09:00
|
|
|
{ "id": "bug808084",
|
|
|
|
"file": "pdfs/bug808084.pdf",
|
|
|
|
"md5": "b1c400de699af29ea3f1983bb26870ab",
|
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "load"
|
|
|
|
},
|
2013-04-11 01:51:06 +09:00
|
|
|
{ "id": "issue3064",
|
|
|
|
"file": "pdfs/issue3064.pdf",
|
|
|
|
"md5": "0307415b7d69b13acaf8bd4285d9544b",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "True type font with encoding dict with no base encoding but with differences."
|
|
|
|
},
|
2019-01-05 08:13:13 +09:00
|
|
|
{ "id": "issue10402",
|
|
|
|
"file": "pdfs/issue10402.pdf",
|
|
|
|
"md5": "7936bd34d7c0aebd0a864b5aa98aa1b4",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
Add a couple more, mostly `text`, reference tests for non-embedded symbolic fonts without included encoding information
I've started to look into how we can fix issue 7580, but quickly became worried that fixing it could easily mean that we'd trade one fixed PDF file for a multitude of broken ones.
Hence I started going through the history of the code that choose the fallback encoding, and noticed that it has been changed a number of times over the years to deal with various cases of weirdness/errors in non-embedded fonts.
To my relief it turned out that almost all the PRs, please see a possibly incomplete [list here], that changed this code actually included `eq` test-cases.
However, in one case it appears that a PR missed to add a test-case. Furthermore since the fallback encoding may also be the only source for creating a `toUnicode` map, changing the encoding could possibly regress only the text-selection despite a PDF file still rendering correctly.
Therefore, this PR adds one new `eq` test, and also a number of additional `text` tests for PDF files already present in the test-suite.
Note that it's obviously possible that there's a certain overlap between the added tests, but I'd be *a whole lot* more concerned with causing regressions.
2016-09-10 20:10:15 +09:00
|
|
|
{ "id": "issue3064-text",
|
|
|
|
"file": "pdfs/issue3064.pdf",
|
|
|
|
"md5": "0307415b7d69b13acaf8bd4285d9544b",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2014-10-19 05:29:21 +09:00
|
|
|
{ "id": "issue5421",
|
|
|
|
"file": "pdfs/issue5421.pdf",
|
|
|
|
"md5": "273f6813758a2349090003c7c8a0d85e",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "Invisible Type3 font used for text selection and searching."
|
|
|
|
},
|
2015-03-03 20:58:20 +09:00
|
|
|
{ "id": "issue5421-text",
|
|
|
|
"file": "pdfs/issue5421.pdf",
|
|
|
|
"md5": "273f6813758a2349090003c7c8a0d85e",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "text"
|
|
|
|
},
|
2015-09-05 19:29:16 +09:00
|
|
|
{ "id": "pr4922",
|
|
|
|
"file": "pdfs/pr4922.pdf",
|
|
|
|
"md5": "178a55510931bc6ecd2f0f848a8fcacc",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "Type3 font with a missing /CharProcs resource."
|
|
|
|
},
|
2015-02-05 23:25:23 +09:00
|
|
|
{ "id": "issue5701",
|
|
|
|
"file": "pdfs/issue5701.pdf",
|
|
|
|
"md5": "7ec476aee12e8bd6be79140223d329c1",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-09-09 22:29:31 +09:00
|
|
|
{ "id": "issue5280",
|
|
|
|
"file": "pdfs/issue5280.pdf",
|
|
|
|
"md5": "0ea1230e2964e74cb6db063a89b78803",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "DecodeParams can be an indirect object"
|
|
|
|
},
|
2013-06-22 07:03:03 +09:00
|
|
|
{ "id": "bug878194",
|
|
|
|
"file": "pdfs/bug878194.pdf",
|
|
|
|
"md5": "c616b21fd2a1a65acc2de0f41e59a8b5",
|
|
|
|
"link": "true",
|
|
|
|
"rounds": 1,
|
|
|
|
"firstPage": 4,
|
|
|
|
"lastPage": 4,
|
2014-10-23 21:19:05 +09:00
|
|
|
"type": "eq"
|
2013-06-22 07:03:03 +09:00
|
|
|
},
|
2013-03-02 11:02:35 +09:00
|
|
|
{ "id": "p020121130574743273239",
|
|
|
|
"file": "pdfs/P020121130574743273239.pdf",
|
|
|
|
"md5": "271b65885d42d174cbc597ca89becb1a",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-02-26 21:06:07 +09:00
|
|
|
{ "id": "sfaa_japanese",
|
|
|
|
"file": "pdfs/SFAA_Japanese.pdf",
|
|
|
|
"md5": "b961bbc0d05bdd6d91041bca60ec8e8b",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-02-08 21:29:22 +09:00
|
|
|
{ "id": "vertical",
|
|
|
|
"file": "pdfs/vertical.pdf",
|
|
|
|
"md5": "8a74d33504701edcefeef2afd022765e",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2018-01-05 07:43:07 +09:00
|
|
|
{ "id": "bug1425312",
|
|
|
|
"file": "pdfs/bug1425312.pdf",
|
|
|
|
"md5": "5b1e7d3e4ba7792fab2b69d1836df5a9",
|
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"firstPage": 2,
|
|
|
|
"lastPage": 2,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-11-03 07:07:13 +09:00
|
|
|
{ "id": "issue3438",
|
|
|
|
"file": "pdfs/issue3438.pdf",
|
|
|
|
"md5": "5aa3340b0920b65a377f697587668f89",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-10-16 05:56:29 +09:00
|
|
|
{ "id": "bug1072164",
|
|
|
|
"file": "pdfs/bug1072164.pdf",
|
|
|
|
"md5": "cfee3c51e8464aa44218f4eaf27e084b",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "CMYK jpeg with mask"
|
|
|
|
},
|
2013-06-26 06:22:03 +09:00
|
|
|
{ "id": "bug886717",
|
|
|
|
"file": "pdfs/bug886717.pdf",
|
|
|
|
"md5": "8ba614192797a1324765610231a1bc9d",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "load",
|
|
|
|
"about": "Annotation that has no resources."
|
|
|
|
},
|
2013-05-17 01:39:39 +09:00
|
|
|
{ "id": "issue3263",
|
2015-11-14 00:47:02 +09:00
|
|
|
"file": "pdfs/issue3263r.pdf",
|
|
|
|
"md5": "0ef82d7a6998c3919cf5595dd47b31a6",
|
2013-05-17 01:39:39 +09:00
|
|
|
"rounds": 1,
|
2015-11-14 00:47:02 +09:00
|
|
|
"link": false,
|
2013-05-17 01:39:39 +09:00
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-04-04 02:36:09 +09:00
|
|
|
{ "id": "issue2761",
|
|
|
|
"file": "pdfs/issue2761.pdf",
|
|
|
|
"md5": "35df0b8cff4afec0c08f08c6a5bc9857",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "Image with indexed colorspace that has a base lab colorspace."
|
|
|
|
},
|
2013-03-10 11:21:44 +09:00
|
|
|
{ "id": "20130226130259",
|
|
|
|
"file": "pdfs/20130226130259.pdf",
|
|
|
|
"md5": "c33e90a1b369c508573023d2434b950f",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-01-12 10:10:09 +09:00
|
|
|
{ "id": "issue2099-1",
|
|
|
|
"file": "pdfs/issue2099-1.pdf",
|
|
|
|
"md5": "c7eca682d70a976dfc4b7e64d3e9f1ce",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2013-02-28 09:56:12 +09:00
|
|
|
},
|
2013-05-17 01:21:47 +09:00
|
|
|
{ "id": "issue3207",
|
2015-11-15 21:27:48 +09:00
|
|
|
"file": "pdfs/issue3207r.pdf",
|
|
|
|
"md5": "617bcf475c109365a113cf9fb6711b0a",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2013-05-17 01:21:47 +09:00
|
|
|
},
|
2014-06-14 08:22:22 +09:00
|
|
|
{ "id": "issue3591",
|
|
|
|
"file": "pdfs/issue3591.pdf",
|
|
|
|
"md5": "f76b3e9d1a44621b73063cf10556c6ff",
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "JPX with 0xFF55 marker"
|
|
|
|
},
|
2014-10-22 17:40:50 +09:00
|
|
|
{ "id": "bug865858",
|
|
|
|
"file": "pdfs/bug865858.pdf",
|
|
|
|
"md5": "7a81bd987dc1d95e9a0be46b7c3f2e18",
|
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "JPX packets"
|
|
|
|
},
|
2013-06-26 02:35:34 +09:00
|
|
|
{ "id": "bug766138",
|
|
|
|
"file": "pdfs/bug766138.pdf",
|
|
|
|
"md5": "b171f5cf8d9834348112fba60ee54f8c",
|
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-07-12 03:33:29 +09:00
|
|
|
{ "id": "bug889327",
|
|
|
|
"file": "pdfs/bug889327.pdf",
|
|
|
|
"md5": "b45cd63419241c40731f98d0e1dac082",
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2020-10-22 00:21:33 +09:00
|
|
|
{ "id": "bug1669099",
|
|
|
|
"file": "pdfs/bug1669099.pdf",
|
|
|
|
"md5": "34421549d58e2b6eeddc674759381f7d",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"print": true,
|
|
|
|
"annotationStorage": {
|
2020-11-04 00:04:08 +09:00
|
|
|
"29R": {
|
|
|
|
"value": true
|
|
|
|
},
|
|
|
|
"33R": {
|
|
|
|
"value": true
|
|
|
|
},
|
|
|
|
"37R": {
|
|
|
|
"value": true
|
|
|
|
},
|
|
|
|
"65R": {
|
|
|
|
"value": true
|
|
|
|
},
|
|
|
|
"69R": {
|
|
|
|
"value": true
|
|
|
|
}
|
2020-10-22 00:21:33 +09:00
|
|
|
}
|
|
|
|
},
|
2013-11-02 07:13:31 +09:00
|
|
|
{ "id": "issue1171.pdf",
|
|
|
|
"file": "pdfs/issue1171.pdf",
|
|
|
|
"md5": "2a6188a42a5874c7874b88eebd4acaf0",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-08-11 17:35:56 +09:00
|
|
|
{ "id": "issue3521.pdf",
|
|
|
|
"file": "pdfs/issue3521.pdf",
|
|
|
|
"md5": "df95d31443e20a38efa29c3a635a045b",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "Support for CMap GBKp-EUC-H"
|
|
|
|
},
|
2013-02-28 09:56:12 +09:00
|
|
|
{ "id": "issue2829",
|
|
|
|
"file": "pdfs/issue2829.pdf",
|
|
|
|
"md5": "f32b28cf8792f6ccc470446bfbb38584",
|
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2012-12-27 16:35:25 +09:00
|
|
|
},
|
2013-03-18 22:06:59 +09:00
|
|
|
{ "id": "issue2956",
|
|
|
|
"file": "pdfs/issue2956.pdf",
|
|
|
|
"md5": "d8f68cbbb4bf54cde9f7f878acb6d7cd",
|
|
|
|
"rounds": 1,
|
2013-11-02 06:30:28 +09:00
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-11-09 02:38:36 +09:00
|
|
|
{ "id": "issue2799",
|
|
|
|
"file": "pdfs/issue2799.pdf",
|
|
|
|
"md5": "3d3224eae54bbae5fc76224a2af49486",
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-11-02 06:30:28 +09:00
|
|
|
{ "id": "issue3025",
|
|
|
|
"file": "pdfs/issue3025.pdf",
|
|
|
|
"md5": "8e4e8eacbd7c4c248deeca0ec49d38da",
|
|
|
|
"rounds": 1,
|
2013-03-18 22:06:59 +09:00
|
|
|
"type": "eq"
|
|
|
|
},
|
2012-12-27 16:35:25 +09:00
|
|
|
{ "id": "issue2177-eq",
|
|
|
|
"file": "pdfs/issue2177.pdf",
|
|
|
|
"md5": "48a808278bf31de8414c4e03ecd0900a",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2013-06-25 00:21:12 +09:00
|
|
|
},
|
|
|
|
{ "id": "issue3384",
|
|
|
|
"file": "pdfs/issue3384.pdf",
|
|
|
|
"md5": "57e31e83c165f16609528ad5ec5825ba",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
2013-08-09 02:02:11 +09:00
|
|
|
},
|
2013-04-26 04:26:42 +09:00
|
|
|
{ "id": "calgray",
|
|
|
|
"file": "pdfs/calgray.pdf",
|
|
|
|
"md5": "ee784999bfa1ed373f55cdabbb580df1",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-03-18 03:38:00 +09:00
|
|
|
{ "id": "calrgb",
|
|
|
|
"file": "pdfs/calrgb.pdf",
|
2014-08-16 17:37:52 +09:00
|
|
|
"md5": "625068e9a7dd80e4f70b24ce97b3ec5c",
|
2014-03-18 03:38:00 +09:00
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 8,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-11-18 22:48:06 +09:00
|
|
|
{ "id": "bug900822",
|
|
|
|
"file": "pdfs/bug900822.pdf",
|
|
|
|
"md5": "70e2a3c5922574eeda169c955cf9d084",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "load"
|
|
|
|
},
|
2013-08-09 02:02:11 +09:00
|
|
|
{ "id": "issue2853",
|
|
|
|
"file": "pdfs/issue2853.pdf",
|
|
|
|
"md5": "9f0ad95ef0b243ee8813c4eca0f7a042",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "load"
|
2013-09-27 03:49:02 +09:00
|
|
|
},
|
2014-04-25 01:48:18 +09:00
|
|
|
{ "id": "issue4668",
|
|
|
|
"file": "pdfs/issue4668.pdf",
|
|
|
|
"md5": "a749d5ca995ad745411406d29156b04e",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-09-27 03:49:02 +09:00
|
|
|
{ "id": "issue3666",
|
|
|
|
"file": "pdfs/issue3666.pdf",
|
2016-01-12 05:24:50 +09:00
|
|
|
"md5": "cbcaf533d8a4e825d7f12cb4f137babd",
|
2013-09-27 03:49:02 +09:00
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 2,
|
|
|
|
"type": "eq"
|
2013-11-07 02:06:00 +09:00
|
|
|
},
|
2020-08-22 07:25:07 +09:00
|
|
|
{ "id": "issue12120_reduced",
|
|
|
|
"file": "pdfs/issue12120_reduced.pdf",
|
|
|
|
"md5": "b4570dcee26ac3121ad3322e19ed1a6a",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2020-09-19 22:29:55 +09:00
|
|
|
{ "id": "issue12392",
|
|
|
|
"file": "pdfs/issue12392.pdf",
|
|
|
|
"md5": "76c3a34c6520940c45c66c92f7df2de5",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-06-03 22:56:16 +09:00
|
|
|
{ "id": "issue4883",
|
|
|
|
"file": "pdfs/issue4883.pdf",
|
|
|
|
"md5": "2fac0d9a189ca5fcef8626153d050be8",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "CMYK JPEG with Decode"
|
|
|
|
},
|
2013-11-14 04:45:59 +09:00
|
|
|
{ "id": "bug903856",
|
|
|
|
"file": "pdfs/bug903856.pdf",
|
|
|
|
"md5": "286eaa9d06a5809f4f08f2093cef8f3f",
|
|
|
|
"rounds": 1,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2013-11-07 02:06:00 +09:00
|
|
|
{ "id": "issue3205",
|
2015-09-10 23:59:30 +09:00
|
|
|
"file": "pdfs/issue3205r.pdf",
|
|
|
|
"md5": "379cd3f2f0d651215c6df5ac6182d013",
|
2013-11-07 02:06:00 +09:00
|
|
|
"rounds": 1,
|
2015-09-10 23:59:30 +09:00
|
|
|
"link": false,
|
2013-11-07 02:06:00 +09:00
|
|
|
"type": "eq"
|
2014-02-06 03:58:14 +09:00
|
|
|
},
|
2015-08-04 00:34:30 +09:00
|
|
|
{ "id": "issue4227",
|
|
|
|
"file": "pdfs/coons-allflags-withfunction.pdf",
|
|
|
|
"md5": "c5f79c24bf9eb66698be0e4ecaa1bdf8",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-10-23 22:15:06 +09:00
|
|
|
{ "id": "issue4575",
|
|
|
|
"file": "pdfs/issue4575.pdf",
|
|
|
|
"md5": "9ea15032afd330916a4d7475cbdb55f6",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-08-05 06:55:55 +09:00
|
|
|
{ "id": "issue6305-part-1",
|
|
|
|
"file": "pdfs/tensor-allflags-withfunction.pdf",
|
|
|
|
"md5": "b47260d50f6a0f26aaccbfa3b7176ae8",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-02-06 03:58:14 +09:00
|
|
|
{ "id": "issue4246",
|
|
|
|
"file": "pdfs/issue4246.pdf",
|
|
|
|
"md5": "ed81787b83cc317c9f049643b853bea3",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "Image mask in higher resolution than the image itself"
|
2014-03-19 21:25:46 +09:00
|
|
|
},
|
2018-01-05 07:43:07 +09:00
|
|
|
{ "id": "PDFJS-9279-reduced",
|
|
|
|
"file": "pdfs/PDFJS-9279-reduced.pdf",
|
|
|
|
"md5": "a562a25596e9fe571ac6fb5b9f561974",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-03-19 21:25:46 +09:00
|
|
|
{ "id": "issue4436",
|
2015-09-10 19:49:41 +09:00
|
|
|
"file": "pdfs/issue4436r.pdf",
|
|
|
|
"md5": "4e43d692d213f56674fcac92110c7364",
|
2014-03-19 21:25:46 +09:00
|
|
|
"rounds": 1,
|
2015-09-10 19:49:41 +09:00
|
|
|
"link": false,
|
2014-03-19 21:25:46 +09:00
|
|
|
"type": "eq"
|
2014-06-13 09:37:41 +09:00
|
|
|
},
|
|
|
|
{ "id": "issue4926",
|
|
|
|
"file": "pdfs/issue4926.pdf",
|
|
|
|
"md5": "ed881c8ea2f9bc4be94ecb7f2b2c149b",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
2014-12-18 06:42:06 +09:00
|
|
|
},
|
2017-12-29 22:39:29 +09:00
|
|
|
{ "id": "decodeACSuccessive",
|
|
|
|
"file": "pdfs/decodeACSuccessive.pdf",
|
|
|
|
"md5": "7749c032624fe27ab8e8d7d5e9a4a93f",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
[api-minor] Decode all JPEG images with the built-in PDF.js decoder in `src/core/jpg.js`
Currently some JPEG images are decoded by the built-in PDF.js decoder in `src/core/jpg.js`, while others attempt to use the browser JPEG decoder. This inconsistency seem unfortunate for a number of reasons:
- It adds, compared to the other image formats supported in the PDF specification, a fair amount of code/complexity to the image handling in the PDF.js library.
- The PDF specification support JPEG images with features, e.g. certain ColorSpaces, that browsers are unable to decode natively. Hence, determining if a JPEG image is possible to decode natively in the browser require a non-trivial amount of parsing. In particular, we're parsing (part of) the raw JPEG data to extract certain marker data and we also need to parse the ColorSpace for the JPEG image.
- While some JPEG images may, for all intents and purposes, appear to be natively supported there's still cases where the browser may fail to decode some JPEG images. In order to support those cases, we've had to implement a fallback to the PDF.js JPEG decoder if there's any issues during the native decoding. This also means that it's no longer possible to simply send the JPEG image to the main-thread and continue parsing, but you now need to actually wait for the main-thread to indicate success/failure first.
In practice this means that there's a code-path where the worker-thread is forced to wait for the main-thread, while the reverse should *always* be the case.
- The native decoding, for anything except the *simplest* of JPEG images, result in increased peak memory usage because there's a handful of short-lived copies of the JPEG data (see PR 11707).
Furthermore this also leads to data being *parsed* on the main-thread, rather than the worker-thread, which you usually want to avoid for e.g. performance and UI-reponsiveness reasons.
- Not all environments, e.g. Node.js, fully support native JPEG decoding. This has, historically, lead to some issues and support requests.
- Different browsers may use different JPEG decoders, possibly leading to images being rendered slightly differently depending on the platform/browser where the PDF.js library is used.
Originally the implementation in `src/core/jpg.js` were unable to handle all of the JPEG images in the test-suite, but over the last couple of years I've fixed (hopefully) all of those issues.
At this point in time, there's two kinds of failure with this patch:
- Changes which are basically imperceivable to the naked eye, where some pixels in the images are essentially off-by-one (in all components), which could probably be attributed to things such as different rounding behaviour in the browser/PDF.js JPEG decoder.
This type of "failure" accounts for the *vast* majority of the total number of changes in the reference tests.
- Changes where the JPEG images now looks *ever so slightly* blurrier than with the native browser decoder. For quite some time I've just assumed that this pointed to a general deficiency in the `src/core/jpg.js` implementation, however I've discovered when comparing two viewers side-by-side that the differences vanish at higher zoom levels (usually around 200% is enough).
Basically if you disable [this downscaling in canvas.js](https://github.com/mozilla/pdf.js/blob/8fb82e939cf0c8618a4e775ff17fc96f726872b5/src/display/canvas.js#L2356-L2395), which is what happens when zooming in, the differences simply vanish!
Hence I'm pretty satisfied that there's no significant problems with the `src/core/jpg.js` implementation, and the problems are rather tied to the general quality of the downscaling algorithm used. It could even be seen as a positive that *all* images now share the same downscaling behaviour, since this actually fixes one old bug; see issue 7041.
2020-01-20 20:10:16 +09:00
|
|
|
"type": "eq"
|
2017-12-29 22:39:29 +09:00
|
|
|
},
|
2014-12-30 06:28:03 +09:00
|
|
|
{ "id": "issue5592",
|
|
|
|
"file": "pdfs/issue5592.pdf",
|
|
|
|
"md5": "a0750f95afa80c880f7966df7062616c",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-11-25 16:44:06 +09:00
|
|
|
{ "id": "issue6296.pdf",
|
|
|
|
"file": "pdfs/issue6296.pdf",
|
|
|
|
"md5": "734e191aab1372e6fd7523ca7751fcf0",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-12-24 02:17:23 +09:00
|
|
|
{ "id": "issue6298.pdf",
|
|
|
|
"file": "pdfs/issue6298.pdf",
|
|
|
|
"md5": "214340be34f463611fc3127ad0695034",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2014-12-18 06:42:06 +09:00
|
|
|
{ "id": "issue5549.pdf",
|
|
|
|
"file": "pdfs/issue5549.pdf",
|
|
|
|
"md5": "6c36df6ebc583c9e18aad0ad00d257b8",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "Free image obtained from www.unsplash.com"
|
2014-12-18 06:46:47 +09:00
|
|
|
},
|
|
|
|
{ "id": "issue5475.pdf",
|
|
|
|
"file": "pdfs/issue5475.pdf",
|
|
|
|
"md5": "bda962373570ac4dfe0fbd1ad4f0d9ef",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "Free image obtained from www.unsplash.com"
|
2014-12-19 05:26:02 +09:00
|
|
|
},
|
2014-12-26 05:04:01 +09:00
|
|
|
{ "id": "annotation-border-styles.pdf",
|
|
|
|
"file": "pdfs/annotation-border-styles.pdf",
|
|
|
|
"md5": "22930fc09c7386e1131b14d936e554af",
|
|
|
|
"rounds": 1,
|
2015-12-19 06:29:22 +09:00
|
|
|
"type": "eq",
|
|
|
|
"annotations": true
|
2014-12-26 05:04:01 +09:00
|
|
|
},
|
2014-12-19 05:26:02 +09:00
|
|
|
{ "id": "issue5481.pdf",
|
|
|
|
"file": "pdfs/issue5481.pdf",
|
|
|
|
"md5": "cf00bd25b15b7e23542b48a626585c36",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"about": "Free image obtained from www.unsplash.com"
|
2015-02-10 07:32:16 +09:00
|
|
|
},
|
|
|
|
{ "id": "issue5567",
|
|
|
|
"file": "pdfs/issue5567.pdf",
|
|
|
|
"md5": "d5b37f8bf1b3aafa1b4fcf19ebdc7c74",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2015-04-04 15:15:31 +09:00
|
|
|
},
|
2019-10-13 01:15:55 +09:00
|
|
|
{ "id": "issue5909_original",
|
|
|
|
"file": "pdfs/issue5909_original.pdf",
|
|
|
|
"md5": "65c169b6f540b27ac0ff2738a80d1e14",
|
|
|
|
"link": true,
|
2015-04-04 15:15:31 +09:00
|
|
|
"rounds": 1,
|
2019-10-13 01:15:55 +09:00
|
|
|
"lastPage": 2,
|
|
|
|
"type": "eq"
|
2015-07-11 03:18:53 +09:00
|
|
|
},
|
|
|
|
{ "id": "issue6069",
|
|
|
|
"file": "pdfs/issue6069.pdf",
|
|
|
|
"md5": "d0ad8871f4116bca8e39513ffa8b7d8e",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "load"
|
2015-07-11 19:15:43 +09:00
|
|
|
},
|
2016-02-23 08:21:28 +09:00
|
|
|
{ "id": "issue7014",
|
|
|
|
"file": "pdfs/issue7014.pdf",
|
|
|
|
"md5": "b410891d7a01af791364e9c530d61b17",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"annotations": true
|
|
|
|
},
|
2015-12-23 05:31:56 +09:00
|
|
|
{ "id": "annotation-link-text-popup",
|
|
|
|
"file": "pdfs/annotation-link-text-popup.pdf",
|
|
|
|
"md5": "4bbf56e81d47232de5f305124ab0ba27",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"annotations": true
|
|
|
|
},
|
|
|
|
{ "id": "annotation-text-without-popup",
|
|
|
|
"file": "pdfs/annotation-text-without-popup.pdf",
|
|
|
|
"md5": "7c2d241babe00139e34b9f8369a909eb",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"annotations": true,
|
|
|
|
"about": "Text annotation without a separate Popup annotation"
|
|
|
|
},
|
2015-12-28 08:33:41 +09:00
|
|
|
{ "id": "annotation-underline",
|
|
|
|
"file": "pdfs/annotation-underline.pdf",
|
|
|
|
"md5": "c24b3aba771de52f9bac25e854c39458",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"annotations": true
|
|
|
|
},
|
2020-08-08 03:46:41 +09:00
|
|
|
{ "id": "annotation-underline-without-appearance",
|
|
|
|
"file": "pdfs/annotation-underline-without-appearance.pdf",
|
|
|
|
"md5": "dd5be5e9a8e6bdbf67c175ca170f7cb7",
|
|
|
|
"rounds": 1,
|
|
|
|
"annotations": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-12-29 23:09:28 +09:00
|
|
|
{ "id": "annotation-strikeout",
|
|
|
|
"file": "pdfs/annotation-strikeout.pdf",
|
|
|
|
"md5": "6624e6b5bedd2f2855b6ab12bbf93c57",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"annotations": true
|
|
|
|
},
|
2020-08-08 03:46:41 +09:00
|
|
|
{ "id": "annotation-strikeout-without-appearance",
|
|
|
|
"file": "pdfs/annotation-strikeout-without-appearance.pdf",
|
|
|
|
"md5": "1dc751ab83e8deb3094bfc580289b097",
|
|
|
|
"rounds": 1,
|
|
|
|
"annotations": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2015-12-30 23:28:26 +09:00
|
|
|
{ "id": "annotation-squiggly",
|
|
|
|
"file": "pdfs/annotation-squiggly.pdf",
|
|
|
|
"md5": "38661e731ac6c525af5894d2d20c6e71",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"annotations": true
|
|
|
|
},
|
2020-08-08 03:46:41 +09:00
|
|
|
{ "id": "annotation-squiggly-without-appearance",
|
|
|
|
"file": "pdfs/annotation-squiggly-without-appearance.pdf",
|
|
|
|
"md5": "6546f22a06a5e51d0e835c677cdbc705",
|
|
|
|
"rounds": 1,
|
|
|
|
"annotations": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2016-01-01 23:31:46 +09:00
|
|
|
{ "id": "annotation-highlight",
|
|
|
|
"file": "pdfs/annotation-highlight.pdf",
|
|
|
|
"md5": "e13e198e3a69c32dc9ebdc704d3105e1",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"annotations": true
|
|
|
|
},
|
2020-08-08 03:46:41 +09:00
|
|
|
{ "id": "annotation-highlight-without-appearance",
|
|
|
|
"file": "pdfs/annotation-highlight-without-appearance.pdf",
|
|
|
|
"md5": "a1f2811324fa1ff0c9f1778697413dad",
|
|
|
|
"rounds": 1,
|
|
|
|
"annotations": true,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2019-04-14 01:45:22 +09:00
|
|
|
{ "id": "annotation-freetext",
|
|
|
|
"file": "pdfs/annotation-freetext.pdf",
|
|
|
|
"md5": "6ca19ce632ead3aed08f22e588510e2f",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"annotations": true
|
|
|
|
},
|
2017-04-03 03:50:17 +09:00
|
|
|
{ "id": "annotation-line",
|
|
|
|
"file": "pdfs/annotation-line.pdf",
|
|
|
|
"md5": "fde60608be2748f10fb6522cba425ca1",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"annotations": true
|
|
|
|
},
|
2017-07-24 07:34:39 +09:00
|
|
|
{ "id": "annotation-square-circle",
|
|
|
|
"file": "pdfs/annotation-square-circle.pdf",
|
|
|
|
"md5": "cfd3c302f68d61e1d55ed9c7896046c3",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"annotations": true
|
|
|
|
},
|
2017-09-16 23:37:50 +09:00
|
|
|
{ "id": "annotation-stamp",
|
|
|
|
"file": "pdfs/annotation-stamp.pdf",
|
|
|
|
"md5": "0a04d7ce1ad103cb3c033d26855d6ec7",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"annotations": true
|
|
|
|
},
|
2016-02-15 05:27:53 +09:00
|
|
|
{ "id": "annotation-fileattachment",
|
|
|
|
"file": "pdfs/annotation-fileattachment.pdf",
|
|
|
|
"md5": "d20ecee4b53c81b2dd44c8715a1b4a83",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"annotations": true
|
|
|
|
},
|
2016-09-18 22:35:12 +09:00
|
|
|
{ "id": "annotation-text-widget-annotations",
|
|
|
|
"file": "pdfs/annotation-text-widget.pdf",
|
2016-09-20 07:04:11 +09:00
|
|
|
"md5": "b7b8923a12998fca8603fae53f73f19b",
|
2016-09-18 22:35:12 +09:00
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"annotations": true
|
|
|
|
},
|
2016-09-15 04:51:21 +09:00
|
|
|
{ "id": "annotation-text-widget-forms",
|
|
|
|
"file": "pdfs/annotation-text-widget.pdf",
|
2016-09-20 07:04:11 +09:00
|
|
|
"md5": "b7b8923a12998fca8603fae53f73f19b",
|
2016-09-15 04:51:21 +09:00
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"forms": true
|
|
|
|
},
|
2020-08-18 05:34:02 +09:00
|
|
|
{ "id": "annotation-text-widget-print",
|
|
|
|
"file": "pdfs/annotation-text-widget.pdf",
|
|
|
|
"md5": "b7b8923a12998fca8603fae53f73f19b",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"print": true,
|
|
|
|
"annotationStorage": {
|
2020-11-04 00:04:08 +09:00
|
|
|
"61R": {
|
|
|
|
"value": "Single line, unlimited length"
|
|
|
|
},
|
|
|
|
"62R": {
|
|
|
|
"value": "Single lin"
|
|
|
|
},
|
|
|
|
"63R": {
|
|
|
|
"value": "Single line, center aligned"
|
|
|
|
},
|
|
|
|
"64R": {
|
|
|
|
"value": "Single line, right aligned"
|
|
|
|
},
|
|
|
|
"65R": {
|
|
|
|
"value": ""
|
|
|
|
},
|
|
|
|
"66R": {
|
|
|
|
"value": "zyxwvutsrqponmlkjihgfedcba"
|
|
|
|
},
|
|
|
|
"67R": {
|
|
|
|
"value": "Multiline\nstring"
|
|
|
|
}
|
2020-08-18 05:34:02 +09:00
|
|
|
}
|
|
|
|
},
|
2016-09-26 00:08:17 +09:00
|
|
|
{ "id": "annotation-choice-widget-annotations",
|
|
|
|
"file": "pdfs/annotation-choice-widget.pdf",
|
|
|
|
"md5": "7dfb0d743a0da0f4a71b209ab43b0be5",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"annotations": true
|
|
|
|
},
|
|
|
|
{ "id": "annotation-choice-widget-forms",
|
|
|
|
"file": "pdfs/annotation-choice-widget.pdf",
|
|
|
|
"md5": "7dfb0d743a0da0f4a71b209ab43b0be5",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"forms": true
|
|
|
|
},
|
2020-08-18 05:34:02 +09:00
|
|
|
{ "id": "annotation-choice-widget-print",
|
|
|
|
"file": "pdfs/annotation-choice-widget.pdf",
|
|
|
|
"md5": "7dfb0d743a0da0f4a71b209ab43b0be5",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"print": true,
|
|
|
|
"annotationStorage": {
|
2020-11-04 00:04:08 +09:00
|
|
|
"57R": {
|
|
|
|
"value": "Ipsum"
|
|
|
|
},
|
|
|
|
"58R": {
|
|
|
|
"value": "Lorem"
|
|
|
|
},
|
|
|
|
"59R": {
|
|
|
|
"value": "Dolor"
|
|
|
|
},
|
|
|
|
"62R": {
|
|
|
|
"value": "Sit"
|
|
|
|
},
|
|
|
|
"63R": {
|
|
|
|
"value": ""
|
|
|
|
}
|
2020-08-18 05:34:02 +09:00
|
|
|
}
|
|
|
|
},
|
2020-08-22 23:24:03 +09:00
|
|
|
{ "id": "issue12233-forms",
|
|
|
|
"file": "pdfs/issue12233.pdf",
|
|
|
|
"md5": "6099fc695fe018ce444752929d86f9c8",
|
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"forms": true
|
|
|
|
},
|
|
|
|
{ "id": "issue12233-print",
|
|
|
|
"file": "pdfs/issue12233.pdf",
|
|
|
|
"md5": "6099fc695fe018ce444752929d86f9c8",
|
|
|
|
"link": true,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"print": true,
|
|
|
|
"annotationStorage": {
|
2020-11-04 00:04:08 +09:00
|
|
|
"20R": {
|
|
|
|
"value": true
|
|
|
|
}
|
2020-08-22 23:24:03 +09:00
|
|
|
}
|
|
|
|
},
|
2020-06-04 15:43:46 +09:00
|
|
|
{ "id": "issue11931",
|
|
|
|
"file": "pdfs/issue11931.pdf",
|
|
|
|
"md5": "9ea233037992e1f10280420a49e72845",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2020-08-17 15:49:19 +09:00
|
|
|
{ "id": "issue6931",
|
|
|
|
"file": "pdfs/issue6931_reduced.pdf",
|
|
|
|
"md5": "e61388913821a5e044bf85a5846d6d9a",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
|
|
|
},
|
2016-12-16 06:15:38 +09:00
|
|
|
{ "id": "annotation-button-widget-annotations",
|
|
|
|
"file": "pdfs/annotation-button-widget.pdf",
|
|
|
|
"md5": "5cf23adfff84256d9cfe261bea96dade",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"annotations": true
|
|
|
|
},
|
|
|
|
{ "id": "annotation-button-widget-forms",
|
|
|
|
"file": "pdfs/annotation-button-widget.pdf",
|
|
|
|
"md5": "5cf23adfff84256d9cfe261bea96dade",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"forms": true
|
|
|
|
},
|
2020-08-18 05:34:02 +09:00
|
|
|
{ "id": "annotation-button-widget-print",
|
|
|
|
"file": "pdfs/annotation-button-widget.pdf",
|
|
|
|
"md5": "5cf23adfff84256d9cfe261bea96dade",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"print": true,
|
|
|
|
"annotationStorage": {
|
2020-11-04 00:04:08 +09:00
|
|
|
"105R": {
|
|
|
|
"value": true
|
|
|
|
},
|
|
|
|
"106R": {
|
|
|
|
"value": false
|
|
|
|
},
|
|
|
|
"107R": {
|
|
|
|
"value": false
|
|
|
|
},
|
|
|
|
"108R": {
|
|
|
|
"value": true
|
|
|
|
},
|
|
|
|
"109R": {
|
|
|
|
"value": false
|
|
|
|
},
|
|
|
|
"110R": {
|
|
|
|
"value": false
|
|
|
|
},
|
|
|
|
"111R": {
|
|
|
|
"value": true
|
|
|
|
},
|
|
|
|
"112R": {
|
|
|
|
"value": false
|
|
|
|
},
|
|
|
|
"113R": {
|
|
|
|
"value": false
|
|
|
|
}
|
2020-08-18 05:34:02 +09:00
|
|
|
}
|
|
|
|
},
|
2017-09-24 00:01:19 +09:00
|
|
|
{ "id": "annotation-polyline-polygon",
|
|
|
|
"file": "pdfs/annotation-polyline-polygon.pdf",
|
|
|
|
"md5": "e68611602f58c8ca70cc40575ba3b04e",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"annotations": true
|
|
|
|
},
|
2017-11-21 07:00:19 +09:00
|
|
|
{ "id": "issue4872",
|
|
|
|
"file": "pdfs/issue4872.pdf",
|
|
|
|
"md5": "21c6cbc682140d6f6017bbeb45892053",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"annotations": true
|
|
|
|
},
|
2020-08-27 03:26:18 +09:00
|
|
|
{ "id": "issue4872-forms",
|
|
|
|
"file": "pdfs/issue4872.pdf",
|
|
|
|
"md5": "21c6cbc682140d6f6017bbeb45892053",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"firstPage": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"forms": true
|
|
|
|
},
|
2015-07-11 19:15:43 +09:00
|
|
|
{ "id": "issue6108",
|
|
|
|
"file": "pdfs/issue6108.pdf",
|
|
|
|
"md5": "8961cb55149495989a80bf0487e0f076",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "load"
|
2016-02-10 01:09:17 +09:00
|
|
|
},
|
2016-03-03 11:10:15 +09:00
|
|
|
{ "id": "zero_descent",
|
|
|
|
"file": "pdfs/zero_descent.pdf",
|
|
|
|
"md5": "32805ab28be1d0e91d27d9742c66eccf",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "text"
|
2016-10-13 20:47:17 +09:00
|
|
|
},
|
|
|
|
{ "id": "operator-in-TJ-array",
|
|
|
|
"file": "pdfs/operator-in-TJ-array.pdf",
|
|
|
|
"md5": "dfe0f15a45be18eca142adaf760984ee",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "text"
|
2016-12-07 07:07:16 +09:00
|
|
|
},
|
|
|
|
{ "id": "issue7878",
|
|
|
|
"file": "pdfs/issue7878.pdf",
|
|
|
|
"md5": "59194e30037e8c09ae846ddd0ace4c81",
|
|
|
|
"link": false,
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "text"
|
2017-02-11 15:25:05 +09:00
|
|
|
},
|
|
|
|
{ "id": "font_ascent_descent",
|
|
|
|
"file": "pdfs/font_ascent_descent.pdf",
|
|
|
|
"md5": "c0048a7735010002b998c112335e47bf",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2017-03-07 09:17:27 +09:00
|
|
|
},
|
|
|
|
{ "id": "issue8097",
|
|
|
|
"file": "pdfs/issue8097_reduced.pdf",
|
|
|
|
"md5": "ced0e2d88cfd5b4d3a55d937ea288af1",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2017-08-08 21:38:29 +09:00
|
|
|
},
|
|
|
|
{ "id": "pr8491",
|
|
|
|
"file": "pdfs/pr8491.pdf",
|
|
|
|
"md5": "36ea2e28cd77e9e70731f574ab27cbe0",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
2017-09-20 04:19:40 +09:00
|
|
|
},
|
|
|
|
{ "id": "ccitt_EndOfBlock_false",
|
|
|
|
"file": "pdfs/ccitt_EndOfBlock_false.pdf",
|
|
|
|
"md5": "ce718efe601cd7491dd00651b4790329",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2018-08-03 02:16:42 +09:00
|
|
|
},
|
|
|
|
{ "id": "issue9940",
|
|
|
|
"file": "pdfs/issue9940.pdf",
|
|
|
|
"md5": "6ffef210c4b6cfe423e20430d8af168a",
|
|
|
|
"rounds": 1,
|
|
|
|
"link": false,
|
|
|
|
"type": "eq"
|
2020-01-27 11:03:27 +09:00
|
|
|
},
|
|
|
|
{ "id": "issue11526",
|
|
|
|
"file": "pdfs/issue11526.pdf",
|
|
|
|
"md5": "9babc771fc8792f43e4ada46b0daff8c",
|
|
|
|
"rounds": 1,
|
|
|
|
"lastPage": 1,
|
|
|
|
"link": true,
|
|
|
|
"type": "eq"
|
2020-01-31 23:22:54 +09:00
|
|
|
},
|
|
|
|
{ "id": "issue11555",
|
|
|
|
"file": "pdfs/issue11555.pdf",
|
|
|
|
"md5": "f84ce8b7414f6a18e75a6ce69c902501",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq"
|
2020-09-12 23:52:38 +09:00
|
|
|
},
|
|
|
|
{ "id": "issue12337",
|
|
|
|
"file": "pdfs/issue12337.pdf",
|
|
|
|
"md5": "9165772d5b860bcbcc2478f32e311eb0",
|
|
|
|
"rounds": 2,
|
|
|
|
"lastPage": 1,
|
|
|
|
"type": "fbf"
|
2020-11-03 15:44:21 +09:00
|
|
|
},
|
|
|
|
{ "id": "pr12564",
|
|
|
|
"file": "pdfs/pr12564.pdf",
|
|
|
|
"md5": "24a19949a2541b960363832cf141f2f2",
|
|
|
|
"rounds": 1,
|
|
|
|
"type": "eq",
|
|
|
|
"annotations": true
|
|
|
|
}
|
Initial import of first test harness
The harness (test.py) operates as follows. First it locates executable browsers
(or symlinks or scripts) named "[browser][version]", e.g. "firefox4".
It then launches the located browsers and asks them to load the file
test_slave.html. At the same time, test.py sets up an HTTP server on
localhost:8080 (there's a race condition here currently ;). After
test_slave loads in the browser(s), it fetches the task manifest
(test_manifest.json). The entries in the manifest specify which PDF
to load and how many times to cycle through page rendering. This will
probably evolve over time. test_slave then performs the requested
tasks and POSTs the results back to test.py, which saves them. When
all the results of for a task are in, test.py checks them.
There are three types of tests currently. "==" tests compare the
rendering of a PDF against a master copy. This is not yet implemented
because setting up a master copy is complicated. "fbf" tests render
all a PDF's pages, then go back to page 1 and render all pages a
second time. The renderings from the first round must match the ones
from the second round. "load" tests just check that a PDF's pages
load without errors.
Currently the test harness will only launch a "firefox4" target. This
can be a bash script in your pdf.js checkout, pdf.js/firefox4,
something like the following
#!/bin/bash
dist="/path/to/firefox4/installation"
profile=`mktemp -dt 'pdf.js-test-ff-profile-XXXXXXXXXX'`
$dist/firefox -no-remote -profile $profile $*
rm -rf $profile
(Yes, this script doesn't clean up properly on early termination.)
It's possible to run the tests in a normal browsing session, but that
might be annoying. With that set up, run the harness like so
python test.py
If all goes well, you'll see all "TEST-PASS" messages printed to
stdout. If something goes wrong, you'll see "TEST-UNEXPECTED-FAIL"
printed to stdout.
2011-06-19 10:09:21 +09:00
|
|
|
]
|