Other PDF readers, e.g. Adobe Reader and PDFium (in Chrome), will attempt to render as much of a page as possible even if there are errors present.
Currently we just bail as soon the first error is hit, which means that we'll usually not render anything in these cases and just display a blank page instead.
NOTE: This patch changes the default behaviour of the PDF.js API to always attempt to recover as much data as possible, even when encountering errors during e.g. `getOperatorList`/`getTextContent`, which thus improve our handling of corrupt PDF files and allow the default viewer to handle errors slightly more gracefully.
In the event that an API consumer wishes to use the old behaviour, where we stop parsing as soon as an error is encountered, the `stopAtErrors` parameter can be set at `getDocument`.
Fixes, inasmuch it's possible since the PDF files are corrupt, e.g. issue 6342, issue 3795, and [bug 1130815](https://bugzilla.mozilla.org/show_bug.cgi?id=1130815) (and probably others too).
Considering how extremely simple this patch turned out to be, I'm almost worried that I completely misunderstood why the current code looks like it does...
It doesn't really make sense to attempt to utilize the `NativeImageDecoder` in Node, since there's no native image support available, hence building on PR 8035 we can easily disable it in the example.
Fixes 7901.
The existing implementation of fakeRequestAnimationFrame
did not return a timer ID, so the frame could not be cancelled
if you wanted to cancel it. But if you do want to cancel it,
it needs to be cancelled through clearTimeout instead of
cancelAnimationFrame, because the timer IDs are different.
Signed-off-by: Jonathan Barnes <jbarnes@pivotal.io>
If we only invoke the bootstrap-enabled script when PdfJs.enabled is
true, then we don't need to check it again in the script.
This avoids a sync IPC call to the parent process.
It also keeps PdfJs.jsm from importing PdfjsContentUtils.jsm in the
child process until it is actually needed, which is one steps towards
not loading it until it is really needed.
pdfjschildbootstrap.js will always be run, but
pdfjschildbootstrap-enabled.js will only be run if PdfJs.enabled is
true. This will let us avoid some work in the child process in the
next patch.
This will need to be landed in the mozilla-central repository at the
same time as a change to nsBrowserGlue.js. See bug 1352218.
Refactor removing of the `zoomLayer` into a helper method, and use that in `PDFPageView.reset` to ensure that the entire `zoomLayer` is actually removed (issue 8209)
I happened to notice that the error handling wasn't that great, which I missed previously since there were no unit-tests for failure to load built-in CMap files.
Hence this patch, which improves the error handling *and* adds tests.
I found that PR 8105 unfortunately causes a *very serious* performance regression in long PDF documents where the `Pages` tree only has one level; my apologies for this!
Obviously we cannot revert that PR, since that would cause more issues than it solves. Hence it seems to me that the only viable solution here, is to add a simple `RefSetCache` to reduce the amount of redundant lookups.
Previously in PR 8105 caching was thought to be unnecessary, but as it turns out I don't think that we really have a choice in the matter any more.
See http://eslint.org/docs/rules/#ecmascript-6.
To try and enforce consistent rules and to help avoid some possible errors in ES6 code from the start, this patch adds a few basic ESLint rules.
Note that a two of the rules, `no-shadow` and `object-shorthand`, are currently disabled. While it'd certainly be nice to enable both of them, it's currently impossible since that would result in close to one thousand lint errors.
Remove unnecessary `xref` parameters from various method signatures in `PartialEvaluator`, since `this.xref` is already available in the relevant scope
For reasons I don't pretend to understand, we're passing around `xref` arguments to a bunch of methods despite `this.xref` being available in `PartialEvaluator`.
This patch is a small first small step towards cleaning up the, often unwieldy, signatures of methods in `PartialEvaluator`.
I really cannot understand why this change is necessary, since modern browsers such as Firefox and Chrome work just fine with the old code.
Hence this is patch is yet another "hack" that's needed just because IE apparently cannot just work like you'd expect.
For consistency, the Node factory used in the CMap unit-tests is changed as well.
Fixes 8193.
*My apologies for inadvertently breaking this in PR 8064; apparently we don't have any tests that cover this use-case :(*
Without this patch `getTextContent` will fail if called before `getOperatorList`, since loading of fonts during text-extraction may require fetching of built-in CMap files.
*Please note:* The `text` test added here, which uses an already existing PDF file, fails without this patch.
In core/document.js: `PDFDocument.prototype.parse` accesses a dictionary
property, which could throw if the underlying data is not yet available.
In core/obj.js: `get Catalog.prototype.metadata` calls
`stream.getBytes`, which can throw MissingDataException too when the
stream is a ChunkedStream.