The existing implementation of fakeRequestAnimationFrame
did not return a timer ID, so the frame could not be cancelled
if you wanted to cancel it. But if you do want to cancel it,
it needs to be cancelled through clearTimeout instead of
cancelAnimationFrame, because the timer IDs are different.
Signed-off-by: Jonathan Barnes <jbarnes@pivotal.io>
I happened to notice that the error handling wasn't that great, which I missed previously since there were no unit-tests for failure to load built-in CMap files.
Hence this patch, which improves the error handling *and* adds tests.
I found that PR 8105 unfortunately causes a *very serious* performance regression in long PDF documents where the `Pages` tree only has one level; my apologies for this!
Obviously we cannot revert that PR, since that would cause more issues than it solves. Hence it seems to me that the only viable solution here, is to add a simple `RefSetCache` to reduce the amount of redundant lookups.
Previously in PR 8105 caching was thought to be unnecessary, but as it turns out I don't think that we really have a choice in the matter any more.
For reasons I don't pretend to understand, we're passing around `xref` arguments to a bunch of methods despite `this.xref` being available in `PartialEvaluator`.
This patch is a small first small step towards cleaning up the, often unwieldy, signatures of methods in `PartialEvaluator`.
I really cannot understand why this change is necessary, since modern browsers such as Firefox and Chrome work just fine with the old code.
Hence this is patch is yet another "hack" that's needed just because IE apparently cannot just work like you'd expect.
For consistency, the Node factory used in the CMap unit-tests is changed as well.
Fixes 8193.
*My apologies for inadvertently breaking this in PR 8064; apparently we don't have any tests that cover this use-case :(*
Without this patch `getTextContent` will fail if called before `getOperatorList`, since loading of fonts during text-extraction may require fetching of built-in CMap files.
*Please note:* The `text` test added here, which uses an already existing PDF file, fails without this patch.
In core/document.js: `PDFDocument.prototype.parse` accesses a dictionary
property, which could throw if the underlying data is not yet available.
In core/obj.js: `get Catalog.prototype.metadata` calls
`stream.getBytes`, which can throw MissingDataException too when the
stream is a ChunkedStream.
Similar to other `try-catch` statements in `/core` code, we must re-throw `MissingDataException` to prevent issues with missing data during document loading.
Note that I'm not sure if/how we can test this, which is why the patch doesn't include any test(s).
Fixes 8180.
*After browsing through (a version of) the JPEG specification, see https://www.w3.org/Graphics/JPEG/itu-t81.pdf, I hope that this patch makes sense.*
Note that while issue 7828 became a problem after PR 7661, it isn't really a regression from than PR. The explanation is rather that we're now relying on `core/jpg.js` instead of the Native Image decoder in more situations than before, which thus exposed an *existing* issue in our JPEG decoder.
Another factor also seems to be that in many JPEG images, the DRI (Define Restart Interval) marker isn't present, in which case this bug won't manifest either.
According to https://www.w3.org/Graphics/JPEG/itu-t81.pdf#page=89 (at the bottom of the page):
"NOTE – The final restart interval may be smaller than the size specified by the DRI marker segment, as it includes only the number of MCUs remaining in the scan."
Furthermore, according to https://www.w3.org/Graphics/JPEG/itu-t81.pdf#page=39 (in the middle of the page):
"[...] If restart is enabled and the restart interval is defined to be Ri, each entropy-coded segment except the last one shall contain Ri MCUs. The last one shall contain whatever number of MCUs completes the scan."
Based on the above, it thus seem to me that we should simply ensure that we're not attempting to continue to parse Scan data once we've found all MCUs (Minimum Coded Unit) of the image.
Fixes 7828.
I happened to notice that some inequalities had the wrong order, and was surprised since I thought that the `yoda` rule should have caught that.
However, reading http://eslint.org/docs/rules/yoda#options a bit more closely than previously, it's quite obvious that the `onlyEquality` option does *exactly* what its name suggests. Hence I think that it makes sense to adjust the options such that only ranges are allowed instead.
This patch gets rid of the only case in the code-base where we're throwing a plain `string`, rather than an `Error`, which besides better/more consistent error handling also allows us to enable the [`no-throw-literal`](http://eslint.org/docs/rules/no-throw-literal) ESLint rule.
*This is something that I noticed while working on PR 8126, which is (more) fallout from PR 6065.*
In general, it's actually *not* correct to return `Dict.empty` as the default value for non-existent properties. Please note that a prior PR, see https://github.com/mozilla/pdf.js/pull/5957#issuecomment-103112698, asked for that behaviour but I don't think that's right.
Obviously for properties that are (or should) be `Dict`s it makes sense, however certain properties can be e.g. Strings or Arrays instead. In the latter case, returning `Dict.empty` is just plain wrong, and it's quite fascinating that this hasn't caused any errors in practice. (The existing validation in the various getters has actually saved us here.)
Also, when looking at this code again, it seemed unnecessary to duplicate the `MAX_LOOP_COUNT` check since we could just return immediately instead.
When using content security headers to restrict connections to the same origin,
you may not make connections to `example.com`. This feature detection also
works with a request to the current location.
It appears that I accidentally broke this in PR 6065, sorry about that!
The issue in this particular PDF file is that there's `/Rotate` entries on different levels of the `/Pages` tree. We're supposed to use the `/Rotate` entry in the `/Page` dict (which is `0`), but because of an incorrect condition we instead ended up with the one from the `/Pages` dict (which is `180`).
Fixes 8125.
Ideally, the `Annotation` class should not have anything to do with the
page's operator list. How annotations are added to the page's operator
list is logic that belongs in `src/core/document.js` instead where the
operator list is constructed.
Moreover, some comments have been added to clarify the intent of the
code.
Even though the PDF specification does not state that `Opt` fields are
inheritable, in practice there are PDF generators that let annotations
inherit the options from a parent.
This fixes something that I noticed while working with the code in `Catalog.getPageDict` when debugging issue 8088.
Note that while I don't have an example where this patch really matters, given that e.g. `PartialEvaluator.hasBlendModes` depends on the `objId` to avoid cyclic references this patch could potentially help for some PDF files.
As discussed on IRC, we need to check all nodes at the *bottom* of the tree to ensure that we find the correct `Page` dict.
Furthermore, this patch also gets rid of the caching present in a previous version, since it's not clear if that really helps.
Note that this patch purposely adds an `eq` test, using a reduced test-case, so that we can be sure that the algorithm actually finds the correct `Page` dict for each `pageIndex`.
Fixes 8088.
[api-minor] Refactor fetching of built-in CMaps to utilize a factory on the `display` side instead, to allow users of the API to provide a custom CMap loading factory (e.g. for use with Node.js)