I'm slightly surprised that this hasn't actually caused any (known) bugs, but that may be more luck than anything else since it fortunately doesn't seem common for Streams to be defined inside of an 'ObjStm'.[1]
Note that in the `XRef.fetchUncompressed` method we're *not* caching Streams, and that for very good reasons too.
- Streams, especially the `DecodeStream` ones, can become *very* large once read. Hence caching them really isn't a good idea simply because of the (potential) memory impact of doing so.
- Attempting to read from the *same* Stream more than once won't work, unless it's `reset` in between, since using any method such as e.g. `getBytes` always starts at the current data position.
- Given that even the `src/core/` code is now fairly asynchronous, see e.g. the `PartialEvaluator`, it's generally impossible to assert that any one Stream isn't being accessed "concurrently" by e.g. different `getOperatorList` calls. Hence `reset`-ing a cached Streams isn't going to work in the general case.
All in all, I cannot understand why it'd ever be correct to cache Streams in the `XRef.fetchCompressed` method.
---
[1] One example where that happens is the `issue3115r.pdf` file in the test-suite, where the streams in question are not actually used for anything within the PDF.js code.
- Change all occurences of `var` to `let`/`const`.
- Initialize the (temporary) Arrays with the correct sizes upfront.
- Inline the `isCmd` check. Obviously this won't make a huge difference, but given that the check is only relevant for corrupt documents it cannot hurt.
Having ran the entire test-suite locally with these `Number.isInteger` checks removed, there wasn't a single test failure anywhere; see also PR 8857.
Hence everything points to this being completely unnecessary now, and by removing this code there's thus fewer function calls being made in `XRef.fetchUncompressed`.
The contents of this comment hasn't been correct for *years*, ever since the library was properly split into main/worker-threads, so it's probably high time for this to be updated.
This code was originally added to support IE10 (and below), however with those browsers *explicitly* unsupported since PDF.js version `2.0` this code is now dead.
Obviously the `_pagesRequests` functionality is *mainly* used when `disableAutoFetch` is set, but it will also be used during ranged/streamed loading of documents.
However, the `_pagesRequests` property is currently an Array which seems a bit strange:
- Arrays are zero-indexed, but the first element will never actually be set in the code.
- The `_pagesRequests` Array is never cleared, unless a new document is loaded, and once the `PDFDocumentProxy.getPage` call has resolved/rejected the element is just replaced by `null`.
- Unless the document is browsed *in order* the resulting `_pagesRequests` Array can also be arbitrarily sparse.
All in all, I don't believe that an Array is an appropriate data structure to use for this purpose.
This patch simply restores the behaviour that existed prior to PR 7697, since I cannot imagine that that was changed other than by pure accident.
As mentioned by a comment in `BaseViewer.setDocument`: "Printing is semi-broken with auto fetch disabled.", and note that since triggering of printing is a synchronous operation there's generally no easy way to load the missing data.
https://github.com/mozilla/pdf.js/pull/7697/files#diff-529d1853ee1bba753a0fcb40ea778723L1114-L1118
Considering just how small/simple this code is, it doesn't seem necessary to have a separate method for it (even more so when there's only one call-site).
For documents with a Linearization dictionary the computed `startXRef` position will be relative to the raw file, rather than the actual PDF document itself (which begins with `%PDF-`).
Hence it's necessary to subtract `stream.start` in this case, since otherwise the `XRef.readXRef` method will increment the position too far resulting in parsing errors.
*Please note:* A a similar change was attempted in PR 5005, but it was subsequently backed out (in PR 5069) since other parts of the patch caused issues.
With these changes, it's possible to replace repeated function calls within a loop with just a single function call and subsequent assignment instead.
I've always disliked the solution in PR 10461, since it required changes to the `PDFHistory` code itself to deal with a bug in IE11.
Now that IE11 support is limited, it seems reasonable to remove these `pushState`/`replaceState` hacks from the main code-base and simply use polyfills instead.
For Popup annotation trigger elements consisting of an arbitrary polyline, you need to ensure that the 'stroke-width' is always non-zero since otherwise it's impossible to actually open/close the popup.
Unfortunately I don't believe that any of the test-suites can be used to test this, hence why no tests are included in the patch.
As we've seen in numerous other cases, avoiding unnecessary function calls is never a bad thing (even if the effect is probably tiny here).
With these changes we also avoid potentially two back-to-back `isDict` checks when evaluating possible Page nodes, and can also no longer accidentally pick a dictionary with an incorrect /Type.
For certain canvas-related errors (and probably others), the browser rendering exceptions may be propagated "as-is" to the PDF.js code. In this case, the exceptions are of the somewhat cryptic `NS_ERROR_FAILURE` type.
Unfortunately these aren't actual `Error`s, which thus ends up unintentionally triggering the `assert` in `PDFPageProxy._abortOperatorList`; sorry about that!