Commit Graph

942 Commits

Author SHA1 Message Date
Tim van der Meij
cf7ce0aa7e
Merge pull request #14600 from Snuffleupagus/getPageIndex-more-validation
[api-minor] Add validation for the  `PDFDocumentProxy.getPageIndex` method
2022-02-26 15:30:00 +01:00
Jonas Jenwald
172d007598 [api-minor] Add validation for the PDFDocumentProxy.getPageIndex method
Currently we'll happily attempt to send any argument passed to this method over to the worker-thread, without doing any sort of validation.
That could obviously be quite bad, since there's first of all no protection against sending unclonable data. Secondly, it's also possible to pass data that will cause the `Ref.get` call in the worker-thread to fail immediately.

In order to address all of these issues, we'll now properly validate the argument passed to `PDFDocumentProxy.getPageIndex` and when necessary reject already on the main-thread instead.
2022-02-24 12:01:51 +01:00
Jonas Jenwald
2be8036eb7 [api-minor] Reduce duplication in the "gets non-existent page" unit-test 2022-02-24 11:25:21 +01:00
Jonas Jenwald
ec87995050 Ensure that Cmd/Name is only initialized with string arguments
Trying to use a non-string argument in either a `Cmd` or a `Name` is not intended, and would basically be an implementation error. Hence we can add a non-PRODUCTION check to enforce this, similar to the existing one used e.g. in the `Dict.set` method.
2022-02-23 22:39:12 +01:00
Tim van der Meij
2bb96a708c
Merge pull request #14598 from Snuffleupagus/rm-isBool
Re-factor the `Catalog.viewerPreferences` method and remove the `isBool` helper function
2022-02-23 20:36:56 +01:00
Jonas Jenwald
3704283f5b Remove the isBool helper function
The call-sites are replaced by direct `typeof`-checks instead, which removes unnecessary function calls.
2022-02-23 13:31:03 +01:00
Jonas Jenwald
a2f9031e9a Ensure that Dict.set only accepts string keys
Trying to use a non-string `key` in a `Dict` is not intended, and would basically be an implementation error. Hence we can add a non-PRODUCTION check to enforce this, complementing the existing `value` check added in PR 11672.
2022-02-22 16:35:20 +01:00
Jonas Jenwald
05edd91bdb Remove the isNum helper function
The call-sites are replaced by direct `typeof`-checks instead, which removes unnecessary function calls. Note that in the `src/`-folder we already had more `typeof`-cases than `isNum`-calls.

These changes were *mostly* done using regular expression search-and-replace, with two exceptions:
 - In `Font._charToGlyph` we no longer unconditionally update the `width`, since that seems completely unnecessary.
 - In `PDFDocument.documentInfo`, when parsing custom entries, we now do the `typeof`-check once.
2022-02-22 11:55:34 +01:00
Jonas Jenwald
b282814e38 Prefer instanceof Name rather than calling isName() with one argument
Unless you actually need to check that something is both a `Name` and also of the *correct* type, using `instanceof Name` directly should be a tiny bit more efficient since it avoids one function call and an unnecessary `undefined` check.

This patch uses ESLint to enforce this, since we obviously still want to keep the `isName` helper function for where it makes sense.
2022-02-21 12:45:00 +01:00
Jonas Jenwald
4df82ad31e Prefer instanceof Dict rather than calling isDict() with one argument
Unless you actually need to check that something is both a `Dict` and also of the *correct* type, using `instanceof Dict` directly should be a tiny bit more efficient since it avoids one function call and an unnecessary `undefined` check.

This patch uses ESLint to enforce this, since we obviously still want to keep the `isDict` helper function for where it makes sense.
2022-02-21 12:44:56 +01:00
Jonas Jenwald
67b658e8d5 Prefer instanceof Cmd rather than calling isCmd() with *one* argument
Unless you actually need to check that something is both a `Cmd` and also of the *correct* type, using `instanceof Cmd` directly should be a tiny bit more efficient since it avoids one function call and an unnecessary `undefined` check.

This patch uses ESLint to enforce this, since we obviously still want to keep the `isCmd` helper function for where it makes sense.
2022-02-21 12:44:51 +01:00
Jonas Jenwald
2cb2f633ac Remove the isRef helper function
This helper function is not really needed, since it's just a wrapper around a simple `instanceof` check, and it only adds unnecessary indirection in the code.
2022-02-19 15:33:42 +01:00
Jonas Jenwald
1a31855977 Remove the isStream helper function
At this point all the various Stream-classes extends an abstract base-class, hence this helper function is no longer necessary and only adds unnecessary indirection in the code.
2022-02-17 13:51:36 +01:00
Calixte Denizet
18f4e560ae [Search] Some matches were incorrectly shifted because of some '-\n'
- it aims to fix #14562;
- 'X-\n' were not correctly positioned;
- when X is a diacritic (e.g. in "sä-\n", which is decomposed into "sa¨-\n") we must handle both things:
  - diacritics on the one hand;
  - "-\n" on the other hand.
2022-02-14 10:12:33 +01:00
Calixte Denizet
18e3a98c2b [api-minor] Don't add in the text content the chars which are out-of-page (bug 1755201)
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1755201;
- if the glyph position is not within the view then skip it.
2022-02-13 21:07:11 +01:00
Jonas Jenwald
0daab88a48 Update two display_utils unit-tests to use native functionality rather than the createObjectURL helper function
Given that most of the code-base is already using native functionality, we can update these unit-tests similarily as well.
 - For the `blob:`-URL test, we simply use `URL.createObjectURL(...)` and `Blob` directly instead.
 - For the `data:`-URL test, we simply use `btoa` to do the Base64 encoding and then build the final URL-string.
2022-02-10 12:01:29 +01:00
Brendan Dahl
f8b2a99ddc
Merge pull request #14543 from Snuffleupagus/bug-1753983
Let `Lexer.getNumber` treat a single minus sign as zero (bug 1753983)
2022-02-09 14:06:35 -08:00
Jonas Jenwald
1f0fb270b1 [api-minor] Ensure that the PDFDocumentLoadingTask-promise is rejected when cancelling the PasswordPrompt (bug 1754421)
This is essentially a *continuation* of PR 7926, where we added support for rejecting the current `PDFDocumentLoadingTask`-promise by throwing inside of the `onPassword`-callback.
Hence the naive way to address [bug 1754421](https://bugzilla.mozilla.org/show_bug.cgi?id=1754421) would be to simply throw in the `onPassword`-callback used in the default viewer. However it unfortunately turns out to not work, since the password input/validation is asynchronous, and we thus need another approach.

The simplest solution that I can come up with here, is thus to *extend* the `onPassword`-callback to also reject the current `PDFDocumentLoadingTask`-instance if an `Error` is explicitly passed as the input to the callback function. (This doesn't feel great, but I cannot see a better solution that isn't really complicated.)
2022-02-09 15:09:20 +01:00
Jonas Jenwald
64f3dbeb48 Let Lexer.getNumber treat a single minus sign as zero (bug 1753983)
This appears to be consistent with the behaviour in both Adobe Reader and PDFium (in Google Chrome); this is essentially the same approach as used for a single decimal point in PR 9827.
2022-02-07 17:09:47 +01:00
Calixte Denizet
1f41028fcb Support search with or without diacritics (bug 1508345, bug 916883, bug 1651113)
- get original index in using a dichotomic seach instead of a linear one;
  - normalize the text in using NFD;
  - convert the query string into a RegExp;
  - replace whitespaces in the query with \s+;
  - handle hyphens at eol use to break a word;
  - add some \s* around punctuation signs
2022-02-03 15:42:55 +01:00
Jonas Jenwald
403baa7bba [api-minor] Remove the normalizeWhitespace option in the PDFPageProxy.{getTextContent, streamTextContent} methods (issue 14519, PR 14428 follow-up)
With these changes, we'll now *always* replace all whitespaces with standard spaces (0x20). This behaviour is already, since many years, the default in both the viewer and the browser-tests.
2022-02-03 09:17:22 +01:00
Calixte Denizet
ae842e1c3a [api-minor] Annotations - Adjust the font size in text field in considering the total width (bug 1721335)
- it aims to fix #14502 and bug 1721335;
 - Acrobat and Pdfium do the same;
 - it'll avoid to have truncated data when printed;
 - change the factor to compute font size in using field height: lineHeight = 1.35*fontSize
  - this is the value used by Acrobat.
 - in order to not have truncated strings on the bottom, add few basic metrics for standard fonts.
2022-01-30 15:53:31 +01:00
calixteman
9367d54009
Merge pull request #14483 from calixteman/200B
Remove the invisible format marks from the text chunks
2022-01-24 17:52:06 +01:00
Calixte Denizet
880ac6037c Fix scripting test related to keystroke event 2022-01-24 17:04:50 +01:00
Calixte Denizet
e1d3a3b414 Remove the invisible format marks from the text chunks
- it aims to fix issue #9186.
2022-01-24 13:47:24 +01:00
Tim van der Meij
e08fd5e389
Implement a unit test for getCharUnicodeCategory in src/core/unicode.js (PR 14428 follow-up)
Given that the other functions in this file are already covered by unit
tests, we should also cover this newly added function.
2022-01-16 15:18:05 +01:00
Tim van der Meij
625f829842
Merge pull request #14446 from Snuffleupagus/issue-14435
Expose even more API-functionality in the TypeScript definitions (issue 14435, PR 14013 follow-up)
2022-01-15 13:46:11 +01:00
Jonas Jenwald
76444888fb Add (basic) UTF-8 support in the stringToPDFString helper function (issue 14449)
This patch implements this by looking for the UTF-8 BOM, i.e. `\xEF\xBB\xBF`, in order to determine the encoding.[1]
The actual conversion is done using the `TextDecoder` interface, which should be available in all environments/browsers that we support; please see https://developer.mozilla.org/en-US/docs/Web/API/TextDecoder#browser_compatibility

---
[1] Assuming that everything lacking a UTF-16 BOM would have to be UTF-8 encoded really doesn't seem correct.
2022-01-14 18:57:07 +01:00
Jonas Jenwald
b9849e38b8 Expose even more API-functionality in the TypeScript definitions (issue 14435, PR 14013 follow-up)
While `PageViewport` apparently makes sense in TypeScript environments, given that it's being returned by the `PDFPageProxy.getViewport`-method in the API, we really don't want to extend the *public* API by simply exporting the class directly in `src/pdf.js` since it should never be called/initialized manually.
Hence we follow the same pattern as in PR 14013, and also extend the API unit-tests to ensure that `PDFPageProxy.getViewport` always returns a `PageViewport`-instance as expected.
2022-01-13 12:05:40 +01:00
Jonas Jenwald
457ff0d54a Update Jasmine to version 4
For the unit-tests that were updated in this patch, note that I settled on simply using `toEqual` comparisons rather than updating the custom matchers (since those don't seem necessary any more).

Please refer to the following resources for additional information:
 - https://github.com/jasmine/jasmine/blob/main/release_notes/4.0.0.md
 - https://github.com/jasmine/jasmine-npm/blob/main/release_notes/4.0.0.md
 - https://jasmine.github.io/tutorials/upgrading_to_Jasmine_4.0
2022-01-09 11:32:34 +01:00
Tim van der Meij
8ac0ccc227
Merge pull request #14424 from Snuffleupagus/mv-addLinkAttributes
[api-minor] Move `addLinkAttributes`, `LinkTarget`, and `removeNullCharacters` into the viewer (PR 14092 follow-up)
2022-01-08 13:19:11 +01:00
Calixte Denizet
6369617e6f [JS] Fix few errors around AFSpecial_Keystroke
- @cincodenada found some errors which are fixed in this patch;
 - it partially fixes issue #14306;
 - add some tests.
2022-01-08 12:34:56 +01:00
Jonas Jenwald
7b8794b37e [api-minor] Move removeNullCharacters into the viewer
This helper function has never been used in e.g. the worker-thread, hence its placement in `src/shared/util.js` led to a *small* amount of unnecessary duplication.
After the previous patches this helper function is now *only* used in the viewer, hence it no longer seems necessary to expose it through the official API.

*Please note:* It seems somewhat unlikely that third-party users were relying *directly* on this helper function, which is why it's not being exported as part of the viewer components. (If necessary, we can always change this later on.)
2022-01-06 12:25:33 +01:00
Jonas Jenwald
b99927e1ee Improve the API unit-tests for scripting-related functionality
I happened to notice that we didn't have *any* unit-tests for either `getFieldObjects` or `getCalculationOrderIds`, on the `PDFDocumentProxy` class, which seems unfortunate since it's API functionality that we depend on in e.g. the viewer.
2021-12-29 12:57:32 +01:00
Tim van der Meij
e42d54e1b5
Merge pull request #14400 from Snuffleupagus/getPageDict-async
[api-minor] Convert `Catalog.getPageDict` to an asynchronous method
2021-12-28 19:40:34 +01:00
Jonas Jenwald
b513c64d9d [api-minor] Convert Catalog.getPageDict to an asynchronous method
Besides converting `Catalog.getPageDict` to an `async` method, thus simplifying the code, this patch also allows us to pro-actively fix a existing issue.
Note how we're looking up References in such a way that `MissingDataException`s won't cause trouble, however it's *technically possible* that the entries (i.e. /Count, /Kids, and /Type) in a /Pages Dictionary could actually be indirect objects as well. In the existing code this could lead to *some*, or even all, pages failing to load/render as intended.
In practice that doesn't *appear* to happen in real-world PDF documents, but given all the weird things that PDF software do I'd prefer to fix this pro-actively (rather than waiting for a bug report).
With `Catalog.getPageDict` being `async` this is now really simple to address, however I didn't want to introduce a bunch more *unconditional* asynchronicity in this method if it could be avoided (since that could slow things down). Hence we'll *synchronously* lookup the *raw* data in a /Pages Dictionary, and only fallback to asynchronous data lookup when a Reference was encountered.

In addition to the above, this patch also makes the following notable changes:
 - Let `Catalog.getPageDict` *consistently* reject with the actual error, regardless of what data we're fetching. Previously we'd "swallow" the actual errors except when looking up Dictionary entries, which is inconsistent and thus seem unfortunate. As can be seen from the updated unit-tests this change is API-observable, hence why the patch is tagged `[api-minor]`.

 - Improve the consistency of the Dictionary /Type-checks in both the `Catalog.getPageDict` and `Catalog.getAllPageDicts` methods.
   In `Catalog.getPageDict` there's a fallback code-path where we're *incorrectly* checking the /Page Dictionary for a /Contents-entry, which is wrong since a /Page Dictionary doesn't need to have a /Contents-entry in order to be valid.
   For consistency the `Catalog.getAllPageDicts` method is also updated to handle errors in the /Type-lookup correctly.

 - Reduce the `PagesCountLimit.PAUSE_EAGER_PAGE_INIT` viewer constant, to further improve loading/rendering performance of the *second* page during initialization of very long documents; PR 14359 follow-up.
2021-12-25 15:22:48 +01:00
KouWakai
98158b67a3 Handle non-integer Annotation border widths correctly (issue 14203)
The existing code appears to be wrong, since according to the PDF specification the border width of an Annotation only has to be a number and not specifically an integer. Please see:
 - https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#page=392
 - https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G11.2096210
 - https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G6.1965562
2021-12-24 22:10:19 +09:00
Jonas Jenwald
0a19ef6864 Move the EventBus, and related functionality, into its own file
The size of the `web/ui_utils.js` file has increased over time, as more code has been added to (or moved into) that file. To reduce its size slightly, this patch moves the event-related functionality into a separate file.
2021-12-15 17:18:57 +01:00
Jonas Jenwald
e8562173b8 Prevent an infinite loop when parsing corrupt /CCITTFaxDecode data (issue 14305)
Fixes one of the documents in issue 14305.
2021-12-07 13:57:25 +01:00
Tim van der Meij
335c4c8a43
Merge pull request #14338 from Snuffleupagus/XRef-more-Pages-validation
[api-minor] Clear all caches in `XRef.indexObjects`, and improve /Root dictionary validation in `XRef.parse` (issue 14303)
2021-12-04 13:23:40 +01:00
Jonas Jenwald
40291d1943 Handle errors when fetching the raw /Metadata (issue 14305)
Currently the `Catalog.metadata` getter only handles errors during parsing, however in a *corrupt* PDF document fetching of the raw /Metadata can obviously fail as well.
Without this patch the `PDFDocumentProxy.getMetadata` method, in the API, can thus fail which it *never* should and this will cause the viewer to not initialize all state as expected.

Fixes one of the documents in issue 14305.
2021-12-04 09:41:42 +01:00
Jonas Jenwald
ad3a271fc4 [api-minor] Clear all caches in XRef.indexObjects, and improve /Root dictionary validation in XRef.parse (issue 14303)
*This patch improves handling of a couple of PDF documents from issue 14303.*

 - Update `XRef.indexObjects` to actually clear *all* XRef-caches. Invalid XRef tables *usually* cause issues early enough during parsing that we've not populated the XRef-cache, however to prevent any issues we obviously need to clear that one as well.

 - Improve the /Root dictionary validation in `XRef.parse` (PR 9827 follow-up). In addition to checking that a /Pages entry exists, we'll now also check that it can be successfully fetched *and* that it's of the correct type. There's really no point trying to use a /Root dictionary that e.g. `Catalog.toplevelPagesDict` will reject, and this way we'll be able to fallback to indexing the objects in corrupt documents.

 - Throw an `InvalidPDFException`, rather than a general `FormatError`, in `XRef.parse` when no usable /Root dictionary could be found. That really seems more appropriate overall, since all attempts at parsing/recovery have failed. (This part of the patch is API-observable, hence the tag.)

With these changes, two existing test-cases are improved and the unit-tests are updated/re-factored to highlight that. In particular `GHOSTSCRIPT-698804-1-fuzzed.pdf` will now both load and "render" correctly, whereas `poppler-395-0-fuzzed.pdf` will now fail immediately upon loading (rather than *appearing* to work).
2021-12-03 11:57:38 +01:00
Jonas Jenwald
1fac6371d3 [Regression] Eagerly fetch/parse the entire /Pages-tree in corrupt documents (issue 14303, PR 14311 follow-up)
*Please note:* This is similar to the method that existed prior to PR 3848, but the new method will *only* be used as a fallback when parsing of corrupt PDF documents.

The implementation in PR 14311 unfortunately turned out to be *way* too simplistic, as evident by the recently added test-files in issue 14303, since it may *cause* infinite loops in `PDFDocument.checkLastPage` for some corrupt PDF documents.[1]
To avoid this, the easiest solution that I could come up with was to fallback to eagerly parsing the *entire* /Pages-tree when the /Count-entry validation fails during document initialization.

Fixes *at least* two of the issues listed in issue 14303, namely the `poppler-395-0.pdf...` and `GHOSTSCRIPT-698804-1.pdf...` documents.

---
[1] The whole point of PR 14311 was obviously to *get rid of* infinte loops during document initialization, not to introduce any more of those.
2021-12-02 14:31:04 +01:00
Jonas Jenwald
8ea740c800 Slightly extend the "creates pdf doc from PDF file with bad XRef table" unit-test (PR 14304 follow-up)
Given that we're able to "render" this document, let's extend the unit-test to actually check that we're able to obtain the operatorList; although given the overall issues in the document it'll be empty.
2021-12-02 11:51:40 +01:00
Jonas Jenwald
63be23f05b Handle errors correctly when data lookup fails during /Pages-tree parsing (issue 14303)
This only applies to severely corrupt documents, where it's possible that the `Parser` throws when we try to access e.g. a /Kids-entry in the /Pages-tree.

Fixes two of the issues listed in issue 14303, namely the `poppler-742-0.pdf...` and `poppler-937-0.pdf...` documents.
2021-12-02 10:54:40 +01:00
Jonas Jenwald
a807ffe907 Prevent circular references in XRef tables from hanging the worker-thread (issue 14303)
*Please note:* While this patch on its own is sufficient to prevent the worker-thread from hanging, however in combination with PR 14311 these PDF documents will both load *and* render correctly.

Rather than focusing on the particular structure of these PDF documents, it seemed (at least to me) to make sense to try and prevent all circular references when fetching/looking-up data using the XRef table.
To avoid a solution that required tracking the references manually everywhere, the implementation settled on here instead handles that internally in the `XRef.fetch`-method. This should work, since that method *and* the `Parser`/`Lexer`-implementations are completely synchronous.

Note also that the existing `XRef`-caching, used for all data-types *except* Streams, should hopefully help to lessen the performance impact of these changes.
One *potential* problem with these changes could be certain *browser* exceptions, since those are generally not catchable in JavaScript code, however those would most likely "stop" worker-thread parsing anyway (at least I hope so).

Finally, note that I settled on returning dummy-data rather than throwing an exception. This was done to allow parsing, for the rest of the document, to continue such that *one* bad reference doesn't prevent an entire document from loading.

Fixes two of the issues listed in issue 14303, namely the `poppler-91414-0.zip-2.gz-53.pdf` and `poppler-91414-0.zip-2.gz-54.pdf` documents.
2021-11-27 23:50:26 +01:00
Jonas Jenwald
d0c4bbd828 [api-minor] Validate the /Pages-tree /Count entry during document initialization (issue 14303)
*This patch basically extends the approach from PR 10392, by also checking the last page.*

Currently, in e.g. the `Catalog.numPages`-getter, we're simply assuming that if the /Pages-tree has an *integer* /Count entry it must also be correct/valid.
As can be seen in the referenced PDF documents, that entry may be completely bogus which causes general parsing to breaking down elsewhere in the worker-thread (and hanging the browser).

Rather than hoping that the /Count entry is correct, similar to all other data found in PDF documents, we obviously need to validate it. This turns out to be a little less straightforward than one would like, since the only way to do this (as far as I know) is to parse the *entire* /Pages-tree and essentially counting the pages.
To avoid doing that for all documents, this patch tries to take a short-cut by checking if the last page (based on the /Count entry) can be successfully fetched. If so, we assume that the /Count entry is correct and use it as-is, otherwise we'll iterate through (potentially) the *entire* /Pages-tree to determine the number of pages.

Unfortunately these changes will have a number of *somewhat* negative side-effects, please see a possibly incomplete list below, however I cannot see a better way to address this bug.
 - This will slow down initial loading/rendering of all documents, at least by some amount, since we now need to fetch/parse more of the /Pages-tree in order to be able to access the *last* page of the PDF documents.
 - For poorly generated PDF documents, where the entire /Pages-tree only has *one* level, we'll unfortunately need to fetch/parse the *entire* /Pages-tree to get to the last page. While there's a cache to help reduce repeated data lookups, this will affect initial loading/rendering of *some* long PDF documents,
 - This will affect the `disableAutoFetch = true` mode negatively, since we now need to fetch/parse more data during document initialization. While the `disableAutoFetch = true` mode should still be helpful in larger/longer PDF documents, for smaller ones the effect/usefulness may unfortunately be lost.

As one *small* additional bonus, we should now also be able to support opening PDF documents where the /Pages-tree /Count entry is completely invalid (e.g. contains a non-integer value).

Fixes two of the issues listed in issue 14303, namely the `poppler-67295-0.pdf` and `poppler-85140-0.pdf` documents.
2021-11-27 21:57:35 +01:00
Jonas Jenwald
ca8d2bdce4 Abort parsing when the XRef /W-array contain bogus entries (issue 14303)
For this particular PDF document, we have `/W [1 2 166666666666666666666666666]` which obviously makes no sense.

While this patch makes no attempt at actually validating the entries in the /W-array, we'll now simply abort all processing when the end of the PDF document has been reached (thus preventing hanging the browser).
Please note that this patch doesn't enable the PDF document to be loaded/rendered, but at least it fails "correctly" now.

Fixes one of the issues listed in issue 14303, namely the `REDHAT-1531897-0.pdf`document.
2021-11-25 18:35:08 +01:00
Jonas Jenwald
ae4f1ae3e7 Ensure that ChunkedStream won't attempt to request data *beyond* the document size (issue 14303)
This bug was surprisingly difficult to track down, since it didn't just depend on range-requests being used but also on how quickly the document was loaded. To even be able to reproduce this locally, I had to use a very small `rangeChunkSize`-value (note the unit-test).

The cause of this bug is a bogus entry in the XRef-table, causing us to attempt to request data from *beyond* the actual document size and thus getting into an infinite loop.

Fixes *one* of the issues listed in issue 14303, namely the `PDFBOX-4352-0.pdf` document.
2021-11-24 19:19:43 +01:00
Jonas Jenwald
0ebac67a9f Remove the {BaseViewer, PDFThumbnailViewer}._pagesRequests caches
In the `BaseViewer` this cache is mostly relevant in the `disableAutoFetch = true` mode, since the pages are being initialized *lazily* in that case.
In the `PDFThumbnailViewer` this cache is mostly used for thumbnails that are actually being rendered, as opposed to those created directly from the "regular" pages.

Please note that I'm not suggesting that we remove these caches because they're only used in some situations, but rather because they're for all intents and purposes actually *redundant*. In the API itself, we're already caching both the page-promises and the actual pages themselves on the `WorkerTransport`-instance.
Hence these viewer-caches aren't really necessary in practice, and adds what to me mostly seems like an unnecessary level of indirection.[1]

Given that the viewer now relies on caching in the API itself, this patch also adds a new unit-test to ensure that page-caching works (and keep working) as expected.

---
[1] In the `WorkerTransport.getPage`-method the parameter is being validated on every call, but that's hardly enough code to warrant keeping the "duplicate" caches in the viewer in my opinion.
2021-11-21 11:40:45 +01:00