pdf.js/src
Jonas Jenwald 6da0944fc7 [api-minor] Replace PDFDocumentProxy.getStats with a synchronous PDFDocumentProxy.stats getter
*Please note:* These changes will primarily benefit longer documents, somewhat at the expense of e.g. one-page documents.

The existing `PDFDocumentProxy.getStats` function, which in the default viewer is called for each rendered page, requires a round-trip to the worker-thread in order to obtain the current document stats. In the default viewer, we currently make one such API-call for *every rendered* page.
This patch proposes replacing that method with a *synchronous* `PDFDocumentProxy.stats` getter instead, combined with re-factoring the worker-thread code by adding a `DocStats`-class to track Stream/Font-types and *only send* them to the main-thread *the first time* that a type is encountered.

Note that in practice most PDF documents only use a fairly limited number of Stream/Font-types, which means that in longer documents most of the `PDFDocumentProxy.getStats`-calls will return the same data.[1]
This re-factoring will obviously benefit longer document the most[2], and could actually be seen as a regression for one-page documents, since in practice there'll usually be a couple of "DocStats" messages sent during the parsing of the first page. However, if the user zooms/rotates the document (which causes re-rendering), note that even a one-page document would start to benefit from these changes.

Another benefit of having the data available/cached in the API is that unless the document stats change during parsing, repeated `PDFDocumentProxy.stats`-calls will return *the same identical* object.
This is something that we can easily take advantage of in the default viewer, by now *only* reporting "documentStats" telemetry[3] when the data actually have changed rather than once per rendered page (again beneficial in longer documents).

---
[1] Furthermore, the maximium number of `StreamType`/`FontType` are `10` respectively `12`, which means that regardless of the complexity and page count in a PDF document there'll never be more than twenty-two "DocStats" messages sent; see 41ac3f0c07/src/shared/util.js (L206-L232)

[2] One example is the `pdf.pdf` document in the test-suite, where rendering all of its 1310 pages only result in a total of seven "DocStats" messages being sent from the worker-thread.

[3] Reporting telemetry, in Firefox, includes using `JSON.stringify` on the data and then sending an event to the `PdfStreamConverter.jsm`-code.
In that code the event is handled and `JSON.parse` is used to retrieve the data, and in the "documentStats"-case we'll then iterate through the data to avoid double-reporting telemetry; see https://searchfox.org/mozilla-central/rev/8f4c180b87e52f3345ef8a3432d6e54bd1eb18dc/toolkit/components/pdfjs/content/PdfStreamConverter.jsm#515-549
2021-11-20 12:20:55 +01:00
..
core [api-minor] Replace PDFDocumentProxy.getStats with a synchronous PDFDocumentProxy.stats getter 2021-11-20 12:20:55 +01:00
display [api-minor] Replace PDFDocumentProxy.getStats with a synchronous PDFDocumentProxy.stats getter 2021-11-20 12:20:55 +01:00
images Vectorize the logo. 2012-10-29 14:08:52 -04:00
scripting_api JS - Avoid a popup to ask for specific version of Acrobat 2021-10-29 23:09:59 +10:00
shared [api-minor] Only use Workers when postMessage transfers are supported (PR 11123 follow-up) 2021-11-19 16:47:58 +01:00
doc_helper.js [api-major] Completely remove the global PDFJS object 2018-03-01 18:13:27 +01:00
interfaces.js Use ESLint to ensure that exports are sorted alphabetically 2021-01-09 20:37:51 +01:00
license_header_libre.js Update the year in the license_header files 2021-02-11 17:52:26 +01:00
license_header.js Update the year in the license_header files 2021-02-11 17:52:26 +01:00
pdf.image_decoders.js Fix the Jbig2Image export for the gulp image_decoders build (PR 9729 follow-up, issue 13367) 2021-05-12 19:41:29 +02:00
pdf.js Try to expose more API-functionality in the TypeScript definitions 2021-09-13 13:57:56 +02:00
pdf.sandbox.external.js [JS] Fix several issues found in pdf in #13269 2021-05-04 19:21:51 +02:00
pdf.sandbox.js Clean-up usage of the TESTING-define in src/pdf.sandbox.js 2021-05-11 12:39:33 +02:00
pdf.scripting.js Tweak the pdf.scripting.js bundling, to improve overall consistency 2020-10-25 16:36:56 +01:00
pdf.worker.entry.js Update the year in the license_header files 2021-02-11 17:52:26 +01:00
pdf.worker.js Convert the src/pdf.js and src/pdf.worker.js files to use standard import/export statements 2020-05-20 13:18:23 +02:00
worker_loader.js Update Prettier to version 2.0 2020-04-14 12:28:14 +02:00