Commit Graph

543 Commits

Author SHA1 Message Date
calixteman
8c53bf8647
Merge pull request #13437 from calixteman/xfa_mv_root
XFA - Move the fake HTML representation of XFA from the worker to the main thread
2021-05-31 10:14:15 +02:00
Jonas Jenwald
d8a7c75b4a Revert "Add deprecated handling of the now removed AnnotationStorage API-parameters" (PR 13207 follow-up)
This reverts commit 737a8e846d, since it's included in the latest beta version `2.9.359`.
2021-05-30 16:38:33 +02:00
Jonas Jenwald
70c79c6f69 Fix the JSDocs for PDFDocumentProxy.getPageIndex (issue 13449) 2021-05-27 16:41:08 +02:00
Calixte Denizet
45c3f00a27 XFA - Move the fake HTML representation of XFA from the worker to the main thread
- the only goal of this patch is to be able to get synchronously the fake html when printing from firefox:
    - in order to print we need to inject some html in beforeprint callback but we cannot block in waiting for all the pages.
  - from a memory point of view: it doesn't change anything since the fake HTML is deleted in the worker;
  - this way we don't break any assumptions.
2021-05-25 19:33:07 +02:00
Jonas Jenwald
0dba468e60 Don't allow the LoopbackPort to "clone" a URL
Note that `URL`s aren't supported by the structured clone algorithm, see https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm, and any attempt to send a `URL` using `postMessage` is rejected by the browser. Hence, for consistency when workers are disabled, the `LoopbackPort` should obviously also reject any `URL`s.
2021-05-22 10:11:31 +02:00
Jonas Jenwald
8d5689387b Improve handling of named destinations in out-of-order NameTrees (PR 10274 follow-up)
According to the specification, see https://web.archive.org/web/20210404042322if_/https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G6.2384179, the keys of a NameTree/NumberTree should be ordered.
For corrupt PDF files, which violate this assumption, it's thus possible that trying to lookup a single entry fails.

Previously, in PR 10274, we implemented a fallback that only applies to the "bottom" node of a NameTree/NumberTree, which in general might not actually help for sufficiently corrupt NameTree/NumberTree data.
Instead we remove the current *limited* fallback from `NameOrNumberTree.get`, and defer to the call-site to handle this case explicitly e.g. by using `NameOrNumberTree.getAll` for data where that makes sense. For well-formed documents, these changes should *not* lead to any additional data fetching/parsing.

Finally, as part of these changes, the validation of named destination data is improved in the `Catalog` and a new unit-test is also added.
2021-05-21 15:48:37 +02:00
Jonas Jenwald
8943bcd3c3 Account for formatting changes in Prettier version 2.3.0
With the exception of one tweaked `eslint-disable` comment, in `web/generic_scripting.js`, this patch was generated automatically using `gulp lint --fix`.

Please find additional information at:
 - https://github.com/prettier/prettier/releases/tag/2.3.0
 - https://prettier.io/blog/2021/05/09/2.3.0.html
2021-05-16 11:44:05 +02:00
Jonas Jenwald
2ba4b65ca8 [api-minor] Remove the WebGL implementation
Reasons for the removal include:
 - This functionality was always somewhat experimental and has never been enabled by default, partly because of worries about rendering bugs caused by e.g. bad/outdated graphics drivers.

 - After the initial implementation, in PR 4286 (back in 2014), no additional functionality has been added to the WebGL implementation.

 - The vast majority of all documents do not benefit from WebGL rendering, since only a couple of *specific* features are supported (e.g. some Soft Masks and Patterns).

 - There is, and has always been, *zero* test-coverage for the WebGL implementation.

 - Overall performance, in the PDF.js library, has improved since the experimental WebGL implementation was added.

Rather than shipping unused *and* untested code, it seems reasonable to simply remove the WebGL implementation for now; thanks to version control it's always possible to bring back the code should the need ever arise.
2021-05-09 16:38:44 +02:00
Jonas Jenwald
da22146b95 Replace a bunch of Array.prototype.forEach() cases with for...of loops instead
Using `for...of` is a modern and generally much nicer pattern, since it gets rid of unnecessary callback-functions. (In a couple of spots, a "regular" `for` loop had to be used.)
2021-04-24 13:00:19 +02:00
Jonas Jenwald
5adee0cdd1 [api-minor] Let PDFPageProxy.getStructTree return null, rather than an empty structTree, for documents without any accessibility data (PR 13171 follow-up)
This is first of all consistent with existing API-methods, where we return `null` when the data in question doesn't exist. Secondly, it should also be (slightly) more efficient since there's less dummy-data that we need to transfer between threads.
Finally, this prevents us from adding an empty/unnecessary span to *every* single page even in documents without any structure tree data.
2021-04-11 12:35:33 +02:00
Tim van der Meij
03c8c89002
Merge pull request #13171 from brendandahl/struct-tree
[api-minor] Add support for basic structure tree for accessibility.
2021-04-09 21:32:44 +02:00
Brendan Dahl
fc9501a637 Add support for basic structure tree for accessibility.
When a PDF is "marked" we now generate a separate DOM that represents
the structure tree from the PDF.  This DOM is inserted into the <canvas>
element and allows screen readers to walk the tree and have more
information about headings, images, links, etc. To link the structure
tree DOM (which is empty) to the text layer aria-owns is used. This
required modifying the text layer creation so that marked items are
now tracked.
2021-04-09 09:56:28 -07:00
Jonas Jenwald
737a8e846d Add deprecated handling of the now removed AnnotationStorage API-parameters
These changes are done separately, to make it easier to remove them in the future.
2021-04-09 13:25:03 +02:00
Jonas Jenwald
72ef183085 [api-minor] Remove the manual passing of an AnnotationStorage-instance when calling various API-method
Note how we purposely don't expose the `AnnotationStorage`-class directly in the official API (see `src/pdf.js`), since trying to use *multiple* ones simultaneously doesn't really make sense (e.g. in the viewer).
Instead we lazily initialize, and cache, just *one* instance via `PDFDocumentProxy.annotationStorage` which should thus be available internally in the API itself without having to be manually passed to various methods.

To support these changes, the `AnnotationStorage`-instance initialization is moved into the `WorkerTransport`-class to allow both `PDFDocumentProxy` and `PDFPageProxy` to access it.
This patch implements the following simplifications:
 - Remove the `annotationStorage`-parameter from `PDFDocumentProxy.saveDocument`, since it's already available internally.
   Furthermore, while it's currently possible to call that method without an `AnnotationStorage`-instance, that really does *not* make any sense at all. In this case you're effectively reducing `PDFDocumentProxy.saveDocument` to a "regular" `PDFDocumentProxy.getData` call, but with *a lot* more overhead, which was obviously not the intention of the `PDFDocumentProxy.saveDocument`-method.

 - Try to discourage third-party users from calling `PDFDocumentProxy.saveDocument` unconditionally, as a replacement for `PDFDocumentProxy.getData` (note the previous point).

 - Replace the `annotationStorage`-parameter, in `PDFPageProxy.render`, with a boolean `includeAnnotationStorage`-parameter which simply indicates if the (internally available) `AnnotationStorage`-instance should be used during rendering (e.g. for printing).

 - By removing the need to *manually* provide `annotationStorage`-parameters to various API-methods, using the API should become simpler (e.g. for third-parties) since you no longer need to worry about manually fetching and passing around this data.
2021-04-09 13:24:25 +02:00
Tim van der Meij
228adbf673
Merge pull request #13172 from Snuffleupagus/cleanup-keepFonts
[api-minor] Add an option, in `PDFDocumentProxy.cleanup`, to allow fonts to remain attached to the DOM
2021-04-05 14:21:34 +02:00
Jonas Jenwald
16fd838f52 Convert the renderTasks, used in PDFPageProxy.render/PDFPageProxy.getOperatorList, to a Set
When removing tasks we're currently forced to *indirectly* iterate through the array, which can be avoided by using a Set instead.
Furthermore, we can also (slightly) modernize the code responsible for initializing the `renderTasks`.
2021-04-05 10:51:28 +02:00
Jonas Jenwald
a2bc6481a0 [api-minor] Add an option, in PDFDocumentProxy.cleanup, to allow fonts to remain attached to the DOM
As mentioned in the JSDoc comment, this should not be used unless you know what you're doing, since it will lead to increased memory usage. However, in some situations (e.g. SVG-rendering), we still want to be able to run general clean-up on both the main/worker-thread while keeping loaded fonts attached to the DOM.[1]

As part of these changes, `WorkerTransport.startCleanup` is converted to an async method and we'll also skip clean-up when destruction has started (since it's redundant).

---
[1] The SVG-rendering mode is obviously not officially supported, since it's both rather incomplete and inherently slower. However with recent changes, whereby we cache repeated images on the document rather than the page level, memory usage can be *a lot* worse than before if we never attempt to release e.g. cached image-data when the viewer is in SVG-rendering mode.
2021-04-02 12:32:31 +02:00
Jonas Jenwald
48ff20493f Mark some internal PDFDocumentProxy-properties as "private"
These two properties were *never* intended to be anything but "private", hence it really cannot hurt to actually indicate that they're *not* part of any official API.
2021-04-02 12:26:32 +02:00
Jonas Jenwald
db1e1612df [api-minor] Support proper URL-objects, in addition to URL-strings, in getDocument
Currently only URL-strings are officially supported by `getDocument`, however at this point in time I cannot really see any compelling reason to not support `URL`-objects as well.

Most likely the reason that we've don't already support `URL`-objects, in `getDocument`, is that historically `URL` wasn't fully implemented across browsers and our old polyfill wasn't perfect; see https://developer.mozilla.org/en-US/docs/Web/API/URL/URL#browser_compatibility

*Please note:* Because of how the `url` parameter is currently handled, there's actually *some* cases where passing a `URL`-object to `getDocument` already works. That, in my opinion, provides additional motivation for supporting `URL`-objects officially, since it makes the API more consistent.

The following is an attempt to summarize the *current* situation, based on the actual code rather than the JSDocs:
 - `getDocument("url string")` works and is documented.[1]
 - `getDocument({ url: "url string", })` works and is documented.[1]
 - `getDocument(new URL(...))` throws immediately, since no supported parameters are found.
 - `getDocument({ url: new URL(...), })` actually works even though it's not documented.[1] Originally, when data was fetched on the worker-thread, this would likely have thrown since `URL` isn't clonable.[2]
 - `getDocument({ url: { abc: 123, }, })`, or some similarily meaningless input, will be "accepted" by `getDocument` and then throw a `MissingPDFException` when attempting to fetch the bogus data.

With the changes in this patch, not only is `URL`-objects now officially supported and documented when calling `getDocument`, but we'll also do a much better job at actually validating any URL-data passed to `getDocument` (and instead fail early).

---
[1] In *browsers*, we create a valid URL thus indirectly validating the input. In Node.js environments, on the other hand, no validation is done since obtaining a baseUrl is more difficult (and PDF.js is primarily written for browsers anyway).

[2] https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm#supported_types
2021-03-31 16:21:41 +02:00
Jonas Jenwald
27add0f1f3 Re-factor the source parsing, in getDocument, to use switch rather than if...else
Given the number of parameters that we now need to parse here, this code is no longer as readable as one would like. Hence this re-factoring, which will improve overall readability and also help with the next patch.
2021-03-31 16:21:37 +02:00
Jonas Jenwald
9c6770748c Move the PDFDocumentStats typedef closer to its usage
Currently this typedef appears slightly out-of-place, in the middle of the arguably much more important `getDocument` JSDocs.
2021-03-31 16:21:22 +02:00
Tim van der Meij
8269ddbd16
Merge pull request #13105 from Snuffleupagus/BasePdfManager-parseDocBaseUrl
Improve memory usage around the `BasePdfManager.docBaseUrl` parameter (PR 7689 follow-up)
2021-03-19 23:03:20 +01:00
Jonas Jenwald
57e7557235
Actually reset the PDFPageProxy._xfaPromise property as intended (PR 13069 follow-up) (#13119)
Similar to the existing `annotationsPromise` and `_jsActionsPromise` properties, the new `_xfaPromise` should obviously also be reset, since otherwise you might end up holding onto a lot of data for pages that are no longer active.

(That caching wasn't present in the original version of PR 13069, which is why I didn't spot it until now.)
2021-03-19 11:31:54 +01:00
calixteman
24e598a895
XFA - Add a layer to display XFA forms (#13069)
- add an option to enable XFA rendering if any;
  - for now, let the canvas layer: it could be useful to implement XFAF forms (embedded pdf in xml stream for the background and xfa form for the foreground);
  - ui elements in template DOM are pretty close to their html counterpart so we generate a fake html DOM from template one:
    - it makes easier to translate template properties to html ones;
    - it makes faster the creation of the html element in the main thread.
2021-03-19 10:11:40 +01:00
Jonas Jenwald
c4c7216171 Improve memory usage around the BasePdfManager.docBaseUrl parameter (PR 7689 follow-up)
While there is nothing *outright* wrong with the existing implementation, it can however lead to increased memory usage in one particular case (that I completely overlooked when implementing this):
For "data:"-URLs, which by definition contains the entire PDF document and can thus be arbitrarily large, we obviously want to avoid sending, storing, and/or logging the "raw" docBaseUrl in that case.

To address this, this patch makes the following changes:
 - Ignore any non-string in the `docBaseUrl` option passed to `getDocument`, since those are unsupported anyway, already on the main-thread.

 - Ignore "data:"-URLs in the `docBaseUrl` option passed to `getDocument`, to avoid having to send what could potentially be a *very* long string to the worker-thread.

 - Parse the `docBaseUrl` option *directly* in the `BasePdfManager`-constructors, on the worker-thread, to avoid having to store the "raw" docBaseUrl in the first place.
2021-03-17 15:48:24 +01:00
Jonas Jenwald
50681d71c8 Ensure that getDocument handles Node.js Buffers more gracefully (issue 13075)
While the JSDocs have never advertised `getDocument` as supporting Node.js `Buffer`s, that apparently doesn't stop users from passing such data structures to `getDocument`.
In theory the existing `instanceof Uint8Array` check ought to have caught Node.js `Buffer`s, however for reasons that I don't even pretend to understand that check actually passes. Hence this patch which, *only* in Node.js environments, will special-case `Buffer`s to hopefully provide a slightly better out-of-the-box behaviour in Node.js environments[1].

---
[1] Although I'm not sure that we necessarily want to advertise this in the JSDocs, given the specialized use-case.
2021-03-13 10:52:38 +01:00
Jonas Jenwald
6fd899dc44 [api-minor] Support the Content-Disposition filename in the Firefox PDF Viewer (bug 1694556, PR 9379 follow-up)
As can be seen [in the mozilla-central code](https://searchfox.org/mozilla-central/rev/a6db3bd67367aa9ddd9505690cab09b47e65a762/toolkit/components/pdfjs/content/PdfStreamConverter.jsm#1222-1225), we're already getting the Content-Disposition filename. However, that data isn't passed through to the viewer nor to the `PDFDataTransportStream`-implementation, which explains why it's currently being ignored.

*Please note:* This will also require a small mozilla-central patch, see https://bugzilla.mozilla.org/show_bug.cgi?id=1694556, to forward the necessary data to the viewer.
2021-02-26 10:50:29 +01:00
Jonas Jenwald
d69cf702f3 Add a this-bound method for InternalRenderTask.cancel
This is similar to the other methods, and the only reason for this not having been done originally is that the `cancel` functionality is a later addition.
2021-02-20 14:47:57 +01:00
Jonas Jenwald
e9038cc3d1 Send the AnnotationStorage-data to the worker-thread as a Map
Rather than converting the `AnnotationStorage`-data to an Object, before sending it to the worker-thread, we should be able to simply send the internal `Map` directly.
The "structured clone algorithm" doesn't have a problem with `Map`s, however the `LoopbackPort` used when workers are *disabled* (e.g. in Node.js environments) didn't use to support them. With PR 12997 having lifted that restriction, we should now be able to simply send the `AnnotationStorage`-data as-is rather than having to iterate through it to first create an Object.

*Please note:* The changes in `src/core/annotation.js` could have been a lot more compact if we were able to use optional chaining in the `src/core` folder. Unfortunately that's still not possible, since SystemJS is being used in the development viewer (i.g. `gulp server`) and fixing that is *still* blocked by [bug 1247687](https://bugzilla.mozilla.org/show_bug.cgi?id=1247687).
2021-02-18 17:13:43 +01:00
Tim van der Meij
4619b1b568
Merge pull request #12997 from Snuffleupagus/metadata-worker
Move the Metadata parsing to the worker-thread
2021-02-17 20:57:46 +01:00
Jonas Jenwald
3398070e26 [api-minor] Remove support for synchronous event dispatching in LoopbackPort
*Please note:* The `defer` parameter has been enabled by default ever since PR 9777 (in 2018), which first shipped in PDF.js release `2.0.943`.
With workers *disabled*, e.g. in Node.js environments, this has been used ever since without any problems reported[1].

The impetus for this change was that I happened to notice that *if* the `LoopbackPort` was used with synchronous event dispatching, we'd simply send that data as-is to the listeners. This created an inconsistency in the data returned from the `pdf.worker.js` file, since `postMessage` used with *actual* workers (or the `LoopbackPort` with `defer = true`) will ignore/throw when encountering unclonable data.
Originally my intention was simply to just call `cloneValue` regardless of the event dispatching used in `LoopbackPort`, however looking at the use-cases (or lack thereof) of the `LoopbackPort` it seemed reasonable to simply remove the `defer` parameter instead.

This patch is tagged "[api-minor]" since the `LoopbackPort` is still exposed in the API, although I really hope that no third-party is using this (since disabling workers leads to bad performance).

Finally, this patch changes a `forEach` loop to `for...of` and makes uses of optional changing in existing code.

---
[1] As evident by the `npm test` command run by Github Actions, and previously by Travis.
2021-02-17 16:12:29 +01:00
Jonas Jenwald
73bf45e64b Support Map and Set, with postMessage, when workers are disabled
The `LoopbackPort` currently doesn't support `Map` and `Set`, which it should since the "structured clone algorithm" used in browsers does support both of them; please see https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm#supported_types
2021-02-17 13:11:59 +01:00
Jonas Jenwald
298ee5cfbb Replace some ternary operators with optional chaining, and nullish coalescing, in the src/display/-folder
This way, we can further reduce unnecessary code-repetition in some cases.
2021-01-19 17:20:02 +01:00
Jonas Jenwald
13742eb82d Inlude the JS actions for the page when dispatching the "pageopen"-event in the BaseViewer
Note first of all how the `PDFDocumentProxy.getJSActions` method in the API caches the result, which makes repeated lookups cheap enough to not really be an issue.
Secondly, with the previous patch, we're now only dispatching "pageopen"/"pageclose"-events when there's actually a sandbox that listens for them.

All-in-all, with these changes we can thus simplify the default-viewer "pageopen"-event handler a fair bit.
2021-01-12 20:28:50 +01:00
Jonas Jenwald
81525fd446 Use ESLint to ensure that exports are sorted alphabetically
There's built-in ESLint rule, see `sort-imports`, to ensure that all `import`-statements are sorted alphabetically, since that often helps with readability.
Unfortunately there's no corresponding rule to sort `export`-statements alphabetically, however there's an ESLint plugin which does this; please see https://www.npmjs.com/package/eslint-plugin-sort-exports

The only downside here is that it's not automatically fixable, but the re-ordering is a one-time "cost" and the plugin will help maintain a *consistent* ordering of `export`-statements in the future.
*Note:* To reduce the possibility of introducing any errors here, the re-ordering was done by simply selecting the relevant lines and then using the built-in sort-functionality of my editor.
2021-01-09 20:37:51 +01:00
Jonas Jenwald
941b65f683 Remove unncessary CanvasFactory/CMapReaderFactory/FileReaderFactory duplication in unit-tests
Given that the API will now, after PR 12039, automatically pick the correct factories to use depending on the environment (browser vs. Node.js), we can utilize that in the unit-tests as well. This way we don't have to manually repeat the same initialization code in *multiple* unit-tests.
*Note:* The *official* PDF.js API is defined in `src/pdf.js`, hence the new exports in `src/display/api.js` will not affect that.

Also, updates the unit-test `FileReaderFactory` helpers similarily.

*Drive-by change:* Fix the `CMapReaderFactory` usage in the annotation unit-tests, since the cache should only contain raw data and not a Promise. While this obviously works as-is, having unit-tests that "abuse" the intended data format can easily lead to unnecessary failures if changes are made to the relevant `src/core/` code.
2021-01-08 17:33:59 +01:00
Calixte Denizet
6523f8880b JS -- Plug PageOpen and PageClose actions 2021-01-06 13:31:15 +01:00
Jonas Jenwald
f9530e56da Run AnnotationStorage.resetModified when destroying the PDFDocumentLoadingTask/PDFDocumentProxy
This will, in a very simple way using the existing events, thus allow the viewer to remove the "beforeunload" `window` event listener when the document is closed.
Generally speaking we want to avoid having *global* event listeners for the PDF document instance, which is why the `EventBus` exists, and instead reserve global events for the viewer itself. However, the `AnnotationStorage` "beforeunload" event unfortunately needs to be document-specific and we should thus ensure that it's correctly removed when the document is destroyed.
2020-12-19 14:05:31 +01:00
Calixte Denizet
1e2173f038 JS - Collect and execute actions at doc and pages level
* the goal is to execute actions like Open or OpenAction
 * can be tested with issue6106.pdf (auto-print)
 * once #12701 is merged, we can add page actions
2020-12-18 20:03:59 +01:00
Jonas Jenwald
01d12b465c [api-minor] Add "contentLength" to the information returned by the getMetadata method
Given that we already include the "Content-Disposition"-header filename, when it exists, it shouldn't hurt to also include the information from the "Content-Length"-header.
For PDF documents opened via a URL, which should be a very common way for the PDF.js library to be used, this will[1] thus provide a way of getting the PDF filesize without having to wait for the `getDownloadInfo`-promise to resolve[2].

With these API improvements, we can also simplify the filesize handling in the `PDFDocumentProperties` class.

---
[1] Assuming that the server is correctly configured, of course.

[2] Since that's not *guaranteed* to happen in general, with e.g. `disableAutoFetch = true` set.
2020-11-20 15:30:36 +01:00
Jonas Jenwald
de628cec59 Some hasJSActions, and general annotation-code, related cleanup in the viewer and API
- Add support for logical assignment operators, i.e. `&&=`, `||=`, and `??=`, with a Babel-plugin. Given that these required incrementing the ECMAScript version in the ESLint and Acorn configurations, and that platform/browser support is still fairly limited, always transpiling them seems appropriate for now.

 - Cache the `hasJSActions` promise in the API, similar to the existing `getAnnotations` caching. With this implemented, the lookup should now be cheap enough that it can be called unconditionally in the viewer.

 - Slightly improve cleanup of resources when destroying the `WorkerTransport`.

 - Remove the `annotationStorage`-property from the `PDFPageView` constructor, since it's not necessary and also brings it more inline with the `BaseViewer`.

 - Update the `BaseViewer.createAnnotationLayerBuilder` method to actaually agree with the `IPDFAnnotationLayerFactory` interface.[1]

 - Slightly tweak a couple of JSDoc comments.

---
[1] We probably ought to re-factor both the `IPDFTextLayerFactory` and `IPDFAnnotationLayerFactory` interfaces to take parameter objects instead, since especially the `IPDFAnnotationLayerFactory` one is becoming quite unwieldy. Given that that would likely be a breaking change for any custom viewer-components implementation, this probably requires careful deprecation.
2020-11-14 13:58:35 +01:00
Calixte Denizet
a5279897a7 JS -- Add listener for sandbox events only if there are some actions
* When no actions then set it to null instead of empty object
* Even if a field has no actions, it needs to listen to events from the sandbox in order to be updated if an action changes something in it.
2020-11-09 18:37:59 +01:00
Jonas Jenwald
1dad255784 Convert files in the src/display/-folder to use optional chaining where possible
By using optional chaining, see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Optional_chaining, it's possible to reduce unnecessary code-repetition in many cases.
Note that these changes also reduce the size of the *built* `pdf.js` file, when `SKIP_BABEL == true` is set, and for the `MOZCENTRAL` build-target that result in a `0.1%` filesize reduction from a simple and mostly mechanical code change.
2020-11-07 13:22:06 +01:00
Tim van der Meij
e341e6e542
Merge pull request #12525 from brendandahl/mark-info
[api-minor] Implement API to get MarkInfo from the catalog.
2020-10-31 00:05:19 +01:00
Brendan Dahl
f5c821e9c3 [api-minor] Implement API to get MarkInfo from the catalog. 2020-10-30 10:59:45 -07:00
Jonas Jenwald
c293fc2b8f Add (some) optional chaining usage in src/display/api.js
Since we no longer use SystemJS to load the unit-tests, there's now nothing that prevents us from using optional chaining and nullish coalescing in the `src/display/` directory.
2020-10-26 11:11:48 +01:00
Jonas Jenwald
d9084c0be2 Load the fake worker, in non-PRODUCTION mode, with native async import
This removes the last SystemJS usage from both the API and the default viewer.
2020-10-26 11:11:48 +01:00
Calixte Denizet
c30a3a94f0 JS - Add a function in api to get the fields ids in AcroForm::CO 2020-10-17 12:56:40 +02:00
Jonas Jenwald
3351d3476d Don't store complex data in PDFDocument.formInfo, and replace the fields object with a hasFields boolean instead
*This patch is based on a couple of smaller things that I noticed when working on PR 12479.*

 - Don't store the /Fields on the `formInfo` getter, since that feels like overloading it with unintended (and too complex) data, and utilize a `hasFields` boolean instead.
   This functionality was originally added in PR 12271, to help determine what kind of form data a PDF document contains, and I think that we should ensure that the return value of `formInfo` only consists of "simple" data.
   With these changes the `fieldObjects` getter instead has to look-up the /Fields manually, however that shouldn't be a problem since the access is guarded by a `formInfo.hasFields` check which ensures that the data both exists and is valid. Furthermore, most documents doesn't even have any /AcroForm data anyway.

 - Determine the `hasFields` property *first*, to ensure that it's always correct even if there's errors when checking e.g. the /XFA or /SigFlags entires, since the `fieldObjects` getter depends on it.

 - Simplify a loop in `fieldObjects`, since the object being accessed is a `Map` and those have built-in iteration support.

 - Use a higher logging level for errors in the `formInfo` getter, and include the actual error message, since that'd have helped with fixing PR 12479 a lot quicker.

 - Update the JSDoc comment in `src/display/api.js` to list the return values correctly, and also slightly extend/improve the description.
2020-10-16 12:47:27 +02:00
Calixte Denizet
71ecc3129b Add the possibility to collect Javascript actions 2020-10-14 10:44:16 +02:00
Jonas Jenwald
2a8983d76b Enable the ESLint no-var rule in the src/display/ folder
Previously this rule has been enabled in the `web/` folder, and in select files in the `src/` sub-folders.
Note that a number of the files in the `src/display/` folder were already enforcing the `no-var` rule, and thanks to Prettier the necessary re-writing will be (mostly) handled automatically.

Please find additional details about the ESLint rule at https://eslint.org/docs/rules/no-var
2020-10-02 16:16:23 +02:00
Jonas Jenwald
2393443e73 Include the /Order array, if available, when parsing the Optional Content configuration
The `/Order` array is used to improve the display of Optional Content groups in PDF viewers, and it allows a PDF document to e.g. specify that Optional Content groups should be displayed as a (collapsable) tree-structure rather than as just a list.

Note that not all available Optional Content groups must be present in the `/Order` array, and PDF viewers will often (by default) hide those toggles in the UI.
To allow us to improve the UX around toggling of Optional Content groups, in the default viewer, these hidden-by-default groups are thus appended to the parsed `/Order` array under a *custom* nesting level (with `name == null`).

Finally, the patch also slightly tweaks an `OptionalContentConfig` related JSDoc-comment in the API.
2020-08-30 16:28:40 +02:00
Jonas Jenwald
1f5021d76a Prevent errors if PDFDocumentProxy.saveDocument is called without the annotationStorage parameter (PR 12241 follow-up)
Obviously it doesn't make sense to call that method without providing an `AnnotationStorage`-instance, however we should ensure that doing so won't cause errors.
Hence we need to check that `annotationStorage` is actually defined, before attempting to call its `resetModified` method.
2020-08-22 18:09:17 +02:00
Brendan Dahl
8023175103 Support file save triggered from the Firefox integrated version.
Related to https://bugzilla.mozilla.org/show_bug.cgi?id=1659753

This allows Firefox trigger a "save" event from ctrl/cmd+s or the "Save
Page As" context menu, which in turn lets pdf.js generate a new PDF if
there is form data to save.

I also now use `sourceEventType` on downloads so Firefox can determine if
it should launch the "open with" dialog or "save as" dialog.
2020-08-20 18:05:08 -07:00
Aki Sasaki
83365a3756 confirm if leaving a modified form without saving 2020-08-20 17:23:06 -07:00
Jonas Jenwald
b26d736809 Ensure that the "DocException" message handler, in the API, will always either error or warn (depending on the build) if a valid Error isn't found
Having this present would have made debugging issues 11941 and 12209 so much quicker and easier.
2020-08-13 13:17:30 +02:00
Calixte Denizet
1a6816ba98 Add support for saving forms 2020-08-12 10:32:59 +02:00
Jonas Jenwald
4d351eab93 A couple of (small) tweaks of the AnnotationStorage (PR 12173 follow-up)
- Initialize the `AnnotationStorage`-instance, on `PDFDocumentProxy`, lazily.
 - Change the `AnnotationStorage` to use a `Map` internally, rather than a regular Object (simplifies the following points).
 - Let `AnnotationStorage.getAll` return `null` when there's no data stored, to avoid unnecessary parsing on the worker-thread. This ought to "just work", since the worker-thread code *should* already handle the `!annotationStorage` case everywhere.
 - Add a new `AnnotationStorage.size` getter, to be able to easily tell if there's any data stored.
2020-08-10 17:07:24 +02:00
Jonathan Grimes
ac723a1760 Allow loading pdf fonts into another document. 2020-08-08 02:52:32 +00:00
Takashi Tamura
4ac62d8787 Fix the type of PDFDocumentLoadingTask.destroy. 2020-08-07 16:10:19 +09:00
Jonas Jenwald
5e44b241b2 [api-minor] Fix the annotationStorage parameter in PDFPageProxy.render
While the parameter name (clearly) suggests that an `AnnotationStorage`-instance is expected, looking at the only call-sites that include the parameter (i.e. the `PDFPrintServiceFactory` instances) it actually contains just a normal Object.

Hence it seems much more reasonable to actually pass a valid `AnnotationStorage`-instance, as the name suggests, and simply have `PDFPageProxy.render` do the `annotationStorage.getAll()` call. (Since we cannot send an `AnnotationStorage`-instance as-is to the worker-thread, given the "structured clone algorithm".)
2020-08-05 23:02:30 +02:00
Takashi Tamura
a0f0ab78f3 Fix the type definition of TypedArray. 2020-08-05 17:01:08 +09:00
Tim van der Meij
56ca027c08
Improve consistency for the API documentation comments
Over time we used multiple different formats for JSDoc comments. This
commit standardizes those formats to the one we used most often.

Moreover, this removes the example in the outline endpoint documentation
since it now has a proper type definition and it didn't render correctly
in JSDoc.
2020-08-04 23:27:22 +02:00
Tim van der Meij
ba4a07ce07
Fix incorrect types in the API documentation 2020-08-04 23:19:59 +02:00
Tim van der Meij
3116216e1d
Improve the API documentation for PDFDocumentLoadingTask
This commit:
- formats the documentation block according to the standards;
- replaces the callback definitions with the `function` type (we have
  that for other definitions already and the callback type was not
  rendered correctly by JSDoc);
- synchronizes the type documentation and the class documentation;
- fixes the documentation by making it easier to read and making sure
  that all optional properties are indicated as such;
- uses the `@link` tag to indicate links to other code.

The `typestest` still passes and JSDoc now renders this class correctly.
2020-08-04 23:17:24 +02:00
Brendan Dahl
ac494a2278 Add support for optional marked content.
Add a new method to the API to get the optional content configuration. Add
a new render task param that accepts the above configuration.
For now, the optional content is not controllable by the user in
the viewer, but renders with the default configuration in the PDF.

All of the test files added exhibit different uses of optional content.

Fixes #269.

Fix test to work with optional content.

- Change the stopAtErrors test to ensure the operator list has something,
  instead of asserting the exact number of operators.
2020-08-04 09:26:55 -07:00
Tim van der Meij
00a8b42e67
Merge pull request #12102 from ineiti/add_types_annotations
Add types annotations
2020-08-02 16:45:37 +02:00
Jonas Jenwald
05baa4c89f
Revert "[api-minor] Allow loading pdf fonts into another document." 2020-08-01 12:52:39 +02:00
Tim van der Meij
173b92a873
Merge pull request #12131 from jsg2021/issue-8271
[api-minor] Allow loading pdf fonts into another document.
2020-08-01 01:13:41 +02:00
Jonas Jenwald
6d192f987e Prevent Uncaught (in promise) AbortException when running the unit-tests
These errors can/will occur if data is still loading when the document is destroyed, which is the case in the API unit-tests that load the `tracemonkey.pdf` file.
While this patch prevents these kind of problems, and thus allows us to update Jasmine again, I cannot help but thinking that it's slightly "hacky". Basically, we'll simply catch and ignore (some) rejected promises once the document is destroyed and/or its data loading is aborted. However, I don't *think* that these changes should cause issues in general, since we don't really care about errors once document destruction has started (note e.g. the fair number of `catch` handlers ignoring `AbortException`s already).
2020-07-31 23:29:05 +02:00
Jonathan Grimes
9b16b8ef71
Allow loading pdf fonts into another document. 2020-07-31 11:41:48 -05:00
Linus Gasser
f1bbfdc16d Add typescript definitions
This PR adds typescript definitions from the JSDoc already present.
It adds a new gulp-target 'types' that calls 'tsc', the typescript
compiler, to create the definitions.

To use the definitions, users can simply do the following:

```
import {getDocument, GlobalWorkerOptions} from "pdfjs-dist";
import pdfjsWorker from "pdfjs-dist/build/pdf.worker.entry";
GlobalWorkerOptions.workerSrc = pdfjsWorker;

const pdf = await getDocument("file:///some.pdf").promise;
```

Co-authored-by: @oBusk
Co-authored-by: @tamuratak
2020-07-30 11:10:37 +02:00
Jonas Jenwald
f3ff526019 Send/receive Type3 images the same way as other globally-cached images
There's quite frankly no particular reason to special-case Type3-fonts with image resources, which are very rare anyway, now that we have a general mechanism for sending/receiving images globally.
2020-07-27 13:20:15 +02:00
Calixte Denizet
584902dbf8 Add an annotation storage in order to save annotation data in acroforms 2020-07-24 10:50:11 +02:00
Jonas Jenwald
4a7e29865d [api-minor] Use the NodeCanvasFactory/NodeCMapReaderFactory classes as defaults in Node.js environments (issue 11900)
This moves, and slightly simplifies, code that's currently residing in the unit-test utils into the actual library, such that it's bundled with `GENERIC`-builds and used in e.g. the API-code.

As an added bonus, this also brings out-of-the-box support for CMaps in e.g. the Node.js examples.
2020-07-02 04:44:23 +02:00
Jonas Jenwald
4cb0c032f3 Convert the PDFPageProxy.intentStates property from an Object to a Map
As can be seen in the code there's a handful of places where this structure needs to be iterated, something that becomes cumbersome when dealing with `Object`s. Hence, by changing this to a `Map` instead we can both simplify the code and avoid creating unnecessary closures.

Particularily the `PDFPageProxy._tryCleanup` method becomes a lot more readable, at least in my opinion.

Finally, since this property is intended to be "private" the name is adjusted to reflect that.
2020-06-21 17:02:42 +02:00
Jonas Jenwald
cabc2cc4fc Add a InternalRenderTask.completed getter and use it to simplify PDFPageProxy._destroy
This patch aims to simplify the `PDFPageProxy._destroy` method, by:
 - Replacing the unnecessary `forEach` with a "regular" `for`-loop instead.
 - Use a more appropriate variable name, since `intentState.renderTasks` contain instances of `InternalRenderTask`.
 - Move the "is rendering completed"-handling to a new `InternalRenderTask.completed` getter, to abstract away some (mostly) internal `InternalRenderTask` state.
2020-06-21 15:56:14 +02:00
Jonas Jenwald
64378fc366 [api-minor] Remove the deprecated PDFDocumentProxy.getOpenActionDestination method (PR 11644 follow-up)
This method has been printing a `deprecated` warning in two releases, hence it should hopefully be safe to remove now.
2020-06-02 12:28:00 +02:00
Jonas Jenwald
18e0b10d3c [api-minor] Remove the disableCreateObjectURL option from the getDocument parameters, since it's now unused in the API
With the changes in previous patches, the `disableCreateObjectURL` option/functionality is no longer used for anything in the API and/or in the Worker code.

Note however that there's some functionality, mainly related to file loading/downloading, in the GENERIC version of the default viewer which still depends on this option.
Hence the `disableCreateObjectURL` option (and related compatibility code) is moved into the viewer, see e.g. `web/app_options.js`, such that it's still available in the default viewer.
2020-05-22 00:22:48 +02:00
Jonas Jenwald
cc4cc8b11b Remove the, now unused, releaseImageResources helper function
With the changes in the previous patch, this is now dead code which should thus be removed.
2020-05-22 00:22:48 +02:00
Jonas Jenwald
0351852d74 [api-minor] Decode all JPEG images with the built-in PDF.js decoder in src/core/jpg.js
Currently some JPEG images are decoded by the built-in PDF.js decoder in `src/core/jpg.js`, while others attempt to use the browser JPEG decoder. This inconsistency seem unfortunate for a number of reasons:

 - It adds, compared to the other image formats supported in the PDF specification, a fair amount of code/complexity to the image handling in the PDF.js library.

 - The PDF specification support JPEG images with features, e.g. certain ColorSpaces, that browsers are unable to decode natively. Hence, determining if a JPEG image is possible to decode natively in the browser require a non-trivial amount of parsing. In particular, we're parsing (part of) the raw JPEG data to extract certain marker data and we also need to parse the ColorSpace for the JPEG image.

 - While some JPEG images may, for all intents and purposes, appear to be natively supported there's still cases where the browser may fail to decode some JPEG images. In order to support those cases, we've had to implement a fallback to the PDF.js JPEG decoder if there's any issues during the native decoding. This also means that it's no longer possible to simply send the JPEG image to the main-thread and continue parsing, but you now need to actually wait for the main-thread to indicate success/failure first.
   In practice this means that there's a code-path where the worker-thread is forced to wait for the main-thread, while the reverse should *always* be the case.

 - The native decoding, for anything except the *simplest* of JPEG images, result in increased peak memory usage because there's a handful of short-lived copies of the JPEG data (see PR 11707).
Furthermore this also leads to data being *parsed* on the main-thread, rather than the worker-thread, which you usually want to avoid for e.g. performance and UI-reponsiveness reasons.

 - Not all environments, e.g. Node.js, fully support native JPEG decoding. This has, historically, lead to some issues and support requests.

 - Different browsers may use different JPEG decoders, possibly leading to images being rendered slightly differently depending on the platform/browser where the PDF.js library is used.

Originally the implementation in `src/core/jpg.js` were unable to handle all of the JPEG images in the test-suite, but over the last couple of years I've fixed (hopefully) all of those issues.
At this point in time, there's two kinds of failure with this patch:

 - Changes which are basically imperceivable to the naked eye, where some pixels in the images are essentially off-by-one (in all components), which could probably be attributed to things such as different rounding behaviour in the browser/PDF.js JPEG decoder.
   This type of "failure" accounts for the *vast* majority of the total number of changes in the reference tests.

 - Changes where the JPEG images now looks *ever so slightly* blurrier than with the native browser decoder. For quite some time I've just assumed that this pointed to a general deficiency in the `src/core/jpg.js` implementation, however I've discovered when comparing two viewers side-by-side that the differences vanish at higher zoom levels (usually around 200% is enough).
   Basically if you disable [this downscaling in canvas.js](8fb82e939c/src/display/canvas.js (L2356-L2395)), which is what happens when zooming in, the differences simply vanish!
   Hence I'm pretty satisfied that there's no significant problems with the `src/core/jpg.js` implementation, and the problems are rather tied to the general quality of the downscaling algorithm used. It could even be seen as a positive that *all* images now share the same downscaling behaviour, since this actually fixes one old bug; see issue 7041.
2020-05-22 00:22:48 +02:00
Jonas Jenwald
dda6626f40 Attempt to cache repeated images at the document, rather than the page, level (issue 11878)
Currently image resources, as opposed to e.g. font resources, are handled exclusively on a page-specific basis. Generally speaking this makes sense, since pages are separate from each other, however there's PDF documents where many (or even all) pages actually references exactly the same image resources (through the XRef table). Hence, in some cases, we're decoding the *same* images over and over for every page which is obviously slow and wasting both CPU and memory resources better used elsewhere.[1]

Obviously we cannot simply treat all image resources as-if they're used throughout the entire PDF document, since that would end up increasing memory usage too much.[2]
However, by introducing a `GlobalImageCache` in the worker we can track image resources that appear on more than one page. Hence we can switch image resources from being page-specific to being document-specific, once the image resource has been seen on more than a certain number of pages.

In many cases, such as e.g. the referenced issue, this patch will thus lead to reduced memory usage for image resources. Scrolling through all pages of the document, there's now only a few main-thread copies of the same image data, as opposed to one for each rendered page (i.e. there could theoretically be *twenty* copies of the image data).
While this obviously benefit both CPU and memory usage in this case, for *very* large image data this patch *may* possibly increase persistent main-thread memory usage a tiny bit. Thus to avoid negatively affecting memory usage too much in general, particularly on the main-thread, the `GlobalImageCache` will *only* cache a certain number of image resources at the document level and simply fallback to the default behaviour.

Unfortunately the asynchronous nature of the code, with ranged/streamed loading of data, actually makes all of this much more complicated than if all data could be assumed to be immediately available.[3]

*Please note:* The patch will lead to *small* movement in some existing test-cases, since we're now using the built-in PDF.js JPEG decoder more. This was done in order to simplify the overall implementation, especially on the main-thread, by limiting it to only the `OPS.paintImageXObject` operator.

---
[1] There's e.g. PDF documents that use the same image as background on all pages.

[2] Given that data stored in the `commonObjs`, on the main-thread, are only cleared manually through `PDFDocumentProxy.cleanup`. This as opposed to data stored in the `objs` of each page, which is automatically removed when the page is cleaned-up e.g. by being evicted from the cache in the default viewer.

[3] If the latter case were true, we could simply check for repeat images *before* parsing started and thus avoid handling *any* duplicate image resources.
2020-05-21 18:13:45 +02:00
Jonas Jenwald
8d56a69e74 Reduce usage of SystemJS, in the development viewer, even further
With these changes SystemJS is now only used, during development, on the worker-thread and in the unit/font-tests, since Firefox is currently missing support for worker modules; please see https://bugzilla.mozilla.org/show_bug.cgi?id=1247687

Hence all the JavaScript files in the `web/` and `src/display/` folders are now loaded *natively* by the browser (during development) using standard `import` statements/calls, thanks to a nice `import-maps` polyfill.

*Please note:* As soon as https://bugzilla.mozilla.org/show_bug.cgi?id=1247687 is fixed in Firefox, we should be able to remove all traces of SystemJS and thus finally be able to use every possible modern JavaScript feature.
2020-05-20 13:36:52 +02:00
Jonas Jenwald
d4d933538b Re-factor setPDFNetworkStreamFactory, in src/display/api.js, to also accept an asynchronous function
As part of trying to reduce the usage of SystemJS in the development viewer, this patch is a necessary step that will allow removal of some `require` statements.

Currently this uses `SystemJS.import` in non-PRODUCTION mode, but it should be possible to replace those with standard *dynamic* `import` calls in the future.
2020-05-20 13:18:18 +02:00
Jonas Jenwald
e1f340a0c2 Use the ESLint no-restricted-syntax rule to ensure that assert is always called with two arguments
Having `assert` calls without a message string isn't very helpful when debugging, and it turns out that it's easy enough to make use of ESLint to enforce better `assert` call-sites.
In a couple of cases the `assert` calls were changed to "regular" throwing of errors instead, since that seemed more appropriate.

Please find additional details about the ESLint rule at https://eslint.org/docs/rules/no-restricted-syntax
2020-05-05 13:40:05 +02:00
Jonas Jenwald
c355f91d2e [api-minor] Immediately release the font.data property once the font been attached to the DOM (PR 11777 follow-up)
*This patch implements https://github.com/mozilla/pdf.js/pull/11777#issuecomment-609741348*

This extends the work from PR 11773 and 11777 further, by immediately releasing the `font.data` property once the font been attached to the DOM. By not unnecessarily holding onto this data on the main-thread, we'll thus reduce the memory usage of fonts even further (especially beneficial in longer documents with composite fonts).

The new behaviour is controlled by the recently added `fontExtraProperties` API option (adding a new option just for this patch didn't seem necessary), since there's one edge-case in the SVG renderer where the `font.data` property is necessary (see the `pdf2svg` example).

Note that while the default viewer does run clean-up with an idle timeout, that timeout will be reset whenever rendering occurs *or* when scrolling happens in the viewer. In practice this means that unless the user doesn't interact with the viewer in *any* way during an extended period of time, currently set to 30 seconds, the `PDFDocumentProxy.cleanup` method will never be called and font resources will thus not be cleaned-up.
2020-04-23 13:04:57 +02:00
Jonas Jenwald
746eaf3154 [api-minor] Fix the return value of PDFDocumentProxy.getViewerPreferences when no viewer preferences are present (PR 10738 follow-up)
This patch fixes yet another instalment in the never-ending series of "what the *bleep* was I thinking", by changing the `PDFDocumentProxy.getViewerPreferences` method to return `null` by default.
Not only is this method now consistent with many other API methods, for the data not present case, but it also avoids having to e.g. loop through an object to check if it's actually empty (note the old unit-test).
2020-04-14 23:25:50 +02:00
Jonas Jenwald
426945b480 Update Prettier to version 2.0
Please note that these changes were done automatically, using `gulp lint --fix`.

Given that the major version number was increased, there's a fair number of (primarily whitespace) changes; please see https://prettier.io/blog/2020/03/21/2.0.0.html
In order to reduce the size of these changes somewhat, this patch maintains the old "arrowParens" style for now (once mozilla-central updates Prettier we can simply choose the same formatting, assuming it will differ here).
2020-04-14 12:28:14 +02:00
Jonas Jenwald
2d46230d23 [api-minor] Change Font.exportData to, by default, stop exporting properties which are completely unused on the main-thread and/or in the API (PR 11773 follow-up)
For years now, the `Font.exportData` method has (because of its previous implementation) been exporting many properties despite them being completely unused on the main-thread and/or in the API.
This is unfortunate, since among those properties there's a number of potentially very large data-structures, containing e.g. Arrays and Objects, which thus have to be first structured cloned and then stored on the main-thread.

With the changes in this patch, we'll thus by default save memory for *every* `Font` instance created (there can be a lot in longer documents). The memory savings obviously depends a lot on the actual font data, but some approximate figures are: For non-embedded fonts it can save a couple of kilobytes, for simple embedded fonts a handful of kilobytes, and for composite fonts the size of this auxiliary can even be larger than the actual font program itself.

All-in-all, there's no good reason to keep exporting these properties by default when they're unused. However, since we cannot be sure that every property is unused in custom implementations of the PDF.js library, this patch adds a new `getDocument` option (named `fontExtraProperties`) that still allows access to the following properties:

 - "cMap": An internal data structure, only used with composite fonts and never really intended to be exposed on the main-thread and/or in the API.
   Note also that the `CMap`/`IdentityCMap` classes are a lot more complex than simple Objects, but only their "internal" properties survive the structured cloning used to send data to the main-thread. Given that CMaps can often be *very* large, not exporting them can also save a fair bit of memory.

 - "defaultEncoding": An internal property used with simple fonts, and used when building the glyph mapping on the worker-thread. Considering how complex that topic is, and given that not all font types are handled identically, exposing this on the main-thread and/or in the API most likely isn't useful.

 - "differences": An internal property used with simple fonts, and used when building the glyph mapping on the worker-thread. Considering how complex that topic is, and given that not all font types are handled identically, exposing this on the main-thread and/or in the API most likely isn't useful.

 - "isSymbolicFont": An internal property, used during font parsing and building of the glyph mapping on the worker-thread.

  - "seacMap": An internal map, only potentially used with *some* Type1/CFF fonts and never intended to be exposed in the API. The existing `Font.{charToGlyph, charToGlyphs}` functionality already takes this data into account when handling text.

 - "toFontChar": The glyph map, necessary for mapping characters to glyphs in the font, which is built upon the various encoding information contained in the font dictionary and/or font program. This is not directly used on the main-thread and/or in the API.

 - "toUnicode": The unicode map, necessary for text-extraction to work correctly, which is built upon the ToUnicode/CMap information contained in the font dictionary, but not directly used on the main-thread and/or in the API.

 - "vmetrics": An array of width data used with fonts which are composite *and* vertical, but not directly used on the main-thread and/or in the API.

 - "widths": An array of width data used with most fonts, but not directly used on the main-thread and/or in the API.
2020-04-06 11:47:09 +02:00
Jonas Jenwald
dcb16af968 Whitelist closure related cases to address the remaining no-shadow linting errors
Given the way that "classes" were previously implemented in PDF.js, using regular functions and closures, there's a fair number of false positives when the `no-shadow` ESLint rule was enabled.

Note that while *some* of these `eslint-disable` statements can be removed if/when the relevant code is converted to proper `class`es, we'll probably never be able to get rid of all of them given our naming/coding conventions (however I don't really see this being a problem).
2020-03-25 11:57:12 +01:00
Jonas Jenwald
5eabe08c74 Exclude the setPDFNetworkStreamFactory function from the built API docs
Please note that the `setPDFNetworkStreamFactory` functionality isn't exposed in the public API, i.e. not listed among the exports in the `src/pdf.js` file, and that even if it were it wouldn't really be useful considering that none of the `PDFNetworkStream`/`PDFFetchStream`/`PDFNodeStream` classes are exported either.
2020-03-23 16:41:06 +01:00
Jonas Jenwald
c3c197d87a Remove old API methods which were previously converted to throwing (PR 11219 follow-up)
These methods were deprecated already in PDF.js version `2.1.266`, see PRs 10246 and 10369, and were converted to throw `Error`s upon invocation in PDF.js version `2.4.456`, see PR 11219.
Hence it ought to be possible to remove these methods now.
2020-03-23 16:41:03 +01:00
Jonas Jenwald
3539a17d2a Remove variable shadowing from the JavaScript files in the src/display/ folder
*This is part of a series of patches that will try to split PR 11566 into smaller chunks, to make reviewing more feasible.*

Once all the code has been fixed, we'll be able to eventually enable the ESLint no-shadow rule; see https://eslint.org/docs/rules/no-shadow
2020-03-20 23:09:41 +01:00
Jonas Jenwald
ae2900e510 [api-minor] Change the pageIndex, on PDFPageProxy instances, to a private property
This property has never been documented and/or *intentionally* exposed through the API, instead the `PDFPageProxy.pageNumber` property is the documented/intended API to use here.
Hence pageIndex is changed to a "private" property on `PDFPageProxy` instances, and internal API functionality is also updated to *consistently* use `this._pageIndex` rather than a mix of formats.
2020-03-19 15:47:11 +01:00
Jonas Jenwald
01fb309a2a [api-minor] Add more general OpenAction support (PR 10334 follow-up, issue 11642)
This patch deprecates the existing `getOpenActionDestination` API method, in favor of a better and more general `getOpenAction` method instead. (For now JavaScript actions, related to printing, are still handled as before.)

By clearly separating "regular" Print actions from the JavaScript handling, it's thus possible to get rid of the somewhat annoying and strictly incorrect warning when the viewer loads.
2020-03-06 13:03:00 +01:00
Jonas Jenwald
a5db4e985a Remove LoopbackPort.postMessage special-case for polyfilled TypedArrays
Given that all `TypedArray` polyfills were removed in PDF.js version `2.0`, since native support is now required, this branch has been dead code for awhile.
2020-02-13 12:50:41 +01:00
Jonas Jenwald
7117ee03d6 [api-minor] Change PDFDocumentProxy.cleanup/PDFPageProxy.cleanup to return data
This patch makes the following changes, to improve these API methods:

 - Let `PDFPageProxy.cleanup` return a boolean indicating if clean-up actually happened, since ongoing rendering will block clean-up.
   Besides being used in other parts of this patch, it seems that an API user may also be interested in the return value given that clean-up isn't *guaranteed* to happen.

 - Let `PDFDocumentProxy.cleanup` return the promise indicating when clean-up is finished.

 - Improve the JSDoc comment for `PDFDocumentProxy.cleanup` to mention that clean-up is triggered on *both* threads (without going into unnecessary specifics regarding what *exactly* said data actually is).
   Add a note in the JSDoc comment about not calling this method when rendering is ongoing.

 - Change `WorkerTransport.startCleanup` to throw an `Error` if it's called when rendering is ongoing, to prevent rendering from breaking.
   Please note that this won't stop *worker-thread* clean-up from happening (since there's no general "something is rendering"-flag), however I'm not sure if that's really a problem; but please don't quote me on that :-)
   All of the caches that's being cleared in `Catalog.cleanup`, on the worker-thread, *should* be re-filled automatically even if cleared *during* parsing/rendering, and the only thing that probably happens is that e.g. font data would have to be re-parsed.
  On the main-thread, on the other hand, clearing the caches is more-or-less guaranteed to cause rendering errors, since the rendering code in `src/display/canvas.js` isn't able to re-request any image/font data that's suddenly being pulled out from under it.

 - Last, but not least, add a couple of basic unit-tests for the clean-up functionality.
2020-02-07 17:00:29 +01:00
Jonas Jenwald
427df2dfd7 Send the verbosity level when setting up fake workers (issue 11536)
Interestingly the viewer already seem to work correctly as-is, with workers disabled and a non-standard `verbosity` level.
Hence this is possibly Node.js specific, but given that the issue is lacking *both* the PDF file in question and a runnable test-case, so this patch is essentially a best-effort guess at what the problem could be.
2020-01-26 12:37:45 +01:00
Jonas Jenwald
9e262ae7fa Enable the ESLint prefer-const rule globally (PR 11450 follow-up)
Please find additional details about the ESLint rule at https://eslint.org/docs/rules/prefer-const

With the recent introduction of Prettier this sort of mass enabling of ESLint rules becomes a lot easier, since the code will be automatically reformatted as necessary to account for e.g. changed line lengths.

Note that this patch is generated automatically, by using the ESLint `--fix` argument, and will thus require some additional clean-up (which is done separately).
2020-01-25 00:20:22 +01:00
Jonas Jenwald
a39943554a Simplify, and tweak, a couple of PDFJSDev checks
This removes a couple of, thanks to preceeding code, unnecessary `typeof PDFJSDev` checks, and also fixes a couple of incorrectly implemented (my fault) checks intended for `TESTING` builds.
2020-01-21 00:06:15 +01:00