Commit Graph

438 Commits

Author SHA1 Message Date
Jonas Jenwald
449c7763d5 [api-minor] Only support browsers/environments that have *basic* support for Promise natively
Based on https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise#Browser_compatibility and https://caniuse.com/#feat=promises, all even remotely modern browsers already support *basic* `Promise` functionality natively.

The only reason for keeping the `Promise` polyfill (at all) is to be able to support recent additions to the specification, such as e.g. `finally` and `allSettled`.
Note that this patch will, on its own, remove support for IE 11/Edge (non-Chromium based) in both the general PDF.js library and the default viewer.
2020-09-06 13:45:56 +02:00
Jonas Jenwald
784a420027 Add support, in Dict.merge, for merging of "sub"-dictionaries
This allows for merging of dictionaries one level deeper than previously. This could be useful e.g. for /Resources dictionaries, where you want to e.g. merge their respective /Font dictionaries (and other) together rather than picking just the first one.
2020-08-30 23:18:32 +02:00
Calixte Denizet
1a6816ba98 Add support for saving forms 2020-08-12 10:32:59 +02:00
Tim van der Meij
8c162f57f7
Merge pull request #12175 from calixteman/textfield
Support textfield and choice widgets for printing
2020-08-07 00:20:29 +02:00
Calixte Denizet
1747d259f9 Support textfield and choice widgets for printing 2020-08-06 14:45:23 +02:00
Jonas Jenwald
16fa9dc4ea Add support for Object.fromEntries
This provides a simpler way of creating an `Object` from e.g. a `Map`, without having to manually iterate over it.
Please see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Object/fromEntries
2020-08-06 14:39:51 +02:00
Brendan Dahl
ac494a2278 Add support for optional marked content.
Add a new method to the API to get the optional content configuration. Add
a new render task param that accepts the above configuration.
For now, the optional content is not controllable by the user in
the viewer, but renders with the default configuration in the PDF.

All of the test files added exhibit different uses of optional content.

Fixes #269.

Fix test to work with optional content.

- Change the stopAtErrors test to ensure the operator list has something,
  instead of asserting the exact number of operators.
2020-08-04 09:26:55 -07:00
Takashi Tamura
bc4648c0a6 Add types to functions exported as API in src/pdf.js. 2020-08-03 19:19:48 +09:00
Linus Gasser
f1bbfdc16d Add typescript definitions
This PR adds typescript definitions from the JSDoc already present.
It adds a new gulp-target 'types' that calls 'tsc', the typescript
compiler, to create the definitions.

To use the definitions, users can simply do the following:

```
import {getDocument, GlobalWorkerOptions} from "pdfjs-dist";
import pdfjsWorker from "pdfjs-dist/build/pdf.worker.entry";
GlobalWorkerOptions.workerSrc = pdfjsWorker;

const pdf = await getDocument("file:///some.pdf").promise;
```

Co-authored-by: @oBusk
Co-authored-by: @tamuratak
2020-07-30 11:10:37 +02:00
Takashi Tamura
473ea1f1a4 Make the detection of Node.js environments on Electron strict.
The main process and its child processes should be detected as Node.js environments.
2020-07-12 10:56:17 +09:00
Wojciech Maj
78970bbbe1
Replace non-inclusive "whitelist" term with "allowlist" 2020-06-29 17:15:14 +02:00
Jonas Jenwald
88fdb482b0 Move the isEmptyObj helper function from src/shared/util.js to test/unit/test_utils.js
Since this helper function is no longer used anywhere in the main code-base, but only in a couple of unit-tests, it's thus being moved to a more appropriate spot.

Finally, the implementation of `isEmptyObj` is also tweaked slightly by removing the manual loop.
2020-06-09 17:50:16 +02:00
Jonas Jenwald
0351852d74 [api-minor] Decode all JPEG images with the built-in PDF.js decoder in src/core/jpg.js
Currently some JPEG images are decoded by the built-in PDF.js decoder in `src/core/jpg.js`, while others attempt to use the browser JPEG decoder. This inconsistency seem unfortunate for a number of reasons:

 - It adds, compared to the other image formats supported in the PDF specification, a fair amount of code/complexity to the image handling in the PDF.js library.

 - The PDF specification support JPEG images with features, e.g. certain ColorSpaces, that browsers are unable to decode natively. Hence, determining if a JPEG image is possible to decode natively in the browser require a non-trivial amount of parsing. In particular, we're parsing (part of) the raw JPEG data to extract certain marker data and we also need to parse the ColorSpace for the JPEG image.

 - While some JPEG images may, for all intents and purposes, appear to be natively supported there's still cases where the browser may fail to decode some JPEG images. In order to support those cases, we've had to implement a fallback to the PDF.js JPEG decoder if there's any issues during the native decoding. This also means that it's no longer possible to simply send the JPEG image to the main-thread and continue parsing, but you now need to actually wait for the main-thread to indicate success/failure first.
   In practice this means that there's a code-path where the worker-thread is forced to wait for the main-thread, while the reverse should *always* be the case.

 - The native decoding, for anything except the *simplest* of JPEG images, result in increased peak memory usage because there's a handful of short-lived copies of the JPEG data (see PR 11707).
Furthermore this also leads to data being *parsed* on the main-thread, rather than the worker-thread, which you usually want to avoid for e.g. performance and UI-reponsiveness reasons.

 - Not all environments, e.g. Node.js, fully support native JPEG decoding. This has, historically, lead to some issues and support requests.

 - Different browsers may use different JPEG decoders, possibly leading to images being rendered slightly differently depending on the platform/browser where the PDF.js library is used.

Originally the implementation in `src/core/jpg.js` were unable to handle all of the JPEG images in the test-suite, but over the last couple of years I've fixed (hopefully) all of those issues.
At this point in time, there's two kinds of failure with this patch:

 - Changes which are basically imperceivable to the naked eye, where some pixels in the images are essentially off-by-one (in all components), which could probably be attributed to things such as different rounding behaviour in the browser/PDF.js JPEG decoder.
   This type of "failure" accounts for the *vast* majority of the total number of changes in the reference tests.

 - Changes where the JPEG images now looks *ever so slightly* blurrier than with the native browser decoder. For quite some time I've just assumed that this pointed to a general deficiency in the `src/core/jpg.js` implementation, however I've discovered when comparing two viewers side-by-side that the differences vanish at higher zoom levels (usually around 200% is enough).
   Basically if you disable [this downscaling in canvas.js](8fb82e939c/src/display/canvas.js (L2356-L2395)), which is what happens when zooming in, the differences simply vanish!
   Hence I'm pretty satisfied that there's no significant problems with the `src/core/jpg.js` implementation, and the problems are rather tied to the general quality of the downscaling algorithm used. It could even be seen as a positive that *all* images now share the same downscaling behaviour, since this actually fixes one old bug; see issue 7041.
2020-05-22 00:22:48 +02:00
Jonas Jenwald
ec0ab91a2b Reduce the usage of require statements in code-paths not protected by pre-processor and/or run-time checks
This replaces some additional `require`/`exports` usage with standard `import`/`export` statements instead.
Hence another, small, part in the effort to reduce the reliance on SystemJS-specific functionality in the development viewer.
2020-05-14 15:57:49 +02:00
Jonas Jenwald
e1f340a0c2 Use the ESLint no-restricted-syntax rule to ensure that assert is always called with two arguments
Having `assert` calls without a message string isn't very helpful when debugging, and it turns out that it's easy enough to make use of ESLint to enforce better `assert` call-sites.
In a couple of cases the `assert` calls were changed to "regular" throwing of errors instead, since that seemed more appropriate.

Please find additional details about the ESLint rule at https://eslint.org/docs/rules/no-restricted-syntax
2020-05-05 13:40:05 +02:00
Brendan Dahl
b1be33c96f Add more categories of unsupported features.
Fixes #11815
2020-05-04 11:02:16 -07:00
Jonas Jenwald
1cc3dbb694 Enable the dot-notation ESLint rule
*Please note:* These changes were done automatically, using the `gulp lint --fix` command.

This rule is already enabled in mozilla-central, see https://searchfox.org/mozilla-central/rev/567b68b8ff4b6d607ba34a6f1926873d21a7b4d7/tools/lint/eslint/eslint-plugin-mozilla/lib/configs/recommended.js#103-104

The main advantage, besides improved consistency, of this rule is that it reduces the size of the code (by 3 bytes for each case). In the PDF.js code-base there's close to 8000 instances being fixed by the `dot-notation` ESLint rule, which end up reducing the size of even the *built* files significantly; the total size of the `gulp mozcentral` build target changes from `3 247 456` to `3 224 278` bytes, which is a *reduction* of `23 178` bytes (or ~0.7%) for a completely mechanical change.

A large number of these changes affect the (large) lookup tables used on the worker-thread, but given that they are still initialized lazily I don't *think* that the new formatting this patch introduces should undo any of the improvements from PR 6915.

Please find additional details about the ESLint rule at https://eslint.org/docs/rules/dot-notation
2020-04-17 12:24:46 +02:00
Jonas Jenwald
426945b480 Update Prettier to version 2.0
Please note that these changes were done automatically, using `gulp lint --fix`.

Given that the major version number was increased, there's a fair number of (primarily whitespace) changes; please see https://prettier.io/blog/2020/03/21/2.0.0.html
In order to reduce the size of these changes somewhat, this patch maintains the old "arrowParens" style for now (once mozilla-central updates Prettier we can simply choose the same formatting, assuming it will differ here).
2020-04-14 12:28:14 +02:00
Jonas Jenwald
8770ca3014 Make the decryptAscii helper function, in src/core/type1_parser.js, slightly more efficient
By slicing the Uint8Array directly, rather than using the prototype and a `call` invocation, the runtime of `decryptAscii` is decreased slightly (~30% based on quick logging).
The `decryptAscii` function is still less efficient than `decrypt`, however ASCII encoded Type1 font programs are sufficiently rare that it probably doesn't matter much (we've only seen *two* examples, issue 4630 and 11740).
2020-04-06 11:21:02 +02:00
Jonas Jenwald
710704508c Fail early, in modern GENERIC builds, if certain required browser functionality is missing (issue 11762)
With two kind of builds now being produced, with/without translation/polyfills, it's unfortunately somewhat easy for users to accidentally pick the wrong one.

In the case where a user would attempt to use a modern build of PDF.js in an older browser, such as e.g. IE11, the failure would be immediate when the code is loaded (given the use of unsupported ECMAScript features).
However in some browsers/environments, in particular Node.js, a modern PDF.js build may load correctly and thus *appear* to function, only to fail for e.g. certain API calls. To hopefully lessen the support burden, and to try and improve things overall, this patch adds checks to ensure that a modern build of PDF.js cannot be used in browsers/environments which lack native support for critical functionality (such as e.g. `ReadableStream`). Hence we'll fail early, with an error message telling users to pick an ES5-compatible build instead.

To ensure that we actually test things better especially w.r.t. usage of the PDF.js library in Node.js environments, the `gulp npm-test` task as used by Node.js/Travis was changed (back) to test an ES5-compatible build.
(Since the bots still test the code as-is, without transpilation/polyfills, this shouldn't really be a problem as far as I can tell.)
As part of these changes there's now both `gulp lib` and `gulp lib-es5` build targets, similar to e.g. the generic builds, which thanks to some re-factoring only required adding a small amount of code.

*Please note:* While it's probably too early to tell if this will be a widespread issue, it's possible that this is the sort of patch that *may* warrant being `git cherry-pick`ed onto the current beta version (v2.4.456).
2020-04-01 19:42:48 +02:00
Jonas Jenwald
dcb16af968 Whitelist closure related cases to address the remaining no-shadow linting errors
Given the way that "classes" were previously implemented in PDF.js, using regular functions and closures, there's a fair number of false positives when the `no-shadow` ESLint rule was enabled.

Note that while *some* of these `eslint-disable` statements can be removed if/when the relevant code is converted to proper `class`es, we'll probably never be able to get rid of all of them given our naming/coding conventions (however I don't really see this being a problem).
2020-03-25 11:57:12 +01:00
Jonas Jenwald
1d2f787d6a Enable the ESLint no-shadow rule
This rule is *not* currently enabled in mozilla-central, but it appears commented out[1] in the ESLint definition file; see https://searchfox.org/mozilla-central/rev/c80fa7258c935223fe319c5345b58eae85d4c6ae/tools/lint/eslint/eslint-plugin-mozilla/lib/configs/recommended.js#238-239

Unfortunately this rule is, for fairly obvious reasons, impossible to `--fix` automatically (even partially) and each case thus required careful manual analysis.
Hence this ESLint rule is, by some margin, probably the most difficult one that we've enabled thus far. However, using this rule does seem like a good idea in general since allowing variable shadowing could lead to subtle (and difficult to find) bugs or at the very least confusing code.

Please find additional details about the ESLint rule at https://eslint.org/docs/rules/no-shadow

---
[1] Most likely, a very large number of lint errors have prevented this rule from being enabled thus far.
2020-03-25 11:56:05 +01:00
Jonas Jenwald
e4758beaaa Move IsLittleEndianCached and IsEvalSupportedCached to src/shared/util.js
Rather than duplicating the lookup and caching in multiple files, it seems easier to simply move all of this functionality into `src/shared/util.js` instead.
This will also help avoid a bunch of ESLint errors once the `no-shadow` rule is eventually enabled.
2020-03-12 11:36:26 +01:00
Tim van der Meij
f6ffc2bf37
Merge pull request #11598 from Snuffleupagus/polyfill-Map-Set-iteration
Add polyfills to support iteration of `Map` and `Set`
2020-02-14 23:24:20 +01:00
Jonas Jenwald
c97c778f8f [api-minor] Produce non-translated/non-polyfilled builds by default 2020-02-14 18:12:07 +01:00
Jonas Jenwald
4a76ab352c Add polyfills to support iteration of Map and Set
Without this, things such as e.g. `Metadata.getAll` is broken in IE11 (see PR 11596).

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Map#Browser_compatibility

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Set#Browser_compatibility
2020-02-14 15:53:02 +01:00
Jonas Jenwald
3f031f69c2 Move additional worker-thread only functions from src/shared/util.js and into a src/core/core_utils.js instead
This moves the `log2`, `readInt8`, `readUint16`, `readUint32`, and `isSpace` functions since they are only used in the worker-thread.
2020-01-25 00:33:52 +01:00
Jonas Jenwald
9e262ae7fa Enable the ESLint prefer-const rule globally (PR 11450 follow-up)
Please find additional details about the ESLint rule at https://eslint.org/docs/rules/prefer-const

With the recent introduction of Prettier this sort of mass enabling of ESLint rules becomes a lot easier, since the code will be automatically reformatted as necessary to account for e.g. changed line lengths.

Note that this patch is generated automatically, by using the ESLint `--fix` argument, and will thus require some additional clean-up (which is done separately).
2020-01-25 00:20:22 +01:00
Tim van der Meij
d2d9441373
Merge pull request #11489 from Snuffleupagus/rm-FIREFOX-define
Remove the `FIREFOX` build flag, since it's completely unused and simplify a couple of `PDFJSDev` checks
2020-01-24 23:59:13 +01:00
Tim van der Meij
668a29aa45
Merge pull request #11497 from Snuffleupagus/Promise-allSettled
Add support for `Promise.allSettled`
2020-01-22 23:06:54 +01:00
Jonas Jenwald
a39943554a Simplify, and tweak, a couple of PDFJSDev checks
This removes a couple of, thanks to preceeding code, unnecessary `typeof PDFJSDev` checks, and also fixes a couple of incorrectly implemented (my fault) checks intended for `TESTING` builds.
2020-01-21 00:06:15 +01:00
Takashi Tamura
00ce7898a2 Enable import/extensions of ESlint plugin to enforce all import have a .js file extension.
Related to #11465.

- https://github.com/benmosher/eslint-plugin-import/blob/master/docs/rules/extensions.md
2020-01-18 10:53:01 +09:00
Jonas Jenwald
2942233c9c Add support for Promise.allSettled
Please see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise/allSettled
2020-01-10 14:35:12 +01:00
Jonas Jenwald
36881e3770 Ensure that all import and require statements, in the entire code-base, have a .js file extension
In order to eventually get rid of SystemJS and start using native `import`s instead, we'll need to provide "complete" file identifiers since otherwise there'll be MIME type errors when attempting to use `import`.
2020-01-04 13:01:43 +01:00
Jonas Jenwald
a63f7ad486 Fix the linting errors, from the Prettier auto-formatting, that ESLint --fix couldn't handle
This patch makes the follow changes:
 - Remove no longer necessary inline `// eslint-disable-...` comments.
 - Fix `// eslint-disable-...` comments that Prettier moved down, thus causing new linting errors.
 - Concatenate strings which now fit on just one line.
 - Fix comments that are now too long.
 - Finally, and most importantly, adjust comments that Prettier moved down, since the new positions often is confusing or outright wrong.
2019-12-26 12:35:12 +01:00
Jonas Jenwald
de36b2aaba Enable auto-formatting of the entire code-base using Prettier (issue 11444)
Note that Prettier, purposely, has only limited [configuration options](https://prettier.io/docs/en/options.html). The configuration file is based on [the one in `mozilla central`](https://searchfox.org/mozilla-central/source/.prettierrc) with just a few additions (to avoid future breakage if the defaults ever changes).

Prettier is being used for a couple of reasons:

 - To be consistent with `mozilla-central`, where Prettier is already in use across the tree.

 - To ensure a *consistent* coding style everywhere, which is automatically enforced during linting (since Prettier is used as an ESLint plugin). This thus ends "all" formatting disussions once and for all, removing the need for review comments on most stylistic matters.

Many ESLint options are now redundant, and I've tried my best to remove all the now unnecessary options (but I may have missed some).
Note also that since Prettier considers the `printWidth` option as a guide, rather than a hard rule, this patch resorts to a small hack in the ESLint config to ensure that *comments* won't become too long.

*Please note:* This patch is generated automatically, by appending the `--fix` argument to the ESLint call used in the `gulp lint` task. It will thus require some additional clean-up, which will be done in a *separate* commit.

(On a more personal note, I'll readily admit that some of the changes Prettier makes are *extremely* ugly. However, in the name of consistency we'll probably have to live with that.)
2019-12-26 12:34:24 +01:00
Jonas Jenwald
8ec1dfde49 Add // prettier-ignore comments to prevent re-formatting of certain data structures
There's a fair number of (primarily) `Array`s/`TypedArray`s whose formatting we don't want disturb, since in many cases that would lead to the code becoming much more difficult to read and/or break existing inline comments.

*Please note:* It may be a good idea to look through these cases individually, and possibly re-write some of the them (especially the `String` ones) to reduce the need for all of these ignore commands.
2019-12-26 00:14:03 +01:00
Jonas Jenwald
e24050fa13 [api-minor] Move the ReadableStream polyfill to the global scope
Note that most (reasonably) modern browsers have supported this for a while now, see https://developer.mozilla.org/en-US/docs/Web/API/ReadableStream#Browser_compatibility

By moving the polyfill into `src/shared/compatibility.js` we can thus get rid of the need to manually export/import `ReadableStream` and simply use it directly instead.

The only change here which *could* possibly lead to a difference in behavior is in the `isFetchSupported` function. Previously we attempted to check for the existence of a global `ReadableStream` implementation, which could now pass (assuming obviously that the preceding checks also succeeded).
However I'm not sure if that's a problem, since the previous check only confirmed the existence of a native `ReadableStream` implementation and not that it actually worked correctly. Finally it *could* just as well have been a globally registered polyfill from an application embedding the PDF.js library.
2019-12-11 19:02:37 +01:00
Jonas Jenwald
a8fc306b6e Replace globalScope with the standard globalThis property instead
Please see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/globalThis and note that most (reasonably) modern browsers have supported this for a while now, see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/globalThis#Browser_compatibility

Since ESLint doesn't support this new global yet, it was added to the `globals` list in the top-level configuration file to prevent issues.

Finally, for older browsers a polyfill was added in `ssrc/shared/compatibility.js`.
2019-12-08 20:19:02 +01:00
Jonas Jenwald
878432784c [PDFHistory] Move the IE11 pushState/replaceState work-around to src/shared/compatibility.js (PR 10461 follow-up)
I've always disliked the solution in PR 10461, since it required changes to the `PDFHistory` code itself to deal with a bug in IE11.
Now that IE11 support is limited, it seems reasonable to remove these `pushState`/`replaceState` hacks from the main code-base and simply use polyfills instead.
2019-11-11 17:48:04 +01:00
Jonas Jenwald
74e00ed93c Change isNodeJS from a function to a constant
Given that this shouldn't change after the `pdf.js`/`pdf.worker.js` files have been loaded, it doesn't seems necessary to keep this as a function.
2019-11-10 16:44:29 +01:00
Jonas Jenwald
2817121bc1 Convert globalScope and isNodeJS to proper modules
Slightly unrelated to the rest of the patch, but this also removes an out-of-place `globals` definition from the `web/viewer.js` file.
2019-11-10 16:44:29 +01:00
Jonas Jenwald
80342e2fdc Support UTF-16 little-endian strings in the stringToPDFString helper function (bug 1593902)
The bug report seem to suggest that we don't support UTF-16 strings with a BOM (byte order mark), which we *actually* do as evident by both the code and a unit-test.
The issue at play here is rather that we previously only supported big-endian UTF-16 BOM, and the `Title` string in the PDF document is using a *little-endian* UTF-16 BOM instead.

Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1593902
2019-11-05 12:43:17 +01:00
Jonas Jenwald
0293222b96 [MessageHandler] Convert the code to a proper class 2019-10-30 23:22:59 +01:00
Jonas Jenwald
5d5733c0a7 [MessageHandler] Convert all instances of var to const in the code 2019-10-30 23:22:59 +01:00
Jonas Jenwald
f61fb3e0f9 [MessageHandler] Re-factor the _onComObjOnMessage function to use early returns
When `ReadableStream` support was added to the `MessageHandler`, the `_onComObjOnMessage` function became more complex than previously.
All of the nested `if`/`else if`/`else` branches are now, at least in my opinion, making some of this code a bit difficult to follow. Hence this patch, which attempts to help readability by making use of early `return`s and `Error`s.

The patch also changes a couple of `var`/`let` occurences to `const`.
2019-10-30 23:22:59 +01:00
Jonas Jenwald
62f28e11a3 [MessageHandler] Remove unnecessary usage of in from the code
Note that using `in` leads to unnecessary stringification of the properties, which seems completely unnecessary here. To avoid future problems from these changes the `MessageHandler.on` method will now assert, in non-`PRODUCTION`/`TESTING` builds, that it's always called with a function as expected.

This patch also renames `callbacksCapabilities` to `callbackCapabilities`, note the removed "s", since using a double plural format looks a bit strange.
2019-10-30 23:22:59 +01:00
Jonas Jenwald
3e46e800a0 [MessageHandler] Replace the internal isReply property, as sent when Promise callbacks are used, with enumeration values
Given that the `isReply` property is an internal implementation detail, changing its type shouldn't be a problem. Note that by directly indicating if either data or an Error is sent, it's no longer necessary to use `in` when handling the callback.
2019-10-30 23:22:59 +01:00
Tim van der Meij
ca3a58f93a
Consistently use @returns for returned data types in JSDoc comments
Sometimes we also used `@return`, but `@returns` is what the JSDoc
documentation recommends. Even though `@return` works as an alias, it's
good to use the recommended syntax and to be consistent within the
project.
2019-10-13 13:58:17 +02:00
Tim van der Meij
f4daafc077
Consistently use square brackets for optional parameters in JSDoc comments
Square brackets are recommended to indicate optional parameters. Using
them helps for automatically generating correct documentation.
2019-10-13 13:58:17 +02:00
Tim van der Meij
efd331daa1
Consistently use string for string data types in JSDoc comments
Sometimes we also used `String`, but `string` is the what the JSDoc
documentation recommends.
2019-10-13 13:58:17 +02:00
Jonas Jenwald
03387ebaa8 Update src/shared/compatibility.js to only run with SKIP_BABEL = false set
Rather than specifying certain build targets manually, it seems much more appropriate (and future-proof) to use the `SKIP_BABEL` build target instead.

Also, the patch adds a missing `/* eslint no-var: error */` line since I'm touch the file anyway and no code-changes were necessary for it.
2019-10-13 11:33:41 +02:00
Jonas Jenwald
eabedab38e [MessageHandler] Add a non-PRODUCTION/TESTING check to ensure that wrapReason is called with a valid reason
There shouldn't be any situation where `reason` isn't either an `Error`, or a cloned "Error" sent via `postMessage`.
2019-10-06 14:15:13 +02:00
Jonas Jenwald
9201c8dad4 [MessageHandler] Convert the deleteStreamController helper function to a "private" method instead 2019-10-06 14:15:02 +02:00
Jonas Jenwald
5d93fda4f2 Convert the various ...Exceptions to proper classes, to reduce code duplication
By utilizing a base "class", things become significantly simpler. Unfortunately the new `BaseException` cannot be a proper ES6 class and just extend `Error`, since the SystemJS dependency doesn't seem to play well with that.
Note also that we (generally) need to keep the `name` property on the actual `...Exception` object, rather than on its prototype, since the property will otherwise be dropped during the structured cloning used with `postMessage`.
2019-09-29 10:16:20 +02:00
Jonas Jenwald
0ee373f9cc Replace the bundled ReadableStream polyfill with the web-streams-polyfill npm package (issue 11157)
Compared to the recently replaced `URL` polyfill, the new `ReadableStream` polyfill isn't being exported globally for two reasons:
 - We're currently checking for the existence of a global `ReadableStream` implementation when determining if the Fetch API will be used; please see `isFetchSupported` in the src/display/display_utils.js file.
 - Given that it's much newer functionality (compared to `URL`) and that not all browsers may implement all parts of the specification yet, not exposing the `ReadableStream` globally seems safer for now.
2019-09-23 22:16:59 +02:00
Tim van der Meij
1f5ebfbf0c
Replace our URL polyfill with the one from core-js
`core-js` polyfills have proven to be of good quality and using them
prevents us from having to maintain them ourselves.
2019-09-19 14:09:51 +02:00
Tim van der Meij
c71a291317
Upgrade core-js to version 3.2.1
This only required changing the import paths. The `es` folder contains
all polyfills we need now. If we want to import everything, we need to
explicitly require the `index` file.
2019-09-19 13:58:36 +02:00
Jonas Jenwald
4bd79ec4b3 Inline the resolveOrReject helper function at its call-sites in MessageHandler, and rename an error key to reason
Given that there's only a couple of call-sites, and that the helper function is really simple, it doesn't seem entirely necessary to keep it around. While fewer function calls is always a good thing, in this case the performance impact is small enough to be unmeasurable.

With *one* single exception the code in `MessageHandler` is using `reason` when passing around various Errors, hence this patch also renames an `error` key for consistency.
2019-09-17 14:22:24 +02:00
Jonas Jenwald
0617984b59 Remove unnecessary data.streamId accesses in MessageHandler._processStreamMessage, and use a constant object shape in MessageHandler.sendWithStream
The `streamId` short-hand in `MessageHandler._processStreamMessage` was only used partially througout the method, which seemed kind of strange, hence that's fixed in this patch.
Furthermore, always giving the `streamController` object a constant shape in `MessageHandler.sendWithStream` cannot hurt either.
2019-09-17 14:18:57 +02:00
Jonas Jenwald
74f5a59f43 Ensure that the cancel/error methods on Streams are always called with valid reason arguments 2019-09-02 23:31:07 +02:00
Jonas Jenwald
02bdacef42 Ensure that Errors are handled correctly when using postMessage with Streams in MessageHandler
Having recently worked with this code, it struck me that most of the `postMessage` calls where `Error`s are involved have never been correctly implemented (i.e. missing `wrapReason` calls).
2019-09-02 23:31:07 +02:00
Jonas Jenwald
cd82b81bc7 Inline the resolveCall helper function at its call-sites in MessageHandler
There's only three call-sites and one of them doesn't even need the complete functionality of `resolveCall`, hence it seems reasonable to just inline this code.
An additional benefit of this is that the `Function.prototype.apply()` instance can also be converted into "normal" function calls, which should be a tiny bit more efficient.

The patch also replaces a number of unnecessary arrow functions, in relevant parts of the `MessageHandler` code, with "normal" functions instead.
Finally, all `Promise.resolve().then(...)` calls are replaced with `new Promise(...)` instead since the latter is a tiny bit more efficient. This also explains the test failures on the Linux bot, with a prior version of the patch, since the `Promise.resolve().then(...)` format essentially creates two Promises thus causing additional delay.
2019-09-01 13:40:19 +02:00
Jonas Jenwald
055f03938b Remove support for the scope parameter in the MessageHandler.on method
At this point in time it's easy to convert the `MessageHandler.on` call-sites to use arrow functions, and thus let the JavaScript engine handle scopes for us, rather than having to manually keep references to the relevant scopes in `MessageHandler`.[1]
An additional benefit of this is that a couple of `Function.prototype.call()` instances can now be converted into "normal" function calls, which should be a tiny bit more efficient.

All in all, I don't see any compelling reason why it'd be necessary to keep supporting custom `scope`s in the `MessageHandler` implementation.

---
[1] In the event that a custom scope is ever needed, simply using `bind` on the handler function when calling `MessageHandler.on` ought to work as well.
2019-09-01 09:24:15 +02:00
Jonas Jenwald
f71ea2de0e Remove the makeReasonSerializable helper function, and use wrapReason instead, in src/shared/message_handler.js
Since `wrapReason` and `makeReasonSerializable` are essentially functionally equivalent it doesn't seem necessary to keep both of them around, especially when `makeReasonSerializable` only has a *single* call-site.
2019-08-30 19:36:10 +02:00
Jonas Jenwald
4e6a9b54c7 Change the internal stream property, as sent when Streams are used, from a String to a Number
Given that the `stream` property is an internal implementation detail, changing its type shouldn't be a problem. By using Numbers instead, we can avoid unnecessary String allocations when creating/processing Streams.
2019-08-30 13:27:18 +02:00
Jonas Jenwald
252a3e35fb Reduce the amount of unnecessary function calls and object allocations, in MessageHandler, when using Streams
With PR 11069 we're now using Streams for OperatorList parsing (in addition to just TextContent parsing), which brings the nice benefit of being able to easily abort parsing on the worker-thread thus saving resources.

However, since we're now creating many more `ReadableStream` there appears to be a tiny bit more overhead because of it (giving ~1% slower runtime of `browsertest` on the bots). In this case we're just going to have to accept such a small regression, since the benefits of using Streams clearly outweighs it.

What we *can* do here, is to try and make the Streams part of the `MessageHandler` implementation slightly more efficient by e.g. removing unnecessary function calls (which has been helpful in other parts of the code-base). To that end, this patch makes the following changes:

 - Actually support `transfers` in `MessageHandler.sendWithStream`, since the parameter was being ignored.

 - Inline the `sendStreamRequest`/`sendStreamResponse` helper functions at their respective call-sites. Obviously this causes some amount of code duplication, however I still think this change seems reasonable since for each call-site:
   - It avoids making one unnecessary function call.
   - It avoids allocating one temporary object.
   - It avoids sending, and thus structure clone, various undefined object properties.

 - Inline objects in the `MessageHandler.{send, sendWithPromise}` methods.

 - Finally, directly call `comObj.postMessage` in various methods when `transfers` are *not* present, rather than calling `MessageHandler.postMessage`, to further reduce the amount of function calls.
2019-08-30 12:32:20 +02:00
Brendan Dahl
c8129b8787 Move polyfill for codePointAt to String prototype.
This method belongs on the prototype not the String object.
2019-08-16 14:32:43 -07:00
Jonas Jenwald
7f456b3e2e Replace of all usages of var with let/const in the src/shared/util.js file
Also removes a couple of unnecessary (temporary) variable assigments in `arraysToBytes` and uses template strings in a few spots.
2019-08-11 14:35:35 +02:00
Jonas Jenwald
f6c4a1f080 Convert Util to a class with static methods
Also replaces `var` with `const` in all the relevant code.
2019-08-11 14:35:35 +02:00
Jonas Jenwald
7ee370a394 Remove the skipEmpty parameter from Util.intersect (PR 11059 follow-up)
Looking at this again, it struck me that added functionality in `Util.intersect` is probably more confusing than helpful in general; sorry about the churn in this code!
Based on the parameter name you'd probably expect it to only match when the intersection is `[0, 0,  0, 0]` and not when only one component is zero, hence the `skipEmpty` parameter thus feels too tightly coupled to the `Page.view` getter.
2019-08-11 14:33:52 +02:00
Jonas Jenwald
d637b25e36 Fallback gracefully when encountering corrupt PDF files with empty /MediaBox and /CropBox entries
This is based on a real-world PDF file I encountered very recently[1], although I'm currently unable to recall where I saw it.
Note that different PDF viewers handle these sort of errors differently, with Adobe Reader outright failing to render the attached PDF file whereas PDFium mostly handles it "correctly".

The patch makes the following notable changes:
 - Refactor the `cropBox` and `mediaBox` getters, on the `Page`, to reduce unnecessary duplication. (This will also help in the future, if support for extracting additional page bounding boxes are added to the API.)
 - Ensure that the page bounding boxes, i.e. `cropBox` and `mediaBox`, are never empty to prevent issues/weirdness in the viewer.
 - Ensure that the `view` getter on the `Page` will never return an empty intersection of the `cropBox` and `mediaBox`.
 - Add an *optional* parameter to `Util.intersect`, to allow checking that the computed intersection isn't actually empty.
 - Change `Util.intersect` to have consistent return types, since Arrays are of type `Object` and falling back to returning a `Boolean` thus seem strange.

---

[1] In that case I believe that only the `cropBox` was empty, but it seemed like a good idea to attempt to fix a bunch of related cases all at once.
2019-08-09 10:18:13 +02:00
Brendan Dahl
31d71808e7 [api-minor] Update telemetry to use 'categorical' histograms.
Firefox telemetry supports using string labels now. Convert our integers
that we used for categories to just use strings.

The upstream work will happen in:
https://bugzilla.mozilla.org/show_bug.cgi?id=1566882
2019-08-01 09:51:02 -07:00
vlastimilmaca
fe49f0f766 Annotations - Implement parsing of IRT, RT, State and StateModel 2019-07-16 23:33:07 +02:00
Jonas Jenwald
b548bafef7 Simplify, and inline, the finalize function in the MessageHandler class
The `finalize` helper function has only a *single* call-site, and furthermore it's just a one-liner too. Furthermore it's only ever called with a `Promise` as its argument, meaning that it's unnecessarily convoluted as well (i.e. the `Promise.resolve()` part shouldn't be necessary).
Hence this code can be both simplified *and* inlined at its only call-site instead.
2019-07-13 17:54:32 +02:00
Jonas Jenwald
17116917f7 Remove useless wrapReason calls in the MessageHandler class
Currently `wrapReason` is manually called at *every* `resolveOrReject` call-site, despite it being completely unnecessary unless there's an actual error being handled. This is obviously inefficient, and it's easy enough to avoid by having `resolveOrReject` handle this only when actually needed.
2019-07-13 13:08:29 +02:00
Jonas Jenwald
173fbef05b Enable the consistent-return ESLint rule
This rule is already enabled in mozilla-central, and helps ensure more consistent functions/methods, see https://searchfox.org/mozilla-central/rev/b9da45f63cb567244933c77b2c7e827a057d3f9b/tools/lint/eslint/eslint-plugin-mozilla/lib/configs/recommended.js#119-120

Please see https://eslint.org/docs/rules/consistent-return for additional information.
2019-05-11 14:27:21 +02:00
wuhao.daraw
1472c10bab fix: electron enviroment detection 2019-03-26 20:52:49 +08:00
Jonas Jenwald
88f9e633dd Try to improve text-selection for Type3 fonts that utilize a non-default /FontMatrix (bug 1513120)
For Type3 fonts text-selection is often not that great, and there's a couple of heuristics used to try and improve things. This patch simple extends those heuristics a bit, and fixes a pre-existing "naive" array comparison, but this all feels a bit brittle to say the least.

The existing Type3 test-coverage isn't that great in general, and in particular Type3 `text` tests are few and far between, hence why this patch adds *two* different new `text` tests.
2019-03-12 10:32:08 +01:00
Jonas Jenwald
7caf769a66 Move the deprecated helper function to the src/display/display_utils.js file
Given that the function is (purposely) independent of the verbosity level and that its message is worded to only apply on the main-thread, there's no reason to duplicate this across the built `pdf.js`/`pdf.worker.js` files.
2019-03-02 20:23:56 +01:00
Jonas Jenwald
db5dc14158 Move worker-thread only functions from src/shared/util.js and into a new src/core/core_utils.js file
The `src/shared/util.js` file is being bundled into both the `pdf.js` and `pdf.worker.js` files, meaning that its code is by definition duplicated.
Some main-thread only utility functions have already been moved to a separate `src/display/display_utils.js` file, and this patch simply extends that concept to utility functions which are used *only* on the worker-thread.

Note in particular the `getInheritableProperty` function, which expects a `Dict` as input and thus *cannot* possibly ever be used on the main-thread.
2019-02-24 00:35:39 +01:00
Jonas Jenwald
b6d090cc14 Fallback to the built-in font renderer when font loading fails
After PR 9340 all glyphs are now re-mapped to a Private Use Area (PUA) which means that if a font fails to load, for whatever reason[1], all glyphs in the font will now render as Unicode glyph outlines.
This obviously doesn't look good, to say the least, and might be seen as a "regression" since previously many glyphs were left in their original positions which provided a slightly better fallback[2].

Hence this patch, which implements a *general* fallback to the PDF.js built-in font renderer for fonts that fail to load (i.e. are rejected by the sanitizer). One caveat here is that this only works for the Font Loading API, since it's easy to handle errors in that case[3].

The solution implemented in this patch does *not* in any way delay the loading of valid fonts, which was the problem with my previous attempt at a solution, and will only require a bit of extra work/waiting for those fonts that actually fail to load.

*Please note:* This patch doesn't fix any of the underlying PDF.js font conversion bugs that's responsible for creating corrupt font files, however it does *improve* rendering in a number of cases; refer to this possibly incomplete list:

[Bug 1524888](https://bugzilla.mozilla.org/show_bug.cgi?id=1524888)
Issue 10175
Issue 10232

---
[1] Usually because the PDF.js font conversion code wasn't able to parse the font file correctly.

[2] Glyphs fell back to some default font, which while not accurate was more useful than the current state.

[3] Furthermore I'm not sure how to implement this generally, assuming that's even possible, and don't really have time/interest to look into it either.
2019-02-11 10:27:08 +01:00
Jonas Jenwald
3bcf9187ec Add a polyfill for classList.{add, remove} with more than one parameter
Unsurprisingly IE11 doesn't support this, so a polyfill is needed since otherwise the sidebar can no longer be opened.

Also, simplifies the existing `classList.toggle` polyfill.
2019-02-08 13:35:01 +01:00
Jonas Jenwald
614e502227 [api-minor] Remove the document.currentScript polyfill
This polyfill is currently used in only *one* file, i.e. `src/display/api.js`, and only when trying to build a *fallback* `workerSrc` path.

Given that the global `workerSrc` should *always* be set[1] when using the PDF.js library[2], and that the fallback `workerSrc` should only be regarded as a best-effort solution anyway, there isn't a particularily strong reason to keep the compatibility code in my opinion.

---
[1] Other supported options include setting the global `workerPort`, or passing in a `PDFWorker` instance as part of the `getDocument` call.

[2] Which is clearly mentioned in the JSDocs in `src/display/worker_options.js`.
2019-02-03 14:09:24 +01:00
Jonas Jenwald
22468817e1 Add a settled property, tracking the fulfilled/rejected stated of the Promise, to createPromiseCapability
This allows cleaning-up code which is currently manually tracking the state of the Promise of a `createPromiseCapability` instance.
2019-02-02 15:18:56 +01:00
Jonas Jenwald
249b199ff1 Stop bundling the ReadableStream polyfill in MOZCENTRAL builds (PR 10470 follow-up)
Based on the discussion in https://bugzilla.mozilla.org/show_bug.cgi?id=1521413, this patch simply removes the `ReadableStream` polyfill completely from MOZCENTRAL builds.

With this patch, the size of the `gulp mozcentral` build target is thus further reduced (building on PR 10470):

|       | `build/mozcentral`
|-------|-------------------
|master |   3 339 666
|patch  |   3 209 572
2019-01-23 20:33:20 +01:00
Jonas Jenwald
01d624f6a0 Add an Array.from polyfill, using core-js, and remove some compatibility hacks from the src/display/content_disposition.js file 2019-01-20 08:49:20 +01:00
Tim van der Meij
66acc7397f
Merge pull request #10470 from Snuffleupagus/mozcentral-streams
Try to, completely, avoid loading the `ReadableStream` polyfill in MOZCENTRAL builds
2019-01-19 21:22:18 +01:00
Jonas Jenwald
480110625a Try to, completely, avoid loading the ReadableStream polyfill in MOZCENTRAL builds
With https://bugzilla.mozilla.org/show_bug.cgi?id=1505122 landing in Firefox 65, the native `ReadableStream` implementation is now enabled by default in Firefox.

Obviously it would be nice to simply stop bundling the polyfill in MOZCENTRAL builds altogether, however given that it's still possible to disable[1] `ReadableStream` this is probably not a good idea just yet.
Nonetheless, now that native support is available, it seems unnecessary (and wasteful) to keep bundling the polyfill twice[2] in MOZCENTRAL builds. Hence this patch, which contains a suggest approach for packing the polyfill in a *separate* file which is then *only* loaded if/when needed.

With this patch, the size of the `gulp mozcentral` build target is thus reduced accordingly:

|       | `build/mozcentral`
|-------|-------------------
|master |   3 461 089
|patch  |   3 340 268

Besides the PDF.js files taking up less space in Firefox this way, the additional benefit is that there's (by default) less code that needs to be loaded and parsed when the PDF Viewer is used which also cannot hurt.

---
[1] In `about:config`, by toggling the `javascript.options.streams` preference.

[2] Once in the `build/pdf.js` file, and once in the `build/pdf.worker.js` file.
2019-01-19 09:05:01 +01:00
Jonas Jenwald
24a688d6c6 Convert some usage of indexOf to startsWith/includes where applicable
In many cases in the code you don't actually care about the index itself, but rather just want to know if something exists in a String/Array or if a String starts in a particular way. With modern JavaScript functionality, it's thus possible to remove a number of existing `indexOf` cases.
2019-01-18 17:57:41 +01:00
Jonas Jenwald
68ad3e8e9d Tweak the DOMTokenList.toggle polyfill (issue 10460) 2019-01-16 20:15:44 +01:00
Jonas Jenwald
358cd0c096 Add a few more String polyfills (startsWith, endsWith, padStart, padEnd) 2019-01-06 20:10:55 +01:00
Jonas Jenwald
60bcce184e Check that the first page can be successfully loaded, to try and ascertain the validity of the XRef table (issue 7496, issue 10326)
For PDF documents with sufficiently broken XRef tables, it's usually quite obvious when you need to fallback to indexing the entire file. However, for certain kinds of corrupted PDF documents the XRef table will, for all intents and purposes, appear to be valid. It's not until you actually try to fetch various objects that things will start to break, which is the case in the referenced issues[1].

Since there's generally a real effort being in made PDF.js to load even corrupt PDF documents, this patch contains a suggested approach to attempt to do a bit more validation of the XRef table during the initial document loading phase.

Here the choice is made to attempt to load the *first* page, as a basic sanity check of the validity of the XRef table. Please note that attempting to load a more-or-less arbitrarily chosen object without any context of what it's supposed to be isn't a very useful, which is why this particular choice was made.
Obviously, just because the first page can be loaded successfully that doesn't guarantee that the *entire* XRef table is valid, however if even the first page fails to load you can be reasonably sure that the document is *not* valid[2].

Even though this patch won't cause any significant increase in the amount of parsing required during initial loading of the document[3], it will require loading of more data upfront which thus delays the initial `getDocument` call.
Whether or not this is a problem depends very much on what you actually measure, please consider the following examples:

```javascript
console.time('first');
getDocument(...).promise.then((pdfDocument) => {
  console.timeEnd('first');
});

console.time('second');
getDocument(...).promise.then((pdfDocument) => {
  pdfDocument.getPage(1).then((pdfPage) => { // Note: the API uses `pageNumber >= 1`, the Worker uses `pageIndex >= 0`.
    console.timeEnd('second');
  });
});
```

The first case is pretty much guaranteed to show a small regression, however the second case won't be affected at all since the Worker caches the result of `getPage` calls. Again, please remember that the second case is what matters for the standard PDF.js use-case which is why I'm hoping that this patch is deemed acceptable.

---
[1] In issue 7496, the problem is that the document is edited without the XRef table being correctly updated.
In issue 10326, the generator was sorting the XRef table according to the offsets rather than the objects.

[2] The idea of checking the first page in particular came from the "standard" use-case for the PDF.js library, i.e. the default viewer, where a failure to load the first page basically means that nothing will work; note how `{BaseViewer, PDFThumbnailViewer}.setDocument` depends completely on being able to fetch the *first* page.

[3] The only extra parsing is caused by, potentially, having to traverse *part* of the `Pages` tree to find the first page.
2018-12-29 12:47:25 +01:00
Romain Petit
13b0ca6b2a Don't detect nw.js as node.js
nw.js is chrome plus nodejs.
It will succeed everywhere chrome succeeds, but fail in many cases where nodejs succeeds (see issue 9071).
So it's safer to consider it as a browser context rather than a nodejs context.

Make travis happy again

CS

Readability + Explanation

The relevant portion of the NW.js documentation:
http://docs.nwjs.io/en/latest/For%20Users/Advanced/JavaScript%20Contexts%20in%20NW.js/#access-nodejs-and-nwjs-api-in-browser-context

Added full link to relevant doc.
2018-11-07 11:14:22 +01:00
Jonas Jenwald
f23dba1c10 Change canvasInRendering to a WeakSet instead of a WeakMap
Note how nowhere in the code `canvasInRendering.get()` is ever called, and that this structure is really only used to store references to `<canvas>` DOM elements.
The reason for this being a `WeakMap` is probably because at the time we weren't using `core-js` polyfills yet, and since there already existed a manually implemented `WeakMap` polyfill it was probably simpler to use that.
2018-10-31 18:15:23 +01:00
Jonas Jenwald
4cde844ffe Add a DOMTokenList.toggle polyfill for the second, optional, "force" parameter
This is based on the polyfill available at https://developer.mozilla.org/en-US/docs/Web/API/Element/classList#Polyfill
2018-10-12 15:41:09 +02:00
Jonas Jenwald
d6f4d2ff33 Add a Symbol polyfill, using core-js, to allow using for...of loops
https://github.com/zloirock/core-js#ecmascript-symbol
2018-09-29 16:05:00 +02:00
Tim van der Meij
bf13c8a50b
Use the const keyword for constants in src/shared/util.js
Moreover, move general constants to the top of the file, i.e., those
that are not closely tied to a function in the file.
2018-09-11 16:17:45 +02:00
Tim van der Meij
99de25d6cc
Implement unit tests for the isSameOrigin and createValidAbsoluteUrl utility functions
Moreover, mark the `isValidProtocol` function as private since it's only
used in the utilities file and is not (meant to be) exported.
2018-09-11 16:17:45 +02:00
Tim van der Meij
66422eb83e
Merge pull request #9340 from brendandahl/private-use
Map all glyphs to the private use area and duplicate the first glyph.
2018-09-08 17:51:04 +02:00