pdf.js

Author	SHA1	Message	Date
Jonas Jenwald	3f031f69c2	Move additional worker-thread only functions from `src/shared/util.js` and into a `src/core/core_utils.js` instead This moves the `log2`, `readInt8`, `readUint16`, `readUint32`, and `isSpace` functions since they are only used in the worker-thread.	2020-01-25 00:33:52 +01:00
Jonas Jenwald	83bdb525a4	Fix remaining linting errors, from enabling the `prefer-const` ESLint rule globally This covers cases that the `--fix` command couldn't deal with, and in a few cases (notably `src/core/jbig2.js`) the code was changed to use block-scoped variables instead.	2020-01-25 00:20:23 +01:00
Jonas Jenwald	9e262ae7fa	Enable the ESLint `prefer-const` rule globally (PR 11450 follow-up) Please find additional details about the ESLint rule at https://eslint.org/docs/rules/prefer-const With the recent introduction of Prettier this sort of mass enabling of ESLint rules becomes a lot easier, since the code will be automatically reformatted as necessary to account for e.g. changed line lengths. Note that this patch is generated automatically, by using the ESLint `--fix` argument, and will thus require some additional clean-up (which is done separately).	2020-01-25 00:20:22 +01:00
Tim van der Meij	d2d9441373	Merge pull request #11489 from Snuffleupagus/rm-FIREFOX-define Remove the `FIREFOX` build flag, since it's completely unused and simplify a couple of `PDFJSDev` checks	2020-01-24 23:59:13 +01:00
Tim van der Meij	668a29aa45	Merge pull request #11497 from Snuffleupagus/Promise-allSettled Add support for `Promise.allSettled`	2020-01-22 23:06:54 +01:00
Tim van der Meij	a88dec197f	Merge pull request #11511 from Snuffleupagus/eslint-no-nested-ternary Enable the `no-nested-ternary` ESLint rule (PR 11488 follow-up)	2020-01-22 22:52:59 +01:00
Jonas Jenwald	3b78f4e8f8	Fix a couple of cases where Prettier broke existing formatting (PR 11446 follow-up) These two cases should have been whitelisted prior to re-formatting respectively had the comments fixed afterwards, however I unfortunately missed them because of the massive size of the diff.	2020-01-22 09:12:12 +01:00
Jani Pehkonen	809b96b40c	Hide .notdef glyphs in non-embedded Type1 fonts and don't ignore Widths Fixes #11403 The PDF uses the non-embedded Type1 font Helvetica. Character codes 194 and 160 (`Â` and `NBSP`) are encoded as `.notdef`. We shouldn't show those glyphs because it seems that Acrobat Reader doesn't draw glyphs that are named `.notdef` in fonts like this. In addition to testing `glyphName === ".notdef"`, we must test also `glyphName === ""` because the name `""` is used in `core/encodings.js` for undefined glyphs in encodings like `WinAnsiEncoding`. The solution above hides the `Â` characters but now the replacement character (space) appears to be too wide. I found out that PDF.js ignores font's `Widths` array if the font has no `FontDescriptor` entry. That happens in #11403, so the default widths of Helvetica were used as specified in `core/metrics.js` and `.nodef` got a width of 333. The correct width is 0 as specified by the `Widths` array in the PDF. Thus we must never ignore `Widths`.	2020-01-21 21:35:25 +02:00
Jonas Jenwald	a39943554a	Simplify, and tweak, a couple of `PDFJSDev` checks This removes a couple of, thanks to preceeding code, unnecessary `typeof PDFJSDev` checks, and also fixes a couple of incorrectly implemented (my fault) checks intended for `TESTING` builds.	2020-01-21 00:06:15 +01:00
Jonas Jenwald	7322a24ce4	Remove the `FIREFOX` build flag, since it's completely unused After PR 9566, which removed all of the old Firefox extension code, the `FIREFOX` build flag is no longer used for anything. It thus seems to me that it should be removed, for a couple of reasons: - It's simply dead code now, which only serves to add confusion when looking at the `PDFJSDev` calls. - It used to be that `MOZCENTRAL` and `FIREFOX` was almost always used together. However, ever since PR 9566 there's obviously been no effort put into keeping the `FIREFOX` build flags up to date. - In the event that a new, Webextension based, Firefox addon is created in the future you'd still need to audit all `MOZCENTRAL` (and possibly `CHROME`) build flags to see what'd make sense for the addon.	2020-01-21 00:06:15 +01:00
Tim van der Meij	ccf327538b	Merge pull request #11519 from tamuratak/enable_eslint_import_extensions Enable import/extensions of ESlint plugin to enforce all `import` have a `.js` file extension.	2020-01-19 17:37:19 +01:00
Jonas Jenwald	ee87e898db	Update the `GlobalWorkerOptions.workerSrc` JSDoc comment This particular JSDoc comment is fairly old and it also contains some now unrelated/confusing information. The only way to guarantee that the PDF.js library works as expected is to correctly set the global `workerSrc`[1], hence giving the impression that the option isn't strictly necessary is thus incorrect. --- [1] Since advertising the fallbackWorkerSrc functionality definitely seems like the wrong thing to do.	2020-01-19 12:44:42 +01:00
Takashi Tamura	00ce7898a2	Enable import/extensions of ESlint plugin to enforce all `import` have a `.js` file extension. Related to #11465. - https://github.com/benmosher/eslint-plugin-import/blob/master/docs/rules/extensions.md	2020-01-18 10:53:01 +09:00
Jonas Jenwald	9ab7c280aa	Cache the fallback font dictionary on the `PartialEvaluator` (PR 11218 follow-up) This way we'll benefit from the existing font caching, and can thus avoid re-creating a fallback font over and over again during parsing. (Thece changes necessitated the previous patch, since otherwise breakage could occur e.g. with fake workers.)	2020-01-16 15:12:05 +01:00
Jonas Jenwald	090ff116d4	Ensure that full clean-up is always run when handling the "Terminate" message in `src/core/worker.js` This is beneficial in situations where the Worker is being re-used, for example with fake workers, since it ensures that things like font resources are actually released.	2020-01-16 15:11:56 +01:00
Jonas Jenwald	c591826f3b	Enable the `no-nested-ternary` ESLint rule (PR 11488 follow-up) This rule is already enabled in mozilla-central, and helps avoid some confusing formatting, see https://searchfox.org/mozilla-central/rev/9e45d74b956be046e5021a746b0c8912f1c27318/tools/lint/eslint/eslint-plugin-mozilla/lib/configs/recommended.js#209-210 With the recent introduction of Prettier some of the existing nested ternary statements became even more difficult to read, since any possibly helpful indentation was removed. This particular ESLint rule wasn't entirely straightforward to enable, and I do recognize that there's a certain amount of subjectivity in the changes being made. Generally, the changes in this patch fall into three categories: - Cases where a value is only clamped to a certain range (the easiest ones to update). - Cases where the values involved are "simple", such as Numbers and Strings, which are re-factored to initialize the variable with the default value and only update it when necessary by using `if`/`else if` statements. - Cases with more complex and/or larger values, such as TypedArrays, which are re-factored to let the variable be (implicitly) undefined and where all values are then set through `if`/`else if`/`else` statements. Please find additional details about the ESLint rule at https://eslint.org/docs/rules/no-nested-ternary	2020-01-14 17:49:39 +01:00
Jonas Jenwald	78917bab91	Update `src/display/{annotation_layer.js, svg.js}` to determine the `fontWeight` in the same way as with canvas (PR 6091 and 7839 follow-up)	2020-01-14 15:29:59 +01:00
Jonas Jenwald	6590cc32f2	Extract the subroutine bias computation into a helper function in `src/core/font_renderer.js`	2020-01-14 15:29:53 +01:00
Jonas Jenwald	2942233c9c	Add support for `Promise.allSettled` Please see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise/allSettled	2020-01-10 14:35:12 +01:00
Tim van der Meij	93aa613db7	Merge pull request #11465 from Snuffleupagus/import-file-extension Ensure that all `import` and `require` statements, in the entire code-base, have a `.js` file extension	2020-01-06 23:24:43 +01:00
Jonas Jenwald	94f084958a	Update the year in the `license_header` files	2020-01-05 12:14:03 +01:00
Jonas Jenwald	36881e3770	Ensure that all `import` and `require` statements, in the entire code-base, have a `.js` file extension In order to eventually get rid of SystemJS and start using native `import`s instead, we'll need to provide "complete" file identifiers since otherwise there'll be MIME type errors when attempting to use `import`.	2020-01-04 13:01:43 +01:00
Jonas Jenwald	f8ab8c4d3a	Move the SegoeUISymbol font to the `getNonStdFontMap` (PR 8698 follow-up) For reasons that I now cannot even begin to understand, the non-standard SegoeUISymbol font was placed in the `getStdFontMap`. That honestly makes no sense, hence this patch which does what I should have done from the start.	2019-12-28 11:02:49 +01:00
Jonas Jenwald	a63f7ad486	Fix the linting errors, from the Prettier auto-formatting, that ESLint `--fix` couldn't handle This patch makes the follow changes: - Remove no longer necessary inline `// eslint-disable-...` comments. - Fix `// eslint-disable-...` comments that Prettier moved down, thus causing new linting errors. - Concatenate strings which now fit on just one line. - Fix comments that are now too long. - Finally, and most importantly, adjust comments that Prettier moved down, since the new positions often is confusing or outright wrong.	2019-12-26 12:35:12 +01:00
Jonas Jenwald	de36b2aaba	Enable auto-formatting of the entire code-base using Prettier (issue 11444) Note that Prettier, purposely, has only limited [configuration options](https://prettier.io/docs/en/options.html). The configuration file is based on [the one in `mozilla central`](https://searchfox.org/mozilla-central/source/.prettierrc) with just a few additions (to avoid future breakage if the defaults ever changes). Prettier is being used for a couple of reasons: - To be consistent with `mozilla-central`, where Prettier is already in use across the tree. - To ensure a consistent coding style everywhere, which is automatically enforced during linting (since Prettier is used as an ESLint plugin). This thus ends "all" formatting disussions once and for all, removing the need for review comments on most stylistic matters. Many ESLint options are now redundant, and I've tried my best to remove all the now unnecessary options (but I may have missed some). Note also that since Prettier considers the `printWidth` option as a guide, rather than a hard rule, this patch resorts to a small hack in the ESLint config to ensure that comments won't become too long. Please note: This patch is generated automatically, by appending the `--fix` argument to the ESLint call used in the `gulp lint` task. It will thus require some additional clean-up, which will be done in a separate commit. (On a more personal note, I'll readily admit that some of the changes Prettier makes are extremely ugly. However, in the name of consistency we'll probably have to live with that.)	2019-12-26 12:34:24 +01:00
Jonas Jenwald	8ec1dfde49	Add `// prettier-ignore` comments to prevent re-formatting of certain data structures There's a fair number of (primarily) `Array`s/`TypedArray`s whose formatting we don't want disturb, since in many cases that would lead to the code becoming much more difficult to read and/or break existing inline comments. Please note: It may be a good idea to look through these cases individually, and possibly re-write some of the them (especially the `String` ones) to reduce the need for all of these ignore commands.	2019-12-26 00:14:03 +01:00
Jonas Jenwald	70e3345cb4	Support OpenAction dictionaries without `Type` entries when parsing `Print` actions (issue 11442) The PDF generator didn't bother including the `Type` entry in the OpenAction dictionary, hence we skipped parsing the `Print` action.	2019-12-24 10:41:33 +01:00
Wojciech Maj	d40d33682b	Extract & use createHeaders helper in src/display/fetch_stream.js	2019-12-23 08:08:17 +01:00
Jonas Jenwald	d370037618	[api-minor] Tweak the Node.js fake worker loader to prevent `Critical dependency: ...` warnings from Webpack Since bundlers, such as Webpack, cannot be told to leave `require` statements alone we are thus forced to jump through hoops in order to prevent these warnings in third-party deployments of the PDF.js library; please see [Webpack issue 8826](https://github.com/webpack/webpack) and libraries such as [require-fool-webpack](https://github.com/sindresorhus/require-fool-webpack). Please note: This is based on the assumption that code running in Node.js won't ever be affected by e.g. Content Security Policies that prevent use of `eval`. If that ever occurs, we should revert to a normal `require` statement and simply document the Webpack warnings instead.	2019-12-20 17:36:10 +01:00
Jonas Jenwald	8519f87efb	Re-factor the `setupFakeWorkerGlobal` function (in `src/display/api.js`), and the `loadFakeWorker` function (in `web/app.js`) This patch reduces some duplication, by moving all fake worker loader code into the `setupFakeWorkerGlobal` function. Furthermore, the functions are simplified further by using `async`/`await` where appropriate.	2019-12-20 17:36:10 +01:00
Jonas Jenwald	a5485e1ef7	[api-minor] Support loading the fake worker from `GlobalWorkerOptions.workerSrc` in Node.js There's no particularily good reason, as far as I can tell, to not support a custom worker path in Node.js environments (even if workers aren't supported). This patch thus make the Node.js fake worker loader code-path consistent with the fallback code-path used with browser fake worker loader. Finally, this patch also deprecates[1] the `fallbackWorkerSrc` functionality, except in Node.js, since the user should always provide correct worker options since the fallback is nothing more than a best-effort solution. --- [1] Although it probably shouldn't be removed until the next major version.	2019-12-20 17:36:10 +01:00
Jonas Jenwald	591e754831	Move the fake worker loader code into the `PDFWorkerClosure` Given that this code isn't needed "globally" in the file, it seems reasonable to move it to where it's actually used instead.	2019-12-20 17:36:10 +01:00
Jonas Jenwald	aab0f91740	[api-minor] Simplify the fallback fake worker loader code in `src/display/api.js` For performance reasons, and to avoid hanging the browser UI, the PDF.js library should always be used with web workers enabled. At this point in time all of the supported browsers should have proper worker support, and Node.js is thus the only environment where workers aren't supported. Hence it no longer seems relevant/necessary to provide, by default, fake worker loaders for various JS builders/bundlers/frameworks in the PDF.js code itself.[1] In order to simplify things, the fake worker loader code is thus simplified to now only support Node.js usage respectively "normal" browser usage out-of-the-box.[2] Please note: The officially intended way of using the PDF.js library is with workers enabled, which can be done by setting `GlobalWorkerOptions.workerSrc`, `GlobalWorkerOptions.workerPort`, or manually providing a `PDFWorker` instance when calling `getDocument`. --- [1] Note that it's still possible to manually disable workers, simply my manually loading the built `pdf.worker.js` file into the (current) global scope, however this's mostly intended for testing/debugging purposes. [2] Unfortunately some bundlers such as Webpack, when used with third-party deployments of the PDF.js library, will start to print `Critical dependency: ...` warnings when run against the built `pdf.js` file from this patch. The reason is that despite the `require` calls being protected by runtime `isNodeJS` checks, it's not possible to simply tell Webpack to just ignore the `require`; please see [Webpack issue 8826](https://github.com/webpack/webpack) and libraries such as [require-fool-webpack](https://github.com/sindresorhus/require-fool-webpack).	2019-12-20 17:36:08 +01:00
Jonas Jenwald	dbb82f05fc	Re-factor the `find` helper function, in `src/core/document.js`, to search through the raw bytes rather than a string During initial parsing of every PDF document we're currently creating a few `1 kB` strings, in order to find certain commands needed for initialization. This seems inefficient, not to mention completely unnecessary, since we can just as well search through the raw bytes directly instead (similar to other parts of the code-base). One small complication here is the need to support backwards search, which does add some amount of "duplication" to this function. The main benefits here are: - No longer necessary to allocate temporary `1 kB` strings during initial parsing, thus saving some memory. - In practice, for well-formed PDF documents, the number of iterations required to find the commands are usually very low. (For the `tracemonkey.pdf` file, there's a total of only 30 loop iterations.)	2019-12-14 13:43:26 +01:00
Jonas Jenwald	e24050fa13	[api-minor] Move the `ReadableStream` polyfill to the global scope Note that most (reasonably) modern browsers have supported this for a while now, see https://developer.mozilla.org/en-US/docs/Web/API/ReadableStream#Browser_compatibility By moving the polyfill into `src/shared/compatibility.js` we can thus get rid of the need to manually export/import `ReadableStream` and simply use it directly instead. The only change here which could possibly lead to a difference in behavior is in the `isFetchSupported` function. Previously we attempted to check for the existence of a global `ReadableStream` implementation, which could now pass (assuming obviously that the preceding checks also succeeded). However I'm not sure if that's a problem, since the previous check only confirmed the existence of a native `ReadableStream` implementation and not that it actually worked correctly. Finally it could just as well have been a globally registered polyfill from an application embedding the PDF.js library.	2019-12-11 19:02:37 +01:00
Jonas Jenwald	b00835f589	Attempt to improve the `PDFDocument` error message for empty files (issue 5887) Given that the error in question is surfaced on the API-side, this patch makes the following changes: - Updates the wording such that it'll hopefully be slightly easier for users to understand. - Changes the plain `Error` to an `InvalidPDFException` instead, since that should work better with the existing Error handling. - Adds a unit-test which loads an empty PDF document (and also improves a pre-existing `InvalidPDFException` message and its test-case).	2019-12-09 15:45:50 +01:00
Tim van der Meij	a6db045789	Merge pull request #11387 from Snuffleupagus/issue-11385 Handle corrupt ASCII85Decode inline images with truncated EOD markers (issue 11385)	2019-12-08 20:27:46 +01:00
Tim van der Meij	16778118f6	Merge pull request #11391 from Snuffleupagus/globalThis Replace `globalScope` with the standard `globalThis` property instead	2019-12-08 20:23:19 +01:00
Jonas Jenwald	71d61e4c6f	Re-factor `getMainThreadWorkerMessageHandler` to support arbitrary global scopes, rather than only `window`	2019-12-08 20:19:04 +01:00
Jonas Jenwald	a8fc306b6e	Replace `globalScope` with the standard `globalThis` property instead Please see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/globalThis and note that most (reasonably) modern browsers have supported this for a while now, see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/globalThis#Browser_compatibility Since ESLint doesn't support this new global yet, it was added to the `globals` list in the top-level configuration file to prevent issues. Finally, for older browsers a polyfill was added in `ssrc/shared/compatibility.js`.	2019-12-08 20:19:02 +01:00
Jonas Jenwald	a02122e984	Ensure that `PDFDocument.checkFirstPage` waits for cleanup to complete (PR 10392 follow-up) Given how this method is currently used there shouldn't be any fonts loaded at the point in time where it's called, but it does seem like a bad idea to assume that that's always going to be the case. Since `PDFDocument.checkFirstPage` is already asynchronous, it's easy enough to simply await `Catalog.cleanup` here. (The patch also makes a tiny simplification in a loop in `Catalog.cleanup`.)	2019-12-07 12:31:41 +01:00
Jonas Jenwald	5c0336872e	Handle corrupt ASCII85Decode inline images with truncated EOD markers (issue 11385) In the PDF document in question, there's an ASCII85Decode inline image where the '>' part of EOD (end-of-data) marker is missing; hence the PDF document is corrupt.	2019-12-05 15:53:18 +01:00
Jonas Jenwald	c3b1c8f857	Slightly simplify the XRef cache lookup in `XRef.fetch` Note that the XRef cache will only hold objects returned through `Parser.getObj`, and indirectly via `Lexer.getObj`. Since neither of those methods will ever return `undefined`, we can simply `assert` that when inserting objects into the cache and thus get rid of one function call when doing cache lookups. Obviously this won't have a huge effect on performance, however `XRef.fetch` is usually called a lot in larger documents and this patch thus cannot hurt.	2019-11-30 22:41:53 +01:00
Jonas Jenwald	168c6aecae	Stop caching Streams in `XRef.fetchCompressed` I'm slightly surprised that this hasn't actually caused any (known) bugs, but that may be more luck than anything else since it fortunately doesn't seem common for Streams to be defined inside of an 'ObjStm'.[1] Note that in the `XRef.fetchUncompressed` method we're not caching Streams, and that for very good reasons too. - Streams, especially the `DecodeStream` ones, can become very large once read. Hence caching them really isn't a good idea simply because of the (potential) memory impact of doing so. - Attempting to read from the same Stream more than once won't work, unless it's `reset` in between, since using any method such as e.g. `getBytes` always starts at the current data position. - Given that even the `src/core/` code is now fairly asynchronous, see e.g. the `PartialEvaluator`, it's generally impossible to assert that any one Stream isn't being accessed "concurrently" by e.g. different `getOperatorList` calls. Hence `reset`-ing a cached Streams isn't going to work in the general case. All in all, I cannot understand why it'd ever be correct to cache Streams in the `XRef.fetchCompressed` method. --- [1] One example where that happens is the `issue3115r.pdf` file in the test-suite, where the streams in question are not actually used for anything within the PDF.js code.	2019-11-30 10:21:08 +01:00
Jonas Jenwald	06412a557b	Slighthly re-factor `XRef.fetchCompressed` - Change all occurences of `var` to `let`/`const`. - Initialize the (temporary) Arrays with the correct sizes upfront. - Inline the `isCmd` check. Obviously this won't make a huge difference, but given that the check is only relevant for corrupt documents it cannot hurt.	2019-11-30 09:49:51 +01:00
Jonas Jenwald	725566cfea	Remove the `Number.isInteger` checks from `XRef.fetchUncompressed` (PR 8857 follow-up) Having ran the entire test-suite locally with these `Number.isInteger` checks removed, there wasn't a single test failure anywhere; see also PR 8857. Hence everything points to this being completely unnecessary now, and by removing this code there's thus fewer function calls being made in `XRef.fetchUncompressed`.	2019-11-28 23:25:39 +01:00
Jonas Jenwald	cc76132c24	Remove outdated, and misleading, JSDoc comment from the `PDFDocument` class The contents of this comment hasn't been correct for years, ever since the library was properly split into main/worker-threads, so it's probably high time for this to be updated.	2019-11-25 11:36:29 +01:00
Jonas Jenwald	a965662184	Enable the `getter-return`, `no-dupe-else-if`, and `no-setter-return` ESLint rules All of these rules can help catch errors during development. Please note that only `getter-return` required a few changes, which was limited to disabling the rule in a couple of spots; please find additional details about these rules at: - https://eslint.org/docs/rules/getter-return - https://eslint.org/docs/rules/no-dupe-else-if - https://eslint.org/docs/rules/no-setter-return	2019-11-23 11:40:30 +01:00
Tim van der Meij	be02e67972	Merge pull request #11335 from Snuffleupagus/issue-11330 Subtract `stream.start` when getting the `startXRef` property for documents with a Linearization dictionary (issue 11330)	2019-11-16 13:56:01 +01:00
Jonas Jenwald	9199b02a42	Subtract `stream.start` when getting the `startXRef` property for documents with a Linearization dictionary (issue 11330) For documents with a Linearization dictionary the computed `startXRef` position will be relative to the raw file, rather than the actual PDF document itself (which begins with `%PDF-`). Hence it's necessary to subtract `stream.start` in this case, since otherwise the `XRef.readXRef` method will increment the position too far resulting in parsing errors.	2019-11-16 09:29:10 +01:00

... 41 42 43 44 45 ...

5884 Commits