3542 Commits

Author SHA1 Message Date
Jonas Jenwald
bf6d45f85a Convert CMap and IdentityCMap to ES6 classes
Also changes `var` to `let`/`const` in code already touched in the patch.
2018-07-09 21:12:01 +02:00
Jonas Jenwald
b773b356af Convert NameOrNumberTree, NameTree, and NumberTree to ES6 classes
Also changes `var` to `let`/`const` in code already touched in the patch.
2018-07-09 21:12:01 +02:00
Jonas Jenwald
ba1af46709 Convert CompiledFont, TrueTypeCompiled, and Type2Compiled to ES6 classes
Also changes `var` to `let`/`const` in code already touched in the patch.
2018-07-09 21:12:01 +02:00
Jonas Jenwald
775763a091 Ensure that CompiledFont.compileGlyph always returns an Array (PR 6141 follow-up)
PR 6141 changed `CompiledFont.compileGlyph` to, in the general case, return an Array. However, that PR apparenly forgot to update the no-glyph, empty-glyph, and endchar-glyph code-path and a String was still being (incorrectly) returned.

Given the way that `FontFaceObject.getPathGenerator` (on the API side) is implemented, this shouldn't have caused any bugs despite the Worker possible returning unexpected data.
2018-07-09 21:12:01 +02:00
Tim van der Meij
646d81cd09
Merge pull request #9837 from timvandermeij/unreachable
Replace `NotImplementedException` with `unreachable`
2018-07-09 21:10:36 +02:00
Tim van der Meij
907c7f190b
Convert src/code/pdf_manager.js to ES6 classes/syntax 2018-07-08 16:43:46 +02:00
Jonas Jenwald
a9ce4e8417 Stop exposing the URL polyfill in the global scope
This moves/exposes the `URL` polyfill similarily to the existing `ReadableStream` polyfill, rather than exposing it globally, to avoid interfering with any "outside" code.
Both the `URL` and `ReadableStream` polyfills are now exposed on the `pdfjsLib` object, such that they are accessible to the viewer components.
Furthermore, the `no-restricted-globals` ESLint rule is also enabled to prevent accidental usage of the native `URL`/`ReadableStream` implementations directly in the `src/` and `web/` folders; see also https://eslint.org/docs/rules/no-restricted-globals

Addresses the remaining TODO in https://github.com/mozilla/pdf.js/projects/6
2018-07-04 09:16:28 +02:00
Tim van der Meij
99f8f2c275
Merge pull request #9853 from Snuffleupagus/re-render-after-cancel
Fix re-rendering, using the same canvas, when rendering was previously cancelled (PR 8519 follow-up)
2018-06-29 23:25:43 +02:00
Tim van der Meij
6fa2c779b5
Merge pull request #9838 from Snuffleupagus/invalid-path-OPS
Error, rather than warn, once a number of invalid path operators are encountered in `EvaluatorPreprocessor.read` (bug 1443140)
2018-06-28 23:15:25 +02:00
Jonas Jenwald
bf0aca86d7 Fix re-rendering, using the same canvas, when rendering was previously cancelled (PR 8519 follow-up)
Currently if `RenderTask.cancel` is called *immediately* after rendering was started, then by the time that `InternalRenderTask.initializeGraphics` is called rendering will already have been cancelled.
However, we're still inserting the canvas into the `canvasInRendering` map, thus breaking any future attempts at re-rendering using the same canvas. Considering that `InternalRenderTask.cancel` always removes the canvas from the map, I cannot imagine that we'd ever want to re-add it *after* rendering was cancelled (it was likely just a simple oversight in PR 8519).

Fixes 9456.
2018-06-28 22:56:37 +02:00
Tim van der Meij
14b69a4c1c
Merge pull request #9729 from Snuffleupagus/gulp-image_decoders
Add a `gulp image_decoders` command to package the image decoders (i.e. jpg.js, jpx.js, jbig2.js) separately, and publish them in pdfjs-dist
2018-06-26 23:27:32 +02:00
Jonas Jenwald
74e9999044 Add unit-tests for PDFPageProxy.stats (PR 9245 follow-up)
This wasn't included in PR 9245, since all the API options were still global at that time.

Writing the unit-tests also uncovered an issue with `getOperatorList` not starting the "Page Request" timer.
2018-06-25 14:20:49 +02:00
Jonas Jenwald
7f21e38787 Error, rather than warn, once a number of invalid path operators are encountered in EvaluatorPreprocessor.read (bug 1443140)
Incomplete path operators, in particular, can result in fairly chaotic rendering artifacts, as can be observed on page four of the referenced PDF file.

The initial (naive) solution that was attempted, was to simply throw a `FormatError` as soon as any invalid (i.e. too short) operator was found and rely on the existing `ignoreErrors` code-paths. However, doing so would have caused regressions in some files; see the existing `issue2391-1` test-case, which was promoted to an `eq` test to help prevent future bugs.
Hence this patch, which adds special handling for invalid path operators since those may cause quite bad rendering artifacts.

You could, in all fairness, argue that the patch is a handwavy solution and I wouldn't object. However, given that this only concerns *corrupt* PDF files, the way that PDF viewers (PDF.js included) try to gracefully deal with those could probably be described as a best-effort solution anyway.

This patch also adjusts the existing `warn`/`info` messages to print the command name according to the PDF specification, rather than an internal PDF.js enumeration value. The former should be much more useful for debugging purposes.

Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1443140.
2018-06-24 16:05:08 +02:00
Tim van der Meij
2907827d31
Replace NotImplementedException with unreachable 2018-06-23 21:20:53 +02:00
Jonas Jenwald
275834ae66 Clean-up, and add JSDocs to, the PDFDocumentProxy.loadingParams method (PR 9830 follow-up) 2018-06-23 13:33:22 +02:00
Tim van der Meij
34594a5b02
Merge pull request #9830 from EugeneSqr/9824
Removed safari compatibility check (issue #9824)
2018-06-23 02:21:06 +02:00
Tim van der Meij
98ea39f9d0
Merge pull request #9827 from Snuffleupagus/misc-corrupt-pdf-fixes
Fix various corrupt PDF files (issue 9252, issue 9418)
2018-06-21 22:35:00 +02:00
eugenesqr
331ac8ae74 removed safari compatibility check 2018-06-21 12:57:56 +03:00
Brendan Dahl
a278c5a8dc
Merge pull request #9795 from timvandermeij/object-assign
Replace `Util.extendObj` by `Object.assign`
2018-06-20 10:50:40 -07:00
Jonas Jenwald
56e3648b65 Add basic validation of the 'trailer' dictionary candidates in XRef.indexObjects (issue 9418)
This patch avoids choosing a (possible) 'trailer' dictionary that `XRef.parse` and/or the `Catalog` constructor/methods will reject anyway.
Since `XRef.indexObjects` is already parsing the entire PDF file, the extra dictionary look-ups added here shouldn't matter much. Besides, this is a fallback code-path that only applies to corrupt PDF files anyway.
2018-06-20 13:41:22 +02:00
Jonas Jenwald
346810e02a Add basic validation of the 'Root' dictionary in XRef.parse and try to recover when possible
Note that the `Catalog` constructor, and some of its methods, are already enforcing that the 'Root' dictionary is valid/well-formed. However, by doing additional validation already in `XRef.parse` there's a slightly larger chance that corrupt PDF files could be successfully parsed/rendered.
2018-06-20 13:41:22 +02:00
Jonas Jenwald
e84813e7cc Prevent hard errors if fetching the Encrypt dictionary fails in XRef.parse 2018-06-20 13:41:22 +02:00
Jonas Jenwald
30ad62a86a Use the correct startPos when repeating the search for 'endobj' operators in XRef.indexObjects (PR 9288 follow-up) 2018-06-20 13:41:22 +02:00
Jonas Jenwald
6bbcafcd26 Let Lexer.getNumber treat a single decimal point as zero (issue 9252)
This is consistent with the behaviour in Adobe Reader.
2018-06-20 13:41:21 +02:00
Jonas Jenwald
df4799a12a Ensure that line-breaks are *only* skipped after operators in Lexer.getNumber (PR 8359 follow-up)
With the current code line-breaks are accepted not just after an operator, but after a decimal point as well. When looking at this again, the latter case seems prone to cause false positives and might also interfere with subsequent patches.

Hence this is code is adjusted to actually do what the original commit message says, and nothing more.
2018-06-20 13:41:15 +02:00
Jonas Jenwald
303537bcb1 Add a gulp image_decoders command to allow packaging/distributing the image decoders (i.e. jpg.js, jpx.js, jbig2.js) separately from the main PDF.js library
Please note that the standalone `pdf.image_decoders.js` file will be including the complete `src/shared/util.js` file, despite only using parts of it.[1] This was done *purposely*, to not negatively impact the readability/maintainability of the core PDF.js code.

Furthermore, to ensure that the compatibility is the same in the regular PDF.js library *and* in the the standalone image decoders, `src/shared/compatibility.js` was included as well.

To (hopefully) prevent future complaints about the size of the built `pdf.image_decoders.js` file, a few existing async-related polyfills are being skipped (since all of the image decoders are completely synchronous).
Obviously this required adding a couple of pre-processor statements, but given that these are all limited to "compatibility" code, I think this might be OK!?

---
[1] However, please note that previous commits moved `PageViewport` and `MessageHandler` out of `src/shared/util.js` which reduced its size.
2018-06-16 17:56:54 +02:00
Jonas Jenwald
bfc88ead66 Expose a Jbig2Image.parse method, by re-instating the parseJbig2 function
The purpose of this patch is to hopefully provide *slightly* better user ergonomics, if/when the PDF.js image decoders are used standalone.

This implementation is (basically) reverting the changes in PR 9386, in conjunction with code from the `parse` method found at https://github.com/notmasteryet/jpgjs/blob/master/src/pdfjs.js
2018-06-16 17:56:54 +02:00
Jonas Jenwald
682672db8e Change the signature of the JpegImage constructor, to allow passing in various options directly 2018-06-16 17:56:54 +02:00
Tim van der Meij
620da6f4df
Merge pull request #9802 from Snuffleupagus/ColorSpace-PDFImage-Uint8ClampedArray
Update `ColorSpace` and `PDFImage` to use `Uint8ClampedArray`s and remove manual clamping/rounding
2018-06-16 17:55:10 +02:00
Jonas Jenwald
0958006713 Send UnsupportedFeature notification when errors are ignored in FontFaceObject.getPathGenerator 2018-06-13 11:02:10 +02:00
Jonas Jenwald
bf0db0fb72 Pass the ignoreErrors API option to the FontFaceObject constructor, and utilize it in getPathGenerator to ignore missing glyphs
Obviously it's still not possible to render non-embedded fonts as paths, but in this way the rest of the page will at least be allowed to continue rendering.

*Please note:* Including the 14 standard fonts in PDF.js probably wouldn't be *that* difficult to implement. (I'm not a lawyer, but the fonts from PDFium could probably be used given their BSD license.)
However, the main blocker ought to be the total size of the necessary font data, since I cannot imagine people being OK with shipping ~5 MB of (additional) font data with Firefox. (Based on the reactions when the CMap files were added, and those are only ~1 MB in size.)
2018-06-13 11:02:06 +02:00
Jonas Jenwald
fe288bb872 Refactor the FontFaceObject.getPathGenerator method
- Reduce the overall indentation level, by making use of early returns.

 - Replace `var` with `let`.
2018-06-13 11:02:02 +02:00
Jonas Jenwald
778981ec89 Catch, and propagate, errors in the requestAnimationFrame branch of InternalRenderTask._scheduleNext
To support these changes, `InternalRenderTask._next` now returns a Promise.
2018-06-13 11:01:58 +02:00
Jonas Jenwald
d4ff541b78 Enforce the use, in non-production/test-only mode, of Uint8ClampedArray in all relevant methods in ColorSpace and PDFImage
Since `ColorSpace` now depends on the native clamping of `Uint8ClampedArray`, this patch adds non-production/test-only `assert`s to enforce that the expected TypedArray is used for the output.

These `assert`s are purposely *not* included in PRODUCTION builds since that would break rendering completely, as opposed to "only" displaying some weird colours, when a `Uint8Array` was used. Furthermore, these are mostly added to help catch explicit developer errors when working with the `ColorSpace` and `PDFImage` code.
2018-06-12 11:01:32 +02:00
Jonas Jenwald
4b69bb7fe9 Add a TESTING build option, to enable using non-production/test-only code-paths
Since the tests (currently) run with the `pdf.worker.js` file built, i.e. with `PRODUCTION = true` set, there's no simple way to add e.g. `assert` calls for both non-production *and* test-only builds without also affecting PRODUCTION builds.
2018-06-12 11:01:32 +02:00
Jonas Jenwald
f01e54eae1 Improve the warning messages printed by `PartialEvaluator.{getOperatorList, getTextContent} when errors are being ignored
Currently the actual errors aren't printed, which can make debugging harder than necessary.
2018-06-12 11:01:32 +02:00
Jonas Jenwald
731f2e6dfc Remove manual clamping/rounding from ColorSpace and PDFImage, by having their methods use Uint8ClampedArrays
The built-in image decoders are already using `Uint8ClampedArray` when returning data, and this patch simply extends that to the rest of the image/colorspace code.

As far as I can tell, the only reason for using manual clamping/rounding in the first place was because TypedArrays used to be polyfilled (using regular arrays). And trying to polyfill the native clamping/rounding would probably have been had too much overhead, but given that TypedArray support is required in PDF.js version `2.0` that's no longer a concern.

*Please note:* Because of different rounding behaviour, basically `Math.round` in `Uint8ClampedArray` respectively `Math.floor` in the old code, there will be very slight movement in quite a few existing test-cases. However, the changes should be imperceivable to the naked eye, given that the absolute difference is *at most* `1` for each RGB component when comparing `master` and this patch (see also the updated expectation values in the unit-tests).
2018-06-12 11:01:32 +02:00
Jonas Jenwald
55199aa281 Remove the unused bpc parameter from, and update the signature of, the resizeRgbImage function in src/core/colorspace.js 2018-06-12 11:01:32 +02:00
Jonas Jenwald
d1637056b3 Use shorthand method signatures in src/core/colorspace.js 2018-06-12 11:01:32 +02:00
Jonas Jenwald
32367c5968 Make the getBytes/peekBytes methods of Stream/DecodeStream/ChunkedStream able to return Uint8ClampedArrays
The built-in image decoders are already returning data as `Uint8ClampedArray`, and subsequently the JPEG/JBIG2/JPX streams are as well. However, for general streams we obviously don't want to force the use of `Uint8ClampedArray` unless an "Image" is actually being decoded.
Hence this patch, which adds a parameter that allows the caller of the `getBytes`/`peekBytes` methods to force a `Uint8ClampedArray` (rather than a `Uint8Array`) to be returned.
2018-06-12 11:01:32 +02:00
Brendan Dahl
3ac638fad3
Merge pull request #9689 from RafaPolit/master
Fixed critical unhandled promise that prevented error catching using API
2018-06-11 15:40:30 -06:00
Tim van der Meij
af8e88d00b
Replace Util.extendObj by Object.assign 2018-06-10 20:11:03 +02:00
Tim van der Meij
903bad1906
Remove Util.appendToArray and Util.prependToArray
The former may be replaced by regular JavaScript array concatenation and
the latter is unused. This avoids unnecessary function calls/imports.
2018-06-10 15:24:09 +02:00
Jonas Jenwald
07d610615c Move, and modernize, Util.loadScript from src/shared/util.js to src/display/dom_utils.js
Not only is the `Util.loadScript` helper function unused on the Worker side, even trying to use it there would throw an Error (since `document` isn't defined/available in Workers).
Hence this helper function is moved, and its code modernized slightly by having it return a Promise rather than needing a callback function.

Finally, to reduced code duplication, the "new" loadScript function is exported and used in the viewer.
2018-06-07 13:52:40 +02:00
Jonas Jenwald
547f119be6 Simplify the error handling slightly in the src/display/node_stream.js file
The various classes have `this._errored` and `this._reason` properties, where the first one is a boolean indicating if an error was encountered and the second one contains the actual `Error` (or `null` initially).

In practice this means that errors are basically tracked *twice*, rather than just once. This kind of double-bookkeeping is generally a bad idea, since it's quite easy for the properties to (accidentally) get into an inconsistent state whenever the relevant code is modified.

Rather than using a separate boolean, we can just as well check the "error" property directly (since `null` is falsy).

---

Somewhat unrelated to this patch, but `src/display/node_stream.js` is currently *not* handling errors in a consistent or even correct way; compared with `src/display/network.js` and `src/display/fetch_stream.js`.
Obviously using the `createResponseStatusError` utility function, from `src/display/network_utils.js`, might not make much sense in a Node.js environment. However at the *very* least it seems that `MissingPDFException`, or `UnknownErrorException` when one cannot tell that the PDF file is "missing", should be manually thrown.

As is, the API (i.e. `getDocument`) is not returning the *expected* errors when loading fails in Node.js environments (as evident from the `pending` API unit-test).
2018-06-06 09:05:45 +02:00
Jonas Jenwald
2fdaa3d54c Update the postMessageTransfers comment in createDocumentHandler in the src/core/worker.js file
Since the old comment mentions a now unsupported browser, let's update it such that someone won't accidentally conclude that the code in question can be removed.
2018-06-06 08:52:43 +02:00
Jonas Jenwald
871bf5c68b Remove the, now obsolete, handling of the CMapReaderFactory parameter in getDocument
This special handling was added in PR 8567, but was made redundant in PR 8721 which stopped sending everything but the kitchen sink to the Worker side.
2018-06-06 08:52:43 +02:00
Jonas Jenwald
c8e2163bbc Remove incorrect/unnecessary validation of the verbosity parameter in the PDFWorker constructor (PR 9480 follow-up) 2018-06-06 08:52:43 +02:00
Jonas Jenwald
b263b702e8 Rename PDFPageProxy.pageInfo to PDFPageProxy._pageInfo to indicate that the property should be considered "private"
Since `PDFPageProxy` already provide getters for all the data returned by `GetPage` (in the Worker), there isn't any compelling reason for accessing the `pageInfo` directly on `PDFPageProxy`.

The patch also changes the `GetPage` handler, in `src/core/worker.js`, to use modern JavaScript features.
2018-06-06 08:52:42 +02:00
Jonas Jenwald
4f4b50e01e Rename PDFDocumentProxy.pdfInfo to PDFDocumentProxy._pdfInfo to indicate that the property should be considered "private"
Since `PDFDocumentProxy` already provide getters for all the data returned by `GetDoc` (in the Worker), there isn't any compelling reason for accessing the `pdfInfo` directly on `PDFDocumentProxy`.
2018-06-06 08:52:42 +02:00