pdf.js

Author	SHA1	Message	Date
Jonas Jenwald	4a76ab352c	Add polyfills to support iteration of `Map` and `Set` Without this, things such as e.g. `Metadata.getAll` is broken in IE11 (see PR 11596). https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Map#Browser_compatibility https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Set#Browser_compatibility	2020-02-14 15:53:02 +01:00
Tim van der Meij	cd3f2d49e6	Merge pull request #11596 from Snuffleupagus/metadata-map Re-factor how `Metadata` class instances store its data internally	2020-02-13 23:01:51 +01:00
Jonas Jenwald	5cdfff4a47	Re-factor how `Metadata` class instances store its data internally Please note that these changes do not affect the public interface of the `Metadata` class, but only touches internal structures.[1] These changes were prompted by looking at the `getAll` method, which simply returns the "private" metadata object to the consumer. This seems wrong conceptually, since it allows way too easy/accidental changes to the internal parsed metadata. As part of fixing this, the internal metadata was changed to use a `Map` rather than a plain Object. --- [1] Basically, we shouldn't need to worry about someone depending on internal implementation details.	2020-02-13 18:23:15 +01:00
Jonas Jenwald	3f1568b51a	A couple of small improvements of the `Metadata._repair` method - Remove the "capturing group" in the regular expression that removes leading "junk" from the raw metadata, since it's not necessary here (it's simply a case of too much copy-pasting in a prior patch). According to [MDN](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions/Cheatsheet#Groups_and_ranges) you want to, for performance reasons, avoid "capturing groups" unless actually needed. - Add inline comments to document a bunch of magic values in the code.	2020-02-13 17:20:52 +01:00
Jonas Jenwald	a5db4e985a	Remove `LoopbackPort.postMessage` special-case for polyfilled `TypedArray`s Given that all `TypedArray` polyfills were removed in PDF.js version `2.0`, since native support is now required, this branch has been dead code for awhile.	2020-02-13 12:50:41 +01:00
Jonas Jenwald	7b0836ca75	[TextLayer] Immediately set the padding, rather than checking if it's empty, in `expandTextDivs` In practice it's extremely rare[1] for the padding to be zero in all components, hence it seems better to just set it directly rather than creating a temporary variable and checking for the "no padding"-case. --- [1] In the `tracemonkey.pdf` file that only happens with `0.08%` of all text elements.	2020-02-11 15:52:36 +01:00
Takashi Tamura	512dbe3060	Fix text spacing with vertical fonts. #7687 and #11526 . When the writing mode is vertical, we have to reverse the sign of spacing since we are subtracting it from current.y. We have to add it to current.y. See 9.4.4 Text Space Details, https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G8.1694762	2020-02-11 08:49:23 +09:00
Jonas Jenwald	ae5a34c520	[api-minor] Ensure that the `Array.prototype` doesn't contain any enumerable properties Over the years there's been a fair number of issues/PRs opened, where people have wanted to add `hasOwnProperty` checks in (hot) loops in the font parsing code. This has always been rejected, since we don't want to risk reducing performance in the Firefox PDF viewer simply because some users of the general PDF.js library are incorrectly extending the `Array.prototype` with enumerable properties. With this patch the general PDF.js library will now fail immediately with a hopefully useful Error message, rather than having (some) fonts fail to render, when the `Array.prototype` is incorrectly extended. Note that I did consider making this a warning, but ultimately decided against it since it's first of all possible to disable those (with the `verbosity` parameter). Secondly, even when printed, warnings can be easy to overlook and finally a warning may also seem OK to ignore (as opposed to an actual Error).	2020-02-10 14:17:27 +01:00
Tim van der Meij	dced0a3821	Merge pull request #11579 from Snuffleupagus/issue-11578 Ignore spaces when normalizing the font name in `Font.fallbackToSystemFont` (issue 11578)	2020-02-09 17:33:09 +01:00
Tim van der Meij	61056a9238	Merge pull request #11551 from Snuffleupagus/issue-11549 Allow skipping of errors when reading broken/corrupt ToUnicode data (issue 11549)	2020-02-09 17:32:35 +01:00
Tim van der Meij	2fb4076e05	Merge pull request #11568 from Snuffleupagus/PDF-header-validation Ensure that the PDF header contains an actual number (PR 11463 follow-up)	2020-02-09 17:16:25 +01:00
Tim van der Meij	102af0f915	Merge pull request #11547 from Snuffleupagus/convertCmykToRgb-scale Use fewer multiplications in `JpegImage._convertCmykToRgb`	2020-02-09 17:06:23 +01:00
Tim van der Meij	f178805412	Merge pull request #11557 from Snuffleupagus/_getLinearizedBlockData-xScaleBlockOffset Avoid re-calculating the `xScaleBlockOffset` when not necessary in `JpegImage._getLinearizedBlockData`	2020-02-09 16:54:28 +01:00
Tim van der Meij	7948faf675	Merge pull request #11573 from Snuffleupagus/api-cleanup-returns [api-minor] Change `PDFDocumentProxy.cleanup`/`PDFPageProxy.cleanup` to return data	2020-02-08 20:42:28 +01:00
Tim van der Meij	a73a38029c	Merge pull request #11569 from Snuffleupagus/rm-most-setAttribute Replace most remaining `Element.setAttribute("style", ...)` usage with `Element.style = ...` instead	2020-02-08 20:13:56 +01:00
Jonas Jenwald	7937165537	Ignore spaces when normalizing the font name in `Font.fallbackToSystemFont` (issue 11578)	2020-02-08 19:59:04 +01:00
Jonas Jenwald	7117ee03d6	[api-minor] Change `PDFDocumentProxy.cleanup`/`PDFPageProxy.cleanup` to return data This patch makes the following changes, to improve these API methods: - Let `PDFPageProxy.cleanup` return a boolean indicating if clean-up actually happened, since ongoing rendering will block clean-up. Besides being used in other parts of this patch, it seems that an API user may also be interested in the return value given that clean-up isn't guaranteed to happen. - Let `PDFDocumentProxy.cleanup` return the promise indicating when clean-up is finished. - Improve the JSDoc comment for `PDFDocumentProxy.cleanup` to mention that clean-up is triggered on both threads (without going into unnecessary specifics regarding what exactly said data actually is). Add a note in the JSDoc comment about not calling this method when rendering is ongoing. - Change `WorkerTransport.startCleanup` to throw an `Error` if it's called when rendering is ongoing, to prevent rendering from breaking. Please note that this won't stop worker-thread clean-up from happening (since there's no general "something is rendering"-flag), however I'm not sure if that's really a problem; but please don't quote me on that :-) All of the caches that's being cleared in `Catalog.cleanup`, on the worker-thread, should be re-filled automatically even if cleared during parsing/rendering, and the only thing that probably happens is that e.g. font data would have to be re-parsed. On the main-thread, on the other hand, clearing the caches is more-or-less guaranteed to cause rendering errors, since the rendering code in `src/display/canvas.js` isn't able to re-request any image/font data that's suddenly being pulled out from under it. - Last, but not least, add a couple of basic unit-tests for the clean-up functionality.	2020-02-07 17:00:29 +01:00
Jonas Jenwald	88c35d872f	Ensure that the PDF header contains an actual number (PR 11463 follow-up) While it would be nice to change the `PDFFormatVersion` property, as returned through `PDFDocumentProxy.getMetadata`, to a number (rather than a string) that would unfortunately be a breaking API change. However, it does seem like a good idea to at least validate the PDF header version on the worker-thread, rather than potentially returning an arbitrary string.	2020-02-07 12:25:07 +01:00
Tim van der Meij	e12e83702d	Merge pull request #11559 from bhasto/curveto2-fix Fix how curveTo2 (v operator) is translated to SVG	2020-02-06 23:10:41 +01:00
Brendan Dahl	09a6e17d22	Merge pull request #11528 from janpe2/type1-nonemb-notdef Hide .notdef glyphs in non-embedded Type1 fonts and don't ignore Widths	2020-02-06 13:30:07 -08:00
Jonas Jenwald	5cbd44b628	Replace most remaining `Element.setAttribute("style", ...)` usage with `Element.style = ...` instead This should hopefully be useful in environments where restrictive CSPs are in effect. In most cases the replacement is entirely straighforward, and there's only a couple of special cases: - For the `src/display/font_loader.js` and `web/pdf_outline_viewer.js `cases, since the elements aren't appended to the document yet, it shouldn't matter if the style properties are set one-by-one rather than all at once. - For the `web/debugger.js` case, there's really no need to set the `padding` inline at all and the definition was simply moved to `web/viewer.css` instead. Please note: There's still a single case left, in `web/toolbar.js` for setting the width of the zoom dropdown, which is left intact for now. The reasons are that this particular case shouldn't matter for users of the general PDF.js library, and that it'd make a lot more sense to just try and re-factor that very old code anyway (thus fixing the `setAttribute` usage in the process).	2020-02-05 22:26:47 +01:00
Branislav Hašto	393aed9978	Fix how curveTo2 (v operator) is translated to SVG Based on the PDF spec, with `v` operator, current point should be used as the first control point of the curve. Do not overwrite current point before an SVG curve is built, so it can b actually used as first control point.	2020-02-02 17:03:29 +01:00
Jonas Jenwald	a4440a1c6b	Avoid re-calculating the `xScaleBlockOffset` when not necessary in `JpegImage._getLinearizedBlockData` As can be seen in the code, the `xScaleBlockOffset` typed array doesn't depend on the actual image data but only on the width and x-scale. The width is obviously consistent for an image, and it turns out that in practice the `componentScaleX` is quite often identical between two (or more) adjacent image components. All-in-all it's thus not necessary to unconditionally re-compute the `xScaleBlockOffset` when getting the JPEG image data. While avoiding, in many cases, one or more loops can never be a bad thing these changes are unfortunately completely dominated by the rest of the JpegImage code and consequently doesn't really show up in benchmark results. Hence I'd understand if this patch is ultimately deemed not necessary.	2020-02-01 11:58:50 +01:00
Jonas Jenwald	4c54395ff6	Allow skipping of errors when reading broken/corrupt ToUnicode data (issue 11549) This will allow font loading/parsing to continue, rather than immediately failing, when broken/corrupt CMap data is encountered.	2020-01-30 13:19:05 +01:00
Jonas Jenwald	ce4f41d06a	Use fewer multiplications in `JpegImage._convertCmykToRgb` Note: This is inspired by PR 5473, which made similar changes for another kind of JPEG data. Since the implementation in `src/core/jpg.js` only supports 8-bit data, as opposed to similar code in `src/core/colorspace.js`, the computations can be further simplified since the `scale` is always constant. By updating the coefficients, effectively inlining the `scale`, we'll thus avoid four multiplications for each loop iteration. Unfortunately I wasn't able, based on a quick look through the test-files, to find a sufficiently large CMYK JPEG image in order for these changes to really show up in benchmark results. However, when testing the `cmykjpeg.pdf` manually there's a total of `120 000` fewer multiplication with this patch.	2020-01-29 18:34:58 +01:00
Takashi Tamura	0b701e7950	Fix the indices of arguments for RadialAxial. It is related to #10646 .	2020-01-29 19:18:50 +09:00
Tim van der Meij	7ae504222f	Merge pull request #11544 from Snuffleupagus/decodeHuffman Make the `decodeHuffman` function, in `src/core/jpg.js`, slightly more efficient	2020-01-28 22:54:46 +01:00
Tim van der Meij	e9dc179673	Merge pull request #11537 from Snuffleupagus/setupFakeWorker-configure Send the `verbosity` level when setting up fake workers (issue 11536)	2020-01-28 22:50:30 +01:00
Jonas Jenwald	f5a617a334	Make the `decodeHuffman` function, in `src/core/jpg.js`, slightly more efficient Rather than repeating the `typeof node` check twice, we can use a `switch` statement instead. This patch was tested using the PDF file from issue 3809, i.e. https://web.archive.org/web/20140801150504/http://vs.twonky.dk/invitation.pdf, with the following manifest file: ``` [ { "id": "issue3809", "file": "../web/pdfs/issue3809.pdf", "md5": "", "rounds": 50, "type": "eq" } ] ``` which gave the following results when comparing this patch against the `master` branch: ``` -- Grouped By browser, stat -- browser \| stat \| Count \| Baseline(ms) \| Current(ms) \| +/- \| % \| Result(P<.05) ------- \| ------------ \| ----- \| ------------ \| ----------- \| --- \| ----- \| ------------- Firefox \| Overall \| 50 \| 12537 \| 12451 \| -86 \| -0.69 \| faster Firefox \| Page Request \| 50 \| 5 \| 5 \| 0 \| 0.77 \| Firefox \| Rendering \| 50 \| 12532 \| 12446 \| -86 \| -0.69 \| faster ```	2020-01-28 14:23:58 +01:00
Tim van der Meij	474fe1757e	Merge pull request #11508 from Snuffleupagus/jpg-default-marker Simplify the handling of unsupported/incorrect markers in `src/core/jpg.js`	2020-01-26 21:32:13 +01:00
Jonas Jenwald	62b2b984cc	Render Popup annotations last, once all other annotations have been rendered (issue 11362) In the current `AnnotationLayer` implementation, Popup annotations require that the parent annotation have already been rendered (otherwise they're simply ignored). Usually the annotations are ordered, in the `/Annots` array, in such a way that this isn't a problem, however there's obviously no guarantee that all PDF generators actually do so. Hence we simply ensure, when rendering the `AnnotationLayer`, that the Popup annotations are handled last.	2020-01-26 15:49:55 +01:00
Jonas Jenwald	427df2dfd7	Send the `verbosity` level when setting up fake workers (issue 11536) Interestingly the viewer already seem to work correctly as-is, with workers disabled and a non-standard `verbosity` level. Hence this is possibly Node.js specific, but given that the issue is lacking both the PDF file in question and a runnable test-case, so this patch is essentially a best-effort guess at what the problem could be.	2020-01-26 12:37:45 +01:00
Jonas Jenwald	13930e5202	Simplify the handling of unsupported/incorrect markers in `src/core/jpg.js` - Re-factor the "incorrect encoding" check, since this can be easily achieved using the general `findNextFileMarker` helper function (with a suitable `startPos` argument). - Tweak a condition, to make it easier to see that the end of the data has been reached. - Add a reference test for issue 1877, since it's what prompted the "incorrect encoding" check.	2020-01-25 22:52:24 +01:00
Tim van der Meij	3775b711ed	Merge pull request #11482 from Snuffleupagus/more-core-utils Convert `src/core/jpg.js` to use the `readUint16` helper function in `src/core/core_utils.js`, rather than re-implementing it twice	2020-01-25 21:38:34 +01:00
Tim van der Meij	cbbda9d883	Merge pull request #11515 from Snuffleupagus/cache-fallback-font Cache the fallback font dictionary on the `PartialEvaluator` (PR 11218 follow-up)	2020-01-25 21:32:28 +01:00
Jonas Jenwald	188b320e18	Convert `src/core/jpg.js` to use the `readUint16` helper function in `src/core/core_utils.js`, rather than re-implementing it twice The other image decoders, i.e. the JBIG2 and JPEG 2000 ones, are using the common helper function `readUint16`. Most likely, the only reason that the JPEG decoder is doing it this way is because it originated outside of the PDF.js library. Hence we can simply re-factor `src/core/jpg.js` to use the common `readUint16` helper function, which is especially nice given that the functionality was essentially duplicated in the code.	2020-01-25 00:35:10 +01:00
Jonas Jenwald	3f031f69c2	Move additional worker-thread only functions from `src/shared/util.js` and into a `src/core/core_utils.js` instead This moves the `log2`, `readInt8`, `readUint16`, `readUint32`, and `isSpace` functions since they are only used in the worker-thread.	2020-01-25 00:33:52 +01:00
Jonas Jenwald	83bdb525a4	Fix remaining linting errors, from enabling the `prefer-const` ESLint rule globally This covers cases that the `--fix` command couldn't deal with, and in a few cases (notably `src/core/jbig2.js`) the code was changed to use block-scoped variables instead.	2020-01-25 00:20:23 +01:00
Jonas Jenwald	9e262ae7fa	Enable the ESLint `prefer-const` rule globally (PR 11450 follow-up) Please find additional details about the ESLint rule at https://eslint.org/docs/rules/prefer-const With the recent introduction of Prettier this sort of mass enabling of ESLint rules becomes a lot easier, since the code will be automatically reformatted as necessary to account for e.g. changed line lengths. Note that this patch is generated automatically, by using the ESLint `--fix` argument, and will thus require some additional clean-up (which is done separately).	2020-01-25 00:20:22 +01:00
Tim van der Meij	d2d9441373	Merge pull request #11489 from Snuffleupagus/rm-FIREFOX-define Remove the `FIREFOX` build flag, since it's completely unused and simplify a couple of `PDFJSDev` checks	2020-01-24 23:59:13 +01:00
Tim van der Meij	668a29aa45	Merge pull request #11497 from Snuffleupagus/Promise-allSettled Add support for `Promise.allSettled`	2020-01-22 23:06:54 +01:00
Tim van der Meij	a88dec197f	Merge pull request #11511 from Snuffleupagus/eslint-no-nested-ternary Enable the `no-nested-ternary` ESLint rule (PR 11488 follow-up)	2020-01-22 22:52:59 +01:00
Jonas Jenwald	3b78f4e8f8	Fix a couple of cases where Prettier broke existing formatting (PR 11446 follow-up) These two cases should have been whitelisted prior to re-formatting respectively had the comments fixed afterwards, however I unfortunately missed them because of the massive size of the diff.	2020-01-22 09:12:12 +01:00
Jani Pehkonen	809b96b40c	Hide .notdef glyphs in non-embedded Type1 fonts and don't ignore Widths Fixes #11403 The PDF uses the non-embedded Type1 font Helvetica. Character codes 194 and 160 (`Â` and `NBSP`) are encoded as `.notdef`. We shouldn't show those glyphs because it seems that Acrobat Reader doesn't draw glyphs that are named `.notdef` in fonts like this. In addition to testing `glyphName === ".notdef"`, we must test also `glyphName === ""` because the name `""` is used in `core/encodings.js` for undefined glyphs in encodings like `WinAnsiEncoding`. The solution above hides the `Â` characters but now the replacement character (space) appears to be too wide. I found out that PDF.js ignores font's `Widths` array if the font has no `FontDescriptor` entry. That happens in #11403, so the default widths of Helvetica were used as specified in `core/metrics.js` and `.nodef` got a width of 333. The correct width is 0 as specified by the `Widths` array in the PDF. Thus we must never ignore `Widths`.	2020-01-21 21:35:25 +02:00
Jonas Jenwald	a39943554a	Simplify, and tweak, a couple of `PDFJSDev` checks This removes a couple of, thanks to preceeding code, unnecessary `typeof PDFJSDev` checks, and also fixes a couple of incorrectly implemented (my fault) checks intended for `TESTING` builds.	2020-01-21 00:06:15 +01:00
Jonas Jenwald	7322a24ce4	Remove the `FIREFOX` build flag, since it's completely unused After PR 9566, which removed all of the old Firefox extension code, the `FIREFOX` build flag is no longer used for anything. It thus seems to me that it should be removed, for a couple of reasons: - It's simply dead code now, which only serves to add confusion when looking at the `PDFJSDev` calls. - It used to be that `MOZCENTRAL` and `FIREFOX` was almost always used together. However, ever since PR 9566 there's obviously been no effort put into keeping the `FIREFOX` build flags up to date. - In the event that a new, Webextension based, Firefox addon is created in the future you'd still need to audit all `MOZCENTRAL` (and possibly `CHROME`) build flags to see what'd make sense for the addon.	2020-01-21 00:06:15 +01:00
Tim van der Meij	ccf327538b	Merge pull request #11519 from tamuratak/enable_eslint_import_extensions Enable import/extensions of ESlint plugin to enforce all `import` have a `.js` file extension.	2020-01-19 17:37:19 +01:00
Jonas Jenwald	ee87e898db	Update the `GlobalWorkerOptions.workerSrc` JSDoc comment This particular JSDoc comment is fairly old and it also contains some now unrelated/confusing information. The only way to guarantee that the PDF.js library works as expected is to correctly set the global `workerSrc`[1], hence giving the impression that the option isn't strictly necessary is thus incorrect. --- [1] Since advertising the fallbackWorkerSrc functionality definitely seems like the wrong thing to do.	2020-01-19 12:44:42 +01:00
Takashi Tamura	00ce7898a2	Enable import/extensions of ESlint plugin to enforce all `import` have a `.js` file extension. Related to #11465. - https://github.com/benmosher/eslint-plugin-import/blob/master/docs/rules/extensions.md	2020-01-18 10:53:01 +09:00
Jonas Jenwald	9ab7c280aa	Cache the fallback font dictionary on the `PartialEvaluator` (PR 11218 follow-up) This way we'll benefit from the existing font caching, and can thus avoid re-creating a fallback font over and over again during parsing. (Thece changes necessitated the previous patch, since otherwise breakage could occur e.g. with fake workers.)	2020-01-16 15:12:05 +01:00

1 2 3 4 5 ...

3920 Commits