pdf.js

Author	SHA1	Message	Date
terurou	fc0f844539	Implement linear-gradient, radial-gradient and dummy-pattern in SVGGraphics.	2019-03-16 13:56:29 +09:00
Jonas Jenwald	88d5750030	Remove the `src` attribute from `Image` objects used with natively supported JPEG images, when pages are cleaned-up/destroyed This will further help reduce the amount of image data that's currently being held alive, by explicitly removing the `src` attribute. Please note that this is mostly relevant for browsers which do not support `URL.createObjectURL`, or where `disableCreateObjectURL` was manually set by the user, since `blob:` URLs will be revoked (see the previous patch). However, using `about:memory` (in Firefox) it does seem that this may also be generally helpful, given that calling `URL.revokeObjectURL` won't invalidate the image data itself (as far as I can tell).	2019-03-15 15:25:48 +01:00
Jonas Jenwald	983b25f863	Ensure that `blob:` URLs will be revoked when pages are cleaned-up/destroyed Natively supported JPEG images are sent as-is, using a `blob:` or possibly a `data` URL, to the main-thread for loading/decoding. However there's currently no attempt at releasing these resources, which are held alive by `blob:` URLs, which seems unfortunately given that images can be arbitrarily large. As mentioned in https://developer.mozilla.org/en-US/docs/Web/API/URL/createObjectURL the lifetime of these URLs are tied to the document, hence they are not being removed when a page is cleaned-up/destroyed (e.g. when being removed from the `PDFPageViewBuffer` in the viewer). This is easy to test with the help of `about:memory` (in Firefox), which clearly shows the number of `blob:` URLs becomming arbitrarily large without this patch. With this patch however the `blob:` URLs are immediately release upon clean-up as expected, and the memory consumption should thus be considerably reduced for long documents with (simple) JPEG images.	2019-03-15 10:40:58 +01:00
Tim van der Meij	80135378ca	Merge pull request #10636 from Snuffleupagus/PDFDocumentProxy-destroy Small clean-up of the `PDFDocumentProxy.destroy` method and related code	2019-03-13 23:46:41 +01:00
Jonas Jenwald	24fc4f83ca	Small clean-up of the `PDFDocumentProxy.destroy` method and related code Note how `PDFDocumentProxy.destroy` is a nothing more than an alias for `PDFDocumentLoadingTask.destroy`. While removing the latter method would be a breaking API change, there's still room for at least some clean-up here. The main changes in this patch are: - Stop providing a `PDFDocumentLoadingTask` instance separately when creating a `PDFDocumentProxy`, since the loadingTask is already available through the `WorkerTransport` instance. - Stop tracking the `PDFDocumentProxy` instance on the `WorkerTransport`, since that property is completely unused. - Simplify the 'Multiple `getDocument` instances' unit-tests by only destroying once, rather than twice, for each document.	2019-03-12 13:25:29 +01:00
Jonas Jenwald	88f9e633dd	Try to improve text-selection for Type3 fonts that utilize a non-default /FontMatrix (bug 1513120) For Type3 fonts text-selection is often not that great, and there's a couple of heuristics used to try and improve things. This patch simple extends those heuristics a bit, and fixes a pre-existing "naive" array comparison, but this all feels a bit brittle to say the least. The existing Type3 test-coverage isn't that great in general, and in particular Type3 `text` tests are few and far between, hence why this patch adds two different new `text` tests.	2019-03-12 10:32:08 +01:00
Tim van der Meij	8d4d7dbf58	Convert the `Lexer` class in `src/core/parser.js` to ES6 syntax	2019-03-10 19:04:36 +01:00
Tim van der Meij	7d0ecee771	Convert the `Parser` class in `src/core/parser.js` to ES6 syntax	2019-03-10 19:04:35 +01:00
Tim van der Meij	d587abbceb	Merge pull request #10633 from Snuffleupagus/murmurhash-class Convert `MurmurHash3_64` to an ES6 class	2019-03-09 21:07:12 +01:00
Jonas Jenwald	6b1ac44aea	Convert `MurmurHash3_64` to an ES6 class Notable changes: - Remove the `return this;` from the `MurmurHash3_64.update` method, since it's completely unused and doesn't make a lot of sense. - Remove the loop(s) from the `MurmurHash3_64.hexdigest` method, since creating a temporary array and then looping over it is wasteful given how simple this can be written with modern JavaScript.	2019-03-09 17:03:06 +01:00
Jonas Jenwald	2665502055	Move `NativeImageDecoder` into a separate file, and convert it to a `class` Given the size of the `src/core/evaluator.js` file, it cannot hurt to move some of its (image related) helper functionality into a separate file.	2019-03-09 15:59:04 +01:00
Tim van der Meij	e41c4aece4	Merge pull request #10621 from janpe2/svg-Tm-stroke Don't scale SVG stroke width by text matrix	2019-03-08 23:16:10 +01:00
Tim van der Meij	8b149b818e	Merge pull request #10615 from Snuffleupagus/corrupt-inline-ASCII85Decode Handle corrupt ASCII85Decode inline images with whitespace "inside" of the EOD marker (issue 10614)	2019-03-08 23:06:01 +01:00
Tim van der Meij	e1b01a601c	Merge pull request #10605 from timvandermeij/display-utils Convert `let` to `const` if possible in, and improve unit test coverage for, `src/display/display_utils.js`	2019-03-06 23:46:53 +01:00
Tim van der Meij	87a70f3359	Convert `let` to `const` if possible in `src/display/display_utils.js` Finally, `var` usage is removed.	2019-03-06 23:41:54 +01:00
Jani Pehkonen	d9e30b3452	Don't scale SVG stroke width by text matrix	2019-03-05 22:54:25 +02:00
Jonas Jenwald	3ce8fe7927	Handle corrupt ASCII85Decode inline images with whitespace "inside" of the EOD marker (issue 10614) There's a number of things wrong with the PDF document, since its inline images are first all a lot larger than the 4 KB limit (as mandated by the specification, see https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G7.1852045). Furthermore the actual ASCII85Decode data is interspersed with a lot of needless whitespace, in particular also "inside" of the EOD (end-of-data) marker which thus completely breaks the detection. Note that according to the specification, see https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G6.1940130, this patch should be safe since it explicitly mentions that all whitespace should be ignored.	2019-03-04 23:41:36 +01:00
Jonas Jenwald	7caf769a66	Move the `deprecated` helper function to the `src/display/display_utils.js` file Given that the function is (purposely) independent of the verbosity level and that its message is worded to only apply on the main-thread, there's no reason to duplicate this across the built `pdf.js`/`pdf.worker.js` files.	2019-03-02 20:23:56 +01:00
Jonas Jenwald	4170c414fa	Reduce usage of `Date.now()` in `src/core/worker.js` Currently for every single parsed/rendered page there's no less than four `Date.now()` calls being made on the worker-side. This seems totally unnecessary, since the result of these calls are, by default, not used for anything unless the verbosity level is set to `INFO`.	2019-03-02 20:23:52 +01:00
Tim van der Meij	c43396c2b7	Merge pull request #10590 from janpe2/svg-missing-moveto Fix missing moveTos in SVG paths	2019-03-02 14:43:53 +01:00
Tim van der Meij	4f13eb00d0	Merge pull request #10604 from brendandahl/fix-type1-charset Put the string name of the glyph in the charset array.	2019-03-02 13:03:16 +01:00
Brendan Dahl	7d6ab081eb	Put the string name of the glyph in the charset array. Also, only warn once per font when missing a glyph name.	2019-03-01 18:03:51 -08:00
Jonas Jenwald	d7d1f23826	Zero the width/height of the temporary canvas used during `TextLayer` rendering The default size of these canvases seem to be `300 x 150` (two orders of magnitude larger than the ones in PR 10597), which probably is sufficient enough to matter since there's one such canvas for each textLayer that's rendered in the viewer. Also fixes the incorrect rejection reason, i.e. one using a string rather than an `Error`, in the `TextLayerRenderTask.cancel` method.	2019-03-01 04:05:37 +01:00
Brendan Dahl	34022d2fd1	Merge pull request #10591 from brendandahl/fix-charset Add unique glyph names for CFF fonts.	2019-02-28 17:22:29 -08:00
Tim van der Meij	9559d57636	Merge pull request #10595 from Snuffleupagus/JpegDecode-zero-tmpCanvas Zero the width/height of the temporary canvas used during `JpegDecode` (issue 10594)	2019-02-28 23:41:22 +01:00
Tim van der Meij	39fa26ea33	Merge pull request #10597 from Snuffleupagus/isFontSubpixelAAEnabled-canvas-cleanup Ensure that the temporary canvas created in `CanvasGraphics.isFontSubpixelAAEnabled` will be cleared	2019-02-28 23:37:24 +01:00
Tim van der Meij	af5597b7e5	Merge pull request #10573 from Snuffleupagus/type3-avoid-truncation Avoid truncating/breaking some Type3 glyphs in `compileType3Glyph` (bug 1245391, issue 10568)	2019-02-28 23:25:45 +01:00
Jonas Jenwald	b61b4d3229	Ensure that the temporary canvas created in `CanvasGraphics.isFontSubpixelAAEnabled` will be cleared While this particular canvas may be small, there can still be an arbitrarily large number of them (one per page rendered), which can/will eventually add up memory wise. This can be easily avoided by using the `cachedCanvases` abstraction instead, which will ensure that the `isFontSubpixelAAEnabled` canvas is removed together with other temporary canvases in `CanvasGraphics.endDrawing`.	2019-02-28 14:18:38 +01:00
Jonas Jenwald	4687cc85ac	Zero the width/height of the temporary canvas used during `JpegDecode` (issue 10594)	2019-02-28 12:23:34 +01:00
Brendan Dahl	8a596ef5d5	Add unique glyph names for CFF fonts. Printing on MacOS was broken with the previous approach of just mapping all the glyphs to notdef.	2019-02-27 15:00:29 -08:00
Jonas Jenwald	f664e074c9	Avoid using the Fetch API, in `GENERIC` builds, for unsupported protocols (issue 10587)	2019-02-27 13:04:20 +01:00
Jonas Jenwald	cbc07f985b	Load built-in CMap files using the Fetch API when possible	2019-02-27 13:04:19 +01:00
Jani Pehkonen	52e8e9b059	Fix missing moveTos in SVG paths	2019-02-26 20:00:35 +02:00
Jonas Jenwald	3a09a2f7a5	Update the year in the `license_header` files	2019-02-24 00:35:42 +01:00
Jonas Jenwald	db5dc14158	Move worker-thread only functions from `src/shared/util.js` and into a new `src/core/core_utils.js` file The `src/shared/util.js` file is being bundled into both the `pdf.js` and `pdf.worker.js` files, meaning that its code is by definition duplicated. Some main-thread only utility functions have already been moved to a separate `src/display/display_utils.js` file, and this patch simply extends that concept to utility functions which are used only on the worker-thread. Note in particular the `getInheritableProperty` function, which expects a `Dict` as input and thus cannot possibly ever be used on the main-thread.	2019-02-24 00:35:39 +01:00
Jonas Jenwald	a1f7517996	Rename the `src/display/dom_utils.js` file to `src/display/display_utils.js` This file (currently) contains not only DOM-specific helper functions/classes, but is used generally for various helper code relevant for main-thread functionality.	2019-02-23 16:30:16 +01:00
Jonas Jenwald	fb774a65b0	Avoid truncating/breaking some Type3 glyphs in `compileType3Glyph` (bug 1245391, issue 10568) Hopefully this patch makes sense, since I cannot claim to fully understand this function. With the changes made in PR 3354 some Type3 glyph outlines are no longer rendering correctly, since the final paths were being accidentally ignored. The fact that Type3 fonts are not very common in PDF documents, and that most Type3 glyphs are unaffected by this regression, probably explains why this has gone unnoticed since 2013.	2019-02-21 23:29:43 +01:00
Jonas Jenwald	60f6d49ff7	[api-minor] Expose the existence of a `Collection` dictionary via the `getMetadata` API method (issue 10555) Given the complexity of this functionality, and the fact that it doesn't seem widely used, I highly doubt that it'd ever make sense to support Collections; see also https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#M11.9.39646.2Heading.824.Collections	2019-02-15 15:40:31 +01:00
Jonas Jenwald	b6d090cc14	Fallback to the built-in font renderer when font loading fails After PR 9340 all glyphs are now re-mapped to a Private Use Area (PUA) which means that if a font fails to load, for whatever reason[1], all glyphs in the font will now render as Unicode glyph outlines. This obviously doesn't look good, to say the least, and might be seen as a "regression" since previously many glyphs were left in their original positions which provided a slightly better fallback[2]. Hence this patch, which implements a general fallback to the PDF.js built-in font renderer for fonts that fail to load (i.e. are rejected by the sanitizer). One caveat here is that this only works for the Font Loading API, since it's easy to handle errors in that case[3]. The solution implemented in this patch does not in any way delay the loading of valid fonts, which was the problem with my previous attempt at a solution, and will only require a bit of extra work/waiting for those fonts that actually fail to load. Please note: This patch doesn't fix any of the underlying PDF.js font conversion bugs that's responsible for creating corrupt font files, however it does improve rendering in a number of cases; refer to this possibly incomplete list: [Bug 1524888](https://bugzilla.mozilla.org/show_bug.cgi?id=1524888) Issue 10175 Issue 10232 --- [1] Usually because the PDF.js font conversion code wasn't able to parse the font file correctly. [2] Glyphs fell back to some default font, which while not accurate was more useful than the current state. [3] Furthermore I'm not sure how to implement this generally, assuming that's even possible, and don't really have time/interest to look into it either.	2019-02-11 10:27:08 +01:00
Jonas Jenwald	13230a1123	Remove the ability to pass in more than one font to `BaseFontLoader.bind` - The only existing call-site, of this method, is never passing more than one font at a time anyway. - As far as I can remember, this functionality has never actually been used (caveat: I didn't check the git history). - This allows simplification of the method, especially by making use of the fact that it's now asynchronous. - It should be just as easy to call `BaseFontLoader.bind` from within a loop, rather than having the loop in the method itself.	2019-02-10 21:09:57 +01:00
Jonas Jenwald	af3fcca88d	Convert `BaseFontLoader.bind` to be async, and only utilize `BaseFontLoader._queueLoadingCallback` when actually necessary Currently all fonts are using the `_queueLoadingCallback` method to determine when they have been loaded[1]. However in most cases this is just adding unnecessary overhead, especially with `BaseFontLoader.bind` now being asynchronous, given how fonts are loaded: - For fonts loaded using the Font Loading API, it's already possible to easily tell when a font has been loaded simply by checking the `loaded` promise on the FontFace object itself. - For browsers, e.g. Firefox, which support synchronous font loading it's already assumed that fonts are immediately available. Hence the `_queueLoadingCallback` method is moved into the `GenericFontLoader`, such that it's only utilized for fonts which are loaded using CSS. --- [1] In the "fonts loaded using CSS" case, this is already a hack anyway as outlined in the comments.	2019-02-10 21:09:57 +01:00
Tsukasa OI	96ba6afd47	Fix copying on supplementary plane characters pdf.js had a problem when copying characters on supplementary planes (0xPPXXXX where PP is nonzero). This is because certain methods of PartialEvaluator use classic String.fromCharCode instead of ES6's String.fromCodePoint. Despite the fact that readToUnicode method tried to parse out-of-UCS2 code points by parsing UTF-16BE, it was inadequate because String.fromCharCode only supports UCS-2 range of Unicode.	2019-02-10 18:14:53 +09:00
Jonas Jenwald	3bcf9187ec	Add a polyfill for `classList.{add, remove}` with more than one parameter Unsurprisingly IE11 doesn't support this, so a polyfill is needed since otherwise the sidebar can no longer be opened. Also, simplifies the existing `classList.toggle` polyfill.	2019-02-08 13:35:01 +01:00
Jonas Jenwald	614e502227	[api-minor] Remove the `document.currentScript` polyfill This polyfill is currently used in only one file, i.e. `src/display/api.js`, and only when trying to build a fallback `workerSrc` path. Given that the global `workerSrc` should always be set[1] when using the PDF.js library[2], and that the fallback `workerSrc` should only be regarded as a best-effort solution anyway, there isn't a particularily strong reason to keep the compatibility code in my opinion. --- [1] Other supported options include setting the global `workerPort`, or passing in a `PDFWorker` instance as part of the `getDocument` call. [2] Which is clearly mentioned in the JSDocs in `src/display/worker_options.js`.	2019-02-03 14:09:24 +01:00
Jonas Jenwald	22468817e1	Add a `settled` property, tracking the fulfilled/rejected stated of the Promise, to `createPromiseCapability` This allows cleaning-up code which is currently manually tracking the state of the Promise of a `createPromiseCapability` instance.	2019-02-02 15:18:56 +01:00
Jonas Jenwald	6f94a05a29	Do the final text scaling correctly in `flushTextContentItem` (issue 8276) It's necessary to take into account whether or not the text is vertical, to avoid either the textContent `width` or `height` becoming incorrect.	2019-01-29 15:24:04 +01:00
Jonas Jenwald	5081063b9e	Attempt to clean-up/restore pending rendering operations when errors occurs while a `RenderTask` runs (PR 10202 follow-up) This piggybacks of the existing `cancel` functionality, to ensure that any pending operations are closed and that any temporary canvases are actually being removed. Also simplifies `finishPaintTask` in `PDFPageView.draw` slightly, by converting it to an async function.	2019-01-26 16:02:51 +01:00
Jonas Jenwald	29f36d7a1b	Reduce unnecessary duplication of the `isDefaultDecode` methods on `ColorSpace` instances The recent PR 10482 made me realize that I missed an opportunity for simplification when doing the class conversion of this code in PR 10007.	2019-01-25 08:53:08 +01:00
Tim van der Meij	e2701d5422	Merge pull request #10482 from janpe2/indexed-decode Implement Decode entry in Indexed images	2019-01-24 23:46:55 +01:00
Jonas Jenwald	41fbc71ef9	Ensure that `XRef.indexObjects` can handle object numbers with zero-padding (issue 10491) All objects in the PDF document follow this pattern: ``` 0000000001 0 obj << % Some content here... >> endobj 0000000002 0 obj << % More content here... endobj ```	2019-01-24 22:37:18 +01:00

1 2 3 4 5 ...

3552 Commits