pdf.js

Author	SHA1	Message	Date
Tim van der Meij	b161050df4	Merge pull request #10709 from Snuffleupagus/pageLayout [api-minor] Add basic support for PageLayout in the API and the viewer	2019-04-05 23:07:32 +02:00
Tim van der Meij	8c8738ea47	Merge pull request #10678 from Snuffleupagus/rm-moz-chunked-arraybuffer Remove `moz-chunked-arraybuffer` support, and related code, from `src/display/network.js`	2019-04-05 22:52:28 +02:00
Jonas Jenwald	7a999d1d67	[api-minor] Add basic support for PageLayout in the API and the viewer Please see the specification, https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G6.2393749, and refer to the inline comments for additional details.	2019-04-05 11:32:01 +02:00
Tim van der Meij	57abddc9ca	Merge pull request #10713 from Snuffleupagus/rm-JSDoc-annotation Remove `src/core/annotation.js` from the `gulp jsdoc` build target	2019-04-04 23:15:02 +02:00
Tim van der Meij	072c5864fb	Merge pull request #10675 from Snuffleupagus/PDFDataTransportStream-disableRange [Firefox regression] Fix `disableRange=true` bug in `PDFDataTransportStream`	2019-04-04 23:07:45 +02:00
Jonas Jenwald	f666395c24	Remove `src/core/annotation.js` from the `gulp jsdoc` build target Note how at https://mozilla.github.io/pdf.js/api/ it's being described as API docs, however `src/core/annotation.js` is not part of the public API. Furthermore, given that the code residing in the `src/core/` folder is run in a worker-thread, it's not even accessible on the main-thread (since `postMessage` is being used to transfer the data). Hence the different API methods simply returns a "proxy" to the underlying data, but not actually the same objects and data structures as in the worker-thread itself; thus it doesn't make a whole lot of sense to expose this in API docs as far as I'm concerned. Finally, the patch fixes a small JSDoc related typo in `src/display/api.js` when referring to the `TextStyle` typedef.	2019-04-04 18:03:08 +02:00
Jonas Jenwald	b40e6723be	Remove `moz-chunked-arraybuffer` support, and related code, from `src/display/network.js` The `moz-chunked-arraybuffer` responseType is a non-standard property, which has been subsumed by the Fetch API, and it's in the process of being removed from Firefox; please see https://bugzilla.mozilla.org/show_bug.cgi?id=1120171 and https://bugzilla.mozilla.org/show_bug.cgi?id=1411865 Please note: Rather than waiting for both `Fetch` and `ReadableStream` to be available in e.g. a Firefox ESR version (which is probably going to be 68 at the earliest), let's just decide that PDF.js release `2.1.266` will be the last one with `moz-chunked-arraybuffer` support and land this patch (since nothing should outright break without it anyway).	2019-04-01 20:48:51 +02:00
Jonas Jenwald	c6ddbd55e2	Add a `progressiveDataLength` fast-path to `ChunkedStream.ensureByte` This is similar to the existing check using in `ChunkedStream.ensureRange`.	2019-03-29 20:00:28 +01:00
Jonas Jenwald	49e8a270c4	Update `ChunkedStream.makeSubStream` to actually check if (some) data exists when the `length` parameter is undefined Note how `XRef.fetchUncompressed`, which is used a lot for most PDF documents, is calling the `makeSubStream` method without providing a `length` argument. In practice this results in the `makeSubStream` method, on the `ChunkedStream` instance, calling the `ensureRange` method with `NaN` as the end position, thus resulting in no data being requested despite it possibly being necessary. This may be quite bad, since in this particular case it will lead to a new `ChunkedStream` being created and also a new `Parser`/`Lexer` instance. Given that it's quite possible that even the very first `Parser.getObj` call could throw `MissingDataException`, this could thus lead to wasted time/resources (since re-parsing is necessary once the data finally arrives). You obviously need to be very careful to not have `ChunkedStream.makeSubStream` accidentally requesting the entire file, hence its `this.end` property is of no use here, but it should be possible to at least check that the `start` of the data is present before any potentially expensive parsing occurs.	2019-03-29 17:20:31 +01:00
Tim van der Meij	b4c3b94592	Merge pull request #6606 from Rob--W/pattern-scaling Improve performance and correctness of Tiling Patterns	2019-03-29 00:01:38 +01:00
Tim van der Meij	f9c58115fc	Merge pull request #10683 from janpe2/type0-noncid-cmap Use CMap in Type0 fonts when CFF is not a CID font	2019-03-28 00:07:08 +01:00
Rob Wu	5985d4069a	TilingPattern: Add comment to explain the implementation	2019-03-27 17:50:46 +01:00
Rob Wu	d3dc8f16b5	TilingPattern: Reverse transform after painting This transform resulted in an incorrectly positioned object when the bounding box's upper-left corner did not start at (0,0), because the translation was not reverted. This patch adds the missing transform. The test file (tiling-pattern-box.pdf) is based on the PDF from #2825. All but the first cube (including the PDF data) have been removed. To trigger the bug that is fixed by this commit, I changed the BBox of the first pattern from "[ 0 0 596 842]" to "[90 0 596 842]". Without this patch, the dashed vertical line that intersects the corners at A and E would disappear.	2019-03-27 17:50:35 +01:00
Rob Wu	a72a8e921f	Avoid extreme sizing / scaling in tiling pattern The new test file (tiling-pattern-large-steps.pdf) was manually created, to have the following characteristics: - Large xstep and ystep (90000) - Page width is 4000 (which is larger than MAX_PATTERN_SIZE) - Visually, the page consists of a red rectangle with a black border, surrounded by a 50 unit white padding. - Before patch: blurry; After patch: sharp Fixes #6496 Fixes #5698 Fixes #1434 Fixes #2825	2019-03-27 17:44:04 +01:00
Jonas Jenwald	9077abc263	Take the `FirstChar`/`LastChar` properties into account when computing the hash in `PartialEvaluator.preEvaluateFont` (issue 10665) Without this some fonts may incorrectly end up with matching `hash`es, thus breaking rendering since we'll not actually try to load/parse some of the fonts.	2019-03-27 16:27:10 +01:00
Jonas Jenwald	a2a824ed01	Don't accidentally use an empty `hash` value when comparing `preEvaluatedFonts` in `PartialEvaluator.loadFont` Note that `PartialEvaluator.preEvaluateFont` will return an empty string when no hash was computed. This will complete short-circuit the `fontAlias` comparison in `PartialEvaluator.loadFont`, since fonts which are totally different will then match if their `hash`es are empty.	2019-03-27 00:54:39 +01:00
Jani Pehkonen	49c6233fbc	Use CMap in Type0 fonts when CFF is not a CID font	2019-03-26 19:38:44 +02:00
Rob Wu	60d4685c10	Refactor TilingPattern - Deduplicate size/scale calculation, by introducing `getSizeAndScale`. - Eliminate unnecessary calculations / variables.	2019-03-26 17:35:23 +01:00
Jonas Jenwald	bb384dd5ed	[Firefox regression] Fix `disableRange=true` bug in `PDFDataTransportStream` Currently if trying to set `disableRange=true` in the built-in PDF Viewer in Firefox, either through `about:config` or via the URL hash, the PDF document will never load. It appears that this has been broken for a couple of years, without anyone noticing. Obviously it's not a good idea to set `disableRange=true`, however it seems that this bug affects the PDF Viewer in Firefox even with default settings: - In the case where `initialData` already contains the entire file, we're forced to dispatch a range request to re-fetch already available data just so that file loading may complete. - (In the case where the data arrives, via streaming, before being specifically requested through `requestDataRange`, we're also forced to re-fetch data unnecessarily.) This part was removed, to reduce the scope/risk of the patch somewhat. In the cases outlined above, we're having to re-fetch already available data thus potentially delaying loading/rendering of PDF files in Firefox (and wasting resources in the process).	2019-03-26 16:34:13 +01:00
wuhao.daraw	1472c10bab	fix: electron enviroment detection	2019-03-26 20:52:49 +08:00
Tim van der Meij	33bfbef6ba	Merge pull request #10635 from timvandermeij/lexer-parser Convert `src/core/parser.js` to ES6 syntax and write more unit tests for the lexer and the parser	2019-03-19 23:17:34 +01:00
Tim van der Meij	7d3cb19571	Convert the `Linearization` class in `src/core/parser.js` to ES6 syntax Moreover, disable `var` usage for this file.	2019-03-17 13:27:45 +01:00
Tim van der Meij	ee3cfb7986	Merge pull request #10646 from terurou/svg-fill Implement linear-gradient, radial-gradient and dummy-pattern in SVGGraphics.	2019-03-17 13:13:45 +01:00
terurou	9c70a3831c	Fix to use radicalGradient.	2019-03-17 10:57:16 +09:00
Tim van der Meij	7c9f1cc518	Merge pull request #10644 from Snuffleupagus/revokeObjectURL Ensure that `blob:` URLs will be revoked when pages are cleaned-up/destroyed (JPEG memory usage)	2019-03-16 19:29:23 +01:00
terurou	c970a4b6ae	Fix copy-paste mistake.	2019-03-16 23:21:56 +09:00
Jonas Jenwald	56eeeea1dc	Re-factor the `getTransfers` helper function into a "private" getter method on the `OperatorList` This function is currently called with the `OperatorList` instance as its argument, hence I cannot think of any good reason for not just moving it into the `OperatorList` properly. (This will also help with other planned changes regarding the `ImageCache` functionality.)	2019-03-16 13:06:51 +01:00
Jonas Jenwald	7273795eb6	Actually transfer eligible ImageMask data, rather than always copying it By transfering `ArrayBuffer`s you can avoid having two copies of the same data, i.e. one copy on each of the worker/main-thread, for data that's used only once on the worker-thread. Note how the code in [`PDFImage.createMask`](`80135378ca/src/core/image.js (L284-L285)`) goes to great lengths to actually enable tranfering of the image data. However in [`PartialEvaluator.buildPaintImageXObject`](`80135378ca/src/core/evaluator.js (L336)`) the `cached` property is always set to `true`, which disqualifies the image data from being transfered; see [`getTransfers`](`80135378ca/src/core/operator_list.js (L552-L554)`). For most ImageMask data this patch won't matter, since images found in the `/Resources -> /XObject` dictionary will always be indexed by name. However for inline images which contains ImageMask data, where only "small" images are cached (in both `parser.js` and `evaluator.js`), the current code will result in some unnecessary memory usage.	2019-03-16 13:06:32 +01:00
terurou	fc0f844539	Implement linear-gradient, radial-gradient and dummy-pattern in SVGGraphics.	2019-03-16 13:56:29 +09:00
Jonas Jenwald	88d5750030	Remove the `src` attribute from `Image` objects used with natively supported JPEG images, when pages are cleaned-up/destroyed This will further help reduce the amount of image data that's currently being held alive, by explicitly removing the `src` attribute. Please note that this is mostly relevant for browsers which do not support `URL.createObjectURL`, or where `disableCreateObjectURL` was manually set by the user, since `blob:` URLs will be revoked (see the previous patch). However, using `about:memory` (in Firefox) it does seem that this may also be generally helpful, given that calling `URL.revokeObjectURL` won't invalidate the image data itself (as far as I can tell).	2019-03-15 15:25:48 +01:00
Jonas Jenwald	983b25f863	Ensure that `blob:` URLs will be revoked when pages are cleaned-up/destroyed Natively supported JPEG images are sent as-is, using a `blob:` or possibly a `data` URL, to the main-thread for loading/decoding. However there's currently no attempt at releasing these resources, which are held alive by `blob:` URLs, which seems unfortunately given that images can be arbitrarily large. As mentioned in https://developer.mozilla.org/en-US/docs/Web/API/URL/createObjectURL the lifetime of these URLs are tied to the document, hence they are not being removed when a page is cleaned-up/destroyed (e.g. when being removed from the `PDFPageViewBuffer` in the viewer). This is easy to test with the help of `about:memory` (in Firefox), which clearly shows the number of `blob:` URLs becomming arbitrarily large without this patch. With this patch however the `blob:` URLs are immediately release upon clean-up as expected, and the memory consumption should thus be considerably reduced for long documents with (simple) JPEG images.	2019-03-15 10:40:58 +01:00
Tim van der Meij	80135378ca	Merge pull request #10636 from Snuffleupagus/PDFDocumentProxy-destroy Small clean-up of the `PDFDocumentProxy.destroy` method and related code	2019-03-13 23:46:41 +01:00
Jonas Jenwald	24fc4f83ca	Small clean-up of the `PDFDocumentProxy.destroy` method and related code Note how `PDFDocumentProxy.destroy` is a nothing more than an alias for `PDFDocumentLoadingTask.destroy`. While removing the latter method would be a breaking API change, there's still room for at least some clean-up here. The main changes in this patch are: - Stop providing a `PDFDocumentLoadingTask` instance separately when creating a `PDFDocumentProxy`, since the loadingTask is already available through the `WorkerTransport` instance. - Stop tracking the `PDFDocumentProxy` instance on the `WorkerTransport`, since that property is completely unused. - Simplify the 'Multiple `getDocument` instances' unit-tests by only destroying once, rather than twice, for each document.	2019-03-12 13:25:29 +01:00
Jonas Jenwald	88f9e633dd	Try to improve text-selection for Type3 fonts that utilize a non-default /FontMatrix (bug 1513120) For Type3 fonts text-selection is often not that great, and there's a couple of heuristics used to try and improve things. This patch simple extends those heuristics a bit, and fixes a pre-existing "naive" array comparison, but this all feels a bit brittle to say the least. The existing Type3 test-coverage isn't that great in general, and in particular Type3 `text` tests are few and far between, hence why this patch adds two different new `text` tests.	2019-03-12 10:32:08 +01:00
Tim van der Meij	8d4d7dbf58	Convert the `Lexer` class in `src/core/parser.js` to ES6 syntax	2019-03-10 19:04:36 +01:00
Tim van der Meij	7d0ecee771	Convert the `Parser` class in `src/core/parser.js` to ES6 syntax	2019-03-10 19:04:35 +01:00
Tim van der Meij	d587abbceb	Merge pull request #10633 from Snuffleupagus/murmurhash-class Convert `MurmurHash3_64` to an ES6 class	2019-03-09 21:07:12 +01:00
Jonas Jenwald	6b1ac44aea	Convert `MurmurHash3_64` to an ES6 class Notable changes: - Remove the `return this;` from the `MurmurHash3_64.update` method, since it's completely unused and doesn't make a lot of sense. - Remove the loop(s) from the `MurmurHash3_64.hexdigest` method, since creating a temporary array and then looping over it is wasteful given how simple this can be written with modern JavaScript.	2019-03-09 17:03:06 +01:00
Jonas Jenwald	2665502055	Move `NativeImageDecoder` into a separate file, and convert it to a `class` Given the size of the `src/core/evaluator.js` file, it cannot hurt to move some of its (image related) helper functionality into a separate file.	2019-03-09 15:59:04 +01:00
Tim van der Meij	e41c4aece4	Merge pull request #10621 from janpe2/svg-Tm-stroke Don't scale SVG stroke width by text matrix	2019-03-08 23:16:10 +01:00
Tim van der Meij	8b149b818e	Merge pull request #10615 from Snuffleupagus/corrupt-inline-ASCII85Decode Handle corrupt ASCII85Decode inline images with whitespace "inside" of the EOD marker (issue 10614)	2019-03-08 23:06:01 +01:00
Tim van der Meij	e1b01a601c	Merge pull request #10605 from timvandermeij/display-utils Convert `let` to `const` if possible in, and improve unit test coverage for, `src/display/display_utils.js`	2019-03-06 23:46:53 +01:00
Tim van der Meij	87a70f3359	Convert `let` to `const` if possible in `src/display/display_utils.js` Finally, `var` usage is removed.	2019-03-06 23:41:54 +01:00
Jani Pehkonen	d9e30b3452	Don't scale SVG stroke width by text matrix	2019-03-05 22:54:25 +02:00
Jonas Jenwald	3ce8fe7927	Handle corrupt ASCII85Decode inline images with whitespace "inside" of the EOD marker (issue 10614) There's a number of things wrong with the PDF document, since its inline images are first all a lot larger than the 4 KB limit (as mandated by the specification, see https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G7.1852045). Furthermore the actual ASCII85Decode data is interspersed with a lot of needless whitespace, in particular also "inside" of the EOD (end-of-data) marker which thus completely breaks the detection. Note that according to the specification, see https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G6.1940130, this patch should be safe since it explicitly mentions that all whitespace should be ignored.	2019-03-04 23:41:36 +01:00
Jonas Jenwald	7caf769a66	Move the `deprecated` helper function to the `src/display/display_utils.js` file Given that the function is (purposely) independent of the verbosity level and that its message is worded to only apply on the main-thread, there's no reason to duplicate this across the built `pdf.js`/`pdf.worker.js` files.	2019-03-02 20:23:56 +01:00
Jonas Jenwald	4170c414fa	Reduce usage of `Date.now()` in `src/core/worker.js` Currently for every single parsed/rendered page there's no less than four `Date.now()` calls being made on the worker-side. This seems totally unnecessary, since the result of these calls are, by default, not used for anything unless the verbosity level is set to `INFO`.	2019-03-02 20:23:52 +01:00
Tim van der Meij	c43396c2b7	Merge pull request #10590 from janpe2/svg-missing-moveto Fix missing moveTos in SVG paths	2019-03-02 14:43:53 +01:00
Tim van der Meij	4f13eb00d0	Merge pull request #10604 from brendandahl/fix-type1-charset Put the string name of the glyph in the charset array.	2019-03-02 13:03:16 +01:00
Brendan Dahl	7d6ab081eb	Put the string name of the glyph in the charset array. Also, only warn once per font when missing a glyph name.	2019-03-01 18:03:51 -08:00

1 2 3 4 5 ...

3580 Commits