pdf.js

Author	SHA1	Message	Date
Tim van der Meij	55d9b35d37	Merge pull request #10727 from Snuffleupagus/type3-image-resources Support (rare) Type3 fonts which contains image resources (issue 10717)	2019-04-18 23:07:26 +02:00
Tim van der Meij	ae2a4dc3dd	Implement free text annotations	2019-04-13 18:45:22 +02:00
Jonas Jenwald	be604bd195	Support (rare) Type3 fonts which contains image resources (issue 10717) The Type3 font type is not commonly used in PDF documents, as can be seen from telemetry data such as: https://telemetry.mozilla.org/new-pipeline/dist.html#!cumulative=0&end_date=2019-04-09&include_spill=0&keys=__none__!__none__!__none__&max_channel_version=nightly%252F68&measure=PDF_VIEWER_FONT_TYPES&min_channel_version=nightly%252F57&processType=&product=Firefox&sanitize=1&sort_by_value=0&sort_keys=submissions&start_date=2019-03-18&table=0&trim=1&use_submission_date=0 (see also https://github.com/mozilla/pdf.js/wiki/Enumeration-Assignments-for-the-Telemetry-Histograms#pdf_viewer_font_types). Type3 fonts containing image resources are very* rare in practice, usually they only contain path rendering operators, but as the issue shows they unfortunately do exist. Currently these Type3-related image resources are not handled in any special way, and given that fonts are document rather than page specific rendering breaks since the image resources are thus not available to the entire document. Fortunately fixing this isn't too difficult, but it does require adding a couple of Type3-specific code-paths to the `PartialEvaluator`. In order to keep the implementation simple, particularily on the main-thread, these Type3 image resources are completely decoded on the worker-thread to avoid adding too many special cases. This should not cause any issues, only marginally less efficient code, but given how rare this kind of Type3 font is adding premature optimizations didn't seem at all warranted at this point.	2019-04-13 18:27:50 +02:00
Mukul Mishra	02e46d22d2	Add fetch stream spec	2019-04-07 13:14:03 +02:00
Jonas Jenwald	7a999d1d67	[api-minor] Add basic support for PageLayout in the API and the viewer Please see the specification, https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G6.2393749, and refer to the inline comments for additional details.	2019-04-05 11:32:01 +02:00
Tim van der Meij	072c5864fb	Merge pull request #10675 from Snuffleupagus/PDFDataTransportStream-disableRange [Firefox regression] Fix `disableRange=true` bug in `PDFDataTransportStream`	2019-04-04 23:07:45 +02:00
Tim van der Meij	b4c3b94592	Merge pull request #6606 from Rob--W/pattern-scaling Improve performance and correctness of Tiling Patterns	2019-03-29 00:01:38 +01:00
Tim van der Meij	f9c58115fc	Merge pull request #10683 from janpe2/type0-noncid-cmap Use CMap in Type0 fonts when CFF is not a CID font	2019-03-28 00:07:08 +01:00
Rob Wu	d3dc8f16b5	TilingPattern: Reverse transform after painting This transform resulted in an incorrectly positioned object when the bounding box's upper-left corner did not start at (0,0), because the translation was not reverted. This patch adds the missing transform. The test file (tiling-pattern-box.pdf) is based on the PDF from #2825. All but the first cube (including the PDF data) have been removed. To trigger the bug that is fixed by this commit, I changed the BBox of the first pattern from "[ 0 0 596 842]" to "[90 0 596 842]". Without this patch, the dashed vertical line that intersects the corners at A and E would disappear.	2019-03-27 17:50:35 +01:00
Rob Wu	a72a8e921f	Avoid extreme sizing / scaling in tiling pattern The new test file (tiling-pattern-large-steps.pdf) was manually created, to have the following characteristics: - Large xstep and ystep (90000) - Page width is 4000 (which is larger than MAX_PATTERN_SIZE) - Visually, the page consists of a red rectangle with a black border, surrounded by a 50 unit white padding. - Before patch: blurry; After patch: sharp Fixes #6496 Fixes #5698 Fixes #1434 Fixes #2825	2019-03-27 17:44:04 +01:00
Jonas Jenwald	9077abc263	Take the `FirstChar`/`LastChar` properties into account when computing the hash in `PartialEvaluator.preEvaluateFont` (issue 10665) Without this some fonts may incorrectly end up with matching `hash`es, thus breaking rendering since we'll not actually try to load/parse some of the fonts.	2019-03-27 16:27:10 +01:00
Jani Pehkonen	49c6233fbc	Use CMap in Type0 fonts when CFF is not a CID font	2019-03-26 19:38:44 +02:00
Jonas Jenwald	bb384dd5ed	[Firefox regression] Fix `disableRange=true` bug in `PDFDataTransportStream` Currently if trying to set `disableRange=true` in the built-in PDF Viewer in Firefox, either through `about:config` or via the URL hash, the PDF document will never load. It appears that this has been broken for a couple of years, without anyone noticing. Obviously it's not a good idea to set `disableRange=true`, however it seems that this bug affects the PDF Viewer in Firefox even with default settings: - In the case where `initialData` already contains the entire file, we're forced to dispatch a range request to re-fetch already available data just so that file loading may complete. - (In the case where the data arrives, via streaming, before being specifically requested through `requestDataRange`, we're also forced to re-fetch data unnecessarily.) This part was removed, to reduce the scope/risk of the patch somewhat. In the cases outlined above, we're having to re-fetch already available data thus potentially delaying loading/rendering of PDF files in Firefox (and wasting resources in the process).	2019-03-26 16:34:13 +01:00
Jonas Jenwald	234c1d2b2a	Remove the Firefox-specific 'read with streaming' unit-test Support for the non-standard `moz-chunked-arraybuffer` response type is in the process of being removed from Firefox; see e.g. https://bugzilla.mozilla.org/show_bug.cgi?id=1411865 For the time being, you probably want to keep support for this in the general PDF.js library given that feature detection is used. However, removing the unit-test immediately seems reasonable, since it will otherwise start failing once the platform support for `moz-chunked-arraybuffer` is gone. Fixes 8851; please note that if unit-tests for the code in `fetch_stream.js` are wanted, which I'm assuming they are, those should live in their own file rather than being lumped into `network_spec.js` anyway.	2019-03-22 12:43:18 +01:00
Thomas den Hollander	b24a14738a	Update test case description	2019-03-20 12:52:32 +01:00
Tim van der Meij	33bfbef6ba	Merge pull request #10635 from timvandermeij/lexer-parser Convert `src/core/parser.js` to ES6 syntax and write more unit tests for the lexer and the parser	2019-03-19 23:17:34 +01:00
Tim van der Meij	4a4b197b9d	Write more unit tests for the lexer and the parser Moreover, group the lexer unit tests per method. This matches what we do for other classes and makes it more easily visible which methods we don't or insufficiently unit test. The parser itself is not unit tested yet, so this patch provides a start for doing so. The `inlineStreamSkipEI` method is used in other end marker detection methods, so it's important that its functionality is correct for proper parsing.	2019-03-17 13:36:23 +01:00
Tim van der Meij	2ee299a62b	Convert `test/unit/parser_spec.js` to ES6 syntax Moreover, disable `var` usage for this file.	2019-03-17 13:27:46 +01:00
Tim van der Meij	80135378ca	Merge pull request #10636 from Snuffleupagus/PDFDocumentProxy-destroy Small clean-up of the `PDFDocumentProxy.destroy` method and related code	2019-03-13 23:46:41 +01:00
Jonas Jenwald	24fc4f83ca	Small clean-up of the `PDFDocumentProxy.destroy` method and related code Note how `PDFDocumentProxy.destroy` is a nothing more than an alias for `PDFDocumentLoadingTask.destroy`. While removing the latter method would be a breaking API change, there's still room for at least some clean-up here. The main changes in this patch are: - Stop providing a `PDFDocumentLoadingTask` instance separately when creating a `PDFDocumentProxy`, since the loadingTask is already available through the `WorkerTransport` instance. - Stop tracking the `PDFDocumentProxy` instance on the `WorkerTransport`, since that property is completely unused. - Simplify the 'Multiple `getDocument` instances' unit-tests by only destroying once, rather than twice, for each document.	2019-03-12 13:25:29 +01:00
Jonas Jenwald	88f9e633dd	Try to improve text-selection for Type3 fonts that utilize a non-default /FontMatrix (bug 1513120) For Type3 fonts text-selection is often not that great, and there's a couple of heuristics used to try and improve things. This patch simple extends those heuristics a bit, and fixes a pre-existing "naive" array comparison, but this all feels a bit brittle to say the least. The existing Type3 test-coverage isn't that great in general, and in particular Type3 `text` tests are few and far between, hence why this patch adds two different new `text` tests.	2019-03-12 10:32:08 +01:00
Tim van der Meij	8b149b818e	Merge pull request #10615 from Snuffleupagus/corrupt-inline-ASCII85Decode Handle corrupt ASCII85Decode inline images with whitespace "inside" of the EOD marker (issue 10614)	2019-03-08 23:06:01 +01:00
Tim van der Meij	b244622f7e	Improve unit test coverage for `src/display/display_utils.js` The `DOMCanvasFactory` class is now fully covered. Moreover, missing cases for the `getFilenameFromUrl` function have been included. Finally, `var` usage has been removed.	2019-03-06 23:41:54 +01:00
Jonas Jenwald	3ce8fe7927	Handle corrupt ASCII85Decode inline images with whitespace "inside" of the EOD marker (issue 10614) There's a number of things wrong with the PDF document, since its inline images are first all a lot larger than the 4 KB limit (as mandated by the specification, see https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G7.1852045). Furthermore the actual ASCII85Decode data is interspersed with a lot of needless whitespace, in particular also "inside" of the EOD (end-of-data) marker which thus completely breaks the detection. Note that according to the specification, see https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G6.1940130, this patch should be safe since it explicitly mentions that all whitespace should be ignored.	2019-03-04 23:41:36 +01:00
Brendan Dahl	34022d2fd1	Merge pull request #10591 from brendandahl/fix-charset Add unique glyph names for CFF fonts.	2019-02-28 17:22:29 -08:00
Tim van der Meij	af5597b7e5	Merge pull request #10573 from Snuffleupagus/type3-avoid-truncation Avoid truncating/breaking some Type3 glyphs in `compileType3Glyph` (bug 1245391, issue 10568)	2019-02-28 23:25:45 +01:00
Brendan Dahl	8a596ef5d5	Add unique glyph names for CFF fonts. Printing on MacOS was broken with the previous approach of just mapping all the glyphs to notdef.	2019-02-27 15:00:29 -08:00
Jonas Jenwald	f664e074c9	Avoid using the Fetch API, in `GENERIC` builds, for unsupported protocols (issue 10587)	2019-02-27 13:04:20 +01:00
Jonas Jenwald	cbc07f985b	Load built-in CMap files using the Fetch API when possible	2019-02-27 13:04:19 +01:00
Jonas Jenwald	c5cf3ab808	Run the `custom_spec` unit-tests in Node.js/Travis (PR 10537 follow-up)	2019-02-26 22:40:55 +01:00
Jonas Jenwald	db5dc14158	Move worker-thread only functions from `src/shared/util.js` and into a new `src/core/core_utils.js` file The `src/shared/util.js` file is being bundled into both the `pdf.js` and `pdf.worker.js` files, meaning that its code is by definition duplicated. Some main-thread only utility functions have already been moved to a separate `src/display/display_utils.js` file, and this patch simply extends that concept to utility functions which are used only on the worker-thread. Note in particular the `getInheritableProperty` function, which expects a `Dict` as input and thus cannot possibly ever be used on the main-thread.	2019-02-24 00:35:39 +01:00
Jonas Jenwald	a1f7517996	Rename the `src/display/dom_utils.js` file to `src/display/display_utils.js` This file (currently) contains not only DOM-specific helper functions/classes, but is used generally for various helper code relevant for main-thread functionality.	2019-02-23 16:30:16 +01:00
Jonas Jenwald	fb774a65b0	Avoid truncating/breaking some Type3 glyphs in `compileType3Glyph` (bug 1245391, issue 10568) Hopefully this patch makes sense, since I cannot claim to fully understand this function. With the changes made in PR 3354 some Type3 glyph outlines are no longer rendering correctly, since the final paths were being accidentally ignored. The fact that Type3 fonts are not very common in PDF documents, and that most Type3 glyphs are unaffected by this regression, probably explains why this has gone unnoticed since 2013.	2019-02-21 23:29:43 +01:00
Jonas Jenwald	a0354494bd	Re-factor the `PDFDataRangeTransport` unit-tests and enable them in Node.js/Travis There doesn't appear to be any particular reason for only running these unit-tests in browsers, since the `PDFDataRangeTransport` functionality itself should be back-end agnostic.	2019-02-17 14:45:17 +01:00
Jonas Jenwald	507e0a4907	Add a new `DOMFileReaderFactory` helper to the unit-tests, and re-factor `NodeFileReaderFactory` to be asynchronous This allows simplification of the 'creates pdf doc from URL and aborts loading after worker initialized' API unit-test. Note that the `DOMFileReaderFactory` uses the Fetch API, for simplicity, since it should be available in all browsers where we're running tests.	2019-02-17 14:41:14 +01:00
Jonas Jenwald	60f6d49ff7	[api-minor] Expose the existence of a `Collection` dictionary via the `getMetadata` API method (issue 10555) Given the complexity of this functionality, and the fact that it doesn't seem widely used, I highly doubt that it'd ever make sense to support Collections; see also https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#M11.9.39646.2Heading.824.Collections	2019-02-15 15:40:31 +01:00
Tim van der Meij	1d90c76097	Merge pull request #10537 from timvandermeij/unittest Improve unit test coverage	2019-02-12 00:12:29 +01:00
Tim van der Meij	7c91e94b19	Implement the `NodeCanvasFactory` class to execute more unit tests in Node.js	2019-02-10 19:37:34 +01:00
Tim van der Meij	b6eddc40b5	Write unit tests for the `string32` and `toRomanNumerals` utility functions	2019-02-10 18:58:52 +01:00
Tsukasa OI	96ba6afd47	Fix copying on supplementary plane characters pdf.js had a problem when copying characters on supplementary planes (0xPPXXXX where PP is nonzero). This is because certain methods of PartialEvaluator use classic String.fromCharCode instead of ES6's String.fromCodePoint. Despite the fact that readToUnicode method tried to parse out-of-UCS2 code points by parsing UTF-16BE, it was inadequate because String.fromCharCode only supports UCS-2 range of Unicode.	2019-02-10 18:14:53 +09:00
Jonas Jenwald	22468817e1	Add a `settled` property, tracking the fulfilled/rejected stated of the Promise, to `createPromiseCapability` This allows cleaning-up code which is currently manually tracking the state of the Promise of a `createPromiseCapability` instance.	2019-02-02 15:18:56 +01:00
Jonas Jenwald	2b0b6178f7	Clean-up after the `gets operatorList with JPEG image (issue 4888)` unit-test This unit-test wasn't destroying the `loadingTask` when complete, as it should have done.	2019-01-29 15:24:08 +01:00
Jonas Jenwald	6f94a05a29	Do the final text scaling correctly in `flushTextContentItem` (issue 8276) It's necessary to take into account whether or not the text is vertical, to avoid either the textContent `width` or `height` becoming incorrect.	2019-01-29 15:24:04 +01:00
Tim van der Meij	e2701d5422	Merge pull request #10482 from janpe2/indexed-decode Implement Decode entry in Indexed images	2019-01-24 23:46:55 +01:00
Jonas Jenwald	41fbc71ef9	Ensure that `XRef.indexObjects` can handle object numbers with zero-padding (issue 10491) All objects in the PDF document follow this pattern: ``` 0000000001 0 obj << % Some content here... >> endobj 0000000002 0 obj << % More content here... endobj ```	2019-01-24 22:37:18 +01:00
Jani Pehkonen	26121177ab	Implement Decode entry in Indexed images	2019-01-22 22:51:04 +02:00
Tim van der Meij	c4fe4087d3	Implement a unit test for metadata parsing to ensure that it's not vulnerable to the billion laughs attack	2019-01-19 19:54:08 +01:00
Jonas Jenwald	24a688d6c6	Convert some usage of `indexOf` to `startsWith`/`includes` where applicable In many cases in the code you don't actually care about the index itself, but rather just want to know if something exists in a String/Array or if a String starts in a particular way. With modern JavaScript functionality, it's thus possible to remove a number of existing `indexOf` cases.	2019-01-18 17:57:41 +01:00
Jonas Jenwald	9f45f8dfda	When parsing Metadata, attempt to remove "junk" before the first tag (PR 10398 follow-up) This will allow the Metadata to be successfully extracted from the PDF file in issue 10395. Furthermore, this patch also fixes a bug in `Metadata.get` which causes the method to return `null` rather than an empty string or zero (since either ought to be allowed).	2019-01-16 12:44:27 +01:00
Jonas Jenwald	5d90224409	Add a unit-test for issue 10395 (PR 10398 follow-up)	2019-01-16 11:30:36 +01:00

1 2 3 4 5 ...

1849 Commits