pdf.js

Author	SHA1	Message	Date
Brendan Dahl	6b12612a52	Sanitize name index in compile phase of CFF. Fixes #8960	2017-10-23 17:13:49 -07:00
Brendan Dahl	fcc9943d04	Use charstring as plain text when lengthIV is -1. Fixes #7769	2017-10-18 14:19:59 -07:00
Tim van der Meij	509d3728f1	Merge pull request #8922 from Snuffleupagus/paintXObject-errors Allow `getOperatorList`/`getTextContent` to skip errors when parsing broken XObjects (issue 8702, issue 8704)	2017-10-07 15:46:26 +02:00
Tim van der Meij	f73c9b75d9	Transform Web Archive URLs to avoid downloading an HTML page instead of the PDF file Moreover, adjust one linked test case that did not conform to the standard Web Archive URL format and adjust one linked test case because the link was dead.	2017-09-30 19:50:31 +02:00
Jonas Jenwald	b1472cddbb	Allow `getOperatorList`/`getTextContent` to skip errors when parsing broken XObjects (issue 8702, issue 8704) This patch makes use of the existing `ignoreErrors` property in `src/core/evaluator.js`, see PRs 8240 and 8441, thus allowing us to attempt to recovery as much as possible of a page even when it contains broken XObjects. Fixes 8702. Fixes 8704.	2017-09-29 17:14:21 +02:00
Brendan Dahl	18e2321845	Overwrite maxSizeOfInstructions in maxp with computed value. In issue #7507 the value is less than the actuall max size of the glyph instructions causing OTS to fail the font.	2017-09-25 17:53:26 -07:00
Jonas Jenwald	10727572a2	Merge pull request #8950 from timvandermeij/polygon-polyline-annotations Implement support for polyline and polygon annotations	2017-09-24 15:16:14 +02:00
Tim van der Meij	c69a7a83da	Merge pull request #8932 from janpe2/jbig2-sym-offset JBIG2 symbol offsets	2017-09-23 17:11:45 +02:00
Tim van der Meij	ed8c0ebfa7	Implement reference tests for polyline and polygon annotations	2017-09-23 17:01:19 +02:00
Jonas Jenwald	abc864fca9	Merge pull request #8938 from brendandahl/bug1392647 Use font's default width even when 0. (bug 1392647)	2017-09-20 22:38:39 +02:00
Brendan Dahl	10ba292b46	Use font's default width even when 0. Bug 1392647 has a PDF where the default width of the font is 0. It draws some charcodes that don't have glyphs, but we were wrongly using the 1000 default width for these charcodes causing some text to be overlapping.	2017-09-20 11:38:30 -07:00
Jani Pehkonen	5d1074c110	Fix JBIG2 symbol offsets in text regions	2017-09-19 23:43:23 +03:00
Jani Pehkonen	3d99b8d706	CCITTFaxStream problem when EndOfBlock is false	2017-09-19 22:19:40 +03:00
Tilman Hausherr	d75a497a6b	support tiff predictor for 16bit (for issue #6289) This does the same for 16 bit as the existing 8 bit tiff predictor code, an addition of the last word to this word. The last two "& 0xFF" may or may not be needed, I see this isn't done in the 8 bit code, but I'm not a JS developer.	2017-09-18 22:24:25 +02:00
Tim van der Meij	400e4aae0e	Implement support for stamp annotations	2017-09-16 16:37:50 +02:00
Jonas Jenwald	eece66fa3e	For /Filter entries containing `Name`s, ignore the /DecodeParms entry if it contains an Array (issue 8895)	2017-09-15 23:02:16 +02:00
Jonas Jenwald	f2618eb2e4	Merge pull request #8808 from janpe2/issue8741 Fix color of image masks inside uncolored patterns	2017-09-12 14:27:56 +02:00
Tim van der Meij	320779e6ed	Merge pull request #8691 from timvandermeij/square-circle-annotations Implement support for square and circle annotations	2017-09-09 22:56:54 +02:00
Tim van der Meij	c04f9d6098	Implement reference tests for square and circle annotations	2017-09-09 21:36:28 +02:00
Jonas Jenwald	7115e136e4	Hide unsupported `LinkAnnotation`s (issue 3897) Rather than displaying links that does nothing when clicked, it probably makes more sense to simply not render them instead. Especially since it turns out that, at least at this point in time, this is very easy to both implement and test. Fixes 3897.	2017-09-06 12:52:56 +02:00
Jani Pehkonen	86020396cb	Fix color of image masks inside uncolored patterns	2017-09-06 13:41:48 +03:00
Jonas Jenwald	49b8cd5a6a	Attempt to improve the `EI` detection heuristics, for inline images, in streams containing `NUL` bytes (issue 8823) Since this patch will now treat (some) `NUL` bytes as "ASCII", the number of `followingBytes` checked are thus increased to (hopefully) reduce the risk of introducing new false positives. Fixes 8823.	2017-08-27 12:48:28 +02:00
Tim van der Meij	798e46da97	Merge pull request #8821 from Snuffleupagus/issue-8798-reduced-test Replace the test-case for issue 8798 with a reduced one (PR 8800 follow-up)	2017-08-26 00:00:45 +02:00
Jonas Jenwald	4660cf8238	Prevent an infinite loop in `XRef.readXRef` by keeping track of already parsed tables (bug 1393476) With this patch, not only is the infinite loop prevented, but we're also able to actually render the file (which e.g. Adobe Reader isn't able to). Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1393476.	2017-08-24 19:18:08 +02:00
Jonas Jenwald	4891b9c7e0	Replace the test-case for issue 8798 with a reduced one (PR 8800 follow-up) Re: issue 8798 and PR 8800. Big thanks to @THausherr for providing the test-case.	2017-08-24 17:43:05 +02:00
Tim van der Meij	e9ba54940d	Merge pull request #8800 from Snuffleupagus/issue-8798 Try to recover if we reach the end of the stream when searching for the `EI` marker of an inline image (issue 8798)	2017-08-23 23:47:51 +02:00
Jonas Jenwald	cb55506b95	Try to recover if we reach the end of the stream when searching for the `EI` marker of an inline image (issue 8798)	2017-08-22 09:33:13 +02:00
Jani Pehkonen	9a581ee9ed	Implement JBIG2 halftone regions and pattern dictionaries	2017-08-08 15:38:29 +03:00
Brendan Dahl	0bef50d56d	Fix two cmap related issues. In issue #8707, there's a char code mapped to a non- existing glyph which shouldn't be drawn. However, we saw it was missing and tried to then use the post table and end up mapping it incorrectly. This illuminated a problem with issue #5704 and bug 893730 where glyphs disappeared after above fix. This was from the cmap returning the wrong glyph id. Which in turn was caused because the font had multiple of the same type of cmap table and we were choosing the last one. Now, we instead default to the first one. I'm unsure if we should instead be merging the multiple cmaps, but using only the first one works.	2017-08-03 22:19:36 -07:00
Jonas Jenwald	23ec6b16ca	Add a fallback for non-embedded SegoeUISymbol font (issue 8697) The PDF file uses a non-embedded SegoeUISymbol font, which is not a standard font (and is mainly used by Microsoft, see https://en.wikipedia.org/wiki/Segoe). Fixes 8697.	2017-07-25 12:45:11 +02:00
Jonas Jenwald	794b099385	Add a reduced test-case for issue 7696 Issue 7696 was one of the issues fixed by PR 8580. The other ones were all cases of missing glyphs, however in this particular one glyphs did render but every single one was incorrect. Hence it probably cannot hurt to have a small, reduced, reference test for that PDF file as well.	2017-07-24 09:55:16 +02:00
Rob Wu	01f03fe393	Optimize PNG compression in SVG backend on Node.js Use the environment's zlib implementation if available to get reasonably-sized SVG files when an XObject image is converted to PNG. The generated PNG is not optimal because we do not use a PNG predictor. Futher, when our SVG backend is run in a browser, the generated PNG images will still be unnecessarily large (though the use of blob:-URLs when available should reduce the impact on memory usage). If we want to optimize PNG images in browsers too, we can either try to use a DEFLATE library such as pako, or re-use our XObject image painting logic in src/display/canvas.js. This potential improvement is not implemented by this commit Tested with: - Node.js 8.1.3 (uses zlib) - Node.js 0.11.12 (uses zlib) - Node.js 0.10.48 (falls back to inferior existing implementation). - Chrome 59.0.3071.86 - Firefox 54.0 Tests: Unit test on Node.js: ``` $ gulp lib $ JASMINE_CONFIG_PATH=test/unit/clitests.json node ./node_modules/.bin/jasmine --filter=SVG ``` Unit test in browser: Run `gulp server` and open http://localhost:8888/test/unit/unit_test.html?spec=SVGGraphics To verify that the patch works as desired, ``` $ node examples/node/pdf2svg.js test/pdfs/xobject-image.pdf $ du -b svgdump/xobject-image-1.svg # ^ Calculates the file size. Confirm that the size is small # (784 instead of 80664 bytes). ```	2017-07-10 18:56:57 +02:00
Jonas Jenwald	ea71d23f74	Fix a stupid spelling error in the `ASCII85Decode` name in `Parser.makeInlineImage` (issue 8613) This is a trivial follow-up to PR 5383, and it's a bit strange that this has been wrong since late 2014 without anyone noticing (maybe because inline images aren't too common). So, apparently code works better if you actually spell correctly, who knew ;-) Fixes 8613.	2017-07-05 19:43:09 +02:00
Jonas Jenwald	eff257b820	Merge pull request #8580 from brendandahl/missing-glyf Fix how we detect and handle missing glyph data.	2017-07-04 12:16:07 +02:00
Brendan Dahl	efbbd8533f	Only mask char codes of (3, 0) cmap tables in the range of 0xF000 to 0xF0FF.	2017-07-03 13:13:46 -07:00
Brendan Dahl	6d4f748fb1	Fix how we detect and handle missing glyph data.	2017-07-03 13:06:06 -07:00
Brendan Dahl	a8a8909d2d	Fix missing notdef in expert encoding.	2017-06-29 12:12:39 -07:00
Brendan Dahl	f1f9d98519	Merge pull request #8507 from Snuffleupagus/issue-8480 Only special-case OpenType fonts with `CFF` data if it's both a composite (i.e. Type0) font and also has a non-default CID to GID map (issue 8480)	2017-06-23 13:36:58 -07:00
Rob Wu	fc6448d18c	Move svg:clipPath generation from clip to endPath In the PDF from issue 8527, the clip operator (W) shows up before a path is defined. The current SVG backend however expects a path to exist before generating a `<svg:clipPath>` element. In the example, the path was defined after the clip, followed by a endPath operator (n). So this commit fixes the bug by moving the path generation logic from clip to endPath. Our canvas backend appears to use similar logic: `CanvasGraphics_endPath` calls `consumePath`, which in turn draws the clip and resets the `pendingClip` state. The canvas backend calls `consumePath` from multiple other places, so we probably need to check whether doing so is also necessary for the SVG backend. I scanned our corpus of PDF files in test/pdfs, and found that in every instance (except for one), the "W" PDF operator (clip) is immediately followed by "n" (endPath). The new test from this commit (clippath.pdf) starts with "W", followed by a path definition and then "n". # Commands used to find some of the clipping commands: grep -ra '^W$' -C7 \| less -S grep -ra '^W ' -C7 \| less -S grep -ra ' W$' -C7 \| less -S test/pdfs/issue6413.pdf is the only file where "W" (a tline 55) is not followed by "n". In fact, the "W" is the last operation of a series of XObject painting operations, and removing it does not have any effect on the rendered PDF (confirmed by looking at the output of PDF.js's canvas backend, and ImageMagick's convert command).	2017-06-22 01:08:17 +02:00
Jonas Jenwald	e589834f13	Ensure that `TilingPattern`s have valid (non-zero) /BBox arrays (issue 8330) Fixes 8330.	2017-06-09 21:41:48 +02:00
Jonas Jenwald	8b4a42e5b8	Only special-case OpenType fonts with `CFF` data if it's both a composite (i.e. Type0) font and also has a non-default CID to GID map (issue 8480) As mentioned the last time that I touched this particular part of the font code, I'm sincerely hope that this doesn't cause any regressions! However, the patch passes all tests added in PRs 5770, 6270, and 7904 (and obviously all other tests as well). Furthermore, I've manually checked all the issues/bugs referenced in those PRs without finding any issues. Fixes 8480.	2017-06-09 21:15:39 +02:00
Jonas Jenwald	4ce5e520fb	Add different code-paths to `{CMap, ToUnicodeMap}.charCodeOf` depending on length, since `Array.prototype.indexOf` can be extremely inefficient for very large arrays (issue 8372) Fixes 8372.	2017-05-24 19:47:04 +02:00
Jonas Jenwald	ac942ac657	Merge pull request #8437 from yurydelendik/default-ctx Resets canvas 2d context to the default state.	2017-05-23 23:31:57 +02:00
Yury Delendik	a67198895f	Resets canvas 2d context to the default state.	2017-05-23 15:10:30 -05:00
Jonas Jenwald	31c24ed631	Don't map glyphs to the HANGUL FILLER (0x3164) Unicode location (issue 8424) This patch follows a similar pattern as previous ones, by skipping certain problematic Unicode locations. According to http://searchfox.org/mozilla-central/rev/6c2dbacbba1d58b8679cee700fd0a54189e0cf1b/gfx/harfbuzz/src/hb-unicode-private.hh#136, it seems that the HANGUL FILLER (0x3164) location is "special". Fixes 8424.	2017-05-23 16:12:45 +02:00
Yury Delendik	5dc8dcdc0f	Merge pull request #8388 from Snuffleupagus/issue-8380 Cache JPEG images, just as we do for other image formats, in `evaluator.js` (issue 8380)	2017-05-17 17:25:51 -05:00
chris.greening	cfc2f36f5c	Adds additional parameter so background color of canvas can be set	2017-05-17 17:06:44 +01:00
Jonas Jenwald	0c2ebda31c	Cache JPEG images, just as we do for other image formats, in `evaluator.js` (issue 8380) For some reason, we're putting all kind of images except JPEG into the `imageCache` in `evaluator.js`.[1] This means that in the PDF file in issue 8380, we'll keep sending the same two small images[2] to the main-thread and decoding them over and over. This is obviously hugely inefficient! As can be seen from the discussion in the issue, the performance becomes extremely bad if the user has the addon "Adblock Plus" installed. However, even in a clean Firefox profile, the performance isn't that great. This patch not only addresses the performance implications of the "Adblock Plus" addon together with that particular PDF file, but it also improves the rendering times considerably for all users. Locally, with a clean profile, the rendering times are reduced from `~2000 ms` to `~500 ms` for my setup! Obviously, the general structure of the PDF file and its operator sequence is still hugely inefficient, however I'd say that the performance with this patch is good enough to consider the issue (as it stands) resolved.[3] Fixes 8380. --- [1] Not technically true, since inline images are cached from `parser.js`, but whatever :-) [2] The two JPEG images have dimensions 1x2, respectively 4x2. [3] To make this even more efficient, a new state would have to be added to the `QueueOptimizer`. Given that PDF files this stupid fortunately aren't too common, I'm not convinced that it's worth doing.	2017-05-07 13:07:41 +02:00
Jonas Jenwald	40feca12c1	Ignore line-breaks between operator and digit in `Lexer.getNumber` This is consistent with the behaviour in Adobe Reader (and PDFium), and it fixes the display of page 30 in https://bug1354114.bmoattachments.org/attachment.cgi?id=8855457 (taken from https://bugzilla.mozilla.org/show_bug.cgi?id=1354114). The patch also makes the `error` message for invalid numbers slightly more useful, by including the charCode as well. (Having that information available would have reduced the time spent on debugging the PDF file above.)	2017-05-02 20:59:42 +02:00
Jani Pehkonen	64deb6c700	Subtract the X/Y offsets when decoding refinement regions of JBIG2 images (issue 7145, 7308, 7401, 7850, 8270) Please refer to the JBIG2 standard, see https://www.itu.int/rec/dologin_pub.asp?lang=e&id=T-REC-T.88-200002-I!!PDF-E&type=items. In particular, section "6.3.5.3 Fixed templates and adaptive templates" mentions that the offsets should be subtracted; where the offsets are defined according to "Table 6" under section "6.3.2 Input parameters". Fixes 7145. Fixes 7308. Fixes 7401. Fixes 7850. Fixes 8270.	2017-04-26 16:06:15 +02:00

1 2 3 4 5 ...

877 Commits