pdf.js

Author	SHA1	Message	Date
Brendan Dahl	9b51cea724	Fix loca table when offsets aren't in ascending order.	2017-12-15 11:20:28 -06:00
Brendan Dahl	af1d80d45e	Merge pull request #9230 from Snuffleupagus/issue-9195 Add basic support for non-embedded Calibri fonts (issue 9195)	2017-12-08 10:15:43 -08:00
Jonas Jenwald	a5e3261b48	Merge pull request #9062 from mozilla/no_high Move char codes from high surrogate pair range into private use.	2017-12-08 12:31:22 +01:00
Brendan Dahl	306999c325	Move char codes from high surrogate pair range into private use. Fixes #2884	2017-12-07 10:35:50 -08:00
Jonas Jenwald	08de655177	Add basic support for non-embedded Calibri fonts (issue 9195) There's a number of issues with the fonts in the referenced PDF file. First of all, they contain broken `ToUnicode` data (`NUL` bytes all over the place). However even if you skip those, the `ToUnicode` data appears to contain nothing but a `IdentityH` CMap which won't help provide a proper glyph mapping. The real issue actually turns out to be that the PDF file uses the "Calibri" font[1], but doesn't include any font files. Since that one isn't a standard font, and uses a fairly different CID to GID map compared to the standard fonts, we're not able to render the file even remotely correct. To work around this, I'm thus proposing that we include a (incomplete) glyph map for Calibri, and fallback to the standard Helvetica font. Obviously this isn't going to look perfect, but it's really the best that we can hope to achieve given that the PDF file is missing the necessary font data. Finally, please note that none of the PDF readers I've tried (Adobe Reader, PDFium in Chrome) were able to extract the text (which isn't very surprising, given the broken `ToUnicode` data). Fixes 9195. --- [1] According to Wikipedia, see https://en.wikipedia.org/wiki/Calibri, Calibri is (primarily) a Windows font.	2017-12-03 17:23:33 +01:00
Jonas Jenwald	f3c50fe2f9	Merge pull request #9192 from Snuffleupagus/issue-8229 Build a fallback `ToUnicode` map for simple fonts (issue 8229)	2017-11-30 10:27:32 +01:00
Tim van der Meij	e320243870	Merge pull request #9206 from janpe2/svg-inv-images Fix inverted 1-bit images in SVG backend	2017-11-28 22:46:43 +01:00
Jani Pehkonen	58b214eab3	Fix inverted 1-bit images in SVG backend	2017-11-28 21:24:27 +02:00
Jani Pehkonen	06d083b04b	Fix pattern-filled text	2017-11-28 19:40:22 +02:00
Jonas Jenwald	61e19bee43	Build a fallback `ToUnicode` map for simple fonts (issue 8229) In some fonts, the included `ToUnicode` data is incomplete causing text-selection to not work properly. For simple fonts that contain encoding data, we can manually build a `ToUnicode` map to attempt to improve things. Please note that since we're currently using the `ToUnicode` data during glyph mapping, in an attempt to avoid rendering regressions, I purposely didn't want to amend to original `ToUnicode` data for this text-selection edge-case. Instead, I opted for the current solution, which will (hopefully) give slightly better text-extraction results in PDF file with incomplete `ToUnicode` data. According to the PDF specification, see [section 9.10.2](http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G8.1873172): > A conforming reader can use these methods, in the priority given, to map a character code to a Unicode value. > ... Reading that paragraph literally, it doesn't seem too unreasonable to use different methods for different charcodes. Fixes 8229.	2017-11-26 14:45:15 +01:00
Tim van der Meij	0fe80df2a7	Button widget annotations: implement support for pushbuttons	2017-11-26 14:09:48 +01:00
Jonas Jenwald	83e8398ff2	For non-embedded fonts, map softhyphen (0x00AD) to regular hyphen (0x002D) (issue 9084) In the PDF file, the `ToUnicode` data first maps the hyphen correctly, and then overwrites it to point to the softhyphen instead. That one cannot be rendered in browsers, and an empty space thus appear instead. Fixes 9084.	2017-10-31 13:26:04 +01:00
Jonas Jenwald	92fcfce685	Merge pull request #9082 from brendandahl/issue7562 Overwrite glyphs contour count if it's less than -1.	2017-10-30 20:44:01 +01:00
Brendan Dahl	17037b5e51	Overwrite glyphs contour count if it's less than -1. The test pdf has a contour count of -70, but OTS doesn't like values less than -1. Fixes issue #7562.	2017-10-30 09:16:51 -07:00
Jonas Jenwald	d71a576b30	Merge pull request #9045 from brendandahl/sani-name Sanitize name index in compile phase of CFF.	2017-10-24 11:48:03 +02:00
Brendan Dahl	6b12612a52	Sanitize name index in compile phase of CFF. Fixes #8960	2017-10-23 17:13:49 -07:00
Brendan Dahl	fcc9943d04	Use charstring as plain text when lengthIV is -1. Fixes #7769	2017-10-18 14:19:59 -07:00
Tim van der Meij	509d3728f1	Merge pull request #8922 from Snuffleupagus/paintXObject-errors Allow `getOperatorList`/`getTextContent` to skip errors when parsing broken XObjects (issue 8702, issue 8704)	2017-10-07 15:46:26 +02:00
Tim van der Meij	f73c9b75d9	Transform Web Archive URLs to avoid downloading an HTML page instead of the PDF file Moreover, adjust one linked test case that did not conform to the standard Web Archive URL format and adjust one linked test case because the link was dead.	2017-09-30 19:50:31 +02:00
Jonas Jenwald	b1472cddbb	Allow `getOperatorList`/`getTextContent` to skip errors when parsing broken XObjects (issue 8702, issue 8704) This patch makes use of the existing `ignoreErrors` property in `src/core/evaluator.js`, see PRs 8240 and 8441, thus allowing us to attempt to recovery as much as possible of a page even when it contains broken XObjects. Fixes 8702. Fixes 8704.	2017-09-29 17:14:21 +02:00
Brendan Dahl	18e2321845	Overwrite maxSizeOfInstructions in maxp with computed value. In issue #7507 the value is less than the actuall max size of the glyph instructions causing OTS to fail the font.	2017-09-25 17:53:26 -07:00
Jonas Jenwald	10727572a2	Merge pull request #8950 from timvandermeij/polygon-polyline-annotations Implement support for polyline and polygon annotations	2017-09-24 15:16:14 +02:00
Tim van der Meij	c69a7a83da	Merge pull request #8932 from janpe2/jbig2-sym-offset JBIG2 symbol offsets	2017-09-23 17:11:45 +02:00
Tim van der Meij	ed8c0ebfa7	Implement reference tests for polyline and polygon annotations	2017-09-23 17:01:19 +02:00
Jonas Jenwald	abc864fca9	Merge pull request #8938 from brendandahl/bug1392647 Use font's default width even when 0. (bug 1392647)	2017-09-20 22:38:39 +02:00
Brendan Dahl	10ba292b46	Use font's default width even when 0. Bug 1392647 has a PDF where the default width of the font is 0. It draws some charcodes that don't have glyphs, but we were wrongly using the 1000 default width for these charcodes causing some text to be overlapping.	2017-09-20 11:38:30 -07:00
Jani Pehkonen	5d1074c110	Fix JBIG2 symbol offsets in text regions	2017-09-19 23:43:23 +03:00
Jani Pehkonen	3d99b8d706	CCITTFaxStream problem when EndOfBlock is false	2017-09-19 22:19:40 +03:00
Tilman Hausherr	d75a497a6b	support tiff predictor for 16bit (for issue #6289) This does the same for 16 bit as the existing 8 bit tiff predictor code, an addition of the last word to this word. The last two "& 0xFF" may or may not be needed, I see this isn't done in the 8 bit code, but I'm not a JS developer.	2017-09-18 22:24:25 +02:00
Tim van der Meij	400e4aae0e	Implement support for stamp annotations	2017-09-16 16:37:50 +02:00
Jonas Jenwald	eece66fa3e	For /Filter entries containing `Name`s, ignore the /DecodeParms entry if it contains an Array (issue 8895)	2017-09-15 23:02:16 +02:00
Jonas Jenwald	f2618eb2e4	Merge pull request #8808 from janpe2/issue8741 Fix color of image masks inside uncolored patterns	2017-09-12 14:27:56 +02:00
Tim van der Meij	320779e6ed	Merge pull request #8691 from timvandermeij/square-circle-annotations Implement support for square and circle annotations	2017-09-09 22:56:54 +02:00
Tim van der Meij	c04f9d6098	Implement reference tests for square and circle annotations	2017-09-09 21:36:28 +02:00
Jonas Jenwald	7115e136e4	Hide unsupported `LinkAnnotation`s (issue 3897) Rather than displaying links that does nothing when clicked, it probably makes more sense to simply not render them instead. Especially since it turns out that, at least at this point in time, this is very easy to both implement and test. Fixes 3897.	2017-09-06 12:52:56 +02:00
Jani Pehkonen	86020396cb	Fix color of image masks inside uncolored patterns	2017-09-06 13:41:48 +03:00
Jonas Jenwald	49b8cd5a6a	Attempt to improve the `EI` detection heuristics, for inline images, in streams containing `NUL` bytes (issue 8823) Since this patch will now treat (some) `NUL` bytes as "ASCII", the number of `followingBytes` checked are thus increased to (hopefully) reduce the risk of introducing new false positives. Fixes 8823.	2017-08-27 12:48:28 +02:00
Tim van der Meij	798e46da97	Merge pull request #8821 from Snuffleupagus/issue-8798-reduced-test Replace the test-case for issue 8798 with a reduced one (PR 8800 follow-up)	2017-08-26 00:00:45 +02:00
Jonas Jenwald	4660cf8238	Prevent an infinite loop in `XRef.readXRef` by keeping track of already parsed tables (bug 1393476) With this patch, not only is the infinite loop prevented, but we're also able to actually render the file (which e.g. Adobe Reader isn't able to). Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1393476.	2017-08-24 19:18:08 +02:00
Jonas Jenwald	4891b9c7e0	Replace the test-case for issue 8798 with a reduced one (PR 8800 follow-up) Re: issue 8798 and PR 8800. Big thanks to @THausherr for providing the test-case.	2017-08-24 17:43:05 +02:00
Tim van der Meij	e9ba54940d	Merge pull request #8800 from Snuffleupagus/issue-8798 Try to recover if we reach the end of the stream when searching for the `EI` marker of an inline image (issue 8798)	2017-08-23 23:47:51 +02:00
Jonas Jenwald	cb55506b95	Try to recover if we reach the end of the stream when searching for the `EI` marker of an inline image (issue 8798)	2017-08-22 09:33:13 +02:00
Jani Pehkonen	9a581ee9ed	Implement JBIG2 halftone regions and pattern dictionaries	2017-08-08 15:38:29 +03:00
Brendan Dahl	0bef50d56d	Fix two cmap related issues. In issue #8707, there's a char code mapped to a non- existing glyph which shouldn't be drawn. However, we saw it was missing and tried to then use the post table and end up mapping it incorrectly. This illuminated a problem with issue #5704 and bug 893730 where glyphs disappeared after above fix. This was from the cmap returning the wrong glyph id. Which in turn was caused because the font had multiple of the same type of cmap table and we were choosing the last one. Now, we instead default to the first one. I'm unsure if we should instead be merging the multiple cmaps, but using only the first one works.	2017-08-03 22:19:36 -07:00
Jonas Jenwald	23ec6b16ca	Add a fallback for non-embedded SegoeUISymbol font (issue 8697) The PDF file uses a non-embedded SegoeUISymbol font, which is not a standard font (and is mainly used by Microsoft, see https://en.wikipedia.org/wiki/Segoe). Fixes 8697.	2017-07-25 12:45:11 +02:00
Jonas Jenwald	794b099385	Add a reduced test-case for issue 7696 Issue 7696 was one of the issues fixed by PR 8580. The other ones were all cases of missing glyphs, however in this particular one glyphs did render but every single one was incorrect. Hence it probably cannot hurt to have a small, reduced, reference test for that PDF file as well.	2017-07-24 09:55:16 +02:00
Rob Wu	01f03fe393	Optimize PNG compression in SVG backend on Node.js Use the environment's zlib implementation if available to get reasonably-sized SVG files when an XObject image is converted to PNG. The generated PNG is not optimal because we do not use a PNG predictor. Futher, when our SVG backend is run in a browser, the generated PNG images will still be unnecessarily large (though the use of blob:-URLs when available should reduce the impact on memory usage). If we want to optimize PNG images in browsers too, we can either try to use a DEFLATE library such as pako, or re-use our XObject image painting logic in src/display/canvas.js. This potential improvement is not implemented by this commit Tested with: - Node.js 8.1.3 (uses zlib) - Node.js 0.11.12 (uses zlib) - Node.js 0.10.48 (falls back to inferior existing implementation). - Chrome 59.0.3071.86 - Firefox 54.0 Tests: Unit test on Node.js: ``` $ gulp lib $ JASMINE_CONFIG_PATH=test/unit/clitests.json node ./node_modules/.bin/jasmine --filter=SVG ``` Unit test in browser: Run `gulp server` and open http://localhost:8888/test/unit/unit_test.html?spec=SVGGraphics To verify that the patch works as desired, ``` $ node examples/node/pdf2svg.js test/pdfs/xobject-image.pdf $ du -b svgdump/xobject-image-1.svg # ^ Calculates the file size. Confirm that the size is small # (784 instead of 80664 bytes). ```	2017-07-10 18:56:57 +02:00
Jonas Jenwald	ea71d23f74	Fix a stupid spelling error in the `ASCII85Decode` name in `Parser.makeInlineImage` (issue 8613) This is a trivial follow-up to PR 5383, and it's a bit strange that this has been wrong since late 2014 without anyone noticing (maybe because inline images aren't too common). So, apparently code works better if you actually spell correctly, who knew ;-) Fixes 8613.	2017-07-05 19:43:09 +02:00
Jonas Jenwald	eff257b820	Merge pull request #8580 from brendandahl/missing-glyf Fix how we detect and handle missing glyph data.	2017-07-04 12:16:07 +02:00
Brendan Dahl	efbbd8533f	Only mask char codes of (3, 0) cmap tables in the range of 0xF000 to 0xF0FF.	2017-07-03 13:13:46 -07:00

1 2 3 4 5 ...

792 Commits