pdf.js

Author	SHA1	Message	Date
Jonas Jenwald	ae2cc9119b	Add a couple more, mostly `text`, reference tests for non-embedded symbolic fonts without included encoding information I've started to look into how we can fix issue 7580, but quickly became worried that fixing it could easily mean that we'd trade one fixed PDF file for a multitude of broken ones. Hence I started going through the history of the code that choose the fallback encoding, and noticed that it has been changed a number of times over the years to deal with various cases of weirdness/errors in non-embedded fonts. To my relief it turned out that almost all the PRs, please see a possibly incomplete [list here], that changed this code actually included `eq` test-cases. However, in one case it appears that a PR missed to add a test-case. Furthermore since the fallback encoding may also be the only source for creating a `toUnicode` map, changing the encoding could possibly regress only the text-selection despite a PDF file still rendering correctly. Therefore, this PR adds one new `eq` test, and also a number of additional `text` tests for PDF files already present in the test-suite. Note that it's obviously possible that there's a certain overlap between the added tests, but I'd be a whole lot more concerned with causing regressions.	2016-09-11 16:38:39 +02:00
Jonas Jenwald	0b75f63c03	Don't duplicate the first entry in the `charCodeToGlyphId` map for CIDFontType2 fonts with a `CIDToGIDMap` that already mapped the first entry to a non-zero `glyphId` (issue 7544) Fixes 7544.	2016-09-09 22:33:41 +02:00
Tim van der Meij	b112f9f9f4	Merge pull request #7600 from Snuffleupagus/issue-7598 Check that Type1C fonts does not actually contain OpenType font files (issue 7598)	2016-09-09 22:02:58 +02:00
Tim van der Meij	e281ce7c73	Enable regression testing for interactive forms	2016-09-07 16:50:44 +02:00
Jonas Jenwald	44b75c01a1	Check that Type1C fonts does not actually contain OpenType font files (issue 7598) This patch is yet another instalment in the (never ending) series of patches for PDF files that specify completely incorrect Type/Subtype for its fonts. In this case Type1/Type1C, when in fact OpenType would have been correct. Fixes 7598.	2016-09-06 10:13:11 +02:00
Tim van der Meij	d03651efff	Merge pull request #7407 from Snuffleupagus/issue-7406 Assign the `quantizationTables` after parsing the entire JPEG image, to prevent issues when the DQT (Define Quantization Tables) marker is encountered after SOF{n} (Start of Frame) markers (issue 7406)	2016-09-04 14:49:01 +02:00
Tim van der Meij	6bb95e3129	Merge pull request #7539 from jeremypress/fairexpand [api-minor] Expanding divs to improve selection	2016-09-01 17:43:31 +02:00
Jeremy Press	1ceeb4d17b	added text enhancement regression tests	2016-08-31 09:54:52 -07:00
Jonas Jenwald	3ac23200ba	Add a reduced test-case for issue 7406 The PDF file contains an image that we're allowed to use, since it's just the PDF.js logo. The logo image was simply inverted (so that it requires a /Decode entry in the image dictionary that triggers the use of `jpg.js` instead of the browser), converted to JPEG, and finally edited by hand to change the order of the DQT/SOF{n} markers.	2016-08-31 18:42:07 +02:00
Yury Delendik	ffa99397ad	Merge pull request #7387 from Snuffleupagus/issue-5808 Attempt to ignore multiple identical Tf (setFont) commands in `PartialEvaluator_getTextContent` (issue 5808)	2016-08-30 15:21:41 -05:00
Jonas Jenwald	544d29f5cb	Add a `recoveryMode` that suppresses errors from the `Parser`, and utilize it when searching for the main trailer in `XRef_indexObjects` (bug 1250079) Instead of having `Parser_getObj` fail unconditionally for the referenced PDF file, this patch attempts to let searching for the main trailer continue even if there are errors. Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1250079.	2016-08-17 12:37:35 +02:00
Jonas Jenwald	77c6ed5389	Attempt to ignore multiple identical Tf (setFont) commands in `PartialEvaluator_getTextContent` (issue 5808) This patch improves the performance of issue 5808, but I'm not sure if it's enough to call it fixed. On average, this patch reduces the number of textLayer div's by a factor of 3, and it also reduces the time spend in `getTextContent` by a factor of ~2. The PDF file is generated by `Scribus PDF`, which for reasons I cannot understand is placing redundant `Tf` commands before every showText command. Note how the PDF file also contains lots of (basically) identical fonts, but with slightly different names, which causes unnecessary font-switching. This causes some unnecessary breaking of textLayer div's, but this issue cannot be easily worked around.	2016-07-27 21:37:52 +02:00
Jonas Jenwald	558a22cd02	Prevent errors when parsing Annotations with missing (or invalid) /Subtype entries (issue 7446) Note that I used a separate warning message for this case, instead of utilizing the same one as in the unsupported subtype case, to more clearly indicate that the PDF file itself is to blame rather than PDF.js. Fixes 7446.	2016-07-25 13:59:26 +02:00
Brendan Dahl	5678486802	Merge pull request #7347 from Snuffleupagus/evaluator-more-Ref_toString Slightly refactor the `fontRef` handling in `PartialEvaluator_loadFont` (issue 7403 and issue 7402)	2016-07-22 17:21:47 -07:00
Brendan Dahl	50d6e4f147	Merge pull request #7447 from Snuffleupagus/buildToUnicode-notdef Ignore .notdef in the `differences` array when building a fallback `toUnicode` map in `PartialEvaluator_buildToUnicode` (issue 5256)	2016-07-22 14:33:32 -07:00
Jonas Jenwald	4fe891c5e7	Add a reduced test-case for issue 7403	2016-07-21 16:04:07 +02:00
Tim van der Meij	10f9f11ec4	Merge pull request #7490 from Snuffleupagus/issue-7426 Don't map glyphs to the Lepcha Unicode block (issue 7426)	2016-07-21 14:39:19 +02:00
Jonas Jenwald	90d19de935	Catch errors and continue parsing in `parseCMap` (issue 7492) After PR 7039, the PDF file in issue 7492 no longer renders at all, but note that text selection wasn't working correctly previously. The problem with the PDF file in issue 7492 is that the `cMap`, in the `toUnicode` entry in the font, contains an invalid name: ``` /CMapName /-usr-share-fonts-truetype-Panton-Panton Family-Fontfabric - Panton.otf,000-UTF16 def ``` When we parse that line, things obviously break because there are spaces present in the wrong places. To avoid that issue, the patch simply lets `parseCMap` continue when errors are encountered, to try and recover usable data. Note that by not aborting immediatly when an error is encountered, we are also able to fix the text selection. Obviously, it could be argued that we should just immediatly reject a corrupt `cMap`. But given that they usually are correct, it seems that trying to recover as much data as possible from corrupt one can only be a good thing for both glyph mapping and text selection. Fixes 7492.	2016-07-18 16:39:56 +02:00
Jonas Jenwald	64783c8b6e	Don't map glyphs to the Lepcha Unicode block (issue 7426) In the PDF file in the issue, some of the glyphs end up being mapped to the Lepcha Unicode block; see https://en.wikipedia.org/wiki/Lepcha_(Unicode_block). This didn't use to matter, but after HarfBuzz updates that improved support for Lepcha fonts, in particular https://bugzilla.mozilla.org/show_bug.cgi?id=1249861, some glyphs are now moved horizontally. To avoid that, this patch adds the Lepcha block to the list of Unicode ranges that we skip when building the glyph mapping. Fixes 7426.	2016-07-17 16:53:36 +02:00
Brendan Dahl	1f3f4a8dd7	Merge pull request #7441 from Snuffleupagus/issue-7439 Fallback to attempt to recover standard glyph names when amending the `charCodeToGlyphId` with entries from the `differences` array in `type1FontGlyphMapping` (issue 7439)	2016-07-06 13:02:21 -07:00
Jonas Jenwald	bdd58ab1d2	Ignore .notdef in the `differences` array when building a fallback `toUnicode` map in `PartialEvaluator_buildToUnicode` (issue 5256) Fixes 5256.	2016-06-27 16:20:23 +02:00
Jonas Jenwald	7866109af9	Fallback to attempt to recover standard glyph names when amending the `charCodeToGlyphId` with entries from the `differences` array in `type1FontGlyphMapping` (issue 7439) Fixes 7439.	2016-06-25 14:54:34 +02:00
Jonas Jenwald	6a0b047bfa	Add upper-case `I` as a possible space replacement fallback in `Font.spaceWidth` to improve text-selection (issue 7180) In fonts with only upper-case glyphs, that are also missing a space glyph, `get spaceWidth` won't be able to return anything useful. By adding upper-case `I` as a fallback, we can thus improve text-selection in some PDF files. Note that locally, the patch causes slight movement in a few existing `text` tests, but in my opinion this actually looks like slight improvements. Fixes 7180.	2016-06-07 22:55:25 +02:00
Jonas Jenwald	6260fc09a3	Attempt to recover valid `format 3` FDSelect data from broken CFF fonts (bug 1146106) According to the CFF specification, see http://partners.adobe.com/public/developer/en/font/5176.CFF.pdf#G3.46884, for `format 3` FDSelect data: "The first range must have a ‘first’ GID of 0". Since the PDF file (attached in the bug) violates that part of the specification, this patch tries to recover valid FDSelect data to prevent OTS from rejecting the font. Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1146106.	2016-06-06 18:20:52 +02:00
Jonas Jenwald	b02d560ae0	Fix errors in `setGState` in `PartialEvaluator_getTextContent` that prevents text-selection from working properly Currently `setGState` is completely broken, and looking through the history of that code, it seems to me that this may never have worked correctly. This patch fixes the text-selection in `extgstate.pdf` in the test-suite, which is also added as a `text` test.	2016-06-01 22:58:49 +02:00
Jonas Jenwald	98fe094d18	Let non-viewable Popup Annotations inherit the parent's Annotation Flags if the parent is viewable Fixes http://www.pdf-archive.com/2013/09/30/file2/file2.pdf. Note how it's not possible to show the various Popup Annotations in the above document. To fix that, this patch lets the Popup inherit the flags of the parent, in the special case where the parent is `viewable` and the Popup is not. In general, I don't think that a Popup must have the same flags set as the parent. However, it seems very strange to have a `viewable` parent annotation, and then not being able to view the Popup. Annoyingly the PDF specification doesn't, as far as I can find, mention anything about how this case should be handled, but this patch seem consistent with the actual behaviour in Adobe Reader.	2016-05-25 23:00:26 +02:00
Brendan Dahl	b86610ffdb	Merge pull request #7300 from Snuffleupagus/bug-1068432 Prevent adding invalid values in `CFFDict_setByKey` (bug 1068432)	2016-05-24 12:12:38 -07:00
Jonas Jenwald	7ddb0bc718	Attempt to combine text runs positioned with `setTextMatrix`	2016-05-18 17:21:58 +02:00
Jonas Jenwald	182d33800a	Ignore 'endobj' commands inside of `ObjStm` streams (issue 5241, bug 898610, bug 1037816) According to an example in the PDF specification, see http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#page=56, an `ObjStm` stream should not contain 'endobj' commands. Fixes 5241. Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=898610. Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1037816.	2016-05-09 09:50:45 +02:00
Jonas Jenwald	c9b6de3b16	Prevent adding invalid values in `CFFDict_setByKey` (bug 1068432) In the font in question, there are a couple of `topDict` entries that have invalid values (`0xF 0xF`, i.e. just eof markers without any actual numbers). This causes the `parseFloatOperand` function, inside `CFFParser_parseDict`, to return `NaN`. Currently we pass this broken font onto the browser, which OTS unsurprisingly rejects. Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1068432.	2016-05-07 21:09:58 +02:00
Jonas Jenwald	293901d7e5	Add a (linked) test-case for issue 3248	2016-04-21 16:36:46 +02:00
Jonas Jenwald	e281ef15db	Adjust incorrect first obj number of "free" xref entry in `XRef_readXRefTable` (issue 7229) Fixes 7229.	2016-04-21 16:36:32 +02:00
Jonas Jenwald	079b563e2d	Ensure that the `params` parameter of the `PredictorStream` is a dictionary (issue 7200) Fixes 7200.	2016-04-15 16:30:18 +02:00
Yury Delendik	398e6acbc5	Stops bleeding of pattern edges for mesh.	2016-04-11 18:21:44 -05:00
Yury Delendik	d76db416f4	Adds more SMask tests.	2016-04-11 08:02:06 -05:00
Yury Delendik	ff3ce973b8	Merge pull request #7106 from Snuffleupagus/issue-7101 Keep track of the character to glyph mapping in font_renderer.js, to prevent errors when different characters point to the same glyph (issue 7101)	2016-04-01 08:09:21 -05:00
Jonas Jenwald	05cf709f8e	Parse Type1 font files to determine the various `Length{n}` properties, instead of trusting the PDF file (issue 5686, issue 3928) Fixes 5686. Fixes 3928.	2016-03-31 11:08:12 +02:00
Jonas Jenwald	17aaa125df	Keep track of the character to glyph mapping in font_renderer.js, to prevent errors when different characters point to the same glyph (issue 7101) Fixes 7101.	2016-03-30 11:33:04 +02:00
Jonas Jenwald	13d7a5070e	Prevent failures in the Annotation code if the `Rect` array contains indirect objects (issue 7115) Note that in the PDF files provided by the reporter, this issue was limited to `Rect` arrays in AcroForm entries (which we currently don't support). However, since a bad PDF generator could create this problem in any kind of annotation, the reduced test-case included here uses a simple LinkAnnotation instead. Fixes 7115.	2016-03-26 20:55:16 +01:00
Yury Delendik	a505aa8e90	Disables issue6961 test.	2016-03-25 12:48:11 -05:00
Jonas Jenwald	dfe9015a43	Convert `uniXXXX` glyph names to proper ones when building the `charCodeToGlyphId` map for TrueType fonts (bug 1132849, issue 6893, issue 6894) This patch adds a `getUnicodeForGlyph` helper function, which is used to recover Unicode values for non-standard glyph names. Some PDF generators, e.g. Scribus PDF, use improper `uniXXXX` glyph names which breaks the glyph mapping. We can avoid this by converting them to "standard" glyph names instead. Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1132849. Fixes 6893. Fixes 6894.	2016-03-09 19:37:15 +01:00
Preetham Mysore	be1e12dbcb	Fix for descent calculation while reading font hhea headers	2016-03-03 08:51:41 -05:00
Jonas Jenwald	8402c79171	Merge pull request #7050 from brendandahl/issue4402 For CIDFontType2 use CID as glyph ID when missing CID to GID map.	2016-03-02 10:11:42 +01:00
Brendan Dahl	a6acf74b54	Merge pull request #7023 from brendandahl/issue6721 Only draw glyphs on canvas if they are in the font or the font file is missing.	2016-03-01 18:03:37 -08:00
Brendan Dahl	6e1d131384	For CIDFontType2 use CID as glyph ID when missing CID to GID map.	2016-03-01 17:05:33 -08:00
Brendan Dahl	ff87f3fb86	Only draw glyphs on canvas if they are in the font or the font file is missing.	2016-03-01 13:24:58 -08:00
Jonas Jenwald	505f15f221	Avoid accidentally getting the entire font file in `readNameTable` (issue 7020) In the PDF file in question, some of the 'name' table entries have `record.length === 0`. This becomes problematic in the non-unicode case, since `font.getBytes(0)` will fetch the entire stream. Given that OTS rejects 'name' entries larger than `2^16`, this thus explain the sanitizer errors. Fixes 7020.	2016-03-01 21:59:49 +01:00
Tim van der Meij	ad31e52a26	Group popup creation code and apply it to more annotation types	2016-02-25 00:35:45 +01:00
Jonas Jenwald	41efb92d3a	Merge pull request #6988 from timvandermeij/fileattachment-annotation Implement support for FileAttachment annotations	2016-02-24 12:58:06 +01:00
Tim van der Meij	10902fd882	Implement unit and reference testing for FileAttachment annotations	2016-02-23 22:49:53 +01:00

1 2 3 4 5 ...

757 Commits