pdf.js

Author	SHA1	Message	Date
Brendan Dahl	0bef50d56d	Fix two cmap related issues. In issue #8707, there's a char code mapped to a non- existing glyph which shouldn't be drawn. However, we saw it was missing and tried to then use the post table and end up mapping it incorrectly. This illuminated a problem with issue #5704 and bug 893730 where glyphs disappeared after above fix. This was from the cmap returning the wrong glyph id. Which in turn was caused because the font had multiple of the same type of cmap table and we were choosing the last one. Now, we instead default to the first one. I'm unsure if we should instead be merging the multiple cmaps, but using only the first one works.	2017-08-03 22:19:36 -07:00
Yury Delendik	a1dfbec532	Properly cancel streams and guard at getTextContent.	2017-08-03 16:36:46 -05:00
Jonas Jenwald	e20d4a9c21	Merge pull request #8681 from brendandahl/glyph-ids Fix several issues with glyph id mappings (issue 8668, bug 1383504)	2017-08-03 14:25:34 +02:00
Brendan Dahl	5b7f712ca7	Merge pull request #8627 from yurydelendik/issue-8591 Fallback on font widths if CFF data is broken	2017-08-02 10:53:14 -07:00
Apoorv Mishra	a129de7bd1	Add unit-tests for colorspace.js Added unit-tests for DeviceGray, DeviceRGB and DeviceCMYK Added unit-tests for CalGray Added unit-tests for CalRGB Removed redundant code Added unit-tests for LabCS Added unit-tests for IndexedCS Update comment Change lookup to Uint8Array as mentioned in pdf specs(these tests will pass after PR #8666 is merged). Added unit-tests for AlternateCS Resolved code-style issues Fixed code-style issues Addressed issues pointed out in https://github.com/mozilla/pdf.js/pull/8611#pullrequestreview-52865469	2017-07-28 14:24:56 +05:30
Yury Delendik	343b4dc2b6	Merge pull request #8617 from mukulmishra18/network-streaming Adds Streams API support for networking task of PDF.js project.	2017-07-27 16:15:06 -05:00
Mukul Mishra	109106794d	Adds Streams API support for networking task of PDF.js project. network.js file moved to main thread and `PDFNetworkStream` implemented at worker thread, that is used to ask for data whenever worker needs.	2017-07-28 02:32:30 +05:30
Brendan Dahl	ac33358e1f	Fix several issues with glyph id mappings. The initial issue with #8255 was I added a missing glyphs check to adjustMapping, but this caused us to skip re-mapping a glyph if the fontCharCode was a missingGlyph which in turn caused us to overwrite a valid glyph id with an invalid one. While fixing this, I also added a warning if the private use area is full since this also accidentally happened when I made a different mistake. This brought to light a number of issues where we map missing glyphs to notdef, but often the notdef is actually defined and then ends up being drawn. Now the glyphs don't get mapped in toFontChar and so they are not drawn by the canvas. Fixing the above brought up another issue though in bug1050040.pdf. In this PDF, the font fails to load by the browser and before we were still drawing the glyphs because it looked like the font had them, but with the fixes above the glyphs showed up as missing so we didn't attempt draw them. To fix this, I now throw an error when the loca table is in really bad shape and we fall back to trying to use a system font. We now also use this fall back if there are any format errors during converting fonts.	2017-07-26 13:00:55 -07:00
Tim van der Meij	37ac8f8623	Merge pull request #8698 from Snuffleupagus/issue-8697 Add a fallback for non-embedded SegoeUISymbol font (issue 8697)	2017-07-25 22:35:52 +02:00
Tim van der Meij	44a5cec25e	Merge pull request #8666 from apoorv-mishra/fix-colorspace Fix TypeError that occurs in colorspace.js on accidentally passing an 'Array' instead of 'TypedArray'	2017-07-25 22:13:20 +02:00
Yury Delendik	c830021b07	Fixes CFF data glyph widths	2017-07-25 12:29:51 -05:00
Jonas Jenwald	23ec6b16ca	Add a fallback for non-embedded SegoeUISymbol font (issue 8697) The PDF file uses a non-embedded SegoeUISymbol font, which is not a standard font (and is mainly used by Microsoft, see https://en.wikipedia.org/wiki/Segoe). Fixes 8697.	2017-07-25 12:45:11 +02:00
Tim van der Meij	af71ea7a7d	Merge pull request #8673 from Snuffleupagus/api-pageMode [api-minor] Add support for PageMode in the API and viewer (issue 8657)	2017-07-23 13:17:07 +02:00
Tim van der Meij	e7cddcce28	Merge pull request #8684 from Snuffleupagus/rm-assert Remove most `assert()` calls (issue 8506)	2017-07-22 19:42:24 +02:00
Tim van der Meij	7ded895d0c	Merge pull request #8638 from Snuffleupagus/issue-4926-built-in-jpg In `src/core/jpg.js`, ensure that the Adobe JPEG marker always takes precedence, even when the color transform code is zero	2017-07-22 17:25:09 +02:00
Jonas Jenwald	814fa1dee3	Remove most `assert()` calls (issue 8506) This replaces `assert` calls with `throw new FormatError()`/`throw new Error()`. In a few places, throwing an `Error` (which is what `assert` meant) isn't correct since the enclosing function is supposed to return a `Promise`, hence some cases were changed to `Promise.reject(...)` and similarily for `createPromiseCapability` instances.	2017-07-21 18:51:02 +02:00
Apoorv Mishra	d14956d4b8	Fix TypeError that occurs in colorspace.js on accidentally passing an 'Array' instead of 'TypedArray' Fix TypeError that occurs in colorspace.js on accidentally passing an 'Array' instead of 'TypedArray' Changed getRgbItem(...) to getRgbBuffer(...) since this.lookup has values in range[0, 255] whereas getRgbItem(...) expects those to be in range [0, 1] Revert changes for IE9 compatibility	2017-07-21 01:15:05 +05:30
Jonas Jenwald	15f0963f51	Fix a typo, in the `Catalog.numPages` getter, than prevents shadowing from working correctly Looking at the blame, it seems that this typo was present even before PR 700 (almost six years ago). The result of using `'num'`, rather than the correct `'numPages'` string, is that the `Catalog.numPages` getter isn't actually being shadowed.	2017-07-20 12:35:09 +02:00
Jonas Jenwald	16c5d41c5b	[api-minor] Add support for PageMode in the API (issue 8657) Please refer to https://wwwimages2.adobe.com/content/dam/Adobe/en/devnet/pdf/pdfs/PDF32000_2008.pdf#page=82.	2017-07-19 16:40:03 +02:00
Jonas Jenwald	e2ea9b693c	In `src/core/jpg.js`, ensure that the Adobe JPEG marker always takes precedence, even when the color transform code is zero According to the PDF specification, please see http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G6.2394361, if an Adobe JPEG marker is present it should always take precedence. This even seem to be consistent with the existing comment that is present in the code. Hence it seems reasonable to interpret `transformCode === 0` as no color conversion being necessary. Fixes the rendering of page 1 in `issue-4926` (from the test-suite), when the built-in `src/core/jpg.js` image decoder is used.	2017-07-11 17:08:30 +02:00
Yury Delendik	d028c26210	Removes error()	2017-07-07 09:40:24 -05:00
Jonas Jenwald	ea71d23f74	Fix a stupid spelling error in the `ASCII85Decode` name in `Parser.makeInlineImage` (issue 8613) This is a trivial follow-up to PR 5383, and it's a bit strange that this has been wrong since late 2014 without anyone noticing (maybe because inline images aren't too common). So, apparently code works better if you actually spell correctly, who knew ;-) Fixes 8613.	2017-07-05 19:43:09 +02:00
Yury Delendik	b3bac5100c	Merge pull request #8596 from mukulmishra18/proper-read-result Fixes wrong structure of fullReader.read() result.	2017-07-05 09:03:57 -05:00
Jonas Jenwald	eff257b820	Merge pull request #8580 from brendandahl/missing-glyf Fix how we detect and handle missing glyph data.	2017-07-04 12:16:07 +02:00
Brendan Dahl	9f5c1550ed	Merge pull request #8592 from brendandahl/cmap-3-0 Only mask char codes of (3, 0) cmap tables in the range of 0xF000 to 0…	2017-07-03 17:58:28 -07:00
Brendan Dahl	efbbd8533f	Only mask char codes of (3, 0) cmap tables in the range of 0xF000 to 0xF0FF.	2017-07-03 13:13:46 -07:00
Brendan Dahl	6d4f748fb1	Fix how we detect and handle missing glyph data.	2017-07-03 13:06:06 -07:00
Mukul Mishra	308a83e5ca	Fixes wrong structure of fullReader.read() result.	2017-07-01 15:52:47 +05:30
Jonas Jenwald	de0e7a9a68	Check that the `MessageHandler` isn't already terminated in the `onFailure` handler in `src/core/worker.js` (issue 8584) All other code-paths already checks that the `MessageHandler` isn't terminated, but apparently `onFailure` was missing that check (compare e.g. with the `onSuccess` function). From what I can tell, this is only an issue if workers are disabled, hence why I didn't bother adding a unit-test. Fixes 8584.	2017-06-30 10:11:13 +02:00
Brendan Dahl	a8a8909d2d	Fix missing notdef in expert encoding.	2017-06-29 12:12:39 -07:00
Brendan Dahl	f1f9d98519	Merge pull request #8507 from Snuffleupagus/issue-8480 Only special-case OpenType fonts with `CFF` data if it's both a composite (i.e. Type0) font and also has a non-default CID to GID map (issue 8480)	2017-06-23 13:36:58 -07:00
Yury Delendik	e2ca894fec	Merge pull request #8488 from mukulmishra18/streams-getTextContent Streams get text content	2017-06-23 12:52:13 -05:00
Jonas Jenwald	73234577e1	Rename `map` to `_map` inside of `Dict`, to make it clearer that it should be regarded as a "private" property	2017-06-17 17:32:00 +02:00
Mukul Mishra	0c13d0ff46	Adds Streams API in getTextContent to stream data. This patch adds Streams API support in getTextContent so that we can stream data in chunks instead of fetching whole data from worker thread to main thread. This patch supports Streams API without changing the core functionality of getTextContent. Enqueue textContent directly at getTextContent in partialEvaluator. Adds desiredSize and ready property in streamSink.	2017-06-17 20:03:27 +05:30
Jonas Jenwald	3a20fd165f	Refactor `ObjectLoader` to use `Dict`s correctly, rather than abusing their internal properties The `ObjectLoader` currently takes an Object as input, despite actually working with `Dict`s internally. This means that at the (two) existing call-sites, we're passing in the "private" `Dict.map` property directly. Doing this seems like an anti-pattern, and we could (and even should) simply provide the actual `Dict` when creating an `ObjectLoader` instance. Accessing properties stored in the `Dict` is now done using the intended methods instead, in particular `getRaw` which (as the name suggests) doesn't do any de-referencing, thus maintaining the current functionality of the code. The only functional change in this patch is that `ObjectLoader.load` will now ignore empty nodes, such that `ObjectLoader._walk` only needs to deal with nodes that are known to contain data. (This lets us skip, among other checks, meaningless `addChildren` function calls.)	2017-06-16 22:59:32 +02:00
Jonas Jenwald	f2fc9ee281	Slightly refactor and ES6-ify the code in `ObjectLoader` This patch changes all `var` to `let`, and caches the array lengths in all loops. Also removes two unnecessary temporary variable assignments.	2017-06-16 22:59:32 +02:00
Jonas Jenwald	e589834f13	Ensure that `TilingPattern`s have valid (non-zero) /BBox arrays (issue 8330) Fixes 8330.	2017-06-09 21:41:48 +02:00
Jonas Jenwald	8b4a42e5b8	Only special-case OpenType fonts with `CFF` data if it's both a composite (i.e. Type0) font and also has a non-default CID to GID map (issue 8480) As mentioned the last time that I touched this particular part of the font code, I'm sincerely hope that this doesn't cause any regressions! However, the patch passes all tests added in PRs 5770, 6270, and 7904 (and obviously all other tests as well). Furthermore, I've manually checked all the issues/bugs referenced in those PRs without finding any issues. Fixes 8480.	2017-06-09 21:15:39 +02:00
Jonas Jenwald	999e30723d	Reduce the duplication slightly when detecting an OpenType font (in the `Font` constructor)	2017-06-09 18:26:57 +02:00
Jonas Jenwald	a8c87f8019	Fix inconsistent spacing and trailing commas in objects in `src/core/` files, so we can enable the `comma-dangle` and `object-curly-spacing` ESLint rules later on Unfortunately this patch is fairly big, even though it only covers the `src/core` folder, but splitting it even further seemed difficult. http://eslint.org/docs/rules/comma-dangle http://eslint.org/docs/rules/object-curly-spacing Given that we currently have quite inconsistent object formatting, fixing this in one big patch probably wouldn't be feasible (since I cannot imagine anyone wanting to review that); hence I've opted to try and do this piecewise instead. Please note: This patch was created automatically, using the ESLint --fix command line option. In a couple of places this caused lines to become too long, and I've fixed those manually; please refer to the interdiff below for the only hand-edits in this patch. ```diff diff --git a/src/core/evaluator.js b/src/core/evaluator.js index abab9027..dcd3594b 100644 --- a/src/core/evaluator.js +++ b/src/core/evaluator.js @@ -2785,7 +2785,8 @@ var EvaluatorPreprocessor = (function EvaluatorPreprocessorClosure() { t['Tz'] = { id: OPS.setHScale, numArgs: 1, variableArgs: false, }; t['TL'] = { id: OPS.setLeading, numArgs: 1, variableArgs: false, }; t['Tf'] = { id: OPS.setFont, numArgs: 2, variableArgs: false, }; - t['Tr'] = { id: OPS.setTextRenderingMode, numArgs: 1, variableArgs: false, }; + t['Tr'] = { id: OPS.setTextRenderingMode, numArgs: 1, + variableArgs: false, }; t['Ts'] = { id: OPS.setTextRise, numArgs: 1, variableArgs: false, }; t['Td'] = { id: OPS.moveText, numArgs: 2, variableArgs: false, }; t['TD'] = { id: OPS.setLeadingMoveText, numArgs: 2, variableArgs: false, }; diff --git a/src/core/jbig2.js b/src/core/jbig2.js index 5a17d482..71671541 100644 --- a/src/core/jbig2.js +++ b/src/core/jbig2.js @@ -123,19 +123,22 @@ var Jbig2Image = (function Jbig2ImageClosure() { { x: -1, y: -1, }, { x: 0, y: -1, }, { x: 1, y: -1, }, { x: -2, y: 0, }, { x: -1, y: 0, }], [{ x: -3, y: -1, }, { x: -2, y: -1, }, { x: -1, y: -1, }, { x: 0, y: -1, }, - { x: 1, y: -1, }, { x: -4, y: 0, }, { x: -3, y: 0, }, { x: -2, y: 0, }, { x: -1, y: 0, }] + { x: 1, y: -1, }, { x: -4, y: 0, }, { x: -3, y: 0, }, { x: -2, y: 0, }, + { x: -1, y: 0, }] ]; var RefinementTemplates = [ { coding: [{ x: 0, y: -1, }, { x: 1, y: -1, }, { x: -1, y: 0, }], - reference: [{ x: 0, y: -1, }, { x: 1, y: -1, }, { x: -1, y: 0, }, { x: 0, y: 0, }, - { x: 1, y: 0, }, { x: -1, y: 1, }, { x: 0, y: 1, }, { x: 1, y: 1, }], + reference: [{ x: 0, y: -1, }, { x: 1, y: -1, }, { x: -1, y: 0, }, + { x: 0, y: 0, }, { x: 1, y: 0, }, { x: -1, y: 1, }, + { x: 0, y: 1, }, { x: 1, y: 1, }], }, { - coding: [{ x: -1, y: -1, }, { x: 0, y: -1, }, { x: 1, y: -1, }, { x: -1, y: 0, }], - reference: [{ x: 0, y: -1, }, { x: -1, y: 0, }, { x: 0, y: 0, }, { x: 1, y: 0, }, - { x: 0, y: 1, }, { x: 1, y: 1, }], + coding: [{ x: -1, y: -1, }, { x: 0, y: -1, }, { x: 1, y: -1, }, + { x: -1, y: 0, }], + reference: [{ x: 0, y: -1, }, { x: -1, y: 0, }, { x: 0, y: 0, }, + { x: 1, y: 0, }, { x: 0, y: 1, }, { x: 1, y: 1, }], } ]; ```	2017-06-02 11:20:19 +02:00
Jonas Jenwald	982b6aa65b	Convert the files in the `/src/core` folder to ES6 modules Please note that the `glyphlist.js` and `unicode.js` files are converted to CommonJS modules instead, since Babel cannot handle files that large and they are thus excluded from transpilation.	2017-05-30 22:06:21 +02:00
Jonas Jenwald	4ce5e520fb	Add different code-paths to `{CMap, ToUnicodeMap}.charCodeOf` depending on length, since `Array.prototype.indexOf` can be extremely inefficient for very large arrays (issue 8372) Fixes 8372.	2017-05-24 19:47:04 +02:00
Jonas Jenwald	31c24ed631	Don't map glyphs to the HANGUL FILLER (0x3164) Unicode location (issue 8424) This patch follows a similar pattern as previous ones, by skipping certain problematic Unicode locations. According to http://searchfox.org/mozilla-central/rev/6c2dbacbba1d58b8679cee700fd0a54189e0cf1b/gfx/harfbuzz/src/hb-unicode-private.hh#136, it seems that the HANGUL FILLER (0x3164) location is "special". Fixes 8424.	2017-05-23 16:12:45 +02:00
Jonas Jenwald	0ddf52aca5	Remove the special handling for `nameddest`s that look like standard pageNumbers PR 7341 added special handling for `nameddest`s that look like pageNumbers, to prevent issues since we previously incorrectly supported specifying a pageNumber directly in the hash; i.e. `#10` versus the correct `#page=10` format. Since this behaviour wasn't correct, PR 7757 fixed and deprecated the old format, which means that we no longer need to maintain the `nameddest` hack in multiple files.	2017-05-20 11:29:29 +02:00
Yury Delendik	5dc8dcdc0f	Merge pull request #8388 from Snuffleupagus/issue-8380 Cache JPEG images, just as we do for other image formats, in `evaluator.js` (issue 8380)	2017-05-17 17:25:51 -05:00
巴里切罗	8d5d97264e	fix(svg) adjust strategy for decoding JPEG images	2017-05-08 11:32:44 +08:00
Jonas Jenwald	0c2ebda31c	Cache JPEG images, just as we do for other image formats, in `evaluator.js` (issue 8380) For some reason, we're putting all kind of images except JPEG into the `imageCache` in `evaluator.js`.[1] This means that in the PDF file in issue 8380, we'll keep sending the same two small images[2] to the main-thread and decoding them over and over. This is obviously hugely inefficient! As can be seen from the discussion in the issue, the performance becomes extremely bad if the user has the addon "Adblock Plus" installed. However, even in a clean Firefox profile, the performance isn't that great. This patch not only addresses the performance implications of the "Adblock Plus" addon together with that particular PDF file, but it also improves the rendering times considerably for all users. Locally, with a clean profile, the rendering times are reduced from `~2000 ms` to `~500 ms` for my setup! Obviously, the general structure of the PDF file and its operator sequence is still hugely inefficient, however I'd say that the performance with this patch is good enough to consider the issue (as it stands) resolved.[3] Fixes 8380. --- [1] Not technically true, since inline images are cached from `parser.js`, but whatever :-) [2] The two JPEG images have dimensions 1x2, respectively 4x2. [3] To make this even more efficient, a new state would have to be added to the `QueueOptimizer`. Given that PDF files this stupid fortunately aren't too common, I'm not convinced that it's worth doing.	2017-05-07 13:07:41 +02:00
Yury Delendik	3adda80f97	Merge pull request #8358 from Snuffleupagus/PartialEvaluator-method-signatures Change the signatures of the `PartialEvaluator` "constructor" and its `getOperatorList`/`getTextContent` methods to take parameter objects	2017-05-04 08:10:30 -05:00
Yury Delendik	74ba3033e8	Merge pull request #8359 from Snuffleupagus/Lexer-getNumber-ignore-line-breaks Ignore line-breaks between operator and digit in `Lexer.getNumber`	2017-05-03 09:43:59 -05:00
Jonas Jenwald	3e20d30afc	Change the signatures of the `PartialEvaluator` "constructor" and its `getOperatorList`/`getTextContent` methods to take parameter objects Currently these methods accept a large number of parameters, which creates quite unwieldy call-sites. When invoking them, you have to remember not only what arguments to supply, but also the correct order, to avoid runtime errors. Furthermore, since some of the parameters are optional, you also have to remember to pass e.g. `null` or `undefined` for those ones. Also, adding new parameters to these methods (which happens occasionally), often becomes unnecessarily tedious (based on personal experience). Please note that I do not think that we need/should convert every single method in `evaluator.js` (or elsewhere in `/core` files) to take parameter objects. However, in my opinion, once a method starts relying on approximately five parameter (or even more), passing them in individually becomes quite cumbersome. With these changes, I obviously needed to update the `evaluator_spec.js` unit-tests. The main change there, except the new method signatures[1], is that it's now re-using one `PartialEvalutor` instance, since I couldn't see any compelling reason for creating a new one in every single test. Note: If this patch is accepted, my intention is to (time permitting) see if it makes sense to convert additional methods in `evaluator.js` (and other `/core` files) in a similar fashion, but I figured that it'd be a good idea to limit the initial scope somewhat. --- [1] A fun fact here, note how the `PartialEvaluator` signature used in `evaluator_spec.js` wasn't even correct in the current `master`.	2017-05-03 12:10:20 +02:00

... 34 35 36 37 38 ...

2918 Commits