pdf.js

Author	SHA1	Message	Date
Tony Jin	ef667823dd	[api-minor] Add an optional param to DocumentInitParameters for specifying the range request chunk size to use. Defaults to 2^16 = 65536.	2015-10-26 17:22:11 -07:00
Jonas Jenwald	1c66d4a106	Add a `totalLength` getter to `OperatorList`, since the `length` is zero after flushing In the `RenderPageRequest` handler in `worker.js`, we attempt to print an `info` message containing the rendering time and the length of the operator list. The latter is currently broken (and has been for quite some time), since the `length` of an `OperatorList` is reset when flushing occurs. This patch attempts to rectify this, by adding a getter which keeps track of the total length.	2015-10-26 18:12:14 +01:00
Yury Delendik	58c3ea0820	Adds thread abort capabilities.	2015-10-23 09:06:32 -05:00
Yury Delendik	59c13b32aa	Adds destroy method to the document loading task. Also renames PDFPageProxy.destroy method to cleanup.	2015-10-23 08:57:14 -05:00
Jonas Jenwald	2e751199fb	Prevent getOperatorList from failing to correctly parse OPS.paintXObject for TilingPatterns that are missing some /Resources entries (issue 6541) Fixes 6541.	2015-10-21 21:30:56 +02:00
Rob Wu	50ff2d4c2a	Ignore operators that are known to be unsupported `operatorList.addOp` adds the arguments to the list which is then passed as-is by postMessage to the main thread. But since we don't parse these operations, they are raw PDF objects and may therefore cause a serialization error. This is a conservative patch, and only affects operators which are known to be unsupported. We should ignore all unknown operators, but I haven't really looked into the consequences of doing that. Fixes #6549	2015-10-21 15:39:25 +02:00
Brendan Dahl	e4f0e6f2a0	Merge pull request #6531 from covlllp/new_merge Fixes bluebeam password protection issue	2015-10-16 13:47:06 -07:00
Colin VanLang	6d8e883fe6	Fixes bluebeam password protection issue	2015-10-15 21:22:27 -04:00
Jonas Jenwald	49883439a5	Ensure that `Dict_getArray` doesn't fail if `xref` in undefined (PR 6485 follow-up) In PR 6485 I somehow missed to account for the case where `xref` is undefined. Since a dictonary can be initialized without providing a reference to an `xref` instance, `Dict_getArray` can thus fail without this added check.	2015-10-15 11:47:07 +02:00
Brendan Dahl	3eaeacfe19	Merge pull request #6476 from Snuffleupagus/PartialEvaluator_readToUnicode-cmap-length Right-size the `map` array in PartialEvaluator_readToUnicode	2015-10-09 10:31:28 -07:00
Jonas Jenwald	9b12c64be5	Cache the regular expression used for finding `obj`s in `XRef_indexObjects`, to avoid unnecessary allocations	2015-10-02 12:46:58 +02:00
Jonas Jenwald	192907e0d2	Make `XRef_indexObjects` even more robust against bad PDF files, by checking for the existence of 'trailer' if 'xref' is not found Fixes http://www.cyjack.com/cognition/Terence%20McKenna%20-%20Lectures%20on%20Alchemy.pdf.	2015-10-01 15:01:25 +02:00
Tim van der Meij	1bdfc47de8	Merge pull request #6411 from Snuffleupagus/remove-Parser_fetchIfRef Remove `Parser_fetchIfRef` since it's obsolete	2015-09-30 00:38:35 +02:00
Jonas Jenwald	1b8cb52555	Prevent `PartialEvaluator_buildFormXObject` from failing if the `Matrix` or `BBox` contains indirect objects This patch fixes yet another instance of bad PDF data, specifically a case where the `BBox` array contains indirect objects (i.e. `Ref`s). Fixes the missing image in http://www.int.washington.edu/talks/WorkShops/int_08_37W/People/Franz_M/Franz.pdf#page=24. Note: There are missing images on a number of the pages in that file.	2015-09-29 10:11:49 +02:00
Jonas Jenwald	75557d27d1	Add `getArray` method to `Dict` This method extend `get`, and will fetch all indirect objects (i.e. `Ref`s) when the result is an `Array`.	2015-09-29 10:11:47 +02:00
Jonas Jenwald	8d831449ab	Right-size the `map` array in PartialEvaluator_readToUnicode We can avoid a lot of intermediate resizings, by directly allocating the required number of elements for the `map` array.	2015-09-24 13:08:53 +02:00
Fabian Lange	2564827503	Fix text spacing with vertical fonts (#6387 ) According to the PDF spec 5.3.2, a positive value means in horizontal, that the next glyph is further to the left (so narrower), and in vertical that it is further down (so wider). This change fixes the way PDF.js has interpreted the value.	2015-09-15 09:28:45 +02:00
Tim van der Meij	12b0b9744b	Merge pull request #6427 from Snuffleupagus/slightly-more-robust-get-fingerprint Make `get fingerprint` slightly more robust against corrupt PDF files	2015-09-10 22:07:44 +02:00
Jonas Jenwald	5853553455	Make `get fingerprint` slightly more robust against corrupt PDF files This patch adjusts `get fingerprint` to also check that the `/ID` entry contains (non-empty) strings, to prevent more possible failures when loading corrupt PDF files (follow-up to PR 5602). Note that I've not actually encountered such a PDF file in the wild. However given that `stringToBytes` will assert that the input is a string, and that we'll thus fail to load a document unless `get fingerprint` succeeds, making this more robust seems like a good idea to me.	2015-09-08 13:42:53 +02:00
Jonas Jenwald	29a1cdb6a6	Only choose a (3, 1) cmap table for TrueType fonts that have an encoding specified (issue 6410) For (1, 0) cmaps, we have two different codepaths depending on whether the font has/hasn't got an encoding. But with (3, 1) cmaps we don't have a good fallback when the encoding is missing, hence this patch changes `readCmapTable` to only choose a (3, 1) cmap table if the font is non-symbolic and an encoding exists. Without this, we'll not be able to successfully create a working glyph map for some TrueType fonts with (3, 1) cmap tables. Fixes 6410.	2015-09-07 16:56:05 +02:00
Jonas Jenwald	b1d148a4aa	Remove `Parser_fetchIfRef` since it's obsolete This code was added in PR 1214, but was made obsolete by PRs 1488/1493. Prior to the latter ones, `Dict_get` retured the raw objects. However, afterwards (and currently) `Dict_get` now resolves indirect objects, which makes `Parser_fetchIfRef` redundant. Potential risks with this patch: This patch passes all tests locally, but there's a small possibility that it could break some weird PDF files. In the current code, wrapping `Dict_get` inside `Parser_fetchIfRef` will potentially mean two back-to-back call of `XRef_fetch`, if a reference points directly to another reference. I'm not sure if this can actually happen in practice, and I'd think that if that were the case we'd already have run into it elsewhere in the code-base, given that `Parser` is the only place where we try to "double" resolve references.	2015-09-02 23:11:00 +02:00
Jonas Jenwald	0fb31a4a9e	Fallback in `readCmapTable`, instead of using `error`, for TrueType fonts with unsupported cmap formats (bug 1200096) Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1200096. The problematic font has a `format 2` cmap, which we've never supported properly. Prior to PR 2606, we were able to fallback to a working state, despite not having proper support for that cmap format. Obviously the best/correct solution would be to implement actual support for more cmap formats[1]. However, I'm hoping that a simple patch will be OK for now, given that: - `format 2` cmaps seem to be quite rare in practice, since this has been broken for 2.5 years before anyone noticed. - Having a simple patch will make potential uplifts a lot easier. [1] See the specification at https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6cmap.html	2015-09-01 14:01:19 +02:00
Tim van der Meij	0020f33873	Merge pull request #6357 from Snuffleupagus/bidi-result Avoid more allocations for RTL text in bidi.js	2015-09-01 00:44:33 +02:00
Tim van der Meij	b42b894570	Merge pull request #6386 from Snuffleupagus/Parser_makeFilter-warn-on-empty-stream Add a warning when we encounter an empty stream in `Parser_makeFilter`	2015-08-30 23:14:22 +02:00
Rob Wu	582573b96b	Merge pull request #6358 from Snuffleupagus/Parser_tryShift-missingDataException Don't catch `MissingDataException` in `Parser_tryShift`	2015-08-27 14:46:24 +02:00
Jonas Jenwald	f814fdc215	Add a warning when we encounter an empty stream in `Parser_makeFilter` Having a warning here would have meant that issue 6360 could have been solved in approximately five minutes, instead of an hour. To avoid that happening again, this patch adds a warning whenever we treat a stream as empty.	2015-08-26 20:14:30 +02:00
Brendan Dahl	88e0326787	Merge pull request #6337 from Snuffleupagus/issue-6336 Adjust which TrueType (3, 1) glyphs we attempt to skip mapping of (issue 6336)	2015-08-25 09:49:46 -07:00
Jonas Jenwald	56a43a3181	Make `XRef_indexObjects` more robust against bad PDF files (issue 5752) This patch improves the detection of `xref` in files where it is followed by an arbitrary whitespace character (not just a line-breaking char). It also adds a check for missing whitespace, e.g. `1 0 obj<<`, to speed up `readToken` for the PDF file in the referenced issue. Finally, the patch also replaces a bunch of magic numbers with suitably named constants. Fixes 5752. Also improves 6243, but there are still issues.	2015-08-21 20:33:02 +02:00
Jonas Jenwald	5128603f64	Also check `maybeLength` when deciding if a stream is empty in `Parser_makeFilter` (issue 6360) The problem with the PDF files in the issue, besides the obviously broken XRef tables which we're able to recover from, is that many/most of the streams have Dictionaries where the `Length` entry is set to `0`. This causes us to return `NullStream`, instead of the appropriate one in `Parser_makeFilter`. Fixes 6360.	2015-08-20 23:04:18 +02:00
Yury Delendik	c56dc9a093	Merge pull request #6141 from skalnik/fix-font-csp-issues Provide a fallback for font rendering when not allowed to use `eval`	2015-08-18 18:50:11 -05:00
Jonas Jenwald	3fa5f6cc3b	Only take the fast-path in `PDFImage_createImageData` for un-masked JPEG images with "standard" colour spaces (issue 6364) Fixes 6364.	2015-08-18 22:25:37 +02:00
Jonas Jenwald	8c3b8238ac	Don't catch `MissingDataException` in `Parser_tryShift` I overlooked this while reviewing PR 6197, but I don't think that we should be catching that particular kind of exception here; hence this patch.	2015-08-16 11:35:54 +02:00
Jonas Jenwald	b1cf4d98ad	Avoid more allocations for RTL text in bidi.js Instead of building the resulting string char-by-char for RTL text, which is inefficient, we can just as well `join` the `chars` array.	2015-08-14 21:46:59 +02:00
Mike Skalnik	341c5e9d1f	[PATCH] Add fallback for font loading when eval disabled In some cases, such as in use with a CSP header, constructing a function with a string of javascript is not allowed. However, compiling the various commands that need to be done on the canvas element is faster than interpreting them. This patch changes the font renderer to instead emit commands that are compiled by the font loader. If, during compilation, we receive an EvalError, we instead interpret them.	2015-08-13 14:33:18 -07:00
Yury Delendik	20b46aaf88	Fixes supportsMozChunked for node.js	2015-08-12 18:48:59 -05:00
Jonas Jenwald	99d29487ab	Adjust which TrueType (3, 1) glyphs we attempt to skip mapping of (issue 6336) Fixes 6336.	2015-08-09 12:51:43 +02:00
Rob Wu	b0a8c0fa40	cmaps: Use cmap.forEach instead of Array.forEach CMaps may be sparse. Array.prototype.forEach is terribly slow in Chrome (and also in Firefox) when the sparse array contains a key with a high value. E.g. console.time('forEach sparse') var a = []; a[0xFFFFFF] = 1; a.forEach(function(){}); console.timeEnd('forEach sparse'); // Chrome: 2890ms // Firefox: 1345ms Switching to CMap.prototype.forEach, which is optimized for such scenarios fixes the problem.	2015-08-08 13:30:30 +02:00
Tilman Hausherr	6d1e0f7e8d	fix handling of flags 1-3 in tensor shading pi is an index in the stream and is explained on page 201 of the 32000-spec (however 1-based there), and ps is an index to something in PDF.js. I used the code from flag 0 (which works) to understand which is which. It is also important to understand that for flags 1,2 and 3, the stream is always assigned to the same coordinates and colors. What changes is which "old" coordinates and colors are assigned to what is "missing" in the stream. This is why for these flags, the code is identical except for the assignments in the first "row". (Same principle as in #6304). Note that this change will not improve the lamp_cairo.pdf file, only the two files mentioned in #6305.	2015-08-04 18:21:29 +02:00
Tilman Hausherr	c85fa00d62	fix handling of flags 1-3 in coons shading Short story: somebody got lost in two different indices. pi is an index in the stream and is explained on page 198 of the 32000-spec (however 1-based there), and ps is an index to something in PDF.js. I used the code from flag 0 (which works) to understand which is which. It is also important to understand that for flags 1,2 and 3, the stream is always assigned to the same coordinates and colors. What changes is which "old" coordinates and colors are assigned to what is "missing" in the stream. This is why for these flags, the code is identical except for the assignments in the first "row".	2015-08-03 21:15:26 +02:00
Brendan Dahl	977397ebfd	Merge pull request #6270 from Snuffleupagus/opentype-cff-2 Adjust the heuristics used to detect OpenType font file with CFF data (bug 1186827, bug 1182130, issue 6264)	2015-08-03 09:43:33 -07:00
Tim van der Meij	72ecbec49d	Merge pull request #6292 from Snuffleupagus/issue-6287 Fix various shading pattern regressions (issue 6287)	2015-07-31 22:26:01 +02:00
Jonas Jenwald	1d65daf5e5	Correctly access `colorSpace.numComps` in `MeshStreamReader` (issue 6287) This regressed in `f750e35224`.	2015-07-31 18:00:58 +02:00
Jonas Jenwald	7fe2442a18	Ensure that we don't use the same typed array for both `coords` and `colors` in Mesh `figures` (issue 6287) This regressed in `1e8d70af98`.	2015-07-31 18:00:23 +02:00
Jonas Jenwald	55bc98a8b0	Rename `PatternType` to `ShadingType` to avoid confusion The current name is somewhat confusing, since the specification calls it `ShadingType`, see http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G7.4044105 and http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G7.3882826. The real problem, however, is that there is actually another property called `PatternType`, which makes the current code very confusing, see http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G7.1850929. Since `ShadingType` is only relevant for shading patterns (i.e. `PatternType === 2`), and not for tiling patterns (i.e `PatternType === 1`), this patch should help reduce confusion when reading the code.	2015-07-30 20:03:45 +02:00
Tim van der Meij	4f920ad100	Refactor annotation code to use a factory Currently, `src/core/core.js` uses the `fromRef` method on an `Annotation` object to obtain the right annotation type object (such as `LinkAnnotation` or `TextAnnotation`). That method in turn uses a method `getConstructor` to find out which annotation type object must be returned. Aside from the fact that there is currently a lot of code to achieve this, these methods should not be part of the base `Annotation` class at all. Creation of annotation object should be done by a factory (as also recommended by @yurydelendik at https://github.com/mozilla/pdf.js/pull/5218#issuecomment-52779659) that handles finding out the correct annotation type object and returning it. This patch implements this separation of concerns. Doing this allows us to also simplify the code quite a bit and to make it more readable. Additionally, we are now able to get rid of the hardcoded array of supported annotation types. The factory takes care of checking the annotation types and falls back to returning the base annotation type (and issuing a warning, which the current code also does not do well) when an annotation type is unsupported. I have manually tested this commit with 20 test PDFs with different annotation types, such as /Link, /Text, /Widget, /FileAttachment and /FreeText. All render identically before and after the patch, and unsupported annotation types are now properly indicated with a warning in the console.	2015-07-29 00:31:51 +02:00
Tim van der Meij	d08895d659	Merge pull request #6236 from Rob--W/print-javascript-action Detect scripted auto-print requests	2015-07-25 19:42:31 +02:00
Jonas Jenwald	0a024b5051	Adjust the heuristics used to detect OpenType font file with CFF data (bug 1186827, bug 1182130, issue 6264) This is a tentative patch. Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1186827. Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1182130. Fixes 6264.	2015-07-25 12:26:36 +02:00
Jonas Jenwald	385e2e5aaf	Check if the `Decode` entry is non-default when deciding if JPEG images are natively supported/decodable (issue 6238) Tentatively fixes 6238.	2015-07-21 12:23:07 +02:00
Tim van der Meij	980aa10e04	Refactor annotation rectangle code and add unit tests This patch refactors the code responsible for setting the annotation's rectangle. Its goal is to: - Actually check that the input array is actually an array, and if so, that it contains exactly four elements. - Only call `normalizeRect` if the input array is valid, i.e., we do not call it for the default rectangle anymore. Unit tests are provided just like with the other patches in this series.	2015-07-20 22:01:47 +02:00
Rob Wu	c676ecb5a0	Detect scripted auto-print requests Fixes #6106 To avoid future regressions, two new unit tests were added: 1. A new PDF based on the report from #6106, which contains an OpenAction of type JavaScript and a string "this.print({...}". 2. An existing PDF from https://bugzil.la/1001080 (from #4698). Although it does not matter, since we don't execute the JavaScript code, I have also changed "print(true)" to "print({})" since the print method takes an object (not a boolean). See "Printing PDF documents", page 62: http://adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/js_developer_guide.pdf	2015-07-20 18:25:02 +02:00

... 41 42 43 44 45 ...

2852 Commits