Commit Graph

890 Commits

Author SHA1 Message Date
Jonas Jenwald
02a6b73492 Get rid of getAll usage in colorspace.js
For the `CalGray`/`CalRGB`/`Lab` colour spaces, we're currently using `getAll` to retrieve the parameters. However that's not really necessary, since we may just as well explicitly `get` the needed parameters instead.
2016-02-11 11:59:26 +01:00
Xiliang Chen
e42da0f5e9 move hasHtml to AnnotationElement 2016-02-11 13:58:17 +13:00
Jonas Jenwald
f7f60197ce Replace getAll with getKeys in loadType3Data
Not only is `getAll` less efficient, but given that we actually need the keys here, using `getKeys` seems much more suitable.
2016-02-10 20:19:14 +01:00
Jonas Jenwald
07e1ad40a2 Replace getAll with getKeys in PartialEvaluator_hasBlendModes to speed up loading of badly generated PDF files (issue 6961)
Some bad PDF generators, in particular "Scribus PDF", duplicates resources *a lot* at various levels of the PDF files. This can lead to `PartialEvaluator_hasBlendModes` taking an unreasonable amount of time to complete.
The reason is that the current code is using `Dict_getAll`, which recursively dereferences *all* indirect objects, which can be really slow. This patch instead uses `Dict_getKeys`, and then manually looks up only the necessary indirect objects.

I've added the PDF file as a `load` test. The most important thing here is probably to ensure that the file remains available in the repo, and the comment should help reduced the chance of regressions. (Note that locally, the `load` test times out without this patch, but we cannot really assume that that always happens.)

Fixes 6961.
2016-02-10 17:21:38 +01:00
Tim van der Meij
03f12a10b5 Merge pull request #6954 from Snuffleupagus/setGState-fixes
Various `setGState` improvements
2016-02-10 00:38:31 +01:00
Tim van der Meij
02b161d432 Merge pull request #6933 from brendandahl/faster-decrypt
Make type 1 font program decryption faster.
2016-02-09 23:41:22 +01:00
Jonas Jenwald
a1fe2cb443 Don't directly access the private map in setGState, and ensure that we avoid indirect objects
*This patch is based on something I noticed while debugging some of the PDF files in issue 6931.*

In a number of the cases in `setGState`, we're implicitly assuming that we're not dealing with indirect objects (i.e. `Ref`s). See e.g. the 'Font' case, or the various cases where we simply do `gStateObj.push([key, value]);` (since the code in `canvas.js` won't be able to deal with a `Ref` for those cases).

The reason that I didn't use `Dict_forEach` instead, is that it would re-introduce the unncessary closures that PR 5205 removed.
2016-02-03 17:13:42 +01:00
Jonas Jenwald
2d4a1aa0af Actually ignore no-op setGState (PR 5192 followup)
The intention of PR 5192 was to avoid adding empty `setGState` ops to the operatorList. But the patch accidentally used `>=`, which means that it's not actually working as intended, since empty arrays always have `length === 0`.
2016-02-03 17:13:02 +01:00
Jonas Jenwald
4770b516fe Correct the upper bound used when building the transferMap for SMasks (PR 6723 followup)
Even though the currently known test-cases render correctly without this patch, that seems more like a lucky coincidence, given that there's no guarantee that `transferMap[255] === 0` for every possible transfer function.
2016-02-03 13:41:10 +01:00
Jonas Jenwald
992472fd38 Ensure that we don't modify the Dict data when the Differences array of a font contains indirect objects
This patch fixes an issue that I inadvertently introduced in PR 5815, where we accidentally modify the `Differences` array in the encoding dictionary for indirect objects.

Instead of this change, we could also have used the now existing `Dict_getArray`. However in this case I don't think that would have been a good idea, since it would mean iterating through the array *twice*.
2016-01-30 13:31:24 +01:00
Brendan Dahl
02331f6e33 Make type 1 font program decryption faster.
Discard the values first so we don't have to slice the array.
2016-01-29 11:10:30 -08:00
Jeff Walden
4691a4a1e4 Adjust a comment discussing transferred ArrayBuffers to refer to those buffers being detached, not neutered. This change makes the comment consistent with terminology used in the ECMAScript specification. 2016-01-28 14:52:07 -08:00
Yury Delendik
825a2225ab Merge pull request #6915 from yurydelendik/lookuptables
Refactor lookup hash tables/objects
2016-01-28 15:01:06 -06:00
Yury Delendik
2edf2792dc Replaces literal {} created lookup tables with Object.create 2016-01-28 12:18:38 -06:00
Yury Delendik
d6adf84159 Lazify OP_MAP. 2016-01-28 12:18:37 -06:00
Yury Delendik
1de90454b7 Lazify Metrics 2016-01-28 12:11:46 -06:00
Yury Delendik
55a201d92d Lazify NormalizedUnicodes 2016-01-28 11:56:42 -06:00
Yury Delendik
d0738d7e24 Lazify stdFontMap, serifFonts, GlyphMapForStandardFonts 2016-01-28 11:51:54 -06:00
Yury Delendik
1a9a665adf Refactor Encodings 2016-01-28 11:32:59 -06:00
Yury Delendik
4ef20de429 Lazify GlyphsUnicode. 2016-01-28 11:32:59 -06:00
Jonas Jenwald
1140a34f5c [api-minor] Change getPageLabels to always return the pageLabels, even if they are identical to standard page numbering 2016-01-27 13:36:03 +01:00
Jonas Jenwald
15ce96a6eb Prevent failures in the "scanning for endstream" code, in Parser_makeStream, by handling the case where 'endstream' is split between contiguous chunks (issue 1536) 2016-01-26 09:03:51 +01:00
Tim van der Meij
58329f7f92 Merge pull request #6803 from Snuffleupagus/page-labels
[api-minor] Add support for PageLabels in the API
2016-01-20 22:05:48 +01:00
Yury Delendik
0aa373cdf3 Merge pull request #6891 from Snuffleupagus/issue-6889
Map missing glyphs to the `notdef` glyph for TrueType (3, 1) fonts regardless if the 'post' table is defined or not (issue 6889)
2016-01-20 13:14:47 -06:00
Jonas Jenwald
85cf90643f [api-minor] Add support for PageLabels in the API 2016-01-19 22:49:04 +01:00
Jonas Jenwald
8ad18959d7 Add support for NumberTree 2016-01-19 22:47:45 +01:00
Tim van der Meij
1eea0db897 Merge pull request #6822 from Snuffleupagus/urls-in-outline
[api-minor] Add support for URLs in the document outline
2016-01-19 22:21:40 +01:00
Jonas Jenwald
0030a82dc3 [api-minor] Add support for URLs in the document outline
Re: issue 5089.
(Note that since there are other outline features that we currently don't support, e.g. bold/italic text and custom colours, I thus think we can keep the referenced issue open.)
2016-01-19 21:36:27 +01:00
Jonas Jenwald
4855d4cc9f Map missing glyphs to the notdef glyph for TrueType (3, 1) fonts regardless if the 'post' table is defined or not (issue 6889) 2016-01-17 22:58:00 +01:00
Jonas Jenwald
d52495a9c8 [TrueType] Recover from a missing "glyf" table by replacing it with dummy data, utilizing the existing code in sanitizeGlyphLocations
It seems to be fairly common for OCR software to include incomplete TrueType fonts, notable missing the "glyf" table, in PDF files. Since we currently reject such fonts, the result is that text-selection/copying is broken.

This patch contains a suggested approach to try and use these kind of broken fonts, by using existing code in `sanitizeGlyphLocations` to replace a missing "glyf" table with dummy data.

Fixes 4684.
Fixes 6007.
Fixes 6829.
2016-01-15 21:44:59 +01:00
Brendan Dahl
3057b69e45 Merge pull request #6839 from Snuffleupagus/issue-6782
Check that CIDFontType0 fonts does not actually contain OpenType font files (issue 6782)
2016-01-11 08:56:48 -08:00
Daan Sprenkels
90ec2c9294 shading-pattern: Decreased Shadings.SMALL_NUMBER
and added a test case for #6298
2016-01-06 15:26:40 +01:00
Jonas Jenwald
896e390285 Check that CIDFontType0 fonts does not actually contain OpenType font files (issue 6782)
*This patch follows a similar idea as PR 5756.*

The patch is based on the nice debugging done by Brendan in the referenced issue 6782.
A better way to handle this, and similar issues, would probably be to completely ignore what the PDF file claims about font type/subtype, and just check the actual data. But until that kind of rewrite happens, this patch should help.

Fixes 6782.
2016-01-06 02:19:02 +01:00
Brendan Dahl
eb7c36beb6 Add validation for callsubr and callgsubr for type 2 charstrings. 2016-01-05 09:54:25 -08:00
Tim van der Meij
6ef7120a04 Implement support for Highlight annotations 2016-01-01 15:31:46 +01:00
Tim van der Meij
34918a6666 Implement support for Squiggly annotations 2015-12-30 19:37:04 +01:00
Jonas Jenwald
d956177482 Merge pull request #6819 from timvandermeij/strikeout-annotation
Implement support for StrikeOut annotations
2015-12-30 14:44:50 +01:00
Tim van der Meij
e8db825512 Merge pull request #6771 from yurydelendik/requirejs
Removes hardcoded module loading order
2015-12-30 00:37:32 +01:00
Yury Delendik
b8e7efaaa1 Merge pull request #6821 from yurydelendik/bug951051
Bug 951051 - Better crypto key length recovery.
2015-12-29 15:35:15 -06:00
Yury Delendik
c991480687 Better crypto key length recovery. 2015-12-29 15:10:38 -06:00
Yury Delendik
fc3282db56 Adds RequireJS to worker. 2015-12-29 09:20:52 -06:00
Tim van der Meij
c5f4b9750e Implement support for StrikeOut annotations 2015-12-29 15:09:28 +01:00
Tim van der Meij
cd28dd34fe Implement support for Underline annotations 2015-12-28 00:33:41 +01:00
Tim van der Meij
7d43971f54 Implement support for Popup annotations
Most code for Popup annotations is already present for Text annotations.
This patch extracts the popup creation logic from the Text annotation
code so it can be reused for Popup annotations.

Not only does this add support for Popup annotations, the Text
annotation code is also considerably easier. If a `Popup` entry is
available for a Text annotation, it will not be more than an image. The
popup will be handled by the Popup annotation. However, it is also
possible for Text annotations to not have a separate Popup annotation,
in which case the Text annotation handles the popup creation itself.
2015-12-25 13:17:21 +01:00
Yury Delendik
79c2f69c32 Adds/modifies examples for node.js and webpack. 2015-12-21 13:46:50 -06:00
Tim van der Meij
df81b832bb Remove unused variables 2015-12-16 23:52:16 +01:00
Yury Delendik
b084dc09ee Allows requirejs and node load fake worker files. 2015-12-15 13:24:39 -06:00
Yury Delendik
6b60c8f4db Adds UMD headers to core, display and shared files. 2015-12-15 13:24:39 -06:00
Tim van der Meij
8d36aad30a Implement constants for all annotation types
Now we have a full list of all possible annotation types and the
numbering corresponds to the order in the specification. Not only is
this more consistent and complete, it also prevents having to add these
constants when a new annotation type is implemented.

Additionally fix an issue where a regular Widget annotation would not
have `data.annotationType` set. It was only set for a
TextWidgetAnnotation, but instead move it to the base Widget annotation
class to add it for all Widget annotations (since TextWidgetAnnotation
inherits from WidgetAnnotation it will have it too).
2015-12-15 15:23:55 +01:00
Jonas Jenwald
ee0d522187 Use adjustWidths for TrueType fonts if we handle them as OpenType (issue 5027, issue 5084, issue 6556, bug 1204903)
In `Font_checkAndRepair` we can decide that a font isn't TrueType, and instead parse it as CFF. In that case it's quite possible that the `fontMatrix` will be changed, and without calling `adjustWidths` we're failing to update the glyph widths correctly.

Fixes 5027.
Fixes 5084.
Fixes 6556.
Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1204903.
2015-12-08 00:49:22 +01:00
Jonas Jenwald
084bb8682f Merge pull request #6723 from yurydelendik/smask-transfer
Adds transfer function support for SMask.
2015-12-05 22:41:56 +01:00
Jonas Jenwald
4810b7b8fc Fix the charCodeOf method in IdentityToUnicodeMap in order to prevent text selection from breaking
After PR 6590, `font.spaceWidth` is now called in more cases than before (in `PartialEvaluator_getTextContent`), which exposed an underlying issue with `IdentityToUnicodeMap_charCodeOf` throwing an error.
This breaks text-selection in some PDF files found in the wild, hence this patch replaces the `error` with an actual function instead (modelled after `IdentityCMap_charCodeOf`).
2015-12-05 13:15:55 +01:00
Yury Delendik
15c9969abe Adds transfer function support for SMask. 2015-12-04 12:52:45 -06:00
Brendan Dahl
87762afec4 Remove glyph id's outside the range of valid glyphs.
OTS does not like invalid glyph ids in a camp table.
2015-12-03 11:53:06 -08:00
Brendan Dahl
376788f2b2 Merge pull request #6698 from yurydelendik/rm-UnsupportedManager
[api-minor] Replaces UnsupportedManager with callback.
2015-12-02 10:57:44 -08:00
Jonas Jenwald
5f56a20b34 Merge pull request #6701 from timvandermeij/pdf-manager-inherit
Make use of `Util.inherit` in `src/core/pdf_manager.js`
2015-12-01 19:53:43 +01:00
Yury Delendik
4a82f2f5fd Merge pull request #6695 from Snuffleupagus/issue-6692
Ensure that `Lexer_getName` does not fail if a `Name` contains in invalid usage of the NUMBER SIGN (#) (issue 6692)
2015-12-01 10:25:57 -06:00
Yury Delendik
c9cb6a3025 Replaces UnsupportedManager with callback. 2015-11-30 14:42:47 -06:00
Tim van der Meij
0c41866433 Make use of Util.inherit in src/core/pdf_manager.js
While we are here, fix some incorrect function names.
2015-11-29 00:58:19 +01:00
Tim van der Meij
8b79becad6 Improve code structure of the annotation code
This patch improves the code structure of the annotation code.

- Create the annotation border style object in the `setBorderStyle` method instead of in the constructor. The behavior is the same as the `setBorderStyle` method is always called, thus a border style object is still always available.
- Put all data object manipulation lines in one block in the constructor. This improves readability and maintainability as it is more visible which properties are exposed.
- Simplify `appendToOperatorList` by removing the promise capability and removing an unused parameter.
- Remove some unnecessary newlines/spaces.
2015-11-29 00:04:21 +01:00
Jonas Jenwald
995e1a45b8 Ensure that Lexer_getName does not fail if a Name contains in invalid usage of the NUMBER SIGN (#) (issue 6692)
*This is a regression from PR 3424.*

The PDF file in the referenced issue is using `Type3` fonts. In one of those, the `/CharProcs` dictionary contains an entry with the name `/#`. Before the changes to `Lexer_getName` in PR 3424, we were allowing certain invalid `Name` patterns containing the NUMBER SIGN (#).

It's unfortunate that this has been broken for close to two and a half years before the bug surfaced, but it should at least indicate that this is not a widespread issue.

Fixes 6692.
2015-11-28 11:59:09 +01:00
Yury Delendik
e4e69e2f05 Set error font for Type3 if its loading failed. 2015-11-27 13:05:51 -06:00
Yury Delendik
8dff301ce1 Worker shall wait for MessageHandler to be created at api side. 2015-11-25 18:21:23 -06:00
Jonas Jenwald
6dfe53b976 [api-minor] Add a parameter to PDFPageProxy_getTextContent that enables replacing of all whitespace with standard spaces in the textLayer (issue 6612)
This patch goes a bit further than issue 6612 requires, and replaces all kinds of whitespace with standard spaces.

When testing this locally, it actually seemed to slightly improve two existing test-cases (`tracemonkey-text` and `taro-text`).

Fixes 6612.
2015-11-25 17:28:40 +01:00
Yury Delendik
06c1904675 Refactors FontLoader to group fonts per document. 2015-11-24 13:27:22 -06:00
Yury Delendik
09772e1e15 Creates PDFWorker, separates fetchDocument from transport. 2015-11-24 13:27:22 -06:00
Yury Delendik
acdd49f480 Adds peer communication between MessageHandlers. 2015-11-24 12:16:58 -06:00
Yury Delendik
4b243cdd89 Merge pull request #6675 from Snuffleupagus/getAnnotations-intent
[api-minor] Let `getAnnotations` fetch all annotations by default, unless an intent is specified
2015-11-24 12:11:51 -06:00
Jonas Jenwald
a2a5d36d5b Restore the data.annotationFlags parameter for annotations (PR 6672 follow-up) 2015-11-23 10:17:11 +01:00
Jonas Jenwald
b05652ca97 [api-minor] Let getAnnotations fetch all annotations by default, unless an intent is specified
Currently `getAnnotations` will *only* fetch annotations that are either `viewable` or `printable`. This is "hidden" inside the `core.js` file, meaning that API consumers might be confused as to why they are not recieving *all* the annotations present for a page.

I thus think that the API should, by default, return *all* available annotations unless specifically told otherwise. In e.g. the default viewer, we obviously only want to display annotations that are `viewable`, hence this patch adds an `intent` parameter to `getAnnotations` that makes it possible to decide if only `viewable` or `printable` annotations should be fetched.
2015-11-22 15:51:37 +01:00
Yury Delendik
0029000c9f Merge pull request #6671 from Snuffleupagus/make-stripCommentHeaders-less-gready
Make `stripCommentHeaders` less greedy, to ensure that it doesn't eat 'use strict' directive at the top of files (PR 6627 follow-up)
2015-11-22 07:24:20 -06:00
Tim van der Meij
0991c06395 Refactor annotation flags code
This patch makes it possible to set and get all possible flags that the PDF specification defines. Even though we do not support all possible annotation types and not all possible annotation flags yet, this general framework makes it easy to access all flags for each annotation such that annotation type implementations can use this information.

We add constants for all possible annotation flags such that we do not need to hardcode the flags in the code anymore. The `isViewable()` and `isPrintable()` methods are now easier to read. Additionally, unit tests have been added to ensure correct behavior.

This is another part of #5218.
2015-11-22 01:06:37 +01:00
Jonas Jenwald
373da010ac Move the globals comments in bidi.js and metadata.js to after the Copyright comments 2015-11-21 18:43:08 +01:00
Yury Delendik
2f1a626d6a Merge pull request #6640 from dsprenkels/issue-6006-radial-gradient-size
Apply transformation matrix to RadialGradient radiuses
2015-11-17 11:40:13 -06:00
Daan Sprenkels
6ce83d3290 apply transformation matrix to RadialGradient radiuses,
not only to circle origin points
fix for #6006
2015-11-17 00:20:42 +01:00
Manas
a2ba1b8189 Uses editorconfig to maintain consistent coding styles
Removes the following as they unnecessary
/* -*- Mode: Java; tab-width: 2; indent-tabs-mode: nil; c-basic-offset: 2 -*- */
/* vim: set shiftwidth=2 tabstop=2 autoindent cindent expandtab: */
2015-11-14 07:32:18 +05:30
Jonas Jenwald
50a70429ec Ignore the /Mask entry in images unless its /ImageMask entry is explicitly set to true (issue 6621)
Fixes 6621.
2015-11-12 22:49:26 +01:00
Yury Delendik
7381ff9523 Merge pull request #6599 from prometheansacrifice/generate-better-api-docs
Generate better API documentation
2015-11-12 14:26:18 -06:00
Manas
dbcb46c8de Uses @alias to fix missing comments on JSDocs pages 2015-11-13 01:24:15 +05:30
Yury Delendik
3c6df26704 Merge pull request #6608 from Rob--W/improved-error-message-local-file
Improve error message for non-existent local files
2015-11-09 15:40:41 -06:00
Rob Wu
c604cc22d1 Improve error message for non-existent local files
I received multiple reports about the following cryptic error in the
Chrome extension when the user tried to open a local file:

> PDF.js v1.1.527 (build: 2096a2a)
> Message: Cannot read property 'Symbol(Symbol.iterator)' of null

This error most likely originated from core/stream.js:

    function Stream(arrayBuffer, start, length, dict) {
      this.bytes = (arrayBuffer instanceof Uint8Array ?
                    arrayBuffer : new Uint8Array(arrayBuffer));
                                                 ^^^^^^^^^^^
`arrayBuffer` is `null`, and that in turn is caused by the fact that
for non-existing files, there is no data. I've applied two fixes:

1. Never call onDone with a void buffer, but call the error handler
   instead.
2. Show a sensible error message for local files with status = 0.
2015-11-08 18:03:28 +01:00
Jonas Jenwald
ff64ef0243 Prevent readCmapTable from failing if the cmap is missing in TrueType fonts
Fixes http://arrow.dit.ie/cgi/viewcontent.cgi?article=1000&context=aaschadpoth#page=3.
2015-11-08 16:48:37 +01:00
Yury Delendik
bb29e13307 Merge pull request #6601 from yurydelendik/ascent
Fixes incorrect PDF file font metrics.
2015-11-06 20:16:04 -06:00
Yury Delendik
cc5bc18728 Fixes incorrect PDF file font metrics. 2015-11-06 14:47:10 -06:00
Yury Delendik
fa423cfab0 Refactors fake space heuristics for speed. 2015-11-06 10:55:43 -06:00
Yury Delendik
376f8bde14 Combines standalone divs into text groups. 2015-11-06 10:20:49 -06:00
Yury Delendik
fa46b73c47 Better spacing in text layer. 2015-11-02 08:54:15 -06:00
Yury Delendik
d26ef21d52 Merge pull request #6568 from tonyjin/api-rangeChunkSize
[api-minor] Add an optional param to DocumentInitParameters for speci…
2015-10-28 16:52:52 -05:00
Tony Jin
ef667823dd [api-minor] Add an optional param to DocumentInitParameters for specifying the range request chunk size to use. Defaults to 2^16 = 65536. 2015-10-26 17:22:11 -07:00
Jonas Jenwald
1c66d4a106 Add a totalLength getter to OperatorList, since the length is zero after flushing
In the `RenderPageRequest` handler in `worker.js`, we attempt to print an `info` message containing the rendering time and the length of the operator list. The latter is currently broken (and has been for quite some time), since the `length` of an `OperatorList` is reset when flushing occurs.
This patch attempts to rectify this, by adding a getter which keeps track of the total length.
2015-10-26 18:12:14 +01:00
Yury Delendik
58c3ea0820 Adds thread abort capabilities. 2015-10-23 09:06:32 -05:00
Yury Delendik
59c13b32aa Adds destroy method to the document loading task.
Also renames PDFPageProxy.destroy method to cleanup.
2015-10-23 08:57:14 -05:00
Jonas Jenwald
2e751199fb Prevent getOperatorList from failing to correctly parse OPS.paintXObject for TilingPatterns that are missing some /Resources entries (issue 6541)
Fixes 6541.
2015-10-21 21:30:56 +02:00
Rob Wu
50ff2d4c2a Ignore operators that are known to be unsupported
`operatorList.addOp` adds the arguments to the list which is then
passed as-is by postMessage to the main thread. But since we don't
parse these operations, they are raw PDF objects and may therefore
cause a serialization error.

This is a conservative patch, and only affects operators which are
known to be unsupported. We should ignore all unknown operators,
but I haven't really looked into the consequences of doing that.

Fixes #6549
2015-10-21 15:39:25 +02:00
Brendan Dahl
e4f0e6f2a0 Merge pull request #6531 from covlllp/new_merge
Fixes bluebeam password protection issue
2015-10-16 13:47:06 -07:00
Colin VanLang
6d8e883fe6 Fixes bluebeam password protection issue 2015-10-15 21:22:27 -04:00
Jonas Jenwald
49883439a5 Ensure that Dict_getArray doesn't fail if xref in undefined (PR 6485 follow-up)
In PR 6485 I somehow missed to account for the case where `xref` is undefined. Since a dictonary can be initialized without providing a reference to an `xref` instance, `Dict_getArray` can thus fail without this added check.
2015-10-15 11:47:07 +02:00
Brendan Dahl
3eaeacfe19 Merge pull request #6476 from Snuffleupagus/PartialEvaluator_readToUnicode-cmap-length
Right-size the `map` array in PartialEvaluator_readToUnicode
2015-10-09 10:31:28 -07:00
Jonas Jenwald
9b12c64be5 Cache the regular expression used for finding objs in XRef_indexObjects, to avoid unnecessary allocations 2015-10-02 12:46:58 +02:00
Jonas Jenwald
192907e0d2 Make XRef_indexObjects even more robust against bad PDF files, by checking for the existence of 'trailer' if 'xref' is not found
Fixes http://www.cyjack.com/cognition/Terence%20McKenna%20-%20Lectures%20on%20Alchemy.pdf.
2015-10-01 15:01:25 +02:00