Commit Graph

1055 Commits

Author SHA1 Message Date
Jonas Jenwald
41efb92d3a Merge pull request #6988 from timvandermeij/fileattachment-annotation
Implement support for FileAttachment annotations
2016-02-24 12:58:06 +01:00
Tim van der Meij
6a33dfd13a Implement support for FileAttachment annotations 2016-02-23 22:49:53 +01:00
Tim van der Meij
c53581f4e5 Merge pull request #7009 from KamiHQ/annotations-fix
[api-minor] Always expose data.title and data.contents for TextAnnotation
2016-02-22 22:59:38 +01:00
Tim van der Meij
ebe6fb2560 Merge pull request #7012 from KamiHQ/fix-annotation-popup
don't render highlight/underline/squiggly/strikeout annotations that doesn't have popup
2016-02-22 21:54:08 +01:00
Xiliang Chen
6762ff2fd6 don't render highlight/underline/squiggly/strikeout annotations that doesn't have popup 2016-02-22 13:10:20 +13:00
Xiliang Chen
266cedd960 always expose data.title and data.content 2016-02-19 13:50:25 +13:00
Yury Delendik
0d591719d9 Makes PDF data reading Streams API friendly. 2016-02-18 13:17:53 -06:00
Jonas Jenwald
a494e33776 Update JpegImage.getData to support forceRGBoutput for images with numComponents === 1 (issue 6066)
*A more robust solution for issue 6066.*

As a temporary work-around for (the upstream) [bug 1164199](https://bugzilla.mozilla.org/show_bug.cgi?id=1164199), we parsed *all* images in the Firefox addon during a short time.
Doing so uncovered an issue with our image handling (see 6066), for JPEG images with a `DeviceGray` ColorSpace *and* `bpc !== 1` (bits per component).

As long as we let the browser handle image decoding in this case, this isn't going to be an issue, but I do think that we should proactively fix this to avoid future issues if we change where the images are decoded (in `jpg.js` vs in browser).
Also, we currently don't seem to have a test-case for that kind of image data.
2016-02-18 10:12:37 +01:00
Jonas Jenwald
7cf9de2c17 [api-minor] Change getOutline to actually return the RGB color of outline items
Currently the `C` entry in an outline item is returned as is, which is neither particularly useful nor what the API documentation claims.

This patch also adds unit-tests for both the color handling, and the `F` entry (bold/italic flags).
2016-02-15 13:41:22 +01:00
Jonas Jenwald
98db068079 Reduce the overall indentation level in Catalog_readDocumentOutline, by using early returns, in order to improve readability 2016-02-14 11:38:43 +01:00
Tim van der Meij
e9a1a47d28 Merge pull request #6982 from Snuffleupagus/evaluator-remove-getAll
Remove the only remaining `Dict_getAll` usage (in evaluator.js) and the method itself
2016-02-13 20:48:37 +01:00
Tim van der Meij
addc4a3ded Merge pull request #6856 from KamiHQ/remove-has-html
move hasHtml to AnnotationElement
2016-02-13 20:12:09 +01:00
Jonas Jenwald
1ee016b005 Remove Dict_getAll since it is now unused
`Dict_getAll` is problematic for a number of reasons. First of all, as issue 6961 shows, it can be really bad for performance, since it dereferences all indirect objects.
Second of all, all the derefencing can lead to data being unncessarily requested when ranged/chunked loading is used, thus unnecessarily delaying rendering.

Note: For cases where `Dict_getAll` was previously used, `Dict_getKeys` in combination with `Dict_get` can be used instead. This has the advantage that data isn't requested until it's actually needed.
2016-02-12 22:32:07 +01:00
Jonas Jenwald
93ea866f01 Remove getAll from EvaluatorPreprocessor_read
For the operators that we currently support, the arguments are not `Dict`s, which means that it's not really necessary to use `Dict_getAll` in `EvaluatorPreprocessor_read`.
Also, I do think that if/when we support operators that use `Dict`s as arguments, that should be dealt with in the corresponding `case` in `PartialEvaluator_getOperatorList` which handles the operator.

The only reason that I can find for using `Dict_getAll` like that, is that prior to PR 6550 we would just append certain (currently unsupported) operators without doing any further processing/checking. But as issue 6549 showed, that can lead to issues in practice, which is why it was changed.

In an effort to prevent possible issue with unsupported operators, this patch simply ignores operators with `Dict` arguments in `PartialEvaluator_getOperatorList`.
2016-02-12 22:31:50 +01:00
Tim van der Meij
1f49e7b194 Merge pull request #6975 from Snuffleupagus/ColorSpace-remove-getAll
Get rid of `getAll` usage in colorspace.js
2016-02-11 14:23:10 +01:00
Jonas Jenwald
02a6b73492 Get rid of getAll usage in colorspace.js
For the `CalGray`/`CalRGB`/`Lab` colour spaces, we're currently using `getAll` to retrieve the parameters. However that's not really necessary, since we may just as well explicitly `get` the needed parameters instead.
2016-02-11 11:59:26 +01:00
Xiliang Chen
e42da0f5e9 move hasHtml to AnnotationElement 2016-02-11 13:58:17 +13:00
Jonas Jenwald
f7f60197ce Replace getAll with getKeys in loadType3Data
Not only is `getAll` less efficient, but given that we actually need the keys here, using `getKeys` seems much more suitable.
2016-02-10 20:19:14 +01:00
Jonas Jenwald
07e1ad40a2 Replace getAll with getKeys in PartialEvaluator_hasBlendModes to speed up loading of badly generated PDF files (issue 6961)
Some bad PDF generators, in particular "Scribus PDF", duplicates resources *a lot* at various levels of the PDF files. This can lead to `PartialEvaluator_hasBlendModes` taking an unreasonable amount of time to complete.
The reason is that the current code is using `Dict_getAll`, which recursively dereferences *all* indirect objects, which can be really slow. This patch instead uses `Dict_getKeys`, and then manually looks up only the necessary indirect objects.

I've added the PDF file as a `load` test. The most important thing here is probably to ensure that the file remains available in the repo, and the comment should help reduced the chance of regressions. (Note that locally, the `load` test times out without this patch, but we cannot really assume that that always happens.)

Fixes 6961.
2016-02-10 17:21:38 +01:00
Tim van der Meij
03f12a10b5 Merge pull request #6954 from Snuffleupagus/setGState-fixes
Various `setGState` improvements
2016-02-10 00:38:31 +01:00
Tim van der Meij
02b161d432 Merge pull request #6933 from brendandahl/faster-decrypt
Make type 1 font program decryption faster.
2016-02-09 23:41:22 +01:00
Jonas Jenwald
a1fe2cb443 Don't directly access the private map in setGState, and ensure that we avoid indirect objects
*This patch is based on something I noticed while debugging some of the PDF files in issue 6931.*

In a number of the cases in `setGState`, we're implicitly assuming that we're not dealing with indirect objects (i.e. `Ref`s). See e.g. the 'Font' case, or the various cases where we simply do `gStateObj.push([key, value]);` (since the code in `canvas.js` won't be able to deal with a `Ref` for those cases).

The reason that I didn't use `Dict_forEach` instead, is that it would re-introduce the unncessary closures that PR 5205 removed.
2016-02-03 17:13:42 +01:00
Jonas Jenwald
2d4a1aa0af Actually ignore no-op setGState (PR 5192 followup)
The intention of PR 5192 was to avoid adding empty `setGState` ops to the operatorList. But the patch accidentally used `>=`, which means that it's not actually working as intended, since empty arrays always have `length === 0`.
2016-02-03 17:13:02 +01:00
Jonas Jenwald
4770b516fe Correct the upper bound used when building the transferMap for SMasks (PR 6723 followup)
Even though the currently known test-cases render correctly without this patch, that seems more like a lucky coincidence, given that there's no guarantee that `transferMap[255] === 0` for every possible transfer function.
2016-02-03 13:41:10 +01:00
Jonas Jenwald
992472fd38 Ensure that we don't modify the Dict data when the Differences array of a font contains indirect objects
This patch fixes an issue that I inadvertently introduced in PR 5815, where we accidentally modify the `Differences` array in the encoding dictionary for indirect objects.

Instead of this change, we could also have used the now existing `Dict_getArray`. However in this case I don't think that would have been a good idea, since it would mean iterating through the array *twice*.
2016-01-30 13:31:24 +01:00
Brendan Dahl
02331f6e33 Make type 1 font program decryption faster.
Discard the values first so we don't have to slice the array.
2016-01-29 11:10:30 -08:00
Jeff Walden
4691a4a1e4 Adjust a comment discussing transferred ArrayBuffers to refer to those buffers being detached, not neutered. This change makes the comment consistent with terminology used in the ECMAScript specification. 2016-01-28 14:52:07 -08:00
Yury Delendik
825a2225ab Merge pull request #6915 from yurydelendik/lookuptables
Refactor lookup hash tables/objects
2016-01-28 15:01:06 -06:00
Yury Delendik
2edf2792dc Replaces literal {} created lookup tables with Object.create 2016-01-28 12:18:38 -06:00
Yury Delendik
d6adf84159 Lazify OP_MAP. 2016-01-28 12:18:37 -06:00
Yury Delendik
1de90454b7 Lazify Metrics 2016-01-28 12:11:46 -06:00
Yury Delendik
55a201d92d Lazify NormalizedUnicodes 2016-01-28 11:56:42 -06:00
Yury Delendik
d0738d7e24 Lazify stdFontMap, serifFonts, GlyphMapForStandardFonts 2016-01-28 11:51:54 -06:00
Yury Delendik
1a9a665adf Refactor Encodings 2016-01-28 11:32:59 -06:00
Yury Delendik
4ef20de429 Lazify GlyphsUnicode. 2016-01-28 11:32:59 -06:00
Jonas Jenwald
1140a34f5c [api-minor] Change getPageLabels to always return the pageLabels, even if they are identical to standard page numbering 2016-01-27 13:36:03 +01:00
Jonas Jenwald
15ce96a6eb Prevent failures in the "scanning for endstream" code, in Parser_makeStream, by handling the case where 'endstream' is split between contiguous chunks (issue 1536) 2016-01-26 09:03:51 +01:00
Tim van der Meij
58329f7f92 Merge pull request #6803 from Snuffleupagus/page-labels
[api-minor] Add support for PageLabels in the API
2016-01-20 22:05:48 +01:00
Yury Delendik
0aa373cdf3 Merge pull request #6891 from Snuffleupagus/issue-6889
Map missing glyphs to the `notdef` glyph for TrueType (3, 1) fonts regardless if the 'post' table is defined or not (issue 6889)
2016-01-20 13:14:47 -06:00
Jonas Jenwald
85cf90643f [api-minor] Add support for PageLabels in the API 2016-01-19 22:49:04 +01:00
Jonas Jenwald
8ad18959d7 Add support for NumberTree 2016-01-19 22:47:45 +01:00
Tim van der Meij
1eea0db897 Merge pull request #6822 from Snuffleupagus/urls-in-outline
[api-minor] Add support for URLs in the document outline
2016-01-19 22:21:40 +01:00
Jonas Jenwald
0030a82dc3 [api-minor] Add support for URLs in the document outline
Re: issue 5089.
(Note that since there are other outline features that we currently don't support, e.g. bold/italic text and custom colours, I thus think we can keep the referenced issue open.)
2016-01-19 21:36:27 +01:00
Jonas Jenwald
4855d4cc9f Map missing glyphs to the notdef glyph for TrueType (3, 1) fonts regardless if the 'post' table is defined or not (issue 6889) 2016-01-17 22:58:00 +01:00
Jonas Jenwald
d52495a9c8 [TrueType] Recover from a missing "glyf" table by replacing it with dummy data, utilizing the existing code in sanitizeGlyphLocations
It seems to be fairly common for OCR software to include incomplete TrueType fonts, notable missing the "glyf" table, in PDF files. Since we currently reject such fonts, the result is that text-selection/copying is broken.

This patch contains a suggested approach to try and use these kind of broken fonts, by using existing code in `sanitizeGlyphLocations` to replace a missing "glyf" table with dummy data.

Fixes 4684.
Fixes 6007.
Fixes 6829.
2016-01-15 21:44:59 +01:00
Brendan Dahl
3057b69e45 Merge pull request #6839 from Snuffleupagus/issue-6782
Check that CIDFontType0 fonts does not actually contain OpenType font files (issue 6782)
2016-01-11 08:56:48 -08:00
Daan Sprenkels
90ec2c9294 shading-pattern: Decreased Shadings.SMALL_NUMBER
and added a test case for #6298
2016-01-06 15:26:40 +01:00
Jonas Jenwald
896e390285 Check that CIDFontType0 fonts does not actually contain OpenType font files (issue 6782)
*This patch follows a similar idea as PR 5756.*

The patch is based on the nice debugging done by Brendan in the referenced issue 6782.
A better way to handle this, and similar issues, would probably be to completely ignore what the PDF file claims about font type/subtype, and just check the actual data. But until that kind of rewrite happens, this patch should help.

Fixes 6782.
2016-01-06 02:19:02 +01:00
Brendan Dahl
eb7c36beb6 Add validation for callsubr and callgsubr for type 2 charstrings. 2016-01-05 09:54:25 -08:00
Tim van der Meij
6ef7120a04 Implement support for Highlight annotations 2016-01-01 15:31:46 +01:00
Tim van der Meij
34918a6666 Implement support for Squiggly annotations 2015-12-30 19:37:04 +01:00
Jonas Jenwald
d956177482 Merge pull request #6819 from timvandermeij/strikeout-annotation
Implement support for StrikeOut annotations
2015-12-30 14:44:50 +01:00
Tim van der Meij
e8db825512 Merge pull request #6771 from yurydelendik/requirejs
Removes hardcoded module loading order
2015-12-30 00:37:32 +01:00
Yury Delendik
b8e7efaaa1 Merge pull request #6821 from yurydelendik/bug951051
Bug 951051 - Better crypto key length recovery.
2015-12-29 15:35:15 -06:00
Yury Delendik
c991480687 Better crypto key length recovery. 2015-12-29 15:10:38 -06:00
Yury Delendik
fc3282db56 Adds RequireJS to worker. 2015-12-29 09:20:52 -06:00
Tim van der Meij
c5f4b9750e Implement support for StrikeOut annotations 2015-12-29 15:09:28 +01:00
Tim van der Meij
cd28dd34fe Implement support for Underline annotations 2015-12-28 00:33:41 +01:00
Tim van der Meij
7d43971f54 Implement support for Popup annotations
Most code for Popup annotations is already present for Text annotations.
This patch extracts the popup creation logic from the Text annotation
code so it can be reused for Popup annotations.

Not only does this add support for Popup annotations, the Text
annotation code is also considerably easier. If a `Popup` entry is
available for a Text annotation, it will not be more than an image. The
popup will be handled by the Popup annotation. However, it is also
possible for Text annotations to not have a separate Popup annotation,
in which case the Text annotation handles the popup creation itself.
2015-12-25 13:17:21 +01:00
Yury Delendik
79c2f69c32 Adds/modifies examples for node.js and webpack. 2015-12-21 13:46:50 -06:00
Tim van der Meij
df81b832bb Remove unused variables 2015-12-16 23:52:16 +01:00
Yury Delendik
b084dc09ee Allows requirejs and node load fake worker files. 2015-12-15 13:24:39 -06:00
Yury Delendik
6b60c8f4db Adds UMD headers to core, display and shared files. 2015-12-15 13:24:39 -06:00
Tim van der Meij
8d36aad30a Implement constants for all annotation types
Now we have a full list of all possible annotation types and the
numbering corresponds to the order in the specification. Not only is
this more consistent and complete, it also prevents having to add these
constants when a new annotation type is implemented.

Additionally fix an issue where a regular Widget annotation would not
have `data.annotationType` set. It was only set for a
TextWidgetAnnotation, but instead move it to the base Widget annotation
class to add it for all Widget annotations (since TextWidgetAnnotation
inherits from WidgetAnnotation it will have it too).
2015-12-15 15:23:55 +01:00
Jonas Jenwald
ee0d522187 Use adjustWidths for TrueType fonts if we handle them as OpenType (issue 5027, issue 5084, issue 6556, bug 1204903)
In `Font_checkAndRepair` we can decide that a font isn't TrueType, and instead parse it as CFF. In that case it's quite possible that the `fontMatrix` will be changed, and without calling `adjustWidths` we're failing to update the glyph widths correctly.

Fixes 5027.
Fixes 5084.
Fixes 6556.
Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1204903.
2015-12-08 00:49:22 +01:00
Jonas Jenwald
084bb8682f Merge pull request #6723 from yurydelendik/smask-transfer
Adds transfer function support for SMask.
2015-12-05 22:41:56 +01:00
Jonas Jenwald
4810b7b8fc Fix the charCodeOf method in IdentityToUnicodeMap in order to prevent text selection from breaking
After PR 6590, `font.spaceWidth` is now called in more cases than before (in `PartialEvaluator_getTextContent`), which exposed an underlying issue with `IdentityToUnicodeMap_charCodeOf` throwing an error.
This breaks text-selection in some PDF files found in the wild, hence this patch replaces the `error` with an actual function instead (modelled after `IdentityCMap_charCodeOf`).
2015-12-05 13:15:55 +01:00
Yury Delendik
15c9969abe Adds transfer function support for SMask. 2015-12-04 12:52:45 -06:00
Brendan Dahl
87762afec4 Remove glyph id's outside the range of valid glyphs.
OTS does not like invalid glyph ids in a camp table.
2015-12-03 11:53:06 -08:00
Brendan Dahl
376788f2b2 Merge pull request #6698 from yurydelendik/rm-UnsupportedManager
[api-minor] Replaces UnsupportedManager with callback.
2015-12-02 10:57:44 -08:00
Jonas Jenwald
5f56a20b34 Merge pull request #6701 from timvandermeij/pdf-manager-inherit
Make use of `Util.inherit` in `src/core/pdf_manager.js`
2015-12-01 19:53:43 +01:00
Yury Delendik
4a82f2f5fd Merge pull request #6695 from Snuffleupagus/issue-6692
Ensure that `Lexer_getName` does not fail if a `Name` contains in invalid usage of the NUMBER SIGN (#) (issue 6692)
2015-12-01 10:25:57 -06:00
Yury Delendik
c9cb6a3025 Replaces UnsupportedManager with callback. 2015-11-30 14:42:47 -06:00
Tim van der Meij
0c41866433 Make use of Util.inherit in src/core/pdf_manager.js
While we are here, fix some incorrect function names.
2015-11-29 00:58:19 +01:00
Tim van der Meij
8b79becad6 Improve code structure of the annotation code
This patch improves the code structure of the annotation code.

- Create the annotation border style object in the `setBorderStyle` method instead of in the constructor. The behavior is the same as the `setBorderStyle` method is always called, thus a border style object is still always available.
- Put all data object manipulation lines in one block in the constructor. This improves readability and maintainability as it is more visible which properties are exposed.
- Simplify `appendToOperatorList` by removing the promise capability and removing an unused parameter.
- Remove some unnecessary newlines/spaces.
2015-11-29 00:04:21 +01:00
Jonas Jenwald
995e1a45b8 Ensure that Lexer_getName does not fail if a Name contains in invalid usage of the NUMBER SIGN (#) (issue 6692)
*This is a regression from PR 3424.*

The PDF file in the referenced issue is using `Type3` fonts. In one of those, the `/CharProcs` dictionary contains an entry with the name `/#`. Before the changes to `Lexer_getName` in PR 3424, we were allowing certain invalid `Name` patterns containing the NUMBER SIGN (#).

It's unfortunate that this has been broken for close to two and a half years before the bug surfaced, but it should at least indicate that this is not a widespread issue.

Fixes 6692.
2015-11-28 11:59:09 +01:00
Yury Delendik
e4e69e2f05 Set error font for Type3 if its loading failed. 2015-11-27 13:05:51 -06:00
Yury Delendik
8dff301ce1 Worker shall wait for MessageHandler to be created at api side. 2015-11-25 18:21:23 -06:00
Jonas Jenwald
6dfe53b976 [api-minor] Add a parameter to PDFPageProxy_getTextContent that enables replacing of all whitespace with standard spaces in the textLayer (issue 6612)
This patch goes a bit further than issue 6612 requires, and replaces all kinds of whitespace with standard spaces.

When testing this locally, it actually seemed to slightly improve two existing test-cases (`tracemonkey-text` and `taro-text`).

Fixes 6612.
2015-11-25 17:28:40 +01:00
Yury Delendik
06c1904675 Refactors FontLoader to group fonts per document. 2015-11-24 13:27:22 -06:00
Yury Delendik
09772e1e15 Creates PDFWorker, separates fetchDocument from transport. 2015-11-24 13:27:22 -06:00
Yury Delendik
acdd49f480 Adds peer communication between MessageHandlers. 2015-11-24 12:16:58 -06:00
Yury Delendik
4b243cdd89 Merge pull request #6675 from Snuffleupagus/getAnnotations-intent
[api-minor] Let `getAnnotations` fetch all annotations by default, unless an intent is specified
2015-11-24 12:11:51 -06:00
Jonas Jenwald
a2a5d36d5b Restore the data.annotationFlags parameter for annotations (PR 6672 follow-up) 2015-11-23 10:17:11 +01:00
Jonas Jenwald
b05652ca97 [api-minor] Let getAnnotations fetch all annotations by default, unless an intent is specified
Currently `getAnnotations` will *only* fetch annotations that are either `viewable` or `printable`. This is "hidden" inside the `core.js` file, meaning that API consumers might be confused as to why they are not recieving *all* the annotations present for a page.

I thus think that the API should, by default, return *all* available annotations unless specifically told otherwise. In e.g. the default viewer, we obviously only want to display annotations that are `viewable`, hence this patch adds an `intent` parameter to `getAnnotations` that makes it possible to decide if only `viewable` or `printable` annotations should be fetched.
2015-11-22 15:51:37 +01:00
Yury Delendik
0029000c9f Merge pull request #6671 from Snuffleupagus/make-stripCommentHeaders-less-gready
Make `stripCommentHeaders` less greedy, to ensure that it doesn't eat 'use strict' directive at the top of files (PR 6627 follow-up)
2015-11-22 07:24:20 -06:00
Tim van der Meij
0991c06395 Refactor annotation flags code
This patch makes it possible to set and get all possible flags that the PDF specification defines. Even though we do not support all possible annotation types and not all possible annotation flags yet, this general framework makes it easy to access all flags for each annotation such that annotation type implementations can use this information.

We add constants for all possible annotation flags such that we do not need to hardcode the flags in the code anymore. The `isViewable()` and `isPrintable()` methods are now easier to read. Additionally, unit tests have been added to ensure correct behavior.

This is another part of #5218.
2015-11-22 01:06:37 +01:00
Jonas Jenwald
373da010ac Move the globals comments in bidi.js and metadata.js to after the Copyright comments 2015-11-21 18:43:08 +01:00
Yury Delendik
2f1a626d6a Merge pull request #6640 from dsprenkels/issue-6006-radial-gradient-size
Apply transformation matrix to RadialGradient radiuses
2015-11-17 11:40:13 -06:00
Daan Sprenkels
6ce83d3290 apply transformation matrix to RadialGradient radiuses,
not only to circle origin points
fix for #6006
2015-11-17 00:20:42 +01:00
Manas
a2ba1b8189 Uses editorconfig to maintain consistent coding styles
Removes the following as they unnecessary
/* -*- Mode: Java; tab-width: 2; indent-tabs-mode: nil; c-basic-offset: 2 -*- */
/* vim: set shiftwidth=2 tabstop=2 autoindent cindent expandtab: */
2015-11-14 07:32:18 +05:30
Jonas Jenwald
50a70429ec Ignore the /Mask entry in images unless its /ImageMask entry is explicitly set to true (issue 6621)
Fixes 6621.
2015-11-12 22:49:26 +01:00
Yury Delendik
7381ff9523 Merge pull request #6599 from prometheansacrifice/generate-better-api-docs
Generate better API documentation
2015-11-12 14:26:18 -06:00
Manas
dbcb46c8de Uses @alias to fix missing comments on JSDocs pages 2015-11-13 01:24:15 +05:30
Yury Delendik
3c6df26704 Merge pull request #6608 from Rob--W/improved-error-message-local-file
Improve error message for non-existent local files
2015-11-09 15:40:41 -06:00
Rob Wu
c604cc22d1 Improve error message for non-existent local files
I received multiple reports about the following cryptic error in the
Chrome extension when the user tried to open a local file:

> PDF.js v1.1.527 (build: 2096a2a)
> Message: Cannot read property 'Symbol(Symbol.iterator)' of null

This error most likely originated from core/stream.js:

    function Stream(arrayBuffer, start, length, dict) {
      this.bytes = (arrayBuffer instanceof Uint8Array ?
                    arrayBuffer : new Uint8Array(arrayBuffer));
                                                 ^^^^^^^^^^^
`arrayBuffer` is `null`, and that in turn is caused by the fact that
for non-existing files, there is no data. I've applied two fixes:

1. Never call onDone with a void buffer, but call the error handler
   instead.
2. Show a sensible error message for local files with status = 0.
2015-11-08 18:03:28 +01:00
Jonas Jenwald
ff64ef0243 Prevent readCmapTable from failing if the cmap is missing in TrueType fonts
Fixes http://arrow.dit.ie/cgi/viewcontent.cgi?article=1000&context=aaschadpoth#page=3.
2015-11-08 16:48:37 +01:00
Yury Delendik
bb29e13307 Merge pull request #6601 from yurydelendik/ascent
Fixes incorrect PDF file font metrics.
2015-11-06 20:16:04 -06:00
Yury Delendik
cc5bc18728 Fixes incorrect PDF file font metrics. 2015-11-06 14:47:10 -06:00
Yury Delendik
fa423cfab0 Refactors fake space heuristics for speed. 2015-11-06 10:55:43 -06:00
Yury Delendik
376f8bde14 Combines standalone divs into text groups. 2015-11-06 10:20:49 -06:00
Yury Delendik
fa46b73c47 Better spacing in text layer. 2015-11-02 08:54:15 -06:00
Yury Delendik
d26ef21d52 Merge pull request #6568 from tonyjin/api-rangeChunkSize
[api-minor] Add an optional param to DocumentInitParameters for speci…
2015-10-28 16:52:52 -05:00
Tony Jin
ef667823dd [api-minor] Add an optional param to DocumentInitParameters for specifying the range request chunk size to use. Defaults to 2^16 = 65536. 2015-10-26 17:22:11 -07:00
Jonas Jenwald
1c66d4a106 Add a totalLength getter to OperatorList, since the length is zero after flushing
In the `RenderPageRequest` handler in `worker.js`, we attempt to print an `info` message containing the rendering time and the length of the operator list. The latter is currently broken (and has been for quite some time), since the `length` of an `OperatorList` is reset when flushing occurs.
This patch attempts to rectify this, by adding a getter which keeps track of the total length.
2015-10-26 18:12:14 +01:00
Yury Delendik
58c3ea0820 Adds thread abort capabilities. 2015-10-23 09:06:32 -05:00
Yury Delendik
59c13b32aa Adds destroy method to the document loading task.
Also renames PDFPageProxy.destroy method to cleanup.
2015-10-23 08:57:14 -05:00
Jonas Jenwald
2e751199fb Prevent getOperatorList from failing to correctly parse OPS.paintXObject for TilingPatterns that are missing some /Resources entries (issue 6541)
Fixes 6541.
2015-10-21 21:30:56 +02:00
Rob Wu
50ff2d4c2a Ignore operators that are known to be unsupported
`operatorList.addOp` adds the arguments to the list which is then
passed as-is by postMessage to the main thread. But since we don't
parse these operations, they are raw PDF objects and may therefore
cause a serialization error.

This is a conservative patch, and only affects operators which are
known to be unsupported. We should ignore all unknown operators,
but I haven't really looked into the consequences of doing that.

Fixes #6549
2015-10-21 15:39:25 +02:00
Brendan Dahl
e4f0e6f2a0 Merge pull request #6531 from covlllp/new_merge
Fixes bluebeam password protection issue
2015-10-16 13:47:06 -07:00
Colin VanLang
6d8e883fe6 Fixes bluebeam password protection issue 2015-10-15 21:22:27 -04:00
Jonas Jenwald
49883439a5 Ensure that Dict_getArray doesn't fail if xref in undefined (PR 6485 follow-up)
In PR 6485 I somehow missed to account for the case where `xref` is undefined. Since a dictonary can be initialized without providing a reference to an `xref` instance, `Dict_getArray` can thus fail without this added check.
2015-10-15 11:47:07 +02:00
Brendan Dahl
3eaeacfe19 Merge pull request #6476 from Snuffleupagus/PartialEvaluator_readToUnicode-cmap-length
Right-size the `map` array in PartialEvaluator_readToUnicode
2015-10-09 10:31:28 -07:00
Jonas Jenwald
9b12c64be5 Cache the regular expression used for finding objs in XRef_indexObjects, to avoid unnecessary allocations 2015-10-02 12:46:58 +02:00
Jonas Jenwald
192907e0d2 Make XRef_indexObjects even more robust against bad PDF files, by checking for the existence of 'trailer' if 'xref' is not found
Fixes http://www.cyjack.com/cognition/Terence%20McKenna%20-%20Lectures%20on%20Alchemy.pdf.
2015-10-01 15:01:25 +02:00
Tim van der Meij
1bdfc47de8 Merge pull request #6411 from Snuffleupagus/remove-Parser_fetchIfRef
Remove `Parser_fetchIfRef` since it's obsolete
2015-09-30 00:38:35 +02:00
Jonas Jenwald
1b8cb52555 Prevent PartialEvaluator_buildFormXObject from failing if the Matrix or BBox contains indirect objects
This patch fixes yet another instance of bad PDF data, specifically a case where the `BBox` array contains indirect objects (i.e. `Ref`s).

Fixes the missing image in http://www.int.washington.edu/talks/WorkShops/int_08_37W/People/Franz_M/Franz.pdf#page=24. *Note:* There are missing images on a number of the pages in that file.
2015-09-29 10:11:49 +02:00
Jonas Jenwald
75557d27d1 Add getArray method to Dict
This method extend `get`, and will fetch all indirect objects (i.e. `Ref`s) when the result is an `Array`.
2015-09-29 10:11:47 +02:00
Jonas Jenwald
8d831449ab Right-size the map array in PartialEvaluator_readToUnicode
We can avoid a lot of intermediate resizings, by directly allocating the required number of elements for the `map` array.
2015-09-24 13:08:53 +02:00
Fabian Lange
2564827503 Fix text spacing with vertical fonts (#6387)
According to the PDF spec 5.3.2, a positive value means in horizontal,
that the next glyph is further to the left (so narrower), and in
vertical that it is further down (so wider).
This change fixes the way PDF.js has interpreted the value.
2015-09-15 09:28:45 +02:00
Tim van der Meij
12b0b9744b Merge pull request #6427 from Snuffleupagus/slightly-more-robust-get-fingerprint
Make `get fingerprint` slightly more robust against corrupt PDF files
2015-09-10 22:07:44 +02:00
Jonas Jenwald
5853553455 Make get fingerprint slightly more robust against corrupt PDF files
This patch adjusts `get fingerprint` to also check that the `/ID` entry contains (non-empty) strings, to prevent more possible failures when loading corrupt PDF files (follow-up to PR 5602).

Note that I've not actually encountered such a PDF file in the wild. However given that `stringToBytes` will assert that the input is a string, and that we'll thus fail to load a document unless `get fingerprint` succeeds, making this more robust seems like a good idea to me.
2015-09-08 13:42:53 +02:00
Jonas Jenwald
29a1cdb6a6 Only choose a (3, 1) cmap table for TrueType fonts that have an encoding specified (issue 6410)
For (1, 0) cmaps, we have two different codepaths depending on whether the font has/hasn't got an encoding. But with (3, 1) cmaps we don't have a good fallback when the encoding is missing, hence this patch changes `readCmapTable` to only choose a (3, 1) cmap table if the font is non-symbolic *and* an encoding exists. Without this, we'll not be able to successfully create a working glyph map for some TrueType fonts with (3, 1) cmap tables.

Fixes 6410.
2015-09-07 16:56:05 +02:00
Jonas Jenwald
b1d148a4aa Remove Parser_fetchIfRef since it's obsolete
This code was added in PR 1214, but was made obsolete by PRs 1488/1493. Prior to the latter ones, `Dict_get` retured the raw objects. However, afterwards (and currently) `Dict_get` now resolves indirect objects, which makes `Parser_fetchIfRef` redundant.

*Potential risks with this patch:*
This patch passes all tests locally, but there's a *small* possibility that it could break some weird PDF files.
In the current code, wrapping `Dict_get` inside `Parser_fetchIfRef` will potentially mean two back-to-back call of `XRef_fetch`, if a reference points directly to another reference. I'm not sure if this can actually happen in practice, and I'd think that if that were the case we'd already have run into it elsewhere in the code-base, given that `Parser` is the only place where we try to "double" resolve references.
2015-09-02 23:11:00 +02:00
Jonas Jenwald
0fb31a4a9e Fallback in readCmapTable, instead of using error, for TrueType fonts with unsupported cmap formats (bug 1200096)
Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1200096.

The problematic font has a `format 2` cmap, which we've never supported properly. Prior to PR 2606, we were able to fallback to a working state, despite not having proper support for that cmap format.

Obviously the best/correct solution would be to implement actual support for more cmap formats[1]. However, I'm hoping that a simple patch will be OK for now, given that:
 - `format 2` cmaps seem to be quite rare in practice, since this has been broken for 2.5 years before anyone noticed.
 - Having a simple patch will make potential uplifts a lot easier.

[1] See the specification at https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6cmap.html
2015-09-01 14:01:19 +02:00
Tim van der Meij
0020f33873 Merge pull request #6357 from Snuffleupagus/bidi-result
Avoid more allocations for RTL text in bidi.js
2015-09-01 00:44:33 +02:00
Tim van der Meij
b42b894570 Merge pull request #6386 from Snuffleupagus/Parser_makeFilter-warn-on-empty-stream
Add a warning when we encounter an empty stream in `Parser_makeFilter`
2015-08-30 23:14:22 +02:00
Rob Wu
582573b96b Merge pull request #6358 from Snuffleupagus/Parser_tryShift-missingDataException
Don't catch `MissingDataException` in `Parser_tryShift`
2015-08-27 14:46:24 +02:00
Jonas Jenwald
f814fdc215 Add a warning when we encounter an empty stream in Parser_makeFilter
Having a warning here would have meant that issue 6360 could have been solved in approximately five minutes, instead of an hour. To avoid that happening again, this patch adds a warning whenever we treat a stream as empty.
2015-08-26 20:14:30 +02:00
Brendan Dahl
88e0326787 Merge pull request #6337 from Snuffleupagus/issue-6336
Adjust which TrueType (3, 1) glyphs we attempt to skip mapping of (issue 6336)
2015-08-25 09:49:46 -07:00
Jonas Jenwald
56a43a3181 Make XRef_indexObjects more robust against bad PDF files (issue 5752)
This patch improves the detection of `xref` in files where it is followed by an arbitrary whitespace character (not just a line-breaking char).
It also adds a check for missing whitespace, e.g. `1 0 obj<<`, to speed up `readToken` for the PDF file in the referenced issue.
Finally, the patch also replaces a bunch of magic numbers with suitably named constants.

Fixes 5752.

Also improves 6243, but there are still issues.
2015-08-21 20:33:02 +02:00
Jonas Jenwald
5128603f64 Also check maybeLength when deciding if a stream is empty in Parser_makeFilter (issue 6360)
The problem with the PDF files in the issue, besides the obviously broken XRef tables which we're able to recover from, is that many/most of the streams have Dictionaries where the `Length` entry is set to `0`. This causes us to return `NullStream`, instead of the appropriate one in `Parser_makeFilter`.

Fixes 6360.
2015-08-20 23:04:18 +02:00
Yury Delendik
c56dc9a093 Merge pull request #6141 from skalnik/fix-font-csp-issues
Provide a fallback for font rendering when not allowed to use `eval`
2015-08-18 18:50:11 -05:00
Jonas Jenwald
3fa5f6cc3b Only take the fast-path in PDFImage_createImageData for un-masked JPEG images with "standard" colour spaces (issue 6364)
Fixes 6364.
2015-08-18 22:25:37 +02:00
Jonas Jenwald
8c3b8238ac Don't catch MissingDataException in Parser_tryShift
I overlooked this while reviewing PR 6197, but I don't think that we should be catching that particular kind of exception here; hence this patch.
2015-08-16 11:35:54 +02:00
Jonas Jenwald
b1cf4d98ad Avoid more allocations for RTL text in bidi.js
Instead of building the resulting string char-by-char for RTL text, which is inefficient, we can just as well `join` the `chars` array.
2015-08-14 21:46:59 +02:00
Mike Skalnik
341c5e9d1f [PATCH] Add fallback for font loading when eval disabled
In some cases, such as in use with a CSP header, constructing a function with a
string of javascript is not allowed. However, compiling the various commands
that need to be done on the canvas element is faster than interpreting them.
This patch changes the font renderer to instead emit commands that are compiled
by the font loader. If, during compilation, we receive an EvalError, we instead
interpret them.
2015-08-13 14:33:18 -07:00
Yury Delendik
20b46aaf88 Fixes supportsMozChunked for node.js 2015-08-12 18:48:59 -05:00
Jonas Jenwald
99d29487ab Adjust which TrueType (3, 1) glyphs we attempt to skip mapping of (issue 6336)
Fixes 6336.
2015-08-09 12:51:43 +02:00
Rob Wu
b0a8c0fa40 cmaps: Use cmap.forEach instead of Array.forEach
CMaps may be sparse. Array.prototype.forEach is terribly slow in Chrome
(and also in Firefox) when the sparse array contains a key with a high
value. E.g.

    console.time('forEach sparse')
    var a = [];
    a[0xFFFFFF] = 1;
    a.forEach(function(){});
    console.timeEnd('forEach sparse');

    // Chrome: 2890ms
    // Firefox: 1345ms

Switching to CMap.prototype.forEach, which is optimized for such
scenarios fixes the problem.
2015-08-08 13:30:30 +02:00
Tilman Hausherr
6d1e0f7e8d fix handling of flags 1-3 in tensor shading
pi is an index in the stream and is explained on page 201 of the 32000-spec (however 1-based there), and ps is an index to something in PDF.js. I used the code from flag 0 (which works) to understand which is which. It is also important to understand that for flags 1,2 and 3, the stream is always assigned to the same coordinates and colors. What changes is which "old" coordinates and colors are assigned to what is "missing" in the stream. This is why for these flags, the code is identical except for the assignments in the first "row". (Same principle as in #6304). Note that this change will not improve the lamp_cairo.pdf file, only the two files mentioned in #6305.
2015-08-04 18:21:29 +02:00
Tilman Hausherr
c85fa00d62 fix handling of flags 1-3 in coons shading
Short story: somebody got lost in two different indices. pi is an index in the stream and is explained on page 198 of the 32000-spec (however 1-based there), and ps is an index to something in PDF.js. I used the code from flag 0 (which works) to understand which is which. It is also important to understand that for flags 1,2 and 3, the stream is always assigned to the same coordinates and colors. What changes is which "old" coordinates and colors are assigned to what is "missing" in the stream. This is why for these flags, the code is identical except for the assignments in the first "row".
2015-08-03 21:15:26 +02:00
Brendan Dahl
977397ebfd Merge pull request #6270 from Snuffleupagus/opentype-cff-2
Adjust the heuristics used to detect OpenType font file with CFF data (bug 1186827, bug 1182130, issue 6264)
2015-08-03 09:43:33 -07:00
Tim van der Meij
72ecbec49d Merge pull request #6292 from Snuffleupagus/issue-6287
Fix various shading pattern regressions (issue 6287)
2015-07-31 22:26:01 +02:00
Jonas Jenwald
1d65daf5e5 Correctly access colorSpace.numComps in MeshStreamReader (issue 6287)
This regressed in f750e35224.
2015-07-31 18:00:58 +02:00
Jonas Jenwald
7fe2442a18 Ensure that we don't use the same typed array for both coords and colors in Mesh figures (issue 6287)
This regressed in 1e8d70af98.
2015-07-31 18:00:23 +02:00
Jonas Jenwald
55bc98a8b0 Rename PatternType to ShadingType to avoid confusion
The current name is somewhat confusing, since the specification calls it `ShadingType`, see http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G7.4044105 and http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G7.3882826.

The real problem, however, is that there is actually another property called `PatternType`, which makes the current code very confusing, see http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G7.1850929.

Since `ShadingType` is only relevant for shading patterns (i.e. `PatternType === 2`), and *not* for tiling patterns (i.e `PatternType === 1`), this patch should help reduce confusion when reading the code.
2015-07-30 20:03:45 +02:00
Tim van der Meij
4f920ad100 Refactor annotation code to use a factory
Currently, `src/core/core.js` uses the `fromRef` method on an `Annotation` object to obtain the right annotation type object (such as `LinkAnnotation` or `TextAnnotation`). That method in turn uses a method `getConstructor` to find out which annotation type object must be returned.

Aside from the fact that there is currently a lot of code to achieve this, these methods should not be part of the base `Annotation` class at all. Creation of annotation object should be done by a factory (as also recommended by @yurydelendik at https://github.com/mozilla/pdf.js/pull/5218#issuecomment-52779659) that handles finding out the correct annotation type object and returning it. This patch implements this separation of concerns.

Doing this allows us to also simplify the code quite a bit and to make it more readable. Additionally, we are now able to get rid of the hardcoded array of supported annotation types. The factory takes care of checking the annotation types and falls back to returning the base annotation type (and issuing a warning, which the current code also does not do well) when an annotation type is unsupported.

I have manually tested this commit with 20 test PDFs with different annotation types, such as /Link, /Text, /Widget, /FileAttachment and /FreeText. All render identically before and after the patch, and unsupported annotation types are now properly indicated with a warning in the console.
2015-07-29 00:31:51 +02:00
Tim van der Meij
d08895d659 Merge pull request #6236 from Rob--W/print-javascript-action
Detect scripted auto-print requests
2015-07-25 19:42:31 +02:00
Jonas Jenwald
0a024b5051 Adjust the heuristics used to detect OpenType font file with CFF data (bug 1186827, bug 1182130, issue 6264)
*This is a tentative patch.*

Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1186827.
Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1182130.
Fixes 6264.
2015-07-25 12:26:36 +02:00
Jonas Jenwald
385e2e5aaf Check if the Decode entry is non-default when deciding if JPEG images are natively supported/decodable (issue 6238)
Tentatively fixes 6238.
2015-07-21 12:23:07 +02:00
Tim van der Meij
980aa10e04 Refactor annotation rectangle code and add unit tests
This patch refactors the code responsible for setting the annotation's rectangle. Its goal is to:

- Actually check that the input array is actually an array, and if so, that it contains exactly four elements.
- Only call `normalizeRect` if the input array is valid, i.e., we do not call it for the default rectangle anymore.

Unit tests are provided just like with the other patches in this series.
2015-07-20 22:01:47 +02:00
Rob Wu
c676ecb5a0 Detect scripted auto-print requests
Fixes #6106

To avoid future regressions, two new unit tests were added:
1. A new PDF based on the report from #6106, which contains an
   OpenAction of type JavaScript and a string "this.print({...}".
2. An existing PDF from https://bugzil.la/1001080 (from #4698).

Although it does not matter, since we don't execute the JavaScript code,
I have also changed "print(true)" to "print({})" since the print method
takes an object (not a boolean). See "Printing PDF documents", page 62:
http://adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/js_developer_guide.pdf
2015-07-20 18:25:02 +02:00
Tim van der Meij
995c5ba205 Simplify annotation data passing 2015-07-19 14:02:49 +02:00
Tim van der Meij
465611a2ff More cleanup regarding annotation border styles 2015-07-17 21:51:24 +02:00
Jonas Jenwald
c718d1ab10 Ignore double negative in Lexer_getNumber (issue 6218)
Basic mathematics would suggest that a double negative should always become positive, but it appears that Adobe Reader simply ignores that case. Hence I think that it makes sense for us to do the same.

Fixes 6218.
2015-07-16 12:11:49 +02:00
Tim van der Meij
a2e9845093 Refactor annotation color handling and add unit tests 2015-07-15 18:49:19 +02:00
Jonas Jenwald
28f40b1b58 Fetch all indirect objects (i.e. Refs) in NameTree_getAll and NameTree_get (issue 6204) 2015-07-14 10:56:56 +02:00
Brendan Dahl
367794f0c7 Merge pull request #4990 from fkaelberer/refactor_chunked_stream
Minor refactoring of chunked_stream.js
2015-07-13 16:51:35 -07:00
Tim van der Meij
1416a1b521 Merge pull request #6187 from Snuffleupagus/more-efficient-getDestination
A couple of improvements of `getDestination` (unit-test included)
2015-07-13 23:03:13 +02:00
Rob Wu
e211c25f06 Improve robustness of stream parser (invalid length)
When the parser finds a stream, it retrieves the Length from the stream
dictionary and advances the lexer to the offset as specified in Length.
If this Length is incorrect, the lexer could end up anywhere.

When the lexer gets in an invalid state, it could throw errors. For
example, in issue 6108, the lexer ends up inside the stream data. This
stream has the ASCIIHexDecode filter, so all characters are made up from
ASCII characters, and the lexer interprets it as a command token. Tokens
cannot be longer than 127 bytes, so eventually 128 bytes are consumed
and the lexer throws "Command token too long" error.

Another possible error is "Illegal character: 41" when the lexer happens
to end up at a ')' due to the length mismatch.

These problems are solved by catching lexer errors and recovering the
parser via the existing stream length detection branch.
2015-07-11 20:12:49 +02:00
Tim van der Meij
7d4303b7c4 Merge pull request #6194 from Rob--W/recover-mode-start-offset
Subtract start offset for xrefs in recovery mode
2015-07-11 17:22:08 +02:00
Rob Wu
fd29bb0c57 Subtract start offset for xrefs in recovery mode
Xref offsets are relative to the start of the PDF data, not to the start
of the PDF file. This is clear if you look at the other code:

- In the XRef's readXRefTable and processXRefTable methods of XRef, the
  offset of a xref entry is set to the bytes as given by a PDF file.
  These values are always relative to the start of the PDF file (%PDF-).

- The XRef's readXRef method adds the start offset of the stream to
  Xref entry's offset: "stream.pos = startXRef + stream.start".
  Clearly, this line assumes that the entry offset excludes the start
  offset.

However, when the PDF is parsed in recovery mode, the xref table is
filled with entries whose offset is relative to the start of the stream
rather than the PDF file. This is incorrect, and the fix is to subtract
the start offset of the stream from the entry's byte offset.

The manually created PDF file serves as a regression test. It is a valid
PDF, except:
- The integer to point to the start of the xref table and the %%EOF
  trailer are missing. This will activate recovery mode in PDF.js
- Some junk was added before the start of the PDF file. This exposes the
  bad offset bug.
2015-07-10 23:33:10 +02:00
Tim van der Meij
6c1906fd53 Merge pull request #6193 from Rob--W/long-name-is-warning-not-error
Issue a warning instead of an error for long Names
2015-07-10 22:58:08 +02:00
Tim van der Meij
5af49f8bbb Merge pull request #6166 from Snuffleupagus/issue-5801-2
Add a supplemental glyph map for non-embedded ArialBlack fonts (issue 5801)
2015-07-10 22:29:50 +02:00
Rob Wu
456ad438d8 Issue a warning instead of an error for long Names
The PDF specification (cited below) specifies a maximum length of a name
in bytes as a minimal architectural limit. This means that PDF *writers*
should not create names that exceed 127 bytes.

It does not forbid PDF *readers* to accept such names though. These
names are only used internally to link PDF objects to other objects. For
these use cases, the lengths of the names do not really matter. Hence I
have changed the implementation to not treat long names as errors, but
warnings.

> (7.3.5) The length of a name shall be subject to an implementation
> limit; see Annex C.
>
> (Annex C.2) Table C.1 describes the minimum architectural limits that
> should be accommodated by conforming readers running on 32-bit
> machines. Because conforming readers may be subject to these limits,
> conforming writers producing PDF files should remain within them.
>
> (Table C.1) name 127 "Maximum length of a name, in bytes."

http://adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf
2015-07-10 16:10:24 +02:00
Jonas Jenwald
7df78f997e Slightly more efficient getDestination
For named destinations that are contained in a `Dict`, as opposed to a `NameTree`, we currently iterate through the *entire* dictionary just to fetch *one* destination.
This code appears to simply have been copy-pasted from the `get destinations` method, but in its current form it's quite unnecessary/inefficient since can just get the required destination directly instead.
2015-07-08 18:31:51 +02:00
Jonas Jenwald
940bedf75f Add a unit-test that attempts to fetch a non-existent named destination
Doing this helped uncover an issue with the `getDestination` implementation.
Currently if a named destination doesn't exist, the method (in `obj.js`) may return `undefined` which leads to the promise being stuck in a pending state.
*Note:* returning `null` for this case is consistent with other methods, e.g. `getOutline` and `getAttachments`.
2015-07-07 22:05:08 +02:00
Jonas Jenwald
e5b7258586 Merge pull request #6180 from timvandermeij/border-regression
Do not draw a border if neither a Border nor a BS entry is present
2015-07-06 18:04:57 +02:00
Tim van der Meij
3a6eed6248 Do not draw a border if neither a Border nor a BS entry is present
Fixes #6179.
2015-07-06 14:48:59 +02:00
Tim van der Meij
5aa1d9fdfd Remove InteractiveAnnotation abstraction
This became obsolete in bdeca30fbf. All it does is call the Annotation contructor and add hasHtml. This patch lets the Link and Text annotations directly extend the Annotation class and add hasHtml themselves.

This patch also removes an unused global.
2015-07-06 11:58:26 +02:00
Yury Delendik
0787182e6f Adds more characters to the PUA range 2015-07-02 16:47:47 -05:00
Yury Delendik
9ad6af4a3f Merge pull request #5531 from Rob--W/moz-chunked-only-moz
Feature-test moz-chunked-arraybuffer before use
2015-07-02 09:08:08 -05:00
Jonas Jenwald
d0477302be Add a supplemental glyph map for non-embedded ArialBlack fonts (issue 5801)
This should, hopefully, finally fix 5801.
2015-07-01 22:16:52 +02:00
Brendan Dahl
d8e201446d Merge pull request #6135 from Snuffleupagus/issue-5677-v2
Skip mapping of CIDFontType2 glyphs when the font either has a |IdentityToUnicodeMap| or a |toUnicodeMap| with 65536 elements (issue 5677)
2015-07-01 11:15:55 -07:00
Brendan Dahl
98339f63a8 Merge pull request #5585 from timvandermeij/annotation-layer-borderstyle
Annotation border styles
2015-07-01 10:48:12 -07:00
Rob Wu
2e63dcdcf5 Feature-test moz-chunked-arraybuffer before use 2015-07-01 15:31:40 +02:00
Yury Delendik
f3c3b1fc2d Removes B2G preprocessing directives. 2015-06-30 16:53:32 -05:00
Tim van der Meij
18e1a14e65 Merge pull request #6138 from Snuffleupagus/issue-4558
Ignore paint form XObject when the name is missing (issue 4558)
2015-06-23 20:28:02 +02:00
Jonas Jenwald
46a8485db4 Ignore paint form XObject when the name is missing (issue 4558)
Fixes 4558 (since the font issues already appear to be fixed).
2015-06-22 22:10:26 +02:00
Jonas Jenwald
bc865b9e2d Ensure that we fetch all indirect objects (i.e. |Ref|s) in ColorSpace_parseToIR
Recently I've landed a number patches which fixed issues with ColorSpaces. In most of these cases the cause of the failures were, either partially or entirely, related to the fact that we didn't resolve indirect objects (i.e. the code was missing `xref.fetchIfRef(...)`).

The purpose of this patch is to fix the few remaining cases where indirect objects *could* potentially cause failures.
Given that we have seen how this causes failures in practice, I thus think that it makes sense to try and avoid further issues, instead of waiting for users to file even more bugs for this part of the code-base.
2015-06-19 10:34:27 +02:00
Jonas Jenwald
aa3a64e975 Skip mapping of CIDFontType2 glyphs when the font either has a |IdentityToUnicodeMap| or a |toUnicodeMap| with 65536 elements (issue 5677)
This patch slightly extends the heuristics used when trying to skip mapping of missing glyphs.

Fixes 5677.
2015-06-18 21:53:15 +02:00
Tim van der Meij
9550c00184 Transform old implementation to new implementation of border styles 2015-06-17 22:28:06 +02:00
Tim van der Meij
9ba4f74370 Implement setBorderStyle for annotations 2015-06-17 22:28:05 +02:00
Tim van der Meij
88b2059ed9 Implement annotation border style class and constants 2015-06-17 22:26:47 +02:00
Jonas Jenwald
60fbb5ef69 Ensure that the result of |constructStichedFromIRResult| is a number (issue 6113)
Fixes 6113.
2015-06-14 23:29:38 +02:00
Jonas Jenwald
bc5e43b45c Use the Alternate entry, if it exists, in ICCBased Colour Space dictionaries (issue 5836, issue 5939, issue 6055)
Fixes 5836.
Fixes 5939.
Fixes 6055.
2015-06-14 12:10:22 +02:00
Jonas Jenwald
bf20334bea Merge pull request #6090 from Snuffleupagus/issue-6068
Map missing glyphs to the notdef glyph for TrueType (3, 1) fonts (issue 6068)
2015-06-13 00:29:08 +02:00
Jonas Jenwald
5eae3e29c5 Map missing glyphs to the notdef glyph for TrueType (3, 1) fonts (issue 6068)
Fixes 6068.

The most notable issue with the font in question is that the `differences` array contains lots of strange entries (of the type `uniXXXX`, instead of proper glyph names).
2015-06-06 18:28:16 +02:00
Jonas Jenwald
6f2f0700b7 Don't map glyphs to certain problematic Thai/Lao Unicode locations (issue 5994)
*This patch depends on PR 5990.*

According to https://dxr.mozilla.org/mozilla-central/source/gfx/harfbuzz/src/hb-ot-shape-fallback.cc#38, certain Thai/Lao characters are treated as special by the font shaping code in Firefox.
Further down in that file, https://dxr.mozilla.org/mozilla-central/source/gfx/harfbuzz/src/hb-ot-shape-fallback.cc#216, the vertical position of glyphs is modified, which should thus explain why some glyphs end up in the wrong position in the PDF file.

Fixes 5994.
2015-06-05 23:53:22 +02:00
Brendan Dahl
749a60a0b7 Merge pull request #5990 from Snuffleupagus/missing-glyphs-identityUnicode
Skip mapping of CIDFontType2 glyphs in fonts with a |IdentityToUnicodeMap|, unless |properties.widths| is defined for the glyph
2015-06-05 14:50:02 -07:00
Jonas Jenwald
64e1fb99fe Fetch parameters if they are |Ref|s in Pattern color spaces (issue 6081)
Fixes 6081.
2015-06-04 22:01:01 +02:00
Yury Delendik
82c5cf6617 Merge pull request #6062 from Snuffleupagus/revert-parse-all-jpegs
Revert PR 6024 "[Firefox] Parse all JPEG images in the addon", since it's fixed upstream
2015-06-01 07:43:18 -05:00
Jonas Jenwald
a28ed7c834 Always traverse the entire parent chain in Page_getInheritedPageProp (issue 5954)
This enables us to find resources placed on multiple levels of the tree.

Fixes 5954.
2015-05-30 12:21:05 +02:00
Jonas Jenwald
a1743d9952 Revert PR 6024 "[Firefox] Parse all JPEG images in the addon", since it's fixed upstream 2015-05-29 12:58:17 +02:00
Mike Corbin
4c9b65f0e1 Extract correct PDF format version from the catalog
The 'Version' field of the most recent document catalog, if present, is
intended to supersede the value in the file prologue.

This is significant for incrementally-built PDF documents and generators that
emit a low version in the prologue and later apply a format version based on
PDF features used, such as Apple's CoreGraphics/Quartz PDF backend.

Fixes the internal version variable, as well as the PDFFormatVersion reported
by the API and consumed by viewers.
2015-05-26 01:56:09 +01:00
Yury Delendik
07af86cf70 Merge pull request #6016 from Snuffleupagus/issue-6010
Convert UTF8 encoded passwords to ISO-8859-1 for |R = 6| encryption (issue 6010)
2015-05-18 08:22:47 -05:00
Jonas Jenwald
dd4fc29cbc [Firefox] Parse all JPEG images in the addon
Workaround for:
 - https://bugzilla.mozilla.org/show_bug.cgi?id=1164199.
 - https://github.com/mozilla/pdf.js/issues/6017.
2015-05-15 21:40:34 +02:00
Tim van der Meij
7da9626d16 Merge pull request #5901 from Snuffleupagus/bug-1050040
Fall back to the |defaultEncoding| when no valid "post" table is found in TrueType fonts (bug 1050040)
2015-05-15 12:54:04 +02:00
Jonas Jenwald
6fbc5428bd Skip mapping of CIDFontType2 glyphs in fonts with a |IdentityToUnicodeMap|, unless |properties.widths| is defined for the glyph
Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1142033.
Also fixes issue 5874.
2015-05-14 22:38:04 +02:00
Jonas Jenwald
44240798be Convert UTF8 encoded passwords to ISO-8859-1 for |R = 6| encryption (issue 6010)
For passwords where the encoding already is correct, the conversion is a no-op.
Also, since `encodeURIComponent` might throw, we need to make sure that we handle that case too.

Fixes 6010.
2015-05-14 21:46:31 +02:00
Tim van der Meij
90982332bf Merge pull request #5995 from CodingFabian/tweak-char-spacing-text-selection
Apply char spacing only when there are chars.
2015-05-14 20:06:22 +02:00
Jonas Jenwald
0365baf5ab Fall back to the |defaultEncoding| when no valid "post" table is found in TrueType fonts (bug 1050040)
Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1050040.

With this patch the file is completely readable, but given that the font is broken enough to be rejected by OTS the rendering differs slightly from Adobe Reader.

*Note:* the PDF file is sufficiently broken that even Adobe Reader complains about the font, *and* also about another more general issue.
2015-05-14 13:16:14 +02:00
Jonas Jenwald
70b839386a Ensure that the cmap position is within the bounds of the font file in |readCmapTable| 2015-05-14 13:16:09 +02:00
Tim van der Meij
67816bd085 Merge pull request #5999 from hellemar/handle-utf8-in-url
Bug 1122280 - Handle UTF-8 encoding in URI
2015-05-14 12:52:40 +02:00
Fabian Lange
c2013094e7 Apply char spacing only when there are chars. 2015-05-13 23:45:20 +02:00
Tim van der Meij
d484ebd492 Merge pull request #5910 from jordan-thoms/fix-concatenated-files
Fix error reading concatenated pdfs
2015-05-13 22:40:55 +02:00
Tim van der Meij
b34366d2fc Merge pull request #5898 from stri8ed/master
Extract more accurate glyph heights from type3 fonts
2015-05-13 21:07:17 +02:00
Martin Heller
a61a4b18cc URL annotations handled as UTF-8 to accommodate some bad PDFs. For proper 7-bit ASCII this makes no difference. Fixes Bug 1122280. 2015-05-11 00:46:59 +02:00
Jonas Jenwald
6d2d854f65 Merge pull request #5815 from Snuffleupagus/type1-diff-refs
Ensure that entries in the Differences array of Type1 fonts are either numbers or names
2015-05-07 22:33:23 +02:00
Brendan Dahl
cd53cbe7d4 Merge pull request #5964 from Snuffleupagus/bug-1157493
Handle the Encoding being a dictionary in PartialEvaluator_preEvaluateFont (bug 1157493)
2015-05-05 14:41:32 -07:00
Tim van der Meij
0c84899c0a Revert #5603 regarding Chrome range request bug 2015-04-30 22:37:52 +02:00
Jonas Jenwald
760222cf0b Handle the Encoding being a dictionary in PartialEvaluator_preEvaluateFont (bug 1157493)
*This is a regression from PR 4423.*

Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1157493.
2015-04-25 16:48:14 +02:00
Jonas Jenwald
7c7d05e7a3 Attempt to infer if a CMap file actually contains just a standard Identity-H/Identity-V map 2015-04-25 11:28:33 +02:00
Tim van der Meij
48b2f6d023 Merge pull request #5756 from Snuffleupagus/issue-5751
Guess CIDFontType0 subtype based on font file contents (issue 5751)
2015-04-24 23:50:07 +02:00
Brendan Dahl
846eb967cc Merge pull request #5655 from Snuffleupagus/issue-5644
Avoid getting stuck in empty nodes in the Pages tree when calling |Catalog_getPageDict| (issue 5644)
2015-04-20 11:46:27 -07:00
Jordan Thoms
d0ea772fc6 Fix error reading concatenated pdfs 2015-04-18 20:56:07 +12:00
Jonas Jenwald
4c2ad3bc7b Ensure that entries in the Differences array of Type1 fonts are either numbers or names
This patch is yet another installment in the (never ending) series of bugs in PDF files with non-embedded fonts.

Fixes http://www.int.washington.edu/talks/WorkShops/int_08_37W/People/Franz_M/Franz.pdf.
2015-04-17 20:32:27 +02:00
Marco Castelluccio
1bd952f897 Use Int32Array instead of Uint32Array in FlateStream 2015-04-17 16:33:04 +02:00
Brendan Dahl
63aaf1b969 Merge pull request #5923 from Snuffleupagus/bug-911034
Don't map glyphs to certain problematic General Punctuation Unicode locations (bug 911034)
2015-04-15 14:31:54 -07:00
Thomas Leitner
3ebc85e55f Crypt filter EFF key should have StmF value as default, not StrF
This fixes the problem.
2015-04-13 21:27:32 +02:00
Jonas Jenwald
fda858ae33 Don't map glyphs to certain problematic General Punctuation Unicode locations (bug 911034)
Fixes the remaining missing characters in https://bugzilla.mozilla.org/show_bug.cgi?id=911034.

For reference, see http://www.unicode.org/charts/PDF/U2000.pdf (and also http://en.wikipedia.org/wiki/General_Punctuation_%28Unicode_block%29).
2015-04-09 17:27:03 +02:00
Jonas Jenwald
a54ec673c5 Move the checks for problematic Unicode locations from |adjustMapping| to a separate helper function 2015-04-09 12:56:29 +02:00
Levi Melamed
a5159a7942 extract more accurate glpyh heights from type-3 fonts 2015-04-03 08:49:06 -05:00
Jonas Jenwald
2b1a13ba28 Don't map glyphs to Unicode position 0x0E33, i.e. Thai character SARA AM (bug1046314)
*A similar approach as in PR 5705.*

Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1046314.

According to https://dxr.mozilla.org/mozilla-central/source/gfx/harfbuzz/src/hb-ot-shape-complex-thai.cc#270-365, `0x0E33` is treated as a special case (by the font shaping code in Firefox). Hence it seems reasonable to skip it when adjusting the font mapping.
2015-03-26 13:22:45 +01:00
Brendan Dahl
3a8d4a7d72 Merge pull request #5713 from Snuffleupagus/evaluator-IdentityToUnicodeMap
Create a IdentityToUnicodeMap in evaluator.js when toUnicode contains IdentityH/IdentityV
2015-03-25 10:33:29 -07:00
Brendan Dahl
519b6669f0 Merge pull request #5705 from Snuffleupagus/bug-1108301
Don't map glyphs to Unicode "Dotted circle" combining mark (bug 1108301)
2015-03-24 16:33:04 -07:00
Jonas Jenwald
e894a0a4c6 Guess CIDFontType0 subtype based on font file contents (issue 5751) 2015-03-15 13:35:48 +01:00
Jonas Jenwald
4a9ff471c4 Correctly detect the presence of the Adobe specific APP14 JPEG marker (bug 1140761)
According to the specification, http://partners.adobe.com/public/developer/en/ps/sdk/5116.DCT_Filter.pdf#G3.851943, the content of the marker segment should begin with `Adobe`, and not `Adobe\x00` as the code currently look for.

Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1140761.
Fixes the colour conversion part of issues 4090 and 5623.
2015-03-10 13:07:09 +01:00
Tim van der Meij
c69ad5885c Merge pull request #5770 from Snuffleupagus/opentype-cff
Correctly detect OpenType font files with CFF data
2015-03-06 22:58:43 +01:00
Tim van der Meij
5eedfff647 Merge pull request #5734 from Hengjie/lower-space-threshold
Lower space factor threshold
2015-03-06 21:00:01 +01:00
Jonas Jenwald
f81fc9091a Correctly detect OpenType font files with CFF data
Fixes 5334.
Fixes 215.
Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1125614.

According to the specification, http://www.microsoft.com/typography/otspec/otff.htm, OpenType font files with CFF data should have `OTTO` in the header.
2015-02-28 13:43:53 +01:00
Jonas Jenwald
00ee6bd6b6 Merge pull request #5693 from collinanderson/whitespace
cleaned whitespace
2015-02-28 10:09:21 +01:00
Yury Delendik
2e14cc70cc Merge pull request #5731 from Snuffleupagus/issue-5331
Skip fill bytes (0xFF) when decoding JPEG images (issue 5331)
2015-02-27 06:36:30 -06:00
Yury Delendik
23916b2b14 Merge pull request #5748 from Snuffleupagus/issue-5747
Fetch parameters if they are a |Ref| in CalGray/CalRGB color spaces (issue 5747)
2015-02-26 17:44:46 -06:00
Jonas Jenwald
0a3341dadc Don't map glyphs to Unicode "Dotted circle" combining mark (bug 1108301)
It seems that `0x25CC` is another bad spot for charCodes.
Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1108301.
2015-02-27 00:20:38 +01:00
Jonas Jenwald
888cbe0bde Avoid getting stuck in empty nodes in the Pages tree when calling |Catalog_getPageDict| (issue 5644) 2015-02-22 17:42:15 +01:00
Jonas Jenwald
7c8996558a Fetch parameters if they are a |Ref| in CalGray/CalRGB color spaces (issue 5747) 2015-02-20 12:53:02 +01:00
Jonas Jenwald
417800a1b5 Only skip the |!isSymbolicFont| check for TrueType (3, 1) cmap tables if no previous cmap table was found (PR 5703 followup)
Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=894572.
2015-02-19 13:58:03 +01:00
Brendan Dahl
6bb0a483b1 Merge pull request #5703 from Snuffleupagus/issue-5701
Relax the |isSymbolicFont| check for TrueType (3, 1) cmap tables (issue 5701)
2015-02-18 14:50:19 -08:00
Collin Anderson
54e984c763 cleaned whitespace 2015-02-17 11:07:37 -05:00
Hengjie
109d67691c Lower threshold
Fixes text selection formatting with https://github.com/vortext/vortext/blob/master/resources/public/examples/TestDocument3.pdf
2015-02-13 22:27:49 -08:00
Jonas Jenwald
3651c9e1f7 Skip fill bytes (0xFF) when decoding JPEG images (issue 5331) 2015-02-14 00:08:43 +01:00
Tim van der Meij
27e3558a41 Fix CCITTStream regression by byte-aligning rows before checking EOL marker 2015-02-13 21:29:00 +01:00
Jonas Jenwald
592890a758 Relax the |isSymbolicFont| check for TrueType (3, 1) cmap tables (issue 5701) 2015-02-13 01:03:10 +01:00
Brendan Dahl
394b38b22f Merge pull request #5651 from Snuffleupagus/missing-glyphs
Try to skip mapping of missing TrueType and CIDFontType2 glyphs
2015-02-11 19:31:22 -08:00
Brendan Dahl
fb8200096b Merge pull request #5634 from Snuffleupagus/cmap-0,0
Add support for TrueType (0, 0) cmap tables (issue 5501, issue 5574, and bug 1037973)
2015-02-11 15:04:03 -08:00
Jonas Jenwald
f19a1db414 Create a IdentityToUnicodeMap in evaluator.js when toUnicode contains IdentityH/IdentityV
Currently if a font contains a `toUnicode` entry, we always create a new `ToUnicodeMap` in evaluator.js. This is done even for `IdentityV/IdentityH`, despite to possibility to use the much more compact `IdentityToUnicodeMap` representation.
This patch refactors the `IdentityH/IdentityV` cases, to:
 - Avoid calling `IdentityCMap.getMap`, since this prevents allocating and iterating through an array with 65536 elements.

 - Ensure that the handling of `toUnicode` is actually correct in fonts.js.
We rely on `toUnicode instanceof IdentityToUnicodeMap` in a few places, and currently this does not work correctly for `IdentityH/IdentityV`.
2015-02-09 16:52:31 +01:00
Jonas Jenwald
01e6565dd4 Try to skip mapping of missing TrueType glyphs
Also don't skip mapping of glyphs which are empty, if the corresponding charCode is included in toUnicode.
2015-02-07 12:19:38 +01:00
Jonas Jenwald
8174da61fb Don't skip mapping of glyphs for CIDFontType2 fonts with a CIDToGIDMap
Also don't skip mapping of glyphs which are empty, if the corresponding charCode is included in toUnicode.
2015-02-07 12:19:37 +01:00