Commit Graph

4573 Commits

Author SHA1 Message Date
Tim van der Meij
95f9075565
Optimize TextLayerRenderTask._layoutText to avoid intermediate string creation
This method creates quite a few intermediate strings on each call and
it's called often, even for smaller documents like the Tracemonkey
document. Scrolling from top to bottom in that document resulted in
12936 strings being created in this method. With this commit applied,
this is reduced to 3610 strings.
2018-12-30 14:39:08 +01:00
Tim van der Meij
d5e5d18430
Convert the PDFDocument class in src/core/document.js to ES6 syntax 2018-12-30 13:54:43 +01:00
Tim van der Meij
612fc9fcc2
Convert the Page class in src/core/document.js to ES6 syntax 2018-12-30 13:54:43 +01:00
Tim van der Meij
aad27ff9a0
Optimize the Ref class in src/core/primitives.js
The `toString` method always creates two string objects (for the 'R'
character and for the `num` concatenation) and in the worst case
creates three string objects (one more for the `gen` concatenation).
For the Tracemonkey paper alone, this resulted in 12000 string
objects when scrolling from the top to the bottom of the document.
Since this is a hot function, it's worth minimizing the number of string
objects, especially for large documents, to reduce peak memory usage.

This commit refactors the `toString` method to always create only one
string object.
2018-12-29 17:48:41 +01:00
Jonas Jenwald
60bcce184e Check that the first page can be successfully loaded, to try and ascertain the validity of the XRef table (issue 7496, issue 10326)
For PDF documents with sufficiently broken XRef tables, it's usually quite obvious when you need to fallback to indexing the entire file. However, for certain kinds of corrupted PDF documents the XRef table will, for all intents and purposes, appear to be valid. It's not until you actually try to fetch various objects that things will start to break, which is the case in the referenced issues[1].

Since there's generally a real effort being in made PDF.js to load even corrupt PDF documents, this patch contains a suggested approach to attempt to do a bit more validation of the XRef table during the initial document loading phase.

Here the choice is made to attempt to load the *first* page, as a basic sanity check of the validity of the XRef table. Please note that attempting to load a more-or-less arbitrarily chosen object without any context of what it's supposed to be isn't a very useful, which is why this particular choice was made.
Obviously, just because the first page can be loaded successfully that doesn't guarantee that the *entire* XRef table is valid, however if even the first page fails to load you can be reasonably sure that the document is *not* valid[2].

Even though this patch won't cause any significant increase in the amount of parsing required during initial loading of the document[3], it will require loading of more data upfront which thus delays the initial `getDocument` call.
Whether or not this is a problem depends very much on what you actually measure, please consider the following examples:

```javascript
console.time('first');
getDocument(...).promise.then((pdfDocument) => {
  console.timeEnd('first');
});

console.time('second');
getDocument(...).promise.then((pdfDocument) => {
  pdfDocument.getPage(1).then((pdfPage) => { // Note: the API uses `pageNumber >= 1`, the Worker uses `pageIndex >= 0`.
    console.timeEnd('second');
  });
});
```

The first case is pretty much guaranteed to show a small regression, however the second case won't be affected at all since the Worker caches the result of `getPage` calls. Again, please remember that the second case is what matters for the standard PDF.js use-case which is why I'm hoping that this patch is deemed acceptable.

---
[1] In issue 7496, the problem is that the document is edited without the XRef table being correctly updated.
In issue 10326, the generator was sorting the XRef table according to the offsets rather than the objects.

[2] The idea of checking the first page in particular came from the "standard" use-case for the PDF.js library, i.e. the default viewer, where a failure to load the first page basically means that nothing will work; note how `{BaseViewer, PDFThumbnailViewer}.setDocument` depends completely on being able to fetch the *first* page.

[3] The only extra parsing is caused by, potentially, having to traverse *part* of the `Pages` tree to find the first page.
2018-12-29 12:47:25 +01:00
Tim van der Meij
360c3d3813
Remove the unused url argument for the ChunkedStreamManager class 2018-12-24 13:14:42 +01:00
Tim van der Meij
47344197f4
Convert src/core/chunked_stream.js to ES6 syntax 2018-12-24 13:14:42 +01:00
Tim van der Meij
103f4616ac
Merge pull request #10334 from Snuffleupagus/OpenAction-dest
[api-minor] Add support for OpenAction destinations (issue 10332)
2018-12-23 20:49:50 +01:00
Jonas Jenwald
f0719ed565 [api-minor] Change the getViewport method, on PDFPageProxy, to take a parameter object rather than a bunch of (randomly) ordered parameters
If, as PR 10368 suggests, more parameters should be added to `getViewport` I think that it would be a mistake to not change the signature *first* to avoid needlessly unwieldy call-sites.

To not break any existing code and third-party use-cases, this is obviously implemented with a deprecation warning *and* with a working fallback[1] for the old method signature.

---
[1] This is limited to `GENERIC` builds, which should be sufficient.
2018-12-21 11:55:20 +01:00
Jonas Jenwald
b05f053287 [api-minor] Add support for OpenAction destinations (issue 10332)
Note that the OpenAction dictionary may contain other information besides just a destination array, e.g. instructions for auto-printing[1].
Given first of all that an arbitrary `Dict` cannot be sent from the Worker (since cloning would fail), and second of all that the data obviously needs to be validated, this patch purposely only adds support for fetching a destination from the OpenAction entry[2].

---
[1] This information is, currently in PDF.js, being included through the `getJavaScript` API method.

[2] This significantly reduces the complexity of the implementation, which seems fine for now. If there's ever need for other kinds of OpenAction to be fetched, additional API methods could/should be implemented as necessary (could e.g. follow the `getOpenActionWhatever` naming scheme).
2018-12-19 11:45:16 +01:00
Jonas Jenwald
ba2edeae18 [api-minor] Add support, in getMetadata, for custom information dictionary entries (issue 5970, issue 10344) (#10346)
The custom entries, provided that they exist *and* that their types are safe to include, are exposed through a new `Custom` infoDict entry to clearly separate them from the standard ones.

Fixes 5970.
Fixes 10344.
2018-12-18 23:26:02 +01:00
Jonas Jenwald
437fb8a8a7 Ignore the fieldValue for Signature annotations, since they're currently unsupported (issue 10374)
Given that Signature (Widget) annotations are currently not supported, since they cannot be validated, simply ignoring the `fieldValue` seems OK for now considering that attempting to blindly include unparsed/unvalidated data isn't very useful.

Fixes 10347.
2018-12-12 18:01:43 +01:00
Tim van der Meij
45c0197465
Merge pull request #10330 from janpe2/svg-line-width-zero
Handle line width of zero in SVG
2018-12-07 23:34:27 +01:00
Jani Pehkonen
ddabeb0645 Handle line width of zero in SVG 2018-12-04 16:05:32 +02:00
Jonas Jenwald
d0fec7c6fb Fix NameOrNumberTree.get to actually perform a binary search to find the requested key
The intent of the code, based on existing comments, is to perform a binary search. However, because of what appears to be a typo in the code responsible for computing the current search index, this code is always checking *every* entry (albeit only at the "final" node) starting from the last one.
2018-11-23 23:52:33 +01:00
Jonas Jenwald
fdad0a0b0b Fallback to an exhaustive search, in corrupt PDF files, for NameTrees/NumberTrees that are not correctly ordered (issue 10272)
According to the specification, see https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G6.2384179, the keys of NameTree/NumberTree should be ordered.
For corrupt PDF files, which violate this assumption, we thus need to fallback to an exhaustive search in order to e.g. find all destinations.

*Please note:* Given that this only implements a fallback for the "final" node of the Tree, there's obviously a risk that the patch isn't sufficient for dealing with all kinds of out-of-order corruption. However, this kind of problem should be rare in practice, and without a real-world test-case it's difficult to implement a completely general solution (and there's obviously a question if you'd even want to).
2018-11-20 17:50:47 +01:00
Jani Pehkonen
9e990f6f3e Repair CFF fonts if stem hints are in wrong order 2018-11-20 18:50:37 +02:00
Jonas Jenwald
ac6b94c9dd Replace the remaining occurences, in src/display/api.js, of var with let/const 2018-11-18 19:08:27 +01:00
Jonas Jenwald
061f7bd2f3 Convert PDFWorker, in src/display/api.js, to an ES6 class
Also changes all occurrences of `var` to `let`/`const` in this code.
2018-11-18 19:08:27 +01:00
Jonas Jenwald
02e77a39ec Convert InternalRenderTask, in src/display/api.js, to an ES6 class
This changes all occurrences of `var` to `let`/`const` in this code, and updates the signature of the constructor to use object destructuring for better readability (and self documentation).
Also, `useRequestAnimationFrame` is changed to a parameter and the `typeof window` check is now done *once* rather than at every `_scheduleNext` call.
2018-11-18 19:08:27 +01:00
Jonas Jenwald
5a0d64a6de Convert PDFPageProxy, in src/display/api.js, to an ES6 class
This changes all occurrences of `var` to `let`/`const` in this code, and updates the signatures of a couple of methods to use object destructuring.
Finally, when creating `InternalRenderTask` instances *only* the necessary parameter are now provided, since passing through the `RenderParameters` as-is seems completely unnecessary.
2018-11-18 19:08:25 +01:00
Jonas Jenwald
2c003a82d5 Convert RenderTask, in src/display/api.js, to an ES6 class
Also deprecates the `then` method, in favour of the `promise` getter.
2018-11-18 19:08:00 +01:00
Jonas Jenwald
ef8e5fd77c Convert PDFDocumentLoadingTask, in src/display/api.js, to an ES6 class
Also deprecates the `then` method, in favour of the `promise` getter.
2018-11-18 19:07:57 +01:00
PalmerAL
5f15dc2023 Use span instead of div in the text layer
This improves copy/pasting text content since it reduces the amount of unnecessary newlines.
2018-11-18 15:54:08 +01:00
Tim van der Meij
2194aef03e
Merge pull request #10238 from Snuffleupagus/interfaces
Move the `interface` definitions out of `src/core/worker.js` and into their own file
2018-11-08 23:28:46 +01:00
Jonas Jenwald
4829f567c1 Move the interface definitions out of src/core/worker.js and into their own file
These interfaces are already used in different files, in both the `src/core/` and `src/display/` folders, and having them reside in their own file seems a lot clearer and is also similar to the existing viewer interfaces.

As part of moving the `interface` definitions, they're also converted to ES6 classes.
2018-11-08 13:21:37 +01:00
Jonas Jenwald
60da2d882b [api-minor] Refactor/simplify the PDFObject class
First of all, note how there's currently *two* methods for checking if a certain object exists, which seems completely unwarranted.
Furthermore, the rarely used `getData` method was removed and its only callsite changed to use a combination of `PDFObjects.{has, get}` instead.
Finally, the methods were rearranged slightly, to bring the most important ones (for an API user) to the top of the class.
2018-11-08 10:13:39 +01:00
Jonas Jenwald
d32321d84f Convert PDFObjects, in src/display/api.js, to an ES6 class
Also changes all occurrences of `var` to `const`, and marks internal properties/methods as "private".
2018-11-08 10:11:40 +01:00
Tim van der Meij
3e342554d1
Merge pull request #10228 from morille/patch-2
Don't detect nw.js as node.js
2018-11-07 23:51:01 +01:00
Romain Petit
13b0ca6b2a Don't detect nw.js as node.js
nw.js is chrome plus nodejs.
It will succeed everywhere chrome succeeds, but fail in many cases where nodejs succeeds (see issue 9071).
So it's safer to consider it as a browser context rather than a nodejs context.

Make travis happy again

CS

Readability + Explanation

The relevant portion of the NW.js documentation:
http://docs.nwjs.io/en/latest/For%20Users/Advanced/JavaScript%20Contexts%20in%20NW.js/#access-nodejs-and-nwjs-api-in-browser-context

Added full link to relevant doc.
2018-11-07 11:14:22 +01:00
Jonas Jenwald
a963d139dc Convert src/core/ps_parser.js to use ES6 classes
Besides being a fairly small and self-contained file, this code also shows a possible way of defining static constants on classes.
2018-11-03 17:43:06 +01:00
Tim van der Meij
ec76aa531e
Merge pull request #10202 from Snuffleupagus/issue-10200
Attempt to clean-up/restore pending rendering operations on `RenderTask.cancel` (issue 10200)
2018-11-02 23:11:47 +01:00
Jonas Jenwald
f23dba1c10 Change canvasInRendering to a WeakSet instead of a WeakMap
Note how nowhere in the code `canvasInRendering.get()` is ever called, and that this structure is really only used to store references to `<canvas>` DOM elements.
The reason for this being a `WeakMap` is probably because at the time we weren't using `core-js` polyfills yet, and since there already existed a manually implemented `WeakMap` polyfill it was probably simpler to use that.
2018-10-31 18:15:23 +01:00
Jonas Jenwald
f77b463339 Attempt to clean-up/restore pending rendering operations on RenderTask.cancel (issue 10200)
Please note that, given the lack of a runnable example, I'm not totally sure if this first of all is enough to *completely* address the issue as filed and second of all if we actually want this new behaviour.
2018-10-31 16:22:17 +01:00
Tim van der Meij
ed4ac1bc67
Merge pull request #10162 from janpe2/svg-normalize-bbox
Normalize BBox of form XObjects in SVG back-end
2018-10-28 13:18:48 +01:00
Jani Pehkonen
9cd5f94f03 Normalize the BBox of form XObjects on the /core side 2018-10-22 14:17:05 +03:00
Jonas Jenwald
5bb7f4b615 Convert PDFDataRangeTransport to an ES6 class 2018-10-20 17:15:27 +02:00
Tim van der Meij
d21892933d
Merge pull request #10161 from Snuffleupagus/DataLoaded-onProgress
Ensure that `onProgress` is always called when the entire PDF file has been loaded, regardless of how it was fetched (issue 10160)
2018-10-20 15:22:05 +02:00
Jonas Jenwald
54f9883c51 Export CMapCompressionType and PermissionFlag on the pdfjsLib object (issue 10148, PR 10033 follow-up)
`CMapCompressionType` makes a lot of sense to export, for anyone attempting to implement a custom `CMapReaderFactory`; fixes 10148.

`PermissionFlag` likewise needs to be exported, since otherwise the result of the `getPermissions` API method becomes difficult to interpret; follow-up to 10033.
2018-10-20 11:38:00 +02:00
Jonas Jenwald
327f2eb588 Ensure that onProgress is always called when the entire PDF file has been loaded, regardless of how it was fetched (issue 10160)
*Please note:* I'm totally fine with this patch being rejected, and the issue closed as WONTFIX; however these changes should address the issue if that's desired.

From a conceptual point of view, reporting loading progress doesn't really make a lot of sense for PDF files opened by passing raw binary data directly to `getDocument` (since obviously *all* data was loaded).
This is compared to PDF files loaded via e.g. `XMLHttpRequest` or the Fetch API, where the entire PDF file isn't available from the start and knowing the loading progress makes total sense.

However I can certainly see why the current API could be considered inconsistent, which isn't great, since a registered `onProgress` callback will never be called for certain `getDocument` calls.
The simplest solution to this inconsistency thus seem to be to ensure that `onProgress` is always called when handling the `DataLoaded` message, since that will *always* be dispatched[1] from the worker-thread.

---
[1] Note that this isn't guaranteed to happen, since setting `disableAutoFetch = true` often prevents the *entire* file from ever loading. However, this isn't relevant for the issue at hand, and is a well-known consequence of using `disableAutoFetch = true`; note how the default viewer even has a specialized code-path for hiding the loadingBar.
2018-10-16 13:51:12 +02:00
Jonas Jenwald
4cde844ffe Add a DOMTokenList.toggle polyfill for the second, optional, "force" parameter
This is based on the polyfill available at https://developer.mozilla.org/en-US/docs/Web/API/Element/classList#Polyfill
2018-10-12 15:41:09 +02:00
Tim van der Meij
9e9426c354
Merge pull request #10143 from Snuffleupagus/getMainThreadWorkerMessageHandler-catch-errors
Ensure that `getMainThreadWorkerMessageHandler` won't accidentally break `getDocument` (PR 10139 follow-up)
2018-10-11 00:05:01 +02:00
Jonas Jenwald
0e2c6047e4 Ensure that getMainThreadWorkerMessageHandler won't accidentally break getDocument (PR 10139 follow-up)
*This should have been part of PR 10139.*

In the event that a user has attempted to manually load the worker file on the main-thread, but somehow failed to do that correctly, there's a possibility that `getMainThreadWorkerMessageHandler` could throw. Considering how/where that helper function is being called, an error could still prevent `PDFDocumentLoadingTask` from completing (regardless if it's being resolved/rejected).
2018-10-09 15:44:31 +02:00
Jonas Jenwald
21c8dd4842 Combine the pdfjsFilePath and fallback workerSrc handling in src/display/api.js
With the way that the `getWorkerSrc()` helper function is implemented now, there's no longer a particularly strong reason for keeping the global `pdfjsFilePath` variable around.
With this patch the fallback `workerSrc` will thus, assuming is wasn't already set, be set to the "pdfjsFilePath" which simplifies the `getWorkerSrc()` function and reduces the amount of global state.

Finally, the global `workerSrc` variable was renamed to prevent shadowing.
2018-10-09 13:47:48 +02:00
Tim van der Meij
f45e46d7ad
Merge pull request #10133 from kevinleedrum/fix-content-length
Set returnValues.suggestedLength to Content-Length if integer
2018-10-09 00:05:57 +02:00
Kevin Lee Drum
4cf10ac79d set returnValues.suggestedLength to Content-Length if integer 2018-10-07 13:26:29 -04:00
Jonas Jenwald
755c6edc5e Ensure that the PDFDocumentLoadingTask is rejected when "setting up fake worker" failed (issue 10135)
This should, hopefully, cover all the possible ways[1] in which "fake workers" are loaded. Given the different code-paths, adding unit-tests might not be that simple.
Note that in order to make this work, the various `fakeWorkerFilesLoader` functions were converted to return `Promises`.

---
[1] Unfortunately there's lots of them, for various build targets and configurations.
2018-10-06 13:18:51 +02:00
Simon Leblanc
b5806735d8 Add support of Ink annotation 2018-10-03 00:28:49 +02:00
Tim van der Meij
138324502c
Merge pull request #10119 from Snuffleupagus/rm-onFileAttachmentAnnotation
Attempt to simplify the `fileattachmentannotation` event dispatching
2018-10-02 23:25:22 +02:00
Jonas Jenwald
d60ce998f1 Attempt to simplify the fileattachmentannotation event dispatching
This attempts to reduced the level of indirection, and the amount of code, when dispatching `fileattachmentannotation` events, by removing the `PDFLinkService.onFileAttachmentAnnotation` method and just accessing `PDFLinkService.eventBus` directly in the `FileAttachmentAnnotationElement` constructor.
Given that other properties, such as `externalLinkTarget`/`externalLinkRel`, are already being accessed directly this pattern seems fine here as well.
2018-10-01 15:09:08 +02:00
Jonas Jenwald
d6f4d2ff33 Add a Symbol polyfill, using core-js, to allow using for...of loops
https://github.com/zloirock/core-js#ecmascript-symbol
2018-09-29 16:05:00 +02:00
Jonas Jenwald
435ec6a0d5 Use the Font Loading API in MOZCENTRAL builds, and GENERIC builds for Firefox version 63 and above (issue 9945) 2018-09-29 16:05:00 +02:00
Jonas Jenwald
05b021bcce Refactor the FontLoader into proper, build-specific, ES6 classes
Also changes `var` to `let`/`const` in code already touched in the patch, and makes use of template strings in a few spots.
2018-09-29 16:05:00 +02:00
Jonas Jenwald
45d6651976 Refactor unused Date.now() calls in FontLoader.queueLoadingCallback
The `started` timestamp is completely usused, and the `end` timestamp is currently[1] being used essentially like a boolean value.
Hence this code can be simplified to use an actual boolean value instead, which avoids potentially hundreds (or even thousands) of unnecessary `Date.now()` calls.

---
[1] Looking briefly at the history of this code, I cannot tell if the timestamps themselves were ever used for anything (except for tracking "boolean" state).
2018-09-29 15:57:04 +02:00
Jonas Jenwald
ad3e937816 Replace the Font.loading property with, the already existing, Font.missingFile property
The `Font.loading` property is only ever used *once* in the code, whereas `Font.missingFile` is more widely used. Furthermore the name `loading` feels, at least to me, slight less clear than `missingFile`. Finally, note that these two properties are the inverse of each other.
2018-09-29 15:57:04 +02:00
Jonas Jenwald
caf90ff6ee Convert FontFaceObject to an ES6 class
Also changes `var` to `let`/`const` in code already touched in the patch, and makes use of template strings in a few spots.
2018-09-29 15:57:04 +02:00
Jonas Jenwald
842e9206c0 Replace String.prototype.substr() occurrences with String.prototype.substring()
As outlined in https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/substr, which refers to the ECMA-262 specification, using the `substr` function is advised against.

Hence this PR, which replaces all remaining `substr` occurrences with `substring` instead. Please refer to https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/substr#Syntax respectively https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/substring#Syntax for the differences between the two functions.

Note that in most cases in the code-base there's only one argument passed to `substr`, and those require no other changes except replacing "substr" with "substring". For the other cases, the `substr(start, length)` calls are changed to `substring(start, start + length)` instead.
2018-09-28 11:41:07 +02:00
Brendan Dahl
ae7dcae27e Fix abbreviation. 2018-09-13 13:10:38 -07:00
Brendan Dahl
6adeabbb66 Add Glyph & Cog's XPDF copyright/license information. 2018-09-12 13:59:56 -07:00
Jonas Jenwald
5181172498 Slightly improve the isSourcePDF parameter handling in JpegImage (PR 10031 follow-up)
Currently there's only a single spot in the code-base where `JpegImage.getData` is called, however it nonetheless seem like a good idea to ensure during tests that the `isSourcePDF` parameter is correctly set. (Especially considering that the PDF use-cases will break without it.)

Additionally, in `JpegImage._getLinearizedBlockData`, the code can be made a tiny bit more efficient by checking the value of `isSourcePDF` *first* to avoid useless checks (for the default PDF use-cases).
2018-09-12 11:30:59 +02:00
Tim van der Meij
bf13c8a50b
Use the const keyword for constants in src/shared/util.js
Moreover, move general constants to the top of the file, i.e., those
that are not closely tied to a function in the file.
2018-09-11 16:17:45 +02:00
Tim van der Meij
99de25d6cc
Implement unit tests for the isSameOrigin and createValidAbsoluteUrl utility functions
Moreover, mark the `isValidProtocol` function as private since it's only
used in the utilities file and is not (meant to be) exported.
2018-09-11 16:17:45 +02:00
Tim van der Meij
9a115b41de
Merge pull request #10034 from timvandermeij/canvas-workaround
Remove `getSinglePixelWidth` workaround
2018-09-09 17:36:04 +02:00
Tim van der Meij
66422eb83e
Merge pull request #9340 from brendandahl/private-use
Map all glyphs to the private use area and duplicate the first glyph.
2018-09-08 17:51:04 +02:00
Romain Petit
8671081001
Fix font-string variable name typo
The font-string rebuild condition is always satisfied because the concerned variables are never set.
2018-09-07 09:55:45 +02:00
Brendan Dahl
b76cf665ec Map all glyphs to the private use area and duplicate the first glyph.
There have been lots of problems with trying to map glyphs to their unicode
values. It's more reliable to just use the private use areas so the browser's
font renderer doesn't mess with the glyphs.

Using the private use area for all glyphs did highlight other issues that this
patch also had to fix:

  * small private use area - Previously, only the BMP private use area was used
    which can't map many glyphs. Now, the (much bigger) PUP 16 area can also be
    used.

  * glyph zero not shown - Browsers will not use the glyph from a font if it is
    glyph id = 0. This issue was less prevalent when we mapped to unicode values
    since the fallback font would be used. However, when using the private use
    area, the glyph would not be drawn at all. This is illustrated in one of the
    current test cases (issue #8234) where there's an "ä" glyph at position
    zero. The PDF looked like it rendered correctly, but it was actually not
    using the glyph from the font. To properly show the first glyph it is always
    duplicated and appended to the glyphs and the maps are adjusted.

  * supplementary characters - The private use area PUP 16 is 4 bytes, so
    String.fromCodePoint must be used where we previously used
    String.fromCharCode. This is actually an issue that should have been fixed
    regardless of this patch.

  * charset - Freetype fails to load fonts when the charset size doesn't match
    number of glyphs in the font. We now write out a fake charset with the
    correct length. This also brought up the issue that glyphs with seac/endchar
    should only ever write a standard charset, but we now write a custom one.
    To get around this the seac analysis is permanently enabled so those glyphs
    are instead always drawn as two glyphs.
2018-09-05 14:04:54 -07:00
Jonas Jenwald
e5a6d892b4
Revert "Attempt to combine separate beginText/endText sequences in getTextContent (issue 9984)" 2018-09-05 18:01:33 +02:00
Tim van der Meij
959ed3705b
Implement a permissions API 2018-09-02 21:23:09 +02:00
Tim van der Meij
4874e9ace0
Convert the WorkerTransport class, in src/display/api.js, to ES6 syntax 2018-09-02 21:06:57 +02:00
Tim van der Meij
9c37599fd3
Convert the PDFDocumentProxy class, in src/display/api.js, to ES6 syntax
Moreover, indicate that a member are private and improve the comments to
be more consistent.
2018-09-02 21:06:57 +02:00
Tim van der Meij
1a3e842dc4
Remove getSinglePixelWidth workaround
It's no longer necessary since https://bugzilla.mozilla.org/show_bug.cgi?id=1305963 is fixed quite some time ago.

While we're here, mark the `cachedGetSinglePixelWidth` member as being
private and use ES6 syntax in the `getSinglePixelWidth` method.
2018-09-02 20:36:06 +02:00
Jonas Jenwald
663922f93f Add a new parameter to JpegImage.getData to indicate the source of the image data (issue 9513)
The purpose of this patch is to provide a better default behaviour when `JpegImage` is used to parse standalone JPEG images with CMYK colour spaces.
Since the issue that the patch concerns is somewhat of a special-case, the implementation utilizes the already existing decode support in an attempt to minimize the impact w.r.t. code size.

*Please note:* It's always possible for the user of `JpegImage` to control image inversion, and thus override the new behaviour, by simply passing a custom `decodeTransform` array upon initialization.
2018-09-02 14:15:22 +02:00
Jonas Jenwald
47bf12cbac Change JpegImage._isColorConversionNeeded into a getter, rather than a regular function
Given how `_isColorConversionNeeded` is used, and that it always returns a boolean value, having it be a getter seems more appropriate.
2018-09-02 13:06:28 +02:00
Tim van der Meij
c94df0fef3
Merge pull request #9986 from Snuffleupagus/issue-9984
Attempt to combine separate beginText/endText sequences in `getTextContent` (issue 9984)
2018-09-01 21:21:29 +02:00
Tim van der Meij
66bd088948
Merge pull request #10010 from Snuffleupagus/issue-10004
Attempt to find truncated endstream commands, in the fallback code-path, in `Parser.makeStream` (issue 10004)
2018-09-01 18:44:08 +02:00
Tim van der Meij
283f2dfcc3
Merge pull request #10022 from janpe2/svg-Tr
Implement text rendering modes in SVG backend
2018-08-29 23:51:07 +02:00
Tim van der Meij
27ebb41b8f
Merge pull request #10020 from Snuffleupagus/addon-prefs-no-eslint
Ensure that the built `PdfJsDefaultPreferences.jsm` file won't be affected/touched during tree-wide ESLint rule changes in `mozilla-central` (PR 9571 follow-up)
2018-08-29 22:40:56 +02:00
Jonas Jenwald
d8aaa2f978 Update to the current year, i.e. 2018, in the bundle license headers 2018-08-28 23:46:56 +02:00
Jani Pehkonen
c426ea376c Implement text rendering modes in SVG backend 2018-08-29 00:42:07 +03:00
cheryly279
29c0ea159d Adding chunkname to async loaded code
Better name
2018-08-27 17:17:32 -04:00
Jonas Jenwald
95e5bad4c4 Attempt to find truncated endstream commands, in the fallback code-path, in Parser.makeStream (issue 10004)
Apparently there's some PDF generators, in this case the culprit is "Nooog Pdf Library / Nooog PStoPDF v1.5", that manage to mess up PDF creation enough that endstream[1] commands actually become truncated.

*Please note:* The solution implemented here isn't perfect, since it won't be able to cope with PDF files that contains a *mixture* of correct and truncated endstream commands.
However, considering that this particular mode of corruption *fortunately* doesn't seem very common[2], a slightly less complex solution ought to suffice for now.

Fixes 10004.

---
[1] Scanning through the PDF data to find endstream commands becomes necessary, in order to determine the stream length in cases where the `Length` entry of the (stream) dictionary is missing/incorrect.

[2] I cannot recall having seen any (previous) issues/bugs with "Missing endstream" errors.
2018-08-26 11:51:11 +02:00
Jonas Jenwald
c81cbe113c Extract the "scanning for endstream command" part of Parser.makeStream into a helper method
With this code now living in a separate method, it can be simplified slightly (e.g. by using early returns).
2018-08-26 11:51:09 +02:00
Tim van der Meij
436d2efa8a
Merge pull request #10007 from Snuffleupagus/ColorSpace-class
Convert the code in `src/core/colorspace.js to use ES6 classes
2018-08-25 18:45:40 +02:00
Tim van der Meij
4a0d15aa0e
Slightly simplify the catalog code 2018-08-25 16:40:59 +02:00
Tim van der Meij
aec236f6d8
Convert the Catalog class, in src/core/obj.js, to ES6 syntax 2018-08-25 16:38:22 +02:00
Jonas Jenwald
a182907592 Replace all occurences of var with let/const in src/core/colorspace.js 2018-08-25 03:20:21 +02:00
Jonas Jenwald
ce9a38c536 Convert the code in `src/core/colorspace.js to use ES6 classes
Reduces the amount of boilerplate code when defining the the sub-classes.

Please note that a couple of the closures were kept, since it's not (yet) possible to include helper functions inside of `class`es.
2018-08-25 03:20:19 +02:00
Jonas Jenwald
45b7b861b8 Remove the unused defaultColor property on ColorSpace instances
This property is not only completely unused now, it never actually appears to have been used. Even though the memory savings, from not initializing these extra typed arrays, won't be significant in the grand scheme of things it still seems completely unnecessary to keep allocating this data.

As far as I can tell, the main reason for the existence of `defaultColor` seem to be for documentation purposes. Hence the code is changed into comments instead, to keep the information around (but without the unnecessary allocations).
2018-08-23 11:16:52 +02:00
Jonas Jenwald
099ed08852 Add support for async/await using Babel
For proof-of-concept, this patch converts a couple of `Promise` returning methods to use `async` instead.
Please note that the `generic` build, based on this patch, has been successfully testing in IE11 (i.e. the viewer loads and nothing is obviously broken).

Being able to use modern JavaScript features like `async`/`await` is a huge plus, but there's one (obvious) side-effect: The size of the built files will increase slightly (unless `SKIP_BABEL == true`). That's unavoidable, but seems like a small price to pay in the grand scheme of things.

Finally, note that the `chromium` build target was changed to no longer skip Babel translation, since the Chrome extension still supports version `49` of the browser (where native `async` support isn't available).
2018-08-19 16:54:11 +02:00
Tim van der Meij
4ea663aa8a
Merge pull request #9987 from Snuffleupagus/rm-createBlob
[api-minor] Remove the obsolete `createBlob` helper function
2018-08-19 16:43:36 +02:00
Jonas Jenwald
75923ea515 Remove the unused PDFDocument.mainXRefEntriesOffset method
Not only is this method completely unused *now*, looking through the history of the code it never appears to have been used for anything either.
Years ago `mainXRefEntriesOffset` was included when creating `XRef` instances, however it wasn't actually used for anything (the parameter was never checked, nor assigned to a property on `XRef`).

If this method ever becomes useful (again) it's easy enough to restore it thanks to version control, but including dead code in the builds just seems wasteful.
2018-08-19 14:08:39 +02:00
Jonas Jenwald
50a47be190 [api-minor] Remove the obsolete createBlob helper function
At this point in time, all supported browsers have native support for `Blob`; please see https://developer.mozilla.org/en-US/docs/Web/API/Blob/Blob#Browser_compatibility.
Furthermore, note how the helper function was throwing an error if `Blob` isn't available anyway.
2018-08-19 13:37:19 +02:00
Jonas Jenwald
497b765ede Attempt to combine separate beginText/endText sequences in getTextContent (issue 9984)
Please note that while this *improves* issue 9984 slightly (and likely others too), it's not a complete solution.
The remaining issues are related to the, more general, problems with the existing heuristics related to attempting to combine separate text items.
2018-08-18 13:45:32 +02:00
Jonas Jenwald
bc89edb8f0 Ensure that Uint8ClampedArray is used for image data transfered by getTransfers (PR 9802 follow-up)
One of the `QueueOptimizer` cases wasn't updated to use `Uint8ClampedArray`s, which leads to inconsistent image data on the API side (but no actual rendering bugs, as far as I can tell).
To prevent future errors, a non-production/test-only `assert` was added to ensure that the relevant image data only uses `Uint8ClampedArray`s.
2018-08-16 10:29:44 +02:00
Tim van der Meij
1268aea2b6
Merge pull request #9975 from Snuffleupagus/getDestination-refactor
Re-factor `destinations`/`getDestination` to reduce unnecessary duplication, and reject non-string inputs
2018-08-12 15:51:58 +02:00
Tim van der Meij
af19ed6ee9
Merge pull request #9822 from timvandermeij/annotations
[api-minor] Refactor the annotation code to be asynchronous
2018-08-11 20:39:50 +02:00
dmitryskey
3741becb9b
[api-minor] Refactor the annotation code to be asynchronous
This commit is the first step towards implementing parsing for the
appearance streams of annotations.

Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com>
Co-authored-by: Tim van der Meij <timvandermeij@gmail.com>
2018-08-11 19:00:29 +02:00
Jonas Jenwald
1179584fd6 Reject getDestination, in the API, for non-string inputs
Note how e.g. the `getPage` method does basic validation of the input.
2018-08-11 16:06:35 +02:00
Jonas Jenwald
b74c813353 Re-factor destinations/getDestination, in the Catalog, to reduce unnecessary duplication
Currently, these two methods contain the same boilerplate code for getting the /Dests data.
2018-08-11 16:04:58 +02:00
Jonas Jenwald
06d1ff5af4 Tweak the MMType1 font detection in getFontFileType to improve font telemetry (PR 9961 follow-up)
Please note that this patch does *not* affect rendering in any way, however it's relevant for font telemetry[1].

According to the specification, see https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G8.1904956, Type1C is a valid subtype for *both* Type1 and MMType1 fonts.

---
[1] Refer to the font telemetry results in https://telemetry.mozilla.org/new-pipeline/dist.html#!cumulative=0&end_date=2018-06-25&keys=__none__!__none__!__none__&max_channel_version=nightly%252F62&measure=PDF_VIEWER_FONT_TYPES&min_channel_version=nightly%252F59&processType=*&product=Firefox&sanitize=1&sort_keys=submissions&start_date=2018-05-07&table=0&trim=1&use_submission_date=0

See also https://github.com/mozilla/pdf.js/wiki/Enumeration-Assignments-for-the-Telemetry-Histograms#pdf_viewer_font_types for help with interpreting the data.
2018-08-08 12:18:37 +02:00
Jonas Jenwald
f78efd883e Attempt to throw MissingPDFException when applicable in node_stream.js (issue 9791) 2018-08-06 10:00:03 +02:00
Tim van der Meij
4111871ac5
Merge pull request #9958 from brendandahl/always-fallback
Always fallback to system font on font failure.
2018-08-05 19:58:48 +02:00
Jonas Jenwald
3177f6aa55 Parse the font file to determine the correct type/subtype, rather than relying on the (often incorrect) data in the font dictionary
The current font type/subtype detection code is quite inconsistent/unwieldy. In some cases it will simply assume that the font dictionary is correct, in others it will somewhat "arbitrarily" check the actual font file (more of these cases have been added over the years to fix specific bugs).

As is evident from e.g. issue 9949, the font type/subtype detection code is continuing to cause issues. In an attempt to get rid of these hacks once and for all, this patch instead re-factors the type/subtype detection to *always* parse the font file.

Please note that, as far as I can tell, we still appear to need to rely on the composite font detection based on the font dictionary. However, even if the composite/non-composite detection would get it wrong, that shouldn't really matter too much given that there's basically only two different code-paths (for "TrueType-like" vs "Type1-like" fonts).
2018-08-05 11:13:16 +02:00
Jonas Jenwald
9bbca04579 Add a (basic) isCFFFile helper function to detect CFF font files
Compared to most other font formats, the CFF doesn't have a constant header which makes is slightly more difficult to detect such font files.

Please refer to the Compact Font Format specification: https://www.adobe.com/content/dam/acom/en/devnet/font/pdfs/5176.CFF.pdf#G3.32094
2018-08-05 11:13:14 +02:00
Jonas Jenwald
f4db38aadf Update the TrueType font file detection to also recognize the Mac specific header 'true'
Please refer to the TrueType specification: https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6.html#ScalerTypeNote
2018-08-05 10:33:56 +02:00
Brendan Dahl
5f67a6a237 Always fallback to system font on font failure.
The font in the PDF is marked as a CIDFontType0, but the font file is
actually a true type font. To fully address this issue we should really
peek into the font file and try to determine what it is. However, this
is the first case of this issue, so I think this solution is acceptable for
now.
2018-08-03 16:49:22 -07:00
Tim van der Meij
f19ee127a3
Merge pull request #9874 from boundlesshq/master
[api-minor] Include export value for checkboxes
2018-08-03 23:43:23 +02:00
Jonas Jenwald
a504befc76 Stop warning for non-Name /Filter entries in the PDFImage constructor (PR 9897 follow-up)
Fixes a stupid oversight on my part, since /Filter may (obviously) contain an Array, which resulted in unnecessary console warning spam in perfectly valid PDF files.
Note that it still makes sense to check that /Filter is actually a Name, before attempting to access its `name` property, but the warning should definitely be removed.
2018-08-03 10:23:08 +02:00
Brian
2a665ebad4 Removed Extraneous Matrix Check in CalRGB Conversion 2018-08-02 10:16:42 -07:00
Tim van der Meij
716acf63d4
Merge pull request #9938 from Snuffleupagus/issue-9915
Ensure that Type0, i.e. composite, OpenType fonts with `CFF ` tables are *not* treated as CFF fonts if their glyph mapping is non-default (issue 9915)
2018-08-02 00:11:18 +02:00
Jonas Jenwald
3ce420131f Prefer the Width/Height of the image data, rather than the image dictionary, for JPEG 2000 images (issue 9650)
According to the PDF specification, see https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#page=45
> When using the JPXDecode filter with image XObjects, the following changes to and constraints on some entries in the image dictionary shall apply (see 8.9.5, "Image Dictionaries" for details on these entries):
>
>  - Width and Height shall match the corresponding width and height values in the JPEG2000 data.
>
>  - . . .

Hence it seems reasonable to use the Width/Height of the image data *itself*, rather than the image dictionary when there's a mismatch. Given that JPEG 2000 images are already being parsed, in order to obtain basic parameters, the actual Width/Height is readily available in the `PDFImage` constructor.
2018-08-01 16:42:26 +02:00
Jonas Jenwald
17f65908ae Add more validation of the /Filter entry, in image dictionaries, to the PDFImage constructor
Given that the code is currently assuming that the /Filter entry is a `Name`, it cannot hurt to actually ensure that's the case.

Also fixes an error message, for JPEG 2000 images with unsupported ColorSpaces, since `this.numComps` hasn't been initialized when it's accessed during the `throw new Error()` invocation.
2018-08-01 16:41:15 +02:00
Jonas Jenwald
17eac2d48a Ensure that Type0, i.e. composite, OpenType fonts with CFF tables are *not* treated as CFF fonts if their glyph mapping is non-default (issue 9915)
This particular code-path has been the source of *numerous* regressions to date, so hopefully this patch won't cause any more of those.

Fixes 9915.
2018-07-29 23:06:15 +02:00
Jonas Jenwald
cfdb597e4a Ensure that the CIDSystemInfo strings, in Type0 fonts, are correctly decoded
This isn't directly related to the subsequent patch, but just something that I happened to notice while poking around in the font code.
2018-07-29 23:06:15 +02:00
Tim van der Meij
3521424576
Merge pull request #9920 from Snuffleupagus/getMetadata-linearization
[api-minor] Add an `IsLinearized` property to the `PDFDocument.documentInfo` getter, to allow accessing the linearization status through the API (via `PDFDocumentProxy.getMetadata`)
2018-07-29 20:23:22 +02:00
Tim van der Meij
f45450bd78
Merge pull request #9931 from Snuffleupagus/refactor-getPage
Refactor `getPage` (in the worker), and attempt to use the `Linearization` dictionary to lookup the first Page
2018-07-29 19:33:46 +02:00
Tim van der Meij
a2c317f12b
Merge pull request #9925 from Snuffleupagus/StreamsSequenceStream-maybeLength
Attempt to estimate the minimum required `buffer` length when initializing `StreamsSequenceStream` instances
2018-07-29 16:52:34 +02:00
Jonas Jenwald
ec3728b540 Use the Linearization dictionary, if it exists, when fetching the first Page
Since PDF.js already supports range requests and streaming, not to mention chunked rendering, attempting to use the `Linearization` dictionary in `PDFDocument.getPage` probably isn't going to improve performance in any noticeable way.
Nonetheless, when `Linearization` data is available, it will allow looking up the first Page *directly* without having to descend into the `Pages` tree to find the correct object.
2018-07-28 22:23:36 +02:00
Jonas Jenwald
fbb25ff4e2 Move getPage, on the worker side, from Catalog and into PDFDocument instead
Addresses an existing TODO, and avoids having to pass in a `pageFactory` when creating `Catalog` instances.
2018-07-28 22:23:36 +02:00
Jonas Jenwald
81b471c781 [Regression] Convert Catalog.builtInCMapCache into a Map, instead of an Object, to ensure that it's correctly reset (PR 8064 follow-up)
With the `builtInCMapCache` being a simple Object, it unfortunately means that the `Catalog.cleanup` method isn't resetting it as intended.
By just replacing the `builtInCMapCache` with an empty Object, existing references to it will not actually be updated. The result is that e.g. `Page` instances still keeps references to, what should have been removed, CMap data.

To fix these problems, the `builtInCMapCache` is converted into a `Map` instead (since it can be easily reset).
2018-07-28 22:20:43 +02:00
bion
c31ddf7edc [api-minor] Include export value for checkboxes 2018-07-28 00:30:41 -07:00
Jonas Jenwald
928b89382e [api-minor] Add an IsLinearized property to the PDFDocument.documentInfo getter, to allow accessing the linearization status through the API (via PDFDocumentProxy.getMetadata)
There was a (somewhat) recent question on IRC about accessing the linearization status of a PDF document, and this patch contains a simple way to expose that through already existing API methods.
Please note that during setup/parsing in `PDFDocument` the linearization data is already being fetched and parsed, provided of course that it exists. Hence this patch will *not* cause any additional data to be loaded.
2018-07-26 15:54:19 +02:00
Jonas Jenwald
8a4466139b Simplify the DocumentInfoValidators definition
With this file now being a proper (ES6) module, it's no longer (technically) necessary for this structure to be lazily initialized. Considering its size, and simplicity, I therefore cannot see the harm in letting `DocumentInfoValidators` just be simple Object instead.

While I'm not aware of any bugs caused by the current code, it cannot hurt to add an `isDict` check in `PDFDocument.documentInfo` (since the current code assumes that `infoDict` being defined implies it also being a Dictionary).

Finally, the patch also converts a couple of `var` to `let`/`const`.
2018-07-26 15:54:01 +02:00
Jonas Jenwald
2d51bce941 Remove unnecessary stream.length check from PDFDocument.linearization
Note first of all that `PDFDocument` will be initialized with either a `Stream` or a `ChunkedStream`, and that both of these have `length` getters. Secondly, the `PDFDocument` constructor will assert that the `stream` has a non-zero (and positive) length. Hence there's no point in checking `stream.length` in the `linearization` getter.
2018-07-26 15:54:01 +02:00
Jonas Jenwald
32bfa55d98 Attempt to estimate the minimum required buffer length when initializing StreamsSequenceStream instances
For most other `DecodeStream` based streams, we'll attempt to estimate the minimum `buffer` length based on the raw stream data. The purpose of this is to avoid having to unnecessarily re-size the `buffer`, thus reducing the number of *intermediate* allocations necessary when decoding the stream data.
However, currently no such optimization is attempted for `StreamsSequenceStream`, and given that they can often be quite large that seems unfortunate. To improve this, at least somewhat, this patch utilizes the raw sizes of the `StreamsSequenceStream` sub-streams to estimate the minimum required `buffer` length.

Most likely this patch won't have a huge effect on memory consumption, however for pathological cases it should help reduce peak memory usage slightly.
One example is the PDF file in issue 2813, where currently the `StreamsSequenceStream` instances would grow their `buffer`s as `2 MiB -> 4 MiB -> 8 MiB -> 16 MiB -> 32 MiB`. With this patch, the same stream `buffers`s grow as `8 MiB -> 16 MiB -> 32 MiB`, thus avoiding a total of `12 MiB` of *intermediate* allocations (since there's two `StreamsSequenceStream` used, for rendering/text-extraction).
2018-07-26 13:42:59 +02:00
Jonas Jenwald
36b683ca55 Provide custom messages for the no-restricted-globals ESLint rule, and refactor the .eslintrc files (PR 9868 follow-up)
Without providing useful (custom) error messages for the `no-restricted-globals` rule, see https://eslint.org/docs/rules/no-restricted-globals, it's quite likely that the rule will be incorrectly disabled rather than the required globals being imported as intended.

To reduced duplication of the `no-restricted-globals` rule in multiple `.eslintrc` files, it's instead moved to the top-level `.eslintrc` file and disabled as needed on a folder/file basis outside of `/src` and `/web`.
2018-07-23 14:10:13 +02:00
Jonas Jenwald
8ec99b200c Prevent Metadata/XML parsing from breaking PDFDocumentProxy.getMetadata when no XML root document is found (issue 8884)
With the new XML parser, see PR 9573, the referenced PDF file now causes `getMetadata` to fail when incomplete XML tags are encountered. This provides a simple, and hopefully generally useful, work-around that may also help prevent future bugs.

(Without being able to reproduce nor even understand the other (non XML) errors mentioned in issue 8884, I'd say that this patch is enough to close that one as fixed.)
2018-07-18 11:37:40 +02:00
Tim van der Meij
61db85ab64
Merge pull request #9886 from Snuffleupagus/bug-1473809
Prevent errors in `sanitizeTTProgram`, during parsing of CALL functions, when encountering invalid functions stack deltas (bug 1473809)
2018-07-15 17:23:52 +02:00
Jonas Jenwald
8e76d26e5b Move the toRoman helper function out of the Util scope
Compared to all the other (static) methods in `Util`, the `toRoman` one looks slightly out of place. Even more so considering that `Util` is being exposed through `pdfjsLib`, where access to a Roman numerals conversion method doesn't make much sense.
2018-07-10 10:45:25 +02:00
Jonas Jenwald
c1c49badff Remove the, now unused, Util.inherit helper function 2018-07-10 10:29:47 +02:00
Jonas Jenwald
2b25deb84c Prevent errors in sanitizeTTProgram, during parsing of CALL functions, when encountering invalid functions stack deltas (bug 1473809)
*I was feeling bored; so this is a very quick, and somewhat naive, attempt at fixing the bug.*

The breaking error, i.e. `Error during font loading: invalid array length`, was thrown when attempting to re-size the `stack` to a *negative* length when parsing the CALL functions.

Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1473809.
2018-07-10 09:45:55 +02:00
Jonas Jenwald
bf6d45f85a Convert CMap and IdentityCMap to ES6 classes
Also changes `var` to `let`/`const` in code already touched in the patch.
2018-07-09 21:12:01 +02:00
Jonas Jenwald
b773b356af Convert NameOrNumberTree, NameTree, and NumberTree to ES6 classes
Also changes `var` to `let`/`const` in code already touched in the patch.
2018-07-09 21:12:01 +02:00
Jonas Jenwald
ba1af46709 Convert CompiledFont, TrueTypeCompiled, and Type2Compiled to ES6 classes
Also changes `var` to `let`/`const` in code already touched in the patch.
2018-07-09 21:12:01 +02:00
Jonas Jenwald
775763a091 Ensure that CompiledFont.compileGlyph always returns an Array (PR 6141 follow-up)
PR 6141 changed `CompiledFont.compileGlyph` to, in the general case, return an Array. However, that PR apparenly forgot to update the no-glyph, empty-glyph, and endchar-glyph code-path and a String was still being (incorrectly) returned.

Given the way that `FontFaceObject.getPathGenerator` (on the API side) is implemented, this shouldn't have caused any bugs despite the Worker possible returning unexpected data.
2018-07-09 21:12:01 +02:00
Tim van der Meij
646d81cd09
Merge pull request #9837 from timvandermeij/unreachable
Replace `NotImplementedException` with `unreachable`
2018-07-09 21:10:36 +02:00
Tim van der Meij
907c7f190b
Convert src/code/pdf_manager.js to ES6 classes/syntax 2018-07-08 16:43:46 +02:00
Jonas Jenwald
a9ce4e8417 Stop exposing the URL polyfill in the global scope
This moves/exposes the `URL` polyfill similarily to the existing `ReadableStream` polyfill, rather than exposing it globally, to avoid interfering with any "outside" code.
Both the `URL` and `ReadableStream` polyfills are now exposed on the `pdfjsLib` object, such that they are accessible to the viewer components.
Furthermore, the `no-restricted-globals` ESLint rule is also enabled to prevent accidental usage of the native `URL`/`ReadableStream` implementations directly in the `src/` and `web/` folders; see also https://eslint.org/docs/rules/no-restricted-globals

Addresses the remaining TODO in https://github.com/mozilla/pdf.js/projects/6
2018-07-04 09:16:28 +02:00
Tim van der Meij
99f8f2c275
Merge pull request #9853 from Snuffleupagus/re-render-after-cancel
Fix re-rendering, using the same canvas, when rendering was previously cancelled (PR 8519 follow-up)
2018-06-29 23:25:43 +02:00
Tim van der Meij
6fa2c779b5
Merge pull request #9838 from Snuffleupagus/invalid-path-OPS
Error, rather than warn, once a number of invalid path operators are encountered in `EvaluatorPreprocessor.read` (bug 1443140)
2018-06-28 23:15:25 +02:00
Jonas Jenwald
bf0aca86d7 Fix re-rendering, using the same canvas, when rendering was previously cancelled (PR 8519 follow-up)
Currently if `RenderTask.cancel` is called *immediately* after rendering was started, then by the time that `InternalRenderTask.initializeGraphics` is called rendering will already have been cancelled.
However, we're still inserting the canvas into the `canvasInRendering` map, thus breaking any future attempts at re-rendering using the same canvas. Considering that `InternalRenderTask.cancel` always removes the canvas from the map, I cannot imagine that we'd ever want to re-add it *after* rendering was cancelled (it was likely just a simple oversight in PR 8519).

Fixes 9456.
2018-06-28 22:56:37 +02:00
Tim van der Meij
14b69a4c1c
Merge pull request #9729 from Snuffleupagus/gulp-image_decoders
Add a `gulp image_decoders` command to package the image decoders (i.e. jpg.js, jpx.js, jbig2.js) separately, and publish them in pdfjs-dist
2018-06-26 23:27:32 +02:00
Jonas Jenwald
74e9999044 Add unit-tests for PDFPageProxy.stats (PR 9245 follow-up)
This wasn't included in PR 9245, since all the API options were still global at that time.

Writing the unit-tests also uncovered an issue with `getOperatorList` not starting the "Page Request" timer.
2018-06-25 14:20:49 +02:00
Jonas Jenwald
7f21e38787 Error, rather than warn, once a number of invalid path operators are encountered in EvaluatorPreprocessor.read (bug 1443140)
Incomplete path operators, in particular, can result in fairly chaotic rendering artifacts, as can be observed on page four of the referenced PDF file.

The initial (naive) solution that was attempted, was to simply throw a `FormatError` as soon as any invalid (i.e. too short) operator was found and rely on the existing `ignoreErrors` code-paths. However, doing so would have caused regressions in some files; see the existing `issue2391-1` test-case, which was promoted to an `eq` test to help prevent future bugs.
Hence this patch, which adds special handling for invalid path operators since those may cause quite bad rendering artifacts.

You could, in all fairness, argue that the patch is a handwavy solution and I wouldn't object. However, given that this only concerns *corrupt* PDF files, the way that PDF viewers (PDF.js included) try to gracefully deal with those could probably be described as a best-effort solution anyway.

This patch also adjusts the existing `warn`/`info` messages to print the command name according to the PDF specification, rather than an internal PDF.js enumeration value. The former should be much more useful for debugging purposes.

Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1443140.
2018-06-24 16:05:08 +02:00
Tim van der Meij
2907827d31
Replace NotImplementedException with unreachable 2018-06-23 21:20:53 +02:00
Jonas Jenwald
275834ae66 Clean-up, and add JSDocs to, the PDFDocumentProxy.loadingParams method (PR 9830 follow-up) 2018-06-23 13:33:22 +02:00
Tim van der Meij
34594a5b02
Merge pull request #9830 from EugeneSqr/9824
Removed safari compatibility check (issue #9824)
2018-06-23 02:21:06 +02:00
Tim van der Meij
98ea39f9d0
Merge pull request #9827 from Snuffleupagus/misc-corrupt-pdf-fixes
Fix various corrupt PDF files (issue 9252, issue 9418)
2018-06-21 22:35:00 +02:00
eugenesqr
331ac8ae74 removed safari compatibility check 2018-06-21 12:57:56 +03:00
Brendan Dahl
a278c5a8dc
Merge pull request #9795 from timvandermeij/object-assign
Replace `Util.extendObj` by `Object.assign`
2018-06-20 10:50:40 -07:00
Jonas Jenwald
56e3648b65 Add basic validation of the 'trailer' dictionary candidates in XRef.indexObjects (issue 9418)
This patch avoids choosing a (possible) 'trailer' dictionary that `XRef.parse` and/or the `Catalog` constructor/methods will reject anyway.
Since `XRef.indexObjects` is already parsing the entire PDF file, the extra dictionary look-ups added here shouldn't matter much. Besides, this is a fallback code-path that only applies to corrupt PDF files anyway.
2018-06-20 13:41:22 +02:00
Jonas Jenwald
346810e02a Add basic validation of the 'Root' dictionary in XRef.parse and try to recover when possible
Note that the `Catalog` constructor, and some of its methods, are already enforcing that the 'Root' dictionary is valid/well-formed. However, by doing additional validation already in `XRef.parse` there's a slightly larger chance that corrupt PDF files could be successfully parsed/rendered.
2018-06-20 13:41:22 +02:00
Jonas Jenwald
e84813e7cc Prevent hard errors if fetching the Encrypt dictionary fails in XRef.parse 2018-06-20 13:41:22 +02:00
Jonas Jenwald
30ad62a86a Use the correct startPos when repeating the search for 'endobj' operators in XRef.indexObjects (PR 9288 follow-up) 2018-06-20 13:41:22 +02:00
Jonas Jenwald
6bbcafcd26 Let Lexer.getNumber treat a single decimal point as zero (issue 9252)
This is consistent with the behaviour in Adobe Reader.
2018-06-20 13:41:21 +02:00
Jonas Jenwald
df4799a12a Ensure that line-breaks are *only* skipped after operators in Lexer.getNumber (PR 8359 follow-up)
With the current code line-breaks are accepted not just after an operator, but after a decimal point as well. When looking at this again, the latter case seems prone to cause false positives and might also interfere with subsequent patches.

Hence this is code is adjusted to actually do what the original commit message says, and nothing more.
2018-06-20 13:41:15 +02:00
Jonas Jenwald
303537bcb1 Add a gulp image_decoders command to allow packaging/distributing the image decoders (i.e. jpg.js, jpx.js, jbig2.js) separately from the main PDF.js library
Please note that the standalone `pdf.image_decoders.js` file will be including the complete `src/shared/util.js` file, despite only using parts of it.[1] This was done *purposely*, to not negatively impact the readability/maintainability of the core PDF.js code.

Furthermore, to ensure that the compatibility is the same in the regular PDF.js library *and* in the the standalone image decoders, `src/shared/compatibility.js` was included as well.

To (hopefully) prevent future complaints about the size of the built `pdf.image_decoders.js` file, a few existing async-related polyfills are being skipped (since all of the image decoders are completely synchronous).
Obviously this required adding a couple of pre-processor statements, but given that these are all limited to "compatibility" code, I think this might be OK!?

---
[1] However, please note that previous commits moved `PageViewport` and `MessageHandler` out of `src/shared/util.js` which reduced its size.
2018-06-16 17:56:54 +02:00
Jonas Jenwald
bfc88ead66 Expose a Jbig2Image.parse method, by re-instating the parseJbig2 function
The purpose of this patch is to hopefully provide *slightly* better user ergonomics, if/when the PDF.js image decoders are used standalone.

This implementation is (basically) reverting the changes in PR 9386, in conjunction with code from the `parse` method found at https://github.com/notmasteryet/jpgjs/blob/master/src/pdfjs.js
2018-06-16 17:56:54 +02:00
Jonas Jenwald
682672db8e Change the signature of the JpegImage constructor, to allow passing in various options directly 2018-06-16 17:56:54 +02:00
Tim van der Meij
620da6f4df
Merge pull request #9802 from Snuffleupagus/ColorSpace-PDFImage-Uint8ClampedArray
Update `ColorSpace` and `PDFImage` to use `Uint8ClampedArray`s and remove manual clamping/rounding
2018-06-16 17:55:10 +02:00
Jonas Jenwald
0958006713 Send UnsupportedFeature notification when errors are ignored in FontFaceObject.getPathGenerator 2018-06-13 11:02:10 +02:00
Jonas Jenwald
bf0db0fb72 Pass the ignoreErrors API option to the FontFaceObject constructor, and utilize it in getPathGenerator to ignore missing glyphs
Obviously it's still not possible to render non-embedded fonts as paths, but in this way the rest of the page will at least be allowed to continue rendering.

*Please note:* Including the 14 standard fonts in PDF.js probably wouldn't be *that* difficult to implement. (I'm not a lawyer, but the fonts from PDFium could probably be used given their BSD license.)
However, the main blocker ought to be the total size of the necessary font data, since I cannot imagine people being OK with shipping ~5 MB of (additional) font data with Firefox. (Based on the reactions when the CMap files were added, and those are only ~1 MB in size.)
2018-06-13 11:02:06 +02:00
Jonas Jenwald
fe288bb872 Refactor the FontFaceObject.getPathGenerator method
- Reduce the overall indentation level, by making use of early returns.

 - Replace `var` with `let`.
2018-06-13 11:02:02 +02:00
Jonas Jenwald
778981ec89 Catch, and propagate, errors in the requestAnimationFrame branch of InternalRenderTask._scheduleNext
To support these changes, `InternalRenderTask._next` now returns a Promise.
2018-06-13 11:01:58 +02:00
Jonas Jenwald
d4ff541b78 Enforce the use, in non-production/test-only mode, of Uint8ClampedArray in all relevant methods in ColorSpace and PDFImage
Since `ColorSpace` now depends on the native clamping of `Uint8ClampedArray`, this patch adds non-production/test-only `assert`s to enforce that the expected TypedArray is used for the output.

These `assert`s are purposely *not* included in PRODUCTION builds since that would break rendering completely, as opposed to "only" displaying some weird colours, when a `Uint8Array` was used. Furthermore, these are mostly added to help catch explicit developer errors when working with the `ColorSpace` and `PDFImage` code.
2018-06-12 11:01:32 +02:00
Jonas Jenwald
4b69bb7fe9 Add a TESTING build option, to enable using non-production/test-only code-paths
Since the tests (currently) run with the `pdf.worker.js` file built, i.e. with `PRODUCTION = true` set, there's no simple way to add e.g. `assert` calls for both non-production *and* test-only builds without also affecting PRODUCTION builds.
2018-06-12 11:01:32 +02:00
Jonas Jenwald
f01e54eae1 Improve the warning messages printed by `PartialEvaluator.{getOperatorList, getTextContent} when errors are being ignored
Currently the actual errors aren't printed, which can make debugging harder than necessary.
2018-06-12 11:01:32 +02:00
Jonas Jenwald
731f2e6dfc Remove manual clamping/rounding from ColorSpace and PDFImage, by having their methods use Uint8ClampedArrays
The built-in image decoders are already using `Uint8ClampedArray` when returning data, and this patch simply extends that to the rest of the image/colorspace code.

As far as I can tell, the only reason for using manual clamping/rounding in the first place was because TypedArrays used to be polyfilled (using regular arrays). And trying to polyfill the native clamping/rounding would probably have been had too much overhead, but given that TypedArray support is required in PDF.js version `2.0` that's no longer a concern.

*Please note:* Because of different rounding behaviour, basically `Math.round` in `Uint8ClampedArray` respectively `Math.floor` in the old code, there will be very slight movement in quite a few existing test-cases. However, the changes should be imperceivable to the naked eye, given that the absolute difference is *at most* `1` for each RGB component when comparing `master` and this patch (see also the updated expectation values in the unit-tests).
2018-06-12 11:01:32 +02:00
Jonas Jenwald
55199aa281 Remove the unused bpc parameter from, and update the signature of, the resizeRgbImage function in src/core/colorspace.js 2018-06-12 11:01:32 +02:00
Jonas Jenwald
d1637056b3 Use shorthand method signatures in src/core/colorspace.js 2018-06-12 11:01:32 +02:00
Jonas Jenwald
32367c5968 Make the getBytes/peekBytes methods of Stream/DecodeStream/ChunkedStream able to return Uint8ClampedArrays
The built-in image decoders are already returning data as `Uint8ClampedArray`, and subsequently the JPEG/JBIG2/JPX streams are as well. However, for general streams we obviously don't want to force the use of `Uint8ClampedArray` unless an "Image" is actually being decoded.
Hence this patch, which adds a parameter that allows the caller of the `getBytes`/`peekBytes` methods to force a `Uint8ClampedArray` (rather than a `Uint8Array`) to be returned.
2018-06-12 11:01:32 +02:00
Brendan Dahl
3ac638fad3
Merge pull request #9689 from RafaPolit/master
Fixed critical unhandled promise that prevented error catching using API
2018-06-11 15:40:30 -06:00
Tim van der Meij
af8e88d00b
Replace Util.extendObj by Object.assign 2018-06-10 20:11:03 +02:00
Tim van der Meij
903bad1906
Remove Util.appendToArray and Util.prependToArray
The former may be replaced by regular JavaScript array concatenation and
the latter is unused. This avoids unnecessary function calls/imports.
2018-06-10 15:24:09 +02:00
Jonas Jenwald
07d610615c Move, and modernize, Util.loadScript from src/shared/util.js to src/display/dom_utils.js
Not only is the `Util.loadScript` helper function unused on the Worker side, even trying to use it there would throw an Error (since `document` isn't defined/available in Workers).
Hence this helper function is moved, and its code modernized slightly by having it return a Promise rather than needing a callback function.

Finally, to reduced code duplication, the "new" loadScript function is exported and used in the viewer.
2018-06-07 13:52:40 +02:00
Jonas Jenwald
547f119be6 Simplify the error handling slightly in the src/display/node_stream.js file
The various classes have `this._errored` and `this._reason` properties, where the first one is a boolean indicating if an error was encountered and the second one contains the actual `Error` (or `null` initially).

In practice this means that errors are basically tracked *twice*, rather than just once. This kind of double-bookkeeping is generally a bad idea, since it's quite easy for the properties to (accidentally) get into an inconsistent state whenever the relevant code is modified.

Rather than using a separate boolean, we can just as well check the "error" property directly (since `null` is falsy).

---

Somewhat unrelated to this patch, but `src/display/node_stream.js` is currently *not* handling errors in a consistent or even correct way; compared with `src/display/network.js` and `src/display/fetch_stream.js`.
Obviously using the `createResponseStatusError` utility function, from `src/display/network_utils.js`, might not make much sense in a Node.js environment. However at the *very* least it seems that `MissingPDFException`, or `UnknownErrorException` when one cannot tell that the PDF file is "missing", should be manually thrown.

As is, the API (i.e. `getDocument`) is not returning the *expected* errors when loading fails in Node.js environments (as evident from the `pending` API unit-test).
2018-06-06 09:05:45 +02:00
Jonas Jenwald
2fdaa3d54c Update the postMessageTransfers comment in createDocumentHandler in the src/core/worker.js file
Since the old comment mentions a now unsupported browser, let's update it such that someone won't accidentally conclude that the code in question can be removed.
2018-06-06 08:52:43 +02:00
Jonas Jenwald
871bf5c68b Remove the, now obsolete, handling of the CMapReaderFactory parameter in getDocument
This special handling was added in PR 8567, but was made redundant in PR 8721 which stopped sending everything but the kitchen sink to the Worker side.
2018-06-06 08:52:43 +02:00
Jonas Jenwald
c8e2163bbc Remove incorrect/unnecessary validation of the verbosity parameter in the PDFWorker constructor (PR 9480 follow-up) 2018-06-06 08:52:43 +02:00
Jonas Jenwald
b263b702e8 Rename PDFPageProxy.pageInfo to PDFPageProxy._pageInfo to indicate that the property should be considered "private"
Since `PDFPageProxy` already provide getters for all the data returned by `GetPage` (in the Worker), there isn't any compelling reason for accessing the `pageInfo` directly on `PDFPageProxy`.

The patch also changes the `GetPage` handler, in `src/core/worker.js`, to use modern JavaScript features.
2018-06-06 08:52:42 +02:00
Jonas Jenwald
4f4b50e01e Rename PDFDocumentProxy.pdfInfo to PDFDocumentProxy._pdfInfo to indicate that the property should be considered "private"
Since `PDFDocumentProxy` already provide getters for all the data returned by `GetDoc` (in the Worker), there isn't any compelling reason for accessing the `pdfInfo` directly on `PDFDocumentProxy`.
2018-06-06 08:52:42 +02:00
Jonas Jenwald
e89afa5899 Stop sending the PDFManagerReady message from the Worker, since it's unused in the API
After PR 8617 the `PDFManagerReady` message handler function, in `src/display/api.js`, is now a no-op. Hence it seems completely unnecessary to keep sending this message from `src/core/worker.js`.
2018-06-06 08:52:42 +02:00
Jonas Jenwald
eef53347fe Ensure that the correct data is sent, with the test message, from the worker if typed arrays aren't properly supported
With native typed array support now being mandatory in PDF.js, since version 2.0, this probably isn't a huge problem even though the current code seems wrong (it was changed in PR 6571).

Note how in the `!(data instanceof Uint8Array)` case we're currently attempting to send `handler.send('test', 'main', false);` to the main-thread, which doesn't really make any sense since the signature of the method reads `send(actionName, data, transfers) {`.
Hence the data that's *actually* being sent here is `'main'`, with `false` as the transferList, which just seems weird. On the main-thread, this means that we're in this case checking `data && data.supportTypedArray`, where `data` contains the string `'main'` rather than being falsy. Since a string doesn't have a `supportTypedArray` property, that check still fails as expected but it doesn't seem great nonetheless.
2018-06-06 08:52:42 +02:00
Jonas Jenwald
dc6e1b4176 Use Uint8ClampedArray for the image data returned by JpegDecode, in src/display/api.js
Since all the built-in PDF.js image decoders now return their data as `Uint8ClampedArray`, for consistency `JpegDecode` on the main-thread should be doing the same thing; follow-up to PR 8778.
2018-06-06 08:52:41 +02:00
Jonas Jenwald
47a9d38280 Add more validation in PDFWorker.fromPort
The signature of the `PDFWorker.fromPort` method, in addition to the `PDFWorker` constructor, was changed in PR 9480.
Hence it's probably a good idea to add a bit more validation to `PDFWorker.fromPort`, to ensure that it won't fail silently for an API consumer that updates to version 2.0 of the PDF.js library.
2018-06-06 08:52:41 +02:00
Jonas Jenwald
3c5c8d2a0b Remove the typed array check when calling LoopbackPort in PDFWorker._setupFakeWorker
With version 2.0, native support for typed arrays is now a requirement for using the PDF.js library; see PR 9094 where the old polyfills were removed.
Hence the `isTypedArraysPresent` check, when setting up fake workers, no longer serves any purpose here and can thus be removed.
2018-06-06 08:52:33 +02:00
Jonas Jenwald
89caaf4071 Use LoopbackPort in the "message_handler" unit-tests
There's no good reason, as far as I can tell, to duplicate the functionality of the `LoopbackPort` in the unit-tests. The only difference between the implementations is that `LoopbackPort` mimics the (native) structured cloning, however that shouldn't matter here since the tests are only sending "simple" data (strings respectively arrays with numbers).

Furthermore the patch also changes `LoopbackPort` to default to using "structured cloning" and deferred invocation of the listeners, since native typed array support is now a requirement for using the PDF.js library.
2018-06-04 12:53:08 +02:00
Jonas Jenwald
44d8afd46b Move MessageHandler into a separate src/shared/message_handler.js file
The `MessageHandler` itself, and its assorted helper functions, are currently the single largest[1] piece of code in the `src/shared/util.js` file. By moving this code into its own file, `src/shared/util.js` thus becomes smaller and more manageable.
2018-06-04 12:53:08 +02:00
Jonas Jenwald
69f2a77543 Update the signature of the PageViewport constructor, and improve the JSDoc comments for the class
This changes the constructor to take a parameter object, rather than a string of parameters.
2018-06-04 12:53:07 +02:00
Jonas Jenwald
51673dbc5a Convert the PageViewport to a proper ES6 class
Also converts all `var` to `let` for good measure.
2018-06-04 12:53:07 +02:00
Jonas Jenwald
5917b21702 Remove completely unused fontScale property from PageViewport
The `fontScale` property was added in PR 1531, see commit b312719d7e in particular, apparently for the sole purpose of supporting the "acroforms" example.

However, the `fontScale` property was never used anywhere else in the code-base, and after the modernization of the "acroforms" example in PR 8030 it's been completely unused.

Finally, note that there's also a (more suitably named) `scale` property on `PageViewport` instances, which contains the exact same information as the property being removed here.
2018-06-04 12:53:07 +02:00
Jonas Jenwald
08c8f8733d Move PageViewport from src/shared/util.js to src/display/dom_utils.js
Since the `PageViewport` is not used in the worker, duplicating this code on both the main and worker sides seems completely unnecessary.
2018-06-04 12:53:07 +02:00
Tim van der Meij
8ce24744f2
Merge pull request #9769 from Snuffleupagus/node-unittest-rm-console-errors
Reduce the amount of errors logged, primarily in Node.js/Travis, when running the unit-tests
2018-06-03 19:50:42 +02:00
Rob Wu
0e4e79169b Fall back to ISO-8859-1 in content_disposition.js
Updates content_disposition.js to include
9b789d9b3b
2018-06-03 16:17:28 +02:00
Rob Wu
e992480baa Fix multibyte decoding in content_disposition.js
I made some mistakes when trying to make the content_disposition.js
compatible with non-modern browsers (IE/Edge).
Notably, text decoding was usually skipped because of the inverted
logical check at the top of `textdecode`.

I verified that this new version works as expected, as follows:

1. Visit 55c71eb44e/test/
   and get  test-content-disposition.js
   also get test-content-disposition.node.js if using Node.js,
     or get test-content-disposition.html if you use a browser.
2. Modify `test-content-disposition.node.js` (or the HTML file) and
   change `../extension/content-disposition.js` to `PDFJS-content_disposition.js`
3. Copy the `getFilenameFromContentDispositionHeader` function from
   `content_disposition.js` (i.e. the file without the trailing exports)
   and save it as `PDFJS-content_disposition.js`.
4. Run the tests (`node test-content-disposition.node.js` or by opening
   `test-content-disposition.html` in a browser).
5. Confirm that there are no failures: "Finished all tests (0 failures)"

The code has a best-efforts fallback for Microsoft Edge, which lacks the
TextDecoder API. The fallback only supports the common UTF-8 encoding.
To simulate this in a test, modify `PDFJS-content_disposition.js` and
deliberately throw an error before `new TextDecoder`. There will be two
failures because we don't want to include too much code to support text
decoding for non-UTF-8 encodings in Edge

```
test-content-disposition.js:265 Assertion failed: Input: attachment; filename*=ISO-8859-1''%c3%a4
Expected: "ä"
Actual  : "ä"
test-content-disposition.js:268 Assertion failed: Input: attachment; filename*=ISO-8859-1''%e2%82%ac
Expected: "€"
Actual  : "€"
```
2018-06-03 15:28:22 +02:00
Jonas Jenwald
ef081a0531 Ensure that the WorkerTransport._passwordCapability is always rejected, even when errors are thrown in PDFDocumentLoadingTask.onPassword callback
Please note that while the current code works, both in the viewer and the unit-tests, it can leave the `WorkerTransport._passwordCapability` Promise in a pending state.
In the `PasswordRequest` handler, in src/display/api.js, we're returning the Promise from a `capability` object (rather than just a "plain" Promise). While an error thrown anywhere within this handler was fortunately enough to propagate it to the Worker side, it won't cause the Promise (in `WorkerTransport._passwordCapability`) to actually be rejected.
Finally note that while we're now catching errors in the `PasswordRequest` handler, those errors are still propagated to the Worker side via the (now) rejected Promise and the existing `return this._passwordCapability.promise;` line.

This prevents warnings about uncaught Promises, with messages such as "Error: Worker was destroyed during onPassword callback", when running the unit-tests both in browsers *and* in Node.js/Travis.
2018-06-03 00:28:40 +02:00
Jonas Jenwald
0ecc22cb04 Attempt to provide better default values for the disableFontFace/nativeImageDecoderSupport API options in Node.js
This should provide a better out-of-the-box experience when using PDF.js in a Node.js environment, since it's missing native support for both `@font-face` and `Image`.
Please note that this change *only* affects the default values, hence it's still possible for an API consumer to override those values when calling `getDocument`.

Also, prevents "ReferenceError: document is not defined" errors, when running the unit-tests in Node.js/Travis.
2018-06-03 00:28:37 +02:00
Tim van der Meij
36af85db92
Merge pull request #9740 from pedrotp/replace-get-getArray
Use Dict.getArray, instead of Dict.get, when getting the 'Size' in constructSampled in src/core/function.js
2018-06-02 19:50:09 +02:00
pedrotp
a190d21dd7 Use Dict.getArray, instead of Dict.get, when getting the 'Size' in constructSampled in src/core/function.js (PR 7295 follow-up) 2018-06-02 11:16:05 -04:00
Jonas Jenwald
83ff7d9de9 Simplify the DNL (Define Number of Lines) marker warning in JpegImage.parse 2018-05-30 22:40:11 +02:00
Jonas Jenwald
620f65488b Ignore the rest of the image when encountering an EOI (End of Image) marker while parsing Scan data (issue 9679) 2018-05-30 22:40:11 +02:00
Tim van der Meij
c5d5d29b03
Merge pull request #9756 from Snuffleupagus/Type1Parser-rm-makeSubStream
Remove usage of `makeSubStream` from `Type1Parser.extractFontProgram` in src/core/type1_parser.js (issue 9735)
2018-05-28 23:34:57 +02:00
Jonas Jenwald
f68f60099e Remove usage of makeSubStream from Type1Parser.extractFontProgram in src/core/type1_parser.js (issue 9735)
This avoids the initialization of, potentially thousands of, unnecessary `Stream` objects, by getting the required number of bytes directly instead.
Given the special behaviour, when `length === 0`, of the `getBytes`/`skip` methods, it's also necessary to handle that particular case to prevent errors when encountering empty CharStrings.
2018-05-28 14:32:20 +02:00
Mukul Mishra
949c3e9417 Add abort functionality in fetch stream 2018-05-22 12:46:59 +05:30
RafaPolit
d63b17dbe3 Fixed critical unhandled promise that prevented error catching using API 2018-04-24 13:10:00 -05:00
Jani Pehkonen
fe2cf2f73f SVG clip intersections and operators 2018-04-17 19:20:29 +03:00
Brendan Dahl
2dc4af525d
Merge pull request #9659 from yurydelendik/rm-createFromIR
Remove createFromIR from PDFFunctionFactory
2018-04-12 14:22:43 -07:00
Yury Delendik
20085aaa5e Remove createFromIR from PDFFunctionFactory; forgive invalid Dict values. 2018-04-10 18:49:31 -05:00
Jani Pehkonen
8ea505545a Use FDSelect and FDArray when converting CFF CID font to paths 2018-04-10 16:44:42 +03:00
Brendan Dahl
e8cf7fd512
Merge pull request #9624 from wojtekmaj/no-warning-on-dependency-operator
Prevent warning on unimplemented operator thrown for OPS.dependency
2018-04-03 10:55:29 -07:00
Wojciech Maj
acc0a0fe95 Prevent warning on unimplemented operator thrown for OPS.dependency 2018-04-02 14:29:34 +02:00
Wojciech Maj
ea2850e9a7 Fix typos 2018-04-01 23:20:41 +02:00
Tim van der Meij
8887a09e8f
Merge pull request #9588 from swftvsn/patch-1
Improve node.js support
2018-04-01 12:26:39 +02:00
Jonas Jenwald
8b09f7c34e Clean-up getMainThreadWorkerMessageHandler for non-PRODUCTION mode
*This is a final piece of clean-up of code that I recently wrote, after which I'm done :-)*

When the `getMainThreadWorkerMessageHandler` function was added, in PR 9385, it did so by basically introducing a `web/app.js` dependency in `src/display/api.js` through the `window.pdfjsNonProductionPdfWorker` property[1]. Even though this is limited to non-`PRODUCTION` mode, i.e. `gulp server`, it still seems unfortunate to have that sort of viewer dependency in the API code itself.

With the new, much nicer and shorter, names introduced in PR 9565 we can remove this non-`PRODUCTION` hack and just use `window.pdfjsWorker` in both the viewer and the API regardless of the build mode.

---

[1] It didn't seem correct to piggy-back on the `window.pdfjsDistBuildPdfWorker` property in non-`PRODUCTION` mode.
2018-03-29 11:03:47 +02:00
Tim van der Meij
5c1a16ba6e
Merge pull request #9586 from Snuffleupagus/pageSize-api-rotate
Ensure that `PDFPageProxy.pageSizeInches` handles non-default /Rotate entries correctly
2018-03-25 18:03:32 +02:00
Jonas Jenwald
d547936827 Ensure that PDFPageProxy.pageSizeInches handles non-default /Rotate entries correctly
Without this patch, the pageSize will be incorrectly reported for some PDF files.

---

Move pageSizeInches to ui_utils
2018-03-25 16:48:29 +02:00
Tim van der Meij
115fbc47fe
Merge pull request #9594 from Snuffleupagus/pageLabel-validation
Add stricter validation in `Catalog.readPageLabels`
2018-03-24 19:40:49 +01:00
Brendan Dahl
24f766b14d
Merge pull request #9573 from yurydelendik/xml_parser
New XML parser
2018-03-21 17:00:00 -07:00
Jonas Jenwald
374d074f6e Add stricter validation in Catalog.readPageLabels
The current PageLabel dictionary validation code won't catch some (unlikely) forms of corruption. For example: a `Type`/`S` entry being `null`/`0`/empty string, a `P`/`St` entry being `null`/`0`.

Please note: I'm not aware of any bugs caused by the old code, but I've had this patch sitting locally for some time and figured it couldn't hurt to submit it.
2018-03-21 14:36:05 +01:00
swftvsn
c20426efef Improve node.js support
This change fixes "Unhandled rejection ReferenceError: HTMLElement is not defined" issue that is discussed in more detail in #8489.
2018-03-21 13:43:53 +02:00
Brendan Dahl
63c7aee112
Merge pull request #9565 from brendandahl/new-name
Rename the globals to shorter names.
2018-03-20 13:49:04 -07:00
Yury Delendik
655c8d34d0 New XML parser 2018-03-19 20:51:41 -05:00
Jonas Jenwald
e0ae157582 [api-minor] Fix various issues related to the pageSize information
The `getPageSizeInches` method was implemented on `PDFDocumentProxy`, which seems conceptually wrong since the size property isn't global to the document but rather specific to each page. Hence the method is moved into `PDFPageProxy`, as `get pageSizeInches` instead to address this.

Despite the fact that new API functionality was implemented, no unit-tests were added. To prevent issues later on, we should *always* ensure that new functionality has at least some test-coverage; something that this patch also takes care of.

The new `PDFDocumentProperties._parsePageSize` method seemed unnecessary convoluted. Furthermore, in the "no data provided"-case it even returned incorrect data (an array, rather than the expected object).
Finally, the fallback strings didn't actually agree with the `en-US` locale. This inconsistency doesn't look too great, and it's thus addressed here as well.
2018-03-18 09:10:19 +01:00
Brendan Dahl
01bff1a81d Rename the globals to shorter names.
pdfjsDistBuildPdf=pdfjsLib
pdfjsDistWebPdfViewer=pdfjsViewer
pdfjsDistBuildPdfWorker=pdfjsWorker
2018-03-16 11:08:56 -07:00
Jonas Jenwald
d431ae069d Attempt to handle corrupt PDF documents that inline Page dictionaries in a Kids array (issue 9540)
According to the specification, see https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G6.1942297, the contents of a Kids array should be indirect objects.
2018-03-12 14:13:23 +01:00
Tim van der Meij
6662985a20
Merge pull request #9541 from Rob--W/crx-fetch-chrome61plus
[CRX] Disable fetch in Chrome 60-
2018-03-10 00:15:47 +01:00
Tim van der Meij
fded22c618
Merge pull request #9509 from timvandermeij/document
Implement a single `getInheritableProperty` utility function
2018-03-08 22:40:27 +01:00
Yury Delendik
e0fb18a339
Merge pull request #9508 from pal03377/file-info-page-size
Add paper size to document information/properties
2018-03-08 11:57:38 -06:00
Rob Wu
db428004f4 [CRX] Disable fetch in Chrome 60-
Chrome 60 and earlier does not include credentials (cookies) in requests
made with fetch, regardless of extension permissions. This was fixed in
61.0.3138.0 by
2e231cf052

This patch disables the fetch backend in all affected Chrome versions.
The browser detection is done by checking for a change that coincides
with the release of Chrome 61.

Test case:
1. Copy the `isChromeWithFetchCredentials` function from the patch.
2. Run it in the JS console of Chrome and verify the return value.

Verified results:
- 49.0.2623.75 - false (earliest supported version by us)
- 60.0.3112.90 - false (last major version affected by bug)
- 61.0.3163.100 - true (first major version without bug)
- 65.0.3325.146 - true (current stable)

Test case 2:
1. Build the extension (`gulp chromium`) and load it in Chrome.
2. Open the developer tools, and open any PDF file.
3. In the "Network tab" of the developer tools, look at "request type".
   In Chrome 60-: Should be "xhr"
   In Chrome 61+: Should be "fetch"
2018-03-08 18:27:30 +01:00
palsch
8558c5b1d9 Add page size to the document properties dialog 2018-03-08 18:23:47 +01:00
Tim van der Meij
f308d73d40
Implement a single getInheritableProperty utility function
This function combines the logic of two separate methods into one.
The loop limit is also a good thing to have for the calls in
`src/core/annotation.js`.

Moreover, since this is important functionality, a set of unit tests and
documentation is added.
2018-03-03 19:19:39 +01:00
Tim van der Meij
4e5eb59a33
Remove the getPageProp method in src/core/document.js
It's only used in two places in the class and those callsites can
directly get the information from the dictionary, which is more readable
and avoids an additional method call.
2018-03-03 14:57:42 +01:00
Jonas Jenwald
b8606abbc1 [api-major] Completely remove the global PDFJS object 2018-03-01 18:13:27 +01:00
Jonas Jenwald
4b4fcecf70 Ensure that we only pass in the necessary parameters when initializing PDFDataTransportStream/PDFNetworkStream in src/display/api.js
With options being moved from the global `PDFJS` object and into `getDocument`, a side-effect is that we're now passing in a fair number of useless parameters to the various transport/network streams.
Even though this doesn't *currently* cause any problems, it nonetheless seem like a good idea to explicitly provide the parameters that are actually necessary.
2018-03-01 18:11:17 +01:00
Jonas Jenwald
212553840f Move the pdfBug option from the global PDFJS object and into getDocument instead
Also removes the now unused `getDefaultSetting` helper function.
2018-03-01 18:11:17 +01:00
Jonas Jenwald
1d03ad0060 Move the disableCreateObjectURL option from the global PDFJS object and into getDocument instead 2018-03-01 18:11:17 +01:00
Jonas Jenwald
05c05bdef5 Move the disableStream option from the global PDFJS object and into getDocument instead 2018-03-01 18:11:16 +01:00
Jonas Jenwald
b69abf1111 Move the disableRange option from the global PDFJS object and into getDocument instead 2018-03-01 18:11:16 +01:00
Jonas Jenwald
69d7191034 Move the disableAutoFetch option from the global PDFJS object and into getDocument instead
One additional complication with removing this option from the global `PDFJS` object, is that the viewer currently needs to check `disableAutoFetch` in a couple of places. To address this I'm thus proposing adding a getter in `PDFDocumentProxy`, to allow checking the *actually* used values for a particular `getDocument` invocation.
2018-03-01 18:11:16 +01:00
Jonas Jenwald
c7c583583b Move the disableFontFace option from the global PDFJS object and into getDocument instead 2018-03-01 18:11:16 +01:00
Jonas Jenwald
f3900c4e57 Move the isEvalSupported option from the global PDFJS object and into getDocument instead 2018-03-01 18:11:16 +01:00
Jonas Jenwald
3c2fbdffe6 Move the cMapUrl and cMapPacked options from the global PDFJS object and into getDocument instead 2018-03-01 18:11:16 +01:00
Jonas Jenwald
b674409397 Move the maxImageSize option from the global PDFJS object and into getDocument instead 2018-03-01 18:11:16 +01:00
Jonas Jenwald
81c550903f Move various viewer components options from PDFJS/PDFViewerApplication.viewerPrefs and into AppOptions instead 2018-03-01 18:11:16 +01:00
Jonas Jenwald
5894bfa449 Move API specific compatibility options from src/shared/compatibility.js and into a separate file
Unfortunately, as far as I can tell, we still need the ability to adjust certain API options depending on the browser environment in PDF.js version `2.0`. However, we should be able to separate this from the general compatibility code in the `src/shared/compatibility.js` file.
2018-03-01 18:11:16 +01:00
Rob Wu
401f3a9d97
Merge pull request #9520 from Snuffleupagus/fix-includes-polyfills
Attempt to fix the `Array.prototype.includes` and `String.prototype.includes` polyfills (issue 9514, 9516)
2018-03-01 18:05:14 +01:00
Jonas Jenwald
cd12a177a9 Attempt to fix the Array.prototype.includes and String.prototype.includes polyfills (issue 9514, 9516)
I don't understand why the previous way importing the polyfills didn't work, and I don't have time to try and figure it out, however this patch seems to fix things.

Fixes 9514.
Fixes 9516.
2018-03-01 17:38:14 +01:00
Rob Wu
f893bcd41b
Merge pull request #9494 from pardeepshokeen/windows-drive-letter
Windows drive letter
2018-03-01 16:39:44 +01:00
pardeepshokeen
7ebbdd2579 helper function to parse url 2018-02-25 01:25:32 +05:30
Jonas Jenwald
a97901efb6 Move the verbosity option from the global PDFJS object and into getDocument/PDFWorker instead
Given the purpose of this option, it doesn't seem necessary to make it available through `GlobalWorkerOptions`.
2018-02-16 13:22:35 +01:00
Jonas Jenwald
fdd2376170 Move the postMessageTransfers option from the global PDFJS object and into getDocument/PDFWorker instead
Given the purpose of this option, it doesn't seem necessary to make it available through `GlobalWorkerOptions`.
2018-02-16 13:22:35 +01:00
Jonas Jenwald
83d52518da [api-major] Refactor PDFWorker to be initialized with a parameter object, rather than a bunch of regular parameters 2018-02-16 13:22:35 +01:00
Jonas Jenwald
c3c1fc511d Move the workerSrc option from the global PDFJS object and into GlobalWorkerOptions instead 2018-02-16 13:22:35 +01:00
Jonas Jenwald
45adf33187 Move the workerPort from the global PDFJS object and into GlobalWorkerOptions instead 2018-02-16 13:22:35 +01:00
Jonas Jenwald
003bd4044b Introduce a GlobalWorkerOptions object for (basic) Worker configuration
Compared to most other options currently/previously residing on the global `PDFJS` object, some of the Worker specific ones (e.g. `workerPort`/`workerSrc`) probably cannot be moved into options provided directly when initializing e.g. `PDFWorker`.
The reason is that in some cases, e.g. the Webpack examples, we try to provide Worker auto-configuration and I cannot see a good solution for that use-case if we completely remove the globally available Worker configuration.

However inline with previous patches for PDF.js version `2.0`, it does seem like a worthwhile goal to move away from storing options directly on the global `PDFJS` object, since that is a pattern we should avoid going forward. Especially since one of the (eventual) goals is to attempt to *completely* remove the global `PDFJS` object, and rely solely on exporting/importing the needed functionality.

By introducing the `GlobalWorkerOptions` we thus have larger flexibility in the future, if/when the global `PDFJS` object will be removed.
2018-02-16 13:22:35 +01:00
Rob Wu
a89071bdef
Merge pull request #9470 from Snuffleupagus/issue-4888
Ensure that `JpegImage.getData` returns the correct data length when `forceRGBoutput == true` (issue 4888)
2018-02-16 13:14:21 +01:00
Tim van der Meij
538dda1096
Merge pull request #9479 from Snuffleupagus/refactor-viewer-options
[api-major] Refactor viewer components initialization to reduce their dependency on the global `PDFJS` object
2018-02-14 22:47:33 +01:00
Jonas Jenwald
11ab3b5c00 Ensure that JpegImage.getData returns the correct data length when forceRGBoutput == true (issue 4888)
With PDF.js version `2.0` we'll only support browsers with built-in `TypedArray` functionality, hence there doesn't seem to be any good reason not to implement this now.

Fixes 4888.
2018-02-13 20:44:21 +01:00
Jonas Jenwald
e95c11a7f0 Remove the undocumented PDFJS.enableStats option
In order to simplify things, the undocumented `enableStats` option was removed and `pdfBug` is now instead used to enabled general debugging *and* page request/rendering stats.
Considering that in the default viewer the `stats` was only used when debugging was also enabled, this simplification (code wise) definitely seem worthwhile to me.
2018-02-13 16:56:57 +01:00
Jonas Jenwald
6bc3e1fb93 Remove version/build from the global PDFJS object
There's no need to expose these properties multiple times, since they're already exported by `src/display/api.js`.
2018-02-13 16:56:57 +01:00
Jonas Jenwald
77efed6626 Replace the PDFJS.disableWebGL option with a enableWebGL option passed, via BaseViewer/PDFPageView, to PDFPageProxy.render
Please note that the, pre-existing, viewer preference is already named `enableWebGL`; fixes 4919.
2018-02-13 16:56:56 +01:00
Jonas Jenwald
3a6f6d23d6 Move the externalLinkTarget and externalLinkRel options to PDFLinkService options
This removes the `PDFJS.externalLinkTarget`/`PDFJS.externalLinkRel` dependency from the viewer components, but please note that as a *temporary* solution the default viewer still uses it.
2018-02-13 14:28:40 +01:00
Jonas Jenwald
c45c394364 Move the imageResourcesPath option to a BaseViewer/PDFPageView/AnnotationLayerBuilder option
This removes the `PDFJS.imageResourcesPath` dependency from the viewer components and the test-suite, but please note that as a *temporary* solution the default viewer still uses it.
2018-02-13 14:28:38 +01:00
Jonas Jenwald
9e0a31f662 Move viewer specific compatibility options from src/shared/compatibility.js and into a separate file
Unfortunately, as far as I can tell, we still need the ability to adjust certain viewer options depending on the browser environment in PDF.js version `2.0`. However, we should be able to separate this from the general compatibility code in the `src/shared/compatibility.js` file.
2018-02-13 13:41:59 +01:00
Jonas Jenwald
f05e5c5460 Take the dictionary, and not just the image data, into account when caching inline images (issue 9398)
The reason for the bug is that we're only computing a checksum of the image data itself, but completely ignore the inline dictionary. The latter is important, since in practice it's not uncommon for inline images to be identical but use e.g. different ColourSpaces.

There's obviously a couple of different ways that we could compute a hash/checksum of the dictionary.
Initially I tried using `MurmurHash3_64` to compute a hash of the keys/values in the dictionary. Unfortunately this approach turned out to be *way* too slow in practice, especially for PDF files with a huge number of inline images; in particular issue 2618 would regresses quite badly with this solution.

The solution that is instead implemented in this patch, is to compute a checksum of the dictionary contents. While this is a much simpler, not to mention a lot more efficient, solution there's one drawback associated with it:
If the contents of inline image dictionaries are ordered differently, they will not be considered equal with this approach which could thus lead to failures to cache repeated inline images. In practice this doesn't seem to be a problem in any of the PDF files I've tested, and generally I'd rather err on the side of *not* caching given that too aggressive caching can easily lead to rendering bugs.

One small, but somewhat annoying, complication is that by the time `Parser.makeInlineImage` is called, we no longer know the *exact* stream position where the inline image dictionary starts. Having access to that information is crucial here, and the easiest solution I could come up with is to track this in the current `Lexer` instance.[1]

With the patch, we're thus able to fix the referenced issues without incurring large regressions in problematic cases such as issue 2618.

Fixes 9398; also improves/fixes the `issue8823` reference test.

---

[1] Obviously I'd have preferred if this patch could be limited to `Parser.makeInlineImage`, without the need for this "hack", but I'm not sure what that'd look like here.
2018-02-12 16:43:47 +01:00
Tim van der Meij
b3557483cb
Merge pull request #9465 from Snuffleupagus/streams-unify-disableRange-contentLength
Attempt to unify the `disableRange`/`contentLength` handling in the various network streams
2018-02-11 16:21:34 +01:00
Jonas Jenwald
1cf116ab88 Enable the mozilla/use-includes-instead-of-indexOf ESLint rule globally
This rule is available from https://www.npmjs.com/package/eslint-plugin-mozilla, and is enforced in mozilla-central. Note that we have the necessary `Array`/`String` polyfills and that most cases have already been fixed, see PRs 9032 and 9434.
2018-02-10 23:24:50 +01:00
Jonas Jenwald
2eb29409bc Enable the mozilla/avoid-removeChild ESLint rule globally
This rule is available from https://www.npmjs.com/package/eslint-plugin-mozilla, and is enforced in mozilla-central. Note that we have a polyfill for `ChildNode.remove()` and that most cases have already been fixed, see PRs 8056 and 8138.
2018-02-10 23:24:50 +01:00
Tim van der Meij
7bb066494f
Merge pull request #9427 from Snuffleupagus/native-JPEG-decoding-fallback
Fallback to the built-in JPEG decoder when browser decoding fails, and attempt to handle JPEG images with DNL (Define Number of Lines) markers (issue 8614)
2018-02-09 21:36:08 +01:00
Jonas Jenwald
ad06979cca Attempt to unify the disableRange/contentLength handling in the various network streams
First of all, note how in both `fetch_stream.js` and `node_stream.js` we always overwrite the `this._contentLength` property even when the response headers doesn't actually contain any (valid) length information. This could thus result in the `length` parameter, as passed to the network stream, being completely ignored despite having no better information available.
Secondly, in `node_stream.js` the `this._isRangeSupported` property wasn't always updated correctly based on the response headers.
2018-02-09 13:50:48 +01:00
Jonas Jenwald
25293628ff
Merge pull request #9459 from tonyjin/respect-worker-src
Respect workerSrc if set
2018-02-08 13:57:43 +01:00
Jonas Jenwald
a18c65ae9f Use the correct stream position when reading maxSizeOfInstructions from the maxp table (issue 9458)
Please refer to the `maxp` table specification, found at https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6maxp.html.

Fixes 9458.
2018-02-07 21:57:43 +01:00
Tony Jin
3c33f32dff Respect workerSrc if set
Respect user-defined workerSrc over internal overrides.
2018-02-07 11:31:18 -08:00
Jonas Jenwald
bf4166e6c9 Attempt to handle DNL (Define Number of Lines) markers when parsing JPEG images (issue 8614)
Please refer to the specification, found at https://www.w3.org/Graphics/JPEG/itu-t81.pdf#page=49

Given how the JPEG decoder is currently implemented, we need to know the value of the scanLines parameter (among others) *before* parsing of the SOS (Start of Scan) data begins.
Hence the best solution I could come up with here, is to re-parse the image in the *hopefully* rare case of JPEG images that include a DNL (Define Number of Lines) marker.

Fixes 8614.
2018-02-05 21:05:32 +01:00
Jonas Jenwald
80441346a3 Fallback to the built-in JPEG decoder if 'JpegStream', in src/display/api.js, fails to load the image
This works by making `PartialEvaluator.buildPaintImageXObject` wait for the success/failure of `loadJpegStream` on the API side *before* parsing continues.

Please note that in practice, it should be quite rare for the browser to fail loading/decoding of a JPEG image. In the general case, it should thus not be completely surprising if even `src/core/jpg.js` will fail to decode the image.
2018-02-05 21:05:31 +01:00
Jonas Jenwald
76afe1018b Fallback to built-in image decoding if the NativeImageDecoder fails
In particular this means that if 'JpegDecode', in `src/display/api.js`, fails we'll fallback to the built-in JPEG decoder.
2018-02-05 17:01:35 +01:00
Jonas Jenwald
2570717e77 Inline the code in loadJpegStream at the only call-site in src/display/api.js.js`
Since `loadJpegStream` is only used at a *single* spot in the code-base, and given that it's very heavily tailored to the calling code (since it relies on the data structure of `PDFObjects`), this patch simply inlines the code in `src/display/api.js` instead.
2018-02-05 17:01:35 +01:00
Jonas Jenwald
7f73fc9ace Re-factor PartialEvaluator.buildPaintImageXObject to make it asynchronous
This is necessary for upcoming changes, which will add fallback code-paths to allow graceful handling of native image decoding failures.
2018-02-05 17:01:35 +01:00
Jonas Jenwald
ec85d5c625 Change the signature of PartialEvaluator.buildPaintImageXObject to take a parameter object
This method currently requires a fair number of parameters, which creates quite	unwieldy call-sites. When invoking `buildPaintImageXObject`, you have to remember not only which arguments to supply, but also the correct order, to prevent run-time errors.
2018-02-05 17:01:35 +01:00
Rob Wu
2f19d9d906 Support spaces and semicolons in filename
Imports the following changes:
5b1afa7c29
7e2e35a38b
2018-02-04 16:19:40 +01:00
Jonas Jenwald
712090eff8 Upstream the changes from: Bug 1339461 - Convert foo.indexOf(...) == -1 to foo.includes() and implement an eslint rule to enforce this
Yet another case where PDF.js code was modified in `mozilla-central` without the changes happening in the GitHub repo first; *sigh*.
If we don't upstream at least the changes in `extensions/firefox/`, any future update of PDF.js in `mozilla-central` will be blocked.

Please see:
 - https://bugzilla.mozilla.org/show_bug.cgi?id=1339461
 - https://hg.mozilla.org/mozilla-central/rev/d5a5ad1dbbf2
2018-02-04 14:59:27 +01:00
Jonas Jenwald
9ac9ef8ef1 Polyfill String.prototype.includes using core-js
See https://github.com/zloirock/core-js#ecmascript-6-string.
2018-02-04 14:31:59 +01:00
Tim van der Meij
73436c0d12
Implement the AESBaseCipher class and let the AES128Cipher and AES256Cipher classes extend it 2018-02-03 20:16:33 +01:00
Tim van der Meij
9a959e4df7
Update the AES128Cipher and AES256Cipher implementations to be more similar
This commit is the first step for extracting a base class for the
`AES128Cipher` and the `AES256Cipher` classes. The objective here is to
make code changes (not altering the logic) to make the implementations
as similar as possible as found by creating a diff of both classes.

In particular, we extract the key size and cycles of repetitions
constants since they are different for AES-128 and AES-256. Moreover, we
rename functions to be similar.

In the `AES256Cipher` class, there was an additional assignment to
`this` in the decryption function. However, this was unnecessary because
the assignment would also be done when the loop was exited.
2018-02-03 20:16:29 +01:00
Jonas Jenwald
f4a95de694 Attempt to find the next valid marker when encountering invalid image data in JpegImage.parse (issue 9425)
In the JPEG images in the referenced PDF file, the DHT (Define Huffman Tables) segments contain more data than expected based on the length parameter.

Fixes 9425.
2018-02-03 16:01:19 +01:00
Jonas Jenwald
39c5e1ed1a Remove the (unnecessary) WorkerMessageHandler variable from the setupFakeWorkerGlobal() function in the src/display/api.js file 2018-01-31 12:52:10 +01:00
Jonas Jenwald
56a8c934dd [api-major] Remove the PDFJS.disableWorker option
Despite this patch removing the `disableWorker` option itself, please note that we'll still fallback to loading the worker file(s) on the main-thread when running in environments without proper Web Worker support.

Furthermore it's still possible, even with this patch, to force the use of fake workers by manually loading the necessary file using a `<script>` tag on the main-thread.[1]
That way, the functionality of the now removed `SINGLE_FILE` build target and the resulting `build/pdf.combined.js` file can still be achieved simply by adding e.g. `<script src="build/pdf.worker.js"></script>` to the HTML (obviously with the path adjusted as needed).

Finally note that the `disableWorker` option is a performance footgun, and unfortunately many existing third-party examples actually use it without providing any sort of warning/justification.

---

[1] This approach is used in the default viewer, since certain kind of debugging may be easier if the code is running directly on the main-thread.
2018-01-31 12:52:10 +01:00
Jonas Jenwald
a5aaf62754 [api-minor] Add a (static) PDFWorker.getWorkerSrc method that returns the current workerSrc
This method returns the currently used `workerSrc`, which thus allows obtaining the fallback `workerSrc` value (e.g. when the option wasn't set by the user).
2018-01-31 12:52:07 +01:00
Jonas Jenwald
c56f3f04dd [api-major] Remove the SINGLE_FILE build target
Please note that this build target, and the resulting `build/pdf.combined.js` file, is equivalent to setting the `PDFJS.disableWorker` option to `true` which is a performance footgun.
2018-01-29 14:44:44 +01:00
Rob Wu
5d1c541702 Enable some polyfills for compat with Chrome 49
Successfully tested with Chrome 49.
2018-01-26 12:31:41 +01:00
Rob Wu
44025a3ec1 Explicitly state intended support in compatibility.js
Add comments with supported browser versions where missing.

Method:
- Use MDN compat tables if available.
- Otherwise test in Chrome (31+) otherwise.
  (the Chrome Web Store does not update older versions of
   Chrome, so probably nobody is interested in even older
   versions, even though there is an existing comment for
   Chrome<29 at `document.currentScript`).
2018-01-26 12:31:41 +01:00
Jonas Jenwald
d33c763dd5 Re-factor resetting of StatTimer instances to fix completely broken benchmarking (PR 9245 follow-up)
It turns out that PR 9245 unfortunately broke benchmarking completely, sorry about that!

The bug is that we were attempting to reset the current instance of `StatTimer`, instead of creating a new one as was previously done. By resetting the current instance, the `StatTimer` data fetched in `test/driver.js` is now wiped out since it points to the *same* underlying object.
This re-use of a `StatTimer` instance was asked for during review, and unfortunately I didn't test this thoroughly enough before submitting the final version of the PR.[1]

---

[1] Note that while I did test the benchmarking scripts with that PR *before* initially submitting it, I did however forget to do that after addressing the review comments which might explain why this problem went unnoticed.
2018-01-25 19:46:03 +01:00
Jani Pehkonen
5593c970e0 Implement Huffman coding in JBIG2 2018-01-23 17:04:07 +02:00
Jonas Jenwald
f0216484bc
Merge pull request #9383 from Rob--W/better-content-disposition-parser
Better content disposition parser
2018-01-21 15:08:14 +01:00
Tim van der Meij
9746646511
Merge pull request #9386 from shikhar-scs/remove-parsejbig2-function
removed parseJbig2 function
2018-01-21 14:51:59 +01:00
Rob Wu
a4e907169e Improve correctness of Content-Disposition parser
Re-uses logic from 9f5fcae11c/extension/content-disposition.js
which is already covered by tests: 6f3bbb8bbf
2018-01-21 13:31:12 +01:00
Jonas Jenwald
fe5102a27f
Merge pull request #9363 from Rob--W/fetch-http/s-only
Limit PDFFetchStream to http(s) in the Chrome extension
2018-01-21 11:45:09 +01:00
Shikhar Agnihotri
43e003cf5c
removed parseJbig2 function 2018-01-20 19:49:06 +05:30
Jonas Jenwald
69a8336cf1 Address the final round of review comments for Content-Disposition filename extraction
This patch updates the `IPDFStreamReader` interface and ensures that the interface/implementation of `network.js`, `fetch_stream.js`, `node_stream.js`, and `transport_stream.js` all match properly.
The unit-tests are also adjusted, to more closely replicate the actual behaviour of the various actual `IPDFStreamReader` implementations.
Finally, this patch adjusts the use of the Content-Disposition filename when setting the title in the viewer, and adds `PDFDocumentProperties` support as well.
2018-01-18 17:39:22 +01:00
Juan Salvador Perez Garcia
eb1f6f4c24 Content disposition filename
File name is extracted from headers.
2018-01-18 17:38:44 +01:00
shikhar-scs
32080f1081 changed decodeURI to decodeURIComponent 2018-01-15 19:31:25 +05:30
Tim van der Meij
237bc2ef9d
Merge pull request #9323 from juncaixinchi/master
Get correct path in node_stream on windows platform( issue #9020)
2018-01-14 15:41:56 +01:00
Rob Wu
1c8cacd6b9 Limit PDFFetchStream to http(s) in the Chrome extension
The `fetch` API is only supported for http(s), even in Chrome extensions.
Because of this limitation, we should use the XMLHttpRequest API when the
requested URL is not a http(s) URL.

Fixes #9361
2018-01-14 00:34:46 +01:00
juncaixinchi
8e278d9a45 Get correct path in node_stream on windows platform 2018-01-13 21:13:24 +08:00
Jonas Jenwald
0e1b5589e7 Restore the btoa/atob polyfills for Node.js
These were removed in PR 9170, since they were unused in the browsers that we'll support in PDF.js version `2.0`.
However looking at the output of Travis, where a subset of the unit-tests are run using Node.js, there's warnings about `btoa` being undefined. This doesn't appear to cause any errors, which probably explains why we didn't notice this before (despite PR 9201).
2018-01-13 01:31:05 +01:00
Brendan Dahl
d77fc8882a
Merge pull request #9352 from Snuffleupagus/issue-9285
Attempt to actually resolve ColourSpace names in accordance with the specification (issue 9285)
2018-01-12 13:01:22 -08:00
Jonas Jenwald
d0c8992e8a Attempt to actually resolve ColourSpace names in accordance with the specification (issue 9285)
Please refer to the PDF specification, in particular http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G7.3801570

> A colour space shall be specified in one of two ways:
>  - Within a content stream, the CS or cs operator establishes the current colour space parameter in the graphics state. The operand shall always be name object, which either identifies one of the colour spaces that need no additional parameters (DeviceGray, DeviceRGB, DeviceCMYK, or some cases of Pattern) or shall be used as a key in the ColorSpace subdictionary of the current resource dictionary (see 7.8.3, "Resource Dictionaries"). In the latter case, the value of the dictionary entry in turn shall be a colour space array or name. A colour space array shall never be inline within a content stream.
>
> - Outside a content stream, certain objects, such as image XObjects, shall specify a colour space as an explicit parameter, often associated with the key ColorSpace. In this case, the colour space array or name shall always be defined directly as a PDF object, not by an entry in the ColorSpace resource subdictionary. This convention also applies when colour spaces are defined in terms of other colour spaces.
2018-01-10 20:20:43 +01:00
Jonas Jenwald
5a52ee0a79
Merge pull request #9350 from janpe2/svg-closeEOFillStroke
Implement `closeEOFillStroke` in SVG backend
2018-01-09 21:09:11 +01:00
Brendan Dahl
3925aab010
Merge pull request #9282 from Snuffleupagus/TrueType-Collection
Add support for TrueType Collection fonts (issue 9262)
2018-01-09 11:22:52 -08:00
Jani Pehkonen
d1e1dbfc14 Implement closeEOFillStroke in SVG backend 2018-01-09 19:42:12 +02:00
Jonas Jenwald
915e3f4c5f
Merge pull request #9099 from tiriana/allow-dontFlip-in-PDFPageProxy-getViewport
Allows 'dontFlip' as third arg in PDFPageProxy.getViewport
2018-01-09 18:27:26 +01:00
Radomir Wojtera
3dfc540d04 Allows 'dontFlip' as third argument in PDFPageProxy.getViewport 2018-01-09 13:08:24 +01:00
Jonas Jenwald
d6c028b946 Add support for TrueType Collection fonts (issue 9262)
The specification can be found at https://www.microsoft.com/typography/otspec/otff.htm, under the "Font Collections" heading.

Fixes 9262.
2018-01-08 22:31:08 +01:00
Tim van der Meij
6b2ed504b7
Merge pull request #9336 from Snuffleupagus/jpx-SIZ
Correctly extract component data from "Image and tile size" (SIZ) markers in JPEG 2000 images
2018-01-03 23:34:34 +01:00
Jonas Jenwald
873556865b Correctly extract component data from "Image and tile size" (SIZ) markers in JPEG 2000 images
This is something that I noticed while attempting to debug https://bugzilla.mozilla.org/show_bug.cgi?id=1374945.
Just looking at the code, the `YRsiz` parameter seemed immediately wrong and the fact that every component used the *same* data also looked strange.
Comparing with the specification, see https://www.itu.int/rec/dologin_pub.asp?lang=e&id=T-REC-T.800-200208-S!!PDF-E&type=items#page=37, confirmed that this is indeed incorrect.

Note that I haven't got any example of a PDF file that is fixed by this patch, but that might be more luck than anything else. Manually checking a couple of files with included JPEG 2000 images, the `Csiz`/`XRsiz`/`YRsiz` parameters were `1` which could explain why this hasn't been an issue before.

Obviously we shouldn't generally make changes to `core` code without adding tests, but in this case I'm simply not sure how to obtain/create one. However, since the existing code doesn't make sense this patch could hopefully be deemed acceptable anyway.
2018-01-03 16:26:28 +01:00
Jonas Jenwald
2db75a2a3a Update the ESLint dependencies, and also tweak the no-multiple-empty-lines rules
Since multiple empty lines is virtually unused in the code-base, and the few cases that do exist look like "typos", let's enforce greater consistency here; please see https://eslint.org/docs/rules/no-multiple-empty-lines.
2018-01-03 13:32:57 +01:00
Tim van der Meij
d36c46b2c9
Remove the CustomStyle class
It is only used in a few places to handle prefixing style properties if
necessary. However, we used it only for `transform`, `transformOrigin`
and `borderRadius`, which according to Can I Use are supported natively
(unprefixed) in the browsers that PDF.js 2.0 supports. Therefore, we can
remove this class, which should help performance too since this avoids
extra function calls in parts of the code that are called often.
2017-12-31 14:22:11 +01:00
Jonas Jenwald
c5700211d6 Adjust decodeACSuccessive in src/core/jpg.js to improve the rendering quality of (progressive) JPEG images
I've been looking into the remaining point in 8637 about blurry images, to see if we could perhaps improve the rendering quality slightly there. After quite a bit of debugging, it seems that the issue is limited to certain progressive JPEG images.

As mentioned previously, I've got no detailed knowledge of the JPEG format, but this patch does seem to improve things quite a bit for the images in question.
Squinting at https://searchfox.org/mozilla-central/rev/6c33dde6ca02b389c52e8db3d22494df8b916f33/media/libjpeg/jdphuff.c#492-639, it seems reasonable that we should take the sign of the data into account. Furthermore, looking at the specification in https://www.w3.org/Graphics/JPEG/itu-t81.pdf#page=118, the "F.2.4.3 Decoding the binary decision sequence for non-zero DC differences and AC coefficients" section even contains a description of this (even though I cannot claim to really understand the details).
2017-12-30 15:24:09 +01:00
Jonas Jenwald
d6eed132e5 Correct the indentation in the switch statement in decodeACSuccessive in src/core/jpg.js 2017-12-30 15:22:30 +01:00
Tim van der Meij
18d82d9c54
Merge pull request #9287 from himanish-star/PDFjs-compatible-Librejs
PDFjs now compatible with Librejs
2017-12-30 14:00:25 +01:00
Jonas Jenwald
8c4b7d0439 Avoid truncating JPEG images with DeviceGray ColourSpaces when using the src/core/jpg.js built-in decoder
The bug that this patch fixes is limited to the built-in JPEG decoder, and was unearthed by PR 9260. The underlying issue has existed since PR 6984, where the contents of this patch ought to have been included (if it weren't for the fact that we had no *easy* way to test `src/core/jpg.js` back then).

*Please note:* The slight movement in the reference test is a result of using the `src/core/jpg.js` decoder, rather than the native browser one.
2017-12-29 18:44:07 +01:00
Tim van der Meij
25bbff4692
Merge pull request #9320 from Snuffleupagus/pr-9095-followup
Avoid rendering errors by passing in the `webGLContext` when creating a new `CanvasGraphics` in `getColorN_Pattern` (PR 9095 follow-up)
2017-12-28 23:17:30 +01:00
Jonas Jenwald
ec21bd9626
Merge pull request #9314 from timvandermeij/encodings
Implement unit tests for the encodings and fix missing items
2017-12-27 22:02:38 +01:00
Jonas Jenwald
06605abbc2 Avoid rendering errors by passing in the webGLContext when creating a new CanvasGraphics in getColorN_Pattern (PR 9095 follow-up)
This was an oversight in PR 9095, which unfortunately breaks rendering in some PDF files (e.g. the one from issue 6737).

It thus appears that we don't have any test-coverage for this code-path, and given the relative complexity of the PDF files affected by this bug I wasn't able to easily create a reduced test-case.
*Please note:* The linked test-case included in this patch is currently *not* rendered correctly (that'd be the PR 6606), but it at least gives us some test-coverage here.
2017-12-27 13:50:53 +01:00
Tim van der Meij
c7af2db2ec
Implement unit tests for the encodings and fix missing items
Initially I just implemented the unit tests, but quickly found that they
were failing my expectation of having a size of 256 items. Some of them
did contain 256 items and some did not. I looked up various resources
and figured that they indeed all need to have 256 items. One of the good
resources is https://github.com/davidben/poppler/blob/master/poppler/FontEncodingTables.cc

Aside from some missing `notdef` (empty string) entries at the end of
the arrays, which I assume causes issues since it may cause
out-of-bounds array access which in JavaScript gives `undefined`, there
was a `notdef` entry missing in the `MacExpertEncoding`, causing the
entries after that to be shifted. This fix for this is similar to the
one in #8589.

The unit tests verify that, for known encoding names, the return value
is not only an array, but that it is also of the right length and
contains only strings.
2017-12-24 18:14:40 +01:00
Jonas Jenwald
d4cd44fd16 Add a fallback for non-embedded LucidaSans-Demi fonts (issue 9291)
The PDF file in the issue uses a number of *embedded* versions of Lucida fonts, but for some reason does *not* embed the LucidaSans-Demi font. According to https://en.wikipedia.org/wiki/Lucida#Usages that one should be bold, so we can at least improve rendering here (even though it won't look perfect).

Fixes 9291.
2017-12-24 17:36:58 +01:00
Tim van der Meij
957e2d420d
Implement unit tests for the network utility code
This should provide 100% coverage for the file.
2017-12-23 19:24:11 +01:00
Jonas Jenwald
e58f2f513a [api-major] Remove the unused encrypted property from the pdfInfo object sent from the worker via the GetDoc message
I recall being confused as to the purpose of the `encrypted` property all the way back when working on PR 4750.

Looking at the history, this property was added in PR 1698 when password support was added to the API/viewer. However, its only purpose seem to have been to facilitate the addition of a `isEncrypted` function in the API. That function never, as far as I can tell, saw any use and was unceremoniously removed in PR 4144.

Since we want to avoid sending all non-essential data early during initial document loading (e.g. PR 4750), it seems correct to get rid of the `encrypted` property. Especially since it hasn't even been exposed in the API for over three years, with no complaints that I'm aware of.

Finally note that the `encrypt` property on the `XRef` instance isn't tied to the code that's being removed here. Given that we're calling `PDFDocument.parse` during `createDocumentHandler` in the worker which, via `PDFDocument.setup`, calls `XRef.parse` where the `Encrypt` data (if it exists) is always parsed.
2017-12-21 13:10:23 +01:00
Jonas Jenwald
9ff3c6f99d Remove the document.readyState polyfill
This is only relevant for browsers that we don't intend to support with PDF.js version `2.0`.
2017-12-19 15:05:19 +01:00
Jonas Jenwald
6af45052c5 Remove the input.type polyfill
This is only relevant for browsers that we don't intend to support with PDF.js version `2.0`.
2017-12-19 15:05:15 +01:00
Jonas Jenwald
cf88b7b212 Remove the ImageData.set polyfill
This is only relevant for browsers that we don't intend to support with PDF.js version `2.0`.
2017-12-19 15:05:14 +01:00
Jonas Jenwald
363e517acf Remove the HTMLElement.dataset polyfill
This is only relevant for browsers that we don't intend to support with PDF.js version `2.0`.
2017-12-19 14:50:18 +01:00
Jonas Jenwald
4880200cd4 Remove the XMLHttpRequest.response polyfill
This is only relevant for browsers that we don't intend to support with PDF.js version `2.0`.
2017-12-19 14:48:43 +01:00
Jonas Jenwald
8266cc18e7 Remove the webkitURL polyfill
This is only relevant for browsers that we don't intend to support with PDF.js version `2.0`.
2017-12-19 14:46:04 +01:00
Soumya Himanish Mohapatra
95ad956f68 PDFjs now compatible with Librejs 2017-12-19 15:13:50 +05:30
Jonas Jenwald
1dc54ddb40 Handle PDF files with missing 'endobj' operators, by searching for the "obj" string rather than "endobj" in XRef.indexObjects (issue 9105)
This patch refactors the searching for 'endobj', to try and find the next occurance of "obj" and then check if it was in fact an 'endobj' and continue searching otherwise.
This approach is used to avoid having to first find 'endobj', and then re-check the entire contents of the object and having to run (potentially expensive) regular expressions on arbitrary long strings.

Fixes 9105.
2017-12-18 13:17:45 +01:00
Tim van der Meij
6bbe91079b
Merge pull request #9272 from nveenjain/fix/8846
Replaced occurence of `throw new Error` with `unreachable`
2017-12-15 22:11:32 +01:00
Jonas Jenwald
6515b91118
Merge pull request #9276 from mozilla/loca-fix
Fix loca table when offsets aren't in ascending order.
2017-12-15 20:59:42 +01:00
Brendan Dahl
9b51cea724 Fix loca table when offsets aren't in ascending order. 2017-12-15 11:20:28 -06:00
Naveen Jain
1135674647 Replaced occurence of throw new Error with unreachable where applicable 2017-12-14 12:58:50 +05:30
Jonas Jenwald
ad5ed37059 Handle broken, Ghostscript generated, Metadata that contains HTML character names (bug 1424938)
Please note that while this could be considered a regression in user-facing behaviour, I'm not convinced that it's really a regression as such since prior to PR 8912 the Metadata would fail to parse (with an XML error) and thus be ignored when setting the viewer title.
With the refactored Metadata parsing we're now able to parse this, which uncovered issues with a subset of broken Ghostscript Metadata that uses HTML character names.

Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1424938
2017-12-13 14:32:47 +01:00
Tim van der Meij
095c63cc25
Merge pull request #9260 from Snuffleupagus/rm-JpegStream.getBytes
Attempt to remove the special `JpegStream.getBytes` method and utilize the regular `DecodeStream` one instead
2017-12-10 16:50:50 +01:00
Tim van der Meij
c35bbd11b0
Use native Math functions in the custom log2 function
It is quite confusing that the custom function is called `log2` while it
actually returns the ceiling value and handles zero and negative values
differently than the native function.

To resolve this, we add a comment that explains these differences and
make the function use the native `Math` functions internally instead of
using our own custom logic. To verify that the function does what we
expect, we add unit tests.

All browsers except for IE support `Math.log2` for quite a long time
already (see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Math/log2).
For IE, we use the core-js polyfill.

According to the microbenchmark at https://jsperf.com/log2-pdfjs/1,
using the native functions should also be faster, in my testing almost
six times as fast.
2017-12-10 16:35:17 +01:00
Jonas Jenwald
84de1e9a92 Attempt to remove the special JpegStream.getBytes method and utilize the regular DecodeStream one instead
Note that no other image stream implements a special `getBytes` method, which makes `JpegStream` look somewhat odd.

I'm actually not sure what purpose this methods serves, since I successfully ran all tests locally with it commented out. Furhermore, I also ran tests with an added `if (length && length !== this.bufferLength) { throw new Error('length mismatch'); }` check, and didn't get a single test failure in that case either.

Looking at the history, it seems that this code originated back in PR 4528, but as far as I can tell there's no mention in either commit messages nor PR comments of why it was necessary to add a "special" `getBytes` function for the `JpegStream`.
My assumption is that there's a good reason why this method was added, e.g. to address a *specific* regression in one of the reference tests. However, I did check out commit 58f697f977 locally and ran tests with this method commented out, and there didn't seem to be any image-related failures in that case either!?

Hence I'm suggesting that we attempt to simplify this code slightly be removing this special `getBytes` method. However, please note that there's perhaps a *small* risk of regressions in an edge-case where we currently have insufficient test-coverage.
2017-12-10 13:31:08 +01:00
Brendan Dahl
af1d80d45e
Merge pull request #9230 from Snuffleupagus/issue-9195
Add basic support for non-embedded Calibri fonts (issue 9195)
2017-12-08 10:15:43 -08:00
Jonas Jenwald
a5e3261b48
Merge pull request #9062 from mozilla/no_high
Move char codes from high surrogate pair range into private use.
2017-12-08 12:31:22 +01:00
Brendan Dahl
306999c325 Move char codes from high surrogate pair range into private use.
Fixes #2884
2017-12-07 10:35:50 -08:00
Jonas Jenwald
7c5ba9aad5 [api-major] Only create a StatTimer for pages when enableStats == true (issue 5215)
Unless the debugging tools (i.e. `PDFBug`) are enabled, or the `browsertest` is running, the `PDFPageProxy.stats` aren't actually used for anything.
Rather than initializing unnecessary `StatTimer` instances, we can simply re-use *one* dummy class (with static methods) for every page. Note that by using a dummy `StatTimer` in this way, rather than letting `PDFPageProxy.stats` be undefined, we don't need to guard *every* single stats collection callsite.

Since it wouldn't make much sense to attempt to use `PDFPageProxy.stats` when stat collection is disabled, it was instead changed to a "private" property (i.e. `PDFPageProxy._stats`) and a getter was added for accessing `PDFPageProxy.stats`. This getter will now return `null` when stat collection is disabled, making that case easy to handle.

For benchmarking purposes, the test-suite used to re-create the `StatTimer` after loading/rendering each page. However, modifying properties on various API code from the outside in this way seems very error-prone, and is an anti-pattern that we really should avoid at all cost. Hence the `PDFPageProxy.cleanup` method was modified to accept an optional parameter, which will take care of resetting `this.stats` when necessary, and `test/driver.js` was updated accordingly.

Finally, a tiny bit more validation was added on the viewer side, to ensure that all the code we're attempting to access is defined when handling `PDFPageProxy` stats.
2017-12-06 23:12:25 +01:00
Jonas Jenwald
50b72dec6e Convert StatTimer to an ES6 class 2017-12-06 13:59:03 +01:00
Jonas Jenwald
6b1eda3e12 Move StatTimer from src/shared/util.js to src/display/dom_utils.js
Since the `StatTimer` is not used in the worker, duplicating this code on both the main and worker sides seem completely unnecessary.
2017-12-06 13:51:04 +01:00
Jonas Jenwald
08de655177 Add basic support for non-embedded Calibri fonts (issue 9195)
There's a number of issues with the fonts in the referenced PDF file. First of all, they contain broken `ToUnicode` data (`NUL` bytes all over the place). However even if you skip those, the `ToUnicode` data appears to contain nothing but a `IdentityH` CMap which won't help provide a proper glyph mapping.

The real issue actually turns out to be that the PDF file uses the "Calibri" font[1], but doesn't include any font files. Since that one isn't a standard font, and uses a fairly different CID to GID map compared to the standard fonts, we're not able to render the file even remotely correct.
To work around this, I'm thus proposing that we include a (incomplete) glyph map for Calibri, and fallback to the standard Helvetica font. Obviously this isn't going to look perfect, but it's really the best that we can hope to achieve given that the PDF file is missing the necessary font data.

Finally, please note that none of the PDF readers I've tried (Adobe Reader, PDFium in Chrome) were able to extract the text (which isn't very surprising, given the broken `ToUnicode` data).

Fixes 9195.

---

[1] According to Wikipedia, see https://en.wikipedia.org/wiki/Calibri, Calibri is (primarily) a Windows font.
2017-12-03 17:23:33 +01:00
Jonas Jenwald
f3c50fe2f9
Merge pull request #9192 from Snuffleupagus/issue-8229
Build a fallback `ToUnicode` map for simple fonts (issue 8229)
2017-11-30 10:27:32 +01:00
Tim van der Meij
e320243870
Merge pull request #9206 from janpe2/svg-inv-images
Fix inverted 1-bit images in SVG backend
2017-11-28 22:46:43 +01:00
Jani Pehkonen
58b214eab3 Fix inverted 1-bit images in SVG backend 2017-11-28 21:24:27 +02:00
Jani Pehkonen
06d083b04b Fix pattern-filled text 2017-11-28 19:40:22 +02:00
Tim van der Meij
3e34eb31d9
Merge pull request #9191 from timvandermeij/pushbuttons
Button widget annotations: implement support for pushbuttons
2017-11-27 22:31:07 +01:00
Jonas Jenwald
61e19bee43 Build a fallback ToUnicode map for simple fonts (issue 8229)
In some fonts, the included `ToUnicode` data is incomplete causing text-selection to not work properly. For simple fonts that contain encoding data, we can manually build a `ToUnicode` map to attempt to improve things.

Please note that since we're currently using the `ToUnicode` data during glyph mapping, in an attempt to avoid rendering regressions, I purposely didn't want to amend to original `ToUnicode` data for this text-selection edge-case.
Instead, I opted for the current solution, which will (hopefully) give slightly better text-extraction results in PDF file with incomplete `ToUnicode` data.

According to the PDF specification, see [section 9.10.2](http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G8.1873172):

> A conforming reader can use these methods, in the priority given, to map a character code to a Unicode value.
> ...

Reading that paragraph literally, it doesn't seem too unreasonable to use *different* methods for different charcodes.

Fixes 8229.
2017-11-26 14:45:15 +01:00
Tim van der Meij
0fe80df2a7
Button widget annotations: implement support for pushbuttons 2017-11-26 14:09:48 +01:00
Jonas Jenwald
ffbfc3c2a7 Refactor the building of ToUnicode maps for simple fonts a helper method 2017-11-26 13:30:29 +01:00
Jonas Jenwald
ab1f76cc37 Remove the unused capability parameter from the WorkerTransport.getPage method
That parameter, originally named `promise`, has been unused for over five years; ever since commit f0687c4d50 in PR 1531.
2017-11-25 11:49:33 +01:00
Jonas Jenwald
59b5e14301 Split the existing WebGLUtils in two classes, a private WebGLUtils and a public WebGLContext, and utilize the latter in the API to allow various code to access the methods of WebGLUtils
This patch is one (small) step on the way to reduce the general dependency on a global `PDFJS` object, for PDF.js version `2.0`.
2017-11-24 21:54:47 +01:00
Jonas Jenwald
cc47ef56ec Remove the onclick polyfill for old versions of Opera
This was only relevant for no obsolete versions Opera, that use the Presto engine. According to https://en.wikipedia.org/wiki/History_of_the_Opera_web_browser#Opera_2013, the last version affected was released in 2013.
2017-11-21 11:02:14 +01:00
Jonas Jenwald
d18b2a8e73 Remove the classList polyfill
This is only relevant for browsers that we don't intend to support with PDF.js version `2.0`.
2017-11-21 11:01:52 +01:00
Jonas Jenwald
4b15e8566b Remove the Function.prototype.bind polyfill
This is only relevant for browsers that we don't intend to support with PDF.js version `2.0`.
2017-11-21 11:00:55 +01:00
Jonas Jenwald
d8cb74d3e3 Remove the btoa/atob polyfills
This is only relevant for browsers that we don't intend to support with PDF.js version `2.0`.
2017-11-21 11:00:55 +01:00
Jonas Jenwald
150ac0788f Remove IE9 specific XMLHttpRequest polyfills that utilize VBArray 2017-11-21 11:00:55 +01:00
Jonas Jenwald
935c5c587f Remove the Object.defineProperty polyfill
This is only relevant for browsers that we don't intend to support with PDF.js version `2.0`.
2017-11-21 11:00:55 +01:00
Tim van der Meij
25b07812b9
Sanitize the display value for choice widget annotations 2017-11-18 20:37:27 +01:00
Tim van der Meij
9e8cf448b0
Merge pull request #9140 from Snuffleupagus/rm-console-polyfill
Remove the `console` polyfills
2017-11-18 15:49:19 +01:00
Tim van der Meij
edaf4b3173
Merge pull request #9037 from Snuffleupagus/refactor-streams-params
Re-factor how parameters are passed to the network streams
2017-11-18 15:41:15 +01:00
Tim van der Meij
ae07adf143
Merge pull request #9073 from Snuffleupagus/image-streams-fixes
Fix the interface of `JpegStream`/`JpxStream`/`Jbig2Stream` to agree with the other `DecodeStream`s
2017-11-17 23:26:36 +01:00
Tim van der Meij
1d67d9dccd
Merge pull request #9131 from janpe2/svg-empty-paths
Filling and stroking empty paths in SVG backend
2017-11-16 22:43:24 +01:00
Jonas Jenwald
42099c564f Remove the console polyfills
All browsers that we intend to support with PDF.js version 2.0 already supports `console` natively.
2017-11-16 09:34:51 +01:00
Jonas Jenwald
d5174cd826 Remove the requestAnimationFrame polyfill
According to https://developer.mozilla.org/en-US/docs/Web/API/window/requestAnimationFrame#Browser_compatibility and https://caniuse.com/#feat=requestanimationframe, the browsers we intend to support with PDF.js version 2.0 should all have native `requestAnimationFrame` support.

Note that the reason for indiscriminately polyfilling `requestAnimationFrame` in iOS, see PR 4961, was apparently because of a bug in iOS 6.
However, according to [Wikipedia](https://en.wikipedia.org/wiki/IOS_version_history#iOS_8): "Support for iOS 8 ended in 2017.", hence the lowest version currently supported is iOS 9.
2017-11-15 16:08:48 +01:00
Jani Pehkonen
4e8f7070da Filling and stroking empty paths in SVG backend 2017-11-14 18:35:39 +02:00
Jonas Jenwald
745cb73c65 Remove PDFJS.disableRange/PDFJS.disableStream code for now unsupported browsers in src/shared/compatibility.js
We're currently disabling range requests and streaming for a number of configurations. A couple of those will no longer be supported (with PDF.js version 2.0), hence we ought to be able to clean up the compatibility code slightly.
2017-11-14 15:28:50 +01:00
Jonas Jenwald
eb3a1f24a3 Remove the PDFJS.disableHistory code from src/shared/compatibility.js
This compatibility code is only relevant for browsers that will no longer be supported (with PDF.js version 2.0), hence we ought to be able to remove it.
2017-11-14 15:28:50 +01:00
Tim van der Meij
9686f6652c
Merge pull request #9089 from yurydelendik/rm-chunks
Extracts OperatorList class and prepares for streaming
2017-11-13 23:35:40 +01:00
Jonas Jenwald
23699cef1c Re-factor how parameters are passed to the network streams
*This patch is the result of me starting to look into moving parameters from `PDFJS` into `getDocument` and other API methods.*

When familiarizing myself with the code, the signatures of the various network streams seemed to be unnecessarily cumbersome since `disableRange` is currently handled separately from other parameters.
I'm assuming that the explanation for this is probably "for historical reasons", as is often the case. Hence I'd like to clean this up *before* we start the larger, and more invasive, `PDFJS` parameter re-factoring.
2017-11-11 11:23:29 +01:00
Jonas Jenwald
de5297b9ea Fix the interface of JpegStream/JpxStream/Jbig2Stream to agree with the other DecodeStreams
The interface of all of the "image" streams look kind of weird, and I'm actually a bit surprised that there hasn't been any errors because of it.
For example: None of them actually implement `readBlock` methods, and it seems more luck that anything else that we're not calling `getBytes()` (without providing a length) for those streams, since that would trigger a code-path in `getBytes` that assumes `readBlock` to exist.

To address this long-standing issue, the `ensureBuffer` methods are thus renamed to `readBlock`. Furthermore, the new `ensureBuffer` methods are now no-ops.
Finally, this patch also replaces `var` with `let` in a number of places.
2017-11-11 11:22:16 +01:00
Jonas Jenwald
36593d6bbc Move JpegStream and JpxStream to their own files 2017-11-11 11:22:16 +01:00
Yury Delendik
5fa56f6a9d For backwards compatibility: use addOp amount instead of queue size. 2017-11-09 18:46:48 -06:00
Yury Delendik
877c2d7743 Changing QueueOptimizer to be more iterative. 2017-11-09 18:46:48 -06:00
Max Schaefer
3ae37d1b06 Remove a few useless assignments. 2017-11-03 11:36:48 +00:00
Max Schaefer
bc8f673522 Remove spurious arguments to NullStream constructor. 2017-11-03 10:14:32 +00:00
Max Schaefer
3ab1a9922a Rearrange a few declarations so that they precede their uses. 2017-11-03 10:14:32 +00:00
Jonas Jenwald
2dbd3f2603 [api-major] Remove the TypedArray polyfills 2017-11-01 10:31:28 +01:00
Brendan Dahl
b46443f0c1
Merge pull request #9077 from yurydelendik/v2
Version 2.0 merge
2017-10-31 14:24:20 -07:00
Jonas Jenwald
83e8398ff2 For non-embedded fonts, map softhyphen (0x00AD) to regular hyphen (0x002D) (issue 9084)
In the PDF file, the `ToUnicode` data first maps the hyphen correctly, and then *overwrites* it to point to the softhyphen instead. That one cannot be rendered in browsers, and an empty space thus appear instead.

Fixes 9084.
2017-10-31 13:26:04 +01:00
Jonas Jenwald
92fcfce685
Merge pull request #9082 from brendandahl/issue7562
Overwrite glyphs contour count if it's less than -1.
2017-10-30 20:44:01 +01:00
Yury Delendik
85f544f55a Moves OperatorList and QueueOptimizer into separate file. 2017-10-30 13:29:58 -05:00
Brendan Dahl
17037b5e51 Overwrite glyphs contour count if it's less than -1.
The test pdf has a contour count of -70, but OTS doesn't
like values less than -1.

Fixes issue #7562.
2017-10-30 09:16:51 -07:00
Yury Delendik
b4e25fb2e8 Merge remote-tracking branch 'mozilla/version-2.0' into v2 2017-10-27 14:01:45 -05:00
Jonas Jenwald
5e627810e4 Use stringToBytes in more places
Rather than having (basically) verbatim copies of `stringToBytes` in a few places, we can simply use the helper function directly instead.
2017-10-26 11:01:13 +02:00
Jonas Jenwald
e94a0fd4e7 Extract the actual decoding in CCITTFaxStream into a new CCITTFaxDecoder "class", which the new CCITTFaxStream depends on 2017-10-24 16:03:08 +02:00
Jonas Jenwald
bb35095083 Move CCITTFaxStream and Jbig2Stream, from src/core/stream.js, to separate files 2017-10-24 12:00:40 +02:00
Jonas Jenwald
d71a576b30 Merge pull request #9045 from brendandahl/sani-name
Sanitize name index in compile phase of CFF.
2017-10-24 11:48:03 +02:00
Brendan Dahl
6b12612a52 Sanitize name index in compile phase of CFF.
Fixes #8960
2017-10-23 17:13:49 -07:00
Tim van der Meij
a0ec980e63 Merge pull request #9055 from Snuffleupagus/core-js-Number
Replace `Number` polyfills with the ones from core-js
2017-10-21 16:15:11 +02:00
Jonas Jenwald
dbbb763eaf Replace Number polyfills with the ones from core-js
Since we're already using core-js elsewhere in `compatibility.js`, we can reduce the amount of code we need to maintain ourselves.

https://github.com/zloirock/core-js#ecmascript-6-number
2017-10-21 12:58:53 +02:00
Jonas Jenwald
8684aa0ee5 Replace our Promise polyfill with the one from core-js
Since we're already using core-js elsewhere in `compatibility.js`, we can reduce the amount of code we need to maintain ourselves.

https://github.com/zloirock/core-js#ecmascript-6-promise
2017-10-21 12:51:14 +02:00
Tim van der Meij
10fd590c09 Merge pull request #9032 from Snuffleupagus/nativeImageDecoderSupport-2.0
Simplify the check, and remove the warning, for the `nativeImageDecoderSupport` API parameter
2017-10-20 21:30:47 +02:00
Brendan Dahl
fcc9943d04 Use charstring as plain text when lengthIV is -1.
Fixes #7769
2017-10-18 14:19:59 -07:00
Tim van der Meij
17cc94db4e Merge pull request #9034 from Snuffleupagus/javascript-null
[api-major] Change `getJavaScript` to return `null`, rather than an empty Array, when no JavaScript exists
2017-10-17 21:58:45 +02:00
Tim van der Meij
7d7edd9cc6
[api-major] Remove the PDFJS_NEXT option
Nothing uses this option anymore, so setting it is a no-op now. We can
safely remove it.

Use `SKIP_BABEL` (instead of `PDFJS_NEXT`) now if you want to skip Babel
translation for a build.
2017-10-16 23:16:51 +02:00
Jonas Jenwald
b5794bb26b Simplify the check, and remove the warning, for the nativeImageDecoderSupport API parameter
As discussed in PR 8982.
2017-10-16 09:11:39 +02:00
Jonas Jenwald
bcb29063c1 Polyfill Object.values and Array.prototype.includes using core-js
See https://github.com/zloirock/core-js#stage-4-proposals.
2017-10-16 09:11:39 +02:00
Jonas Jenwald
1cd1582cb9 [api-major] Change getJavaScript to return null, rather than an empty Array, when no JavaScript exists
Other API methods already return `null`, rather than empty Arrays/Objects, hence it makes sense to change `getJavaScript` to be consistent.
2017-10-15 22:17:14 +02:00
Tim van der Meij
815bc53a16 Merge pull request #9029 from Snuffleupagus/eslint--report-unused-disable-directives
Enable the `--report-unused-disable-directives` ESLint command line option
2017-10-15 15:12:20 +02:00
Jonas Jenwald
04b46831c1 Enable the --report-unused-disable-directives ESLint command line option
This option was added in [version `4.8.0` of ESLint](https://github.com/eslint/eslint/releases/tag/v4.8.0), which is already listed as the minimum version in our `package.json` file; please refer to https://eslint.org/docs/user-guide/command-line-interface#--report-unused-disable-directives for additional details.

Despite the caveat listed in the link above, I still think that using this option makes sense since it will help ensure that no longer necessary disable statements are removed.
2017-10-15 13:45:12 +02:00
Jonas Jenwald
0f90d5130c [api-major] Remove all remaining deprecated functions/methods 2017-10-15 13:27:10 +02:00
Jonas Jenwald
2f32131601 Replace our WeakMap polyfill with the one from core-js
Since we're already using core-js elsewhere in `compatibility.js`, we can reduce the amount of code we need to maintain ourselves.

https://github.com/zloirock/core-js#weakmap
2017-10-15 10:22:06 +02:00
Diego Casorran
11b1daa72d Mispelled isEvalSupported property at FontFaceObject() creation. 2017-10-07 20:05:21 +02:00
Tim van der Meij
509d3728f1 Merge pull request #8922 from Snuffleupagus/paintXObject-errors
Allow `getOperatorList`/`getTextContent` to skip errors when parsing broken XObjects (issue 8702, issue 8704)
2017-10-07 15:46:26 +02:00
Yury Delendik
fab59e0f91 Revert "Closes all promises/streams when handler is destroyed." 2017-10-06 11:55:28 -05:00
Jonas Jenwald
3d79bd5e87 [api-major] When rendering is cancelled, always reject with RenderingCancelledException 2017-10-04 18:09:28 +02:00
Tim van der Meij
ee9b5a12da
Remove the deprecated parameters for getDocument of the API
This is deprecated since October 2015 with a visible message, so we can
safely remove this now.
2017-10-02 23:05:32 +02:00
Tim van der Meij
0817375d2f
Remove the deprecated disableNativeImageDecoder parameter of the API
This is deprecated since May 2017 with a visible message, so we can
safely remove this now.
2017-10-02 23:01:57 +02:00
Tim van der Meij
b651cfb440
Remove the deprecated UnsupportedManager class of the API
This is deprecated since November 2015 with a visible message, so we
can safely remove this now.
2017-10-02 22:54:22 +02:00
Tim van der Meij
918bd98a2f
Remove the deprecated destroy method of the API
This is deprecated since October 2015 with a visible message, so we can
safely remove this now.
2017-10-02 22:54:15 +02:00
Tim van der Meij
9b353ef407
Remove the deprecated parameter handling in the render method of the API
This is deprecated since January 2015 with a visible message, so we can
safely remove this now.
2017-10-02 22:54:01 +02:00
Jonas Jenwald
5b67c7594c Remove the unused Util.sign function from src/shared/util.js
This has been completely unused since commit 10bb6c9ec0, in PR 2505, more than four and a half years ago.
2017-10-01 15:58:03 +02:00
Jonas Jenwald
65352c730f Remove the unused isWorker property from src/display/global.js
This has been completely unused since commit 1d12aed5ca, in PR 7126, almost one and a half years ago.
2017-10-01 15:49:55 +02:00
Brendan Dahl
d03e127434 Merge pull request #8971 from yurydelendik/close-handler
Closes all promises/streams when handler is destroyed.
2017-09-29 14:28:53 -07:00
Jonas Jenwald
b1472cddbb Allow getOperatorList/getTextContent to skip errors when parsing broken XObjects (issue 8702, issue 8704)
This patch makes use of the existing `ignoreErrors` property in `src/core/evaluator.js`, see PRs 8240 and 8441, thus allowing us to attempt to recovery as much as possible of a page even when it contains broken XObjects.

Fixes 8702.
Fixes 8704.
2017-09-29 17:14:21 +02:00
Jonas Jenwald
b8ec518a1e Split the existing PDFFunction in two classes, a private PDFFunction and a public PDFFunctionFactory, and utilize the latter in PDFDocument to allow various code to access the methods of PDFFunction`
*Follow-up to PR 8909.*

This requires us to pass around `pdfFunctionFactory` to quite a lot of existing code, however I don't see another way of handling this while still guaranteeing that we can access `PDFFunction` as freely as in the old code.

Please note that the patch passes all tests locally (unit, font, reference), and I *very* much hope that we have sufficient test-coverage for the code in question to catch any typos/mistakes in the re-factoring.
2017-09-29 15:30:53 +02:00
Jonas Jenwald
5c961c76bb Remove the unused inline parameter from various methods/functions in PDFImage, and change a couple of methods to use Objects rather than plain parameters
The `inline` parameter is passed to a number of methods/functions in `PDFImage`, despite not actually being used. Its value is never checked, nor is it ever assigned to the current `PDFImage` instance (i.e. no `this.inline = inline` exists).
Looking briefly at the history of this code, I was also unable to find a point in time where `inline` was being used.

As far as I'm concerned, `inline` does nothing more than add clutter to already very unwieldy method/function signatures, hence why I'm proposing that we just remove it.
To further simplify call-sites using `PDFImage`/`NativeImageDecoder`, a number of methods/functions are changed to take Objects rather than a bunch of (somewhat) randomly ordered parameters.
2017-09-29 15:30:40 +02:00
Yury Delendik
71b0e4e818 Closes all promises/streams when handler is destroyed. 2017-09-28 16:45:04 -05:00
Jonas Jenwald
a159c4f357 Check that this.baseUrl is defined before attempting to fetch any data in DOMCMapReaderFactory/NodeCMapReaderFactory 2017-09-28 12:34:57 +02:00
Jonas Jenwald
7d3efe43a2 Ensure that the same exact version of PDF.js is used in both the API and the Worker
I don't have a good example at hand right know, but I recall seeing custom deployments of PDF.js that bundle a *specific* version of the `build/pdf.js` file and then set `PDFJS.workerSrc` to point to https://mozilla.github.io/pdf.js/build/pdf.worker.js.
That practice seems really bad since, besides (obviously) causing unnecessary server load, it will very quickly result in a version mismatch between the `pdf.js` and `pdf.worker.js` files in those PDF.js deployments.
Such a version mismatch could easily lead to either breaking errors, or even worse slightly inconsistent behaviour for an API call (if the API -> Worker interface changes, which does happen from time to time).

To avoid the problems described above, I'm thus proposing that we enforce that the versions of the `pdf.js` and `pdf.worker.js` files must always match.
2017-09-27 15:41:57 +02:00
Brendan Dahl
18e2321845 Overwrite maxSizeOfInstructions in maxp with computed value.
In issue #7507 the value is less than the actuall max size
of the glyph instructions causing OTS to fail the font.
2017-09-25 17:53:26 -07:00
Jonas Jenwald
10727572a2 Merge pull request #8950 from timvandermeij/polygon-polyline-annotations
Implement support for polyline and polygon annotations
2017-09-24 15:16:14 +02:00
Tim van der Meij
c69a7a83da Merge pull request #8932 from janpe2/jbig2-sym-offset
JBIG2 symbol offsets
2017-09-23 17:11:45 +02:00
Tim van der Meij
8ccad276b2
Implement support for polygon annotations 2017-09-23 16:52:47 +02:00
Tim van der Meij
99b17a494d
Implement support for polyline annotations 2017-09-23 16:37:23 +02:00
Tim van der Meij
40b89e9ba4 Merge pull request #8949 from Snuffleupagus/ColorSpace-rm-instanceof-AlternateCS
Remove the `instanceof AlternateCS` check in `ColorSpace.parse` since it's dead code
2017-09-23 16:09:17 +02:00
Tim van der Meij
2aac994171 Merge pull request #8928 from mukulmishra18/decode-file-path
Fix #8907: Decode URL to get correct path in node_stream.
2017-09-23 14:44:58 +02:00
Jonas Jenwald
8a084aff0f Remove the instanceof AlternateCS check in ColorSpace.parse since it's dead code
Looking at `ColorSpace.parseToIR`, it will do one of the following things when called:
 1. Return a String.
 2. Return an Array.
 3. Throw a `FormatError`.
 4. In one case, return the result of *another* `ColorSpace.parseToIR` call.

However, under no circumstances will it ever return an `AlternateCS` instance.

Since it's often useful to understand why code, which has become unused, existed in the first place, let's grab a hard hat and a shovel and start digging through the history of this code :-)

The current condition was introduced in commit c198ec4323, in PR 794, but it was actually already obsolete by that time.
The preceeding `instanceof SeparationCS` condition predates commit a7278b7fbc, in PR 700.
That condition was originally introduced all the way back in commit 4e3f87b60c, in PR 692. However, it was made obsolete by commit 9dcefe1efc, which is included in the very same PR!

Hence we're left with the conclusion that not only has this code be unused for *almost* six years, it was basically never used at all save for a few refactoring commits that're part of PR 692.
2017-09-23 14:36:10 +02:00
Tim van der Meij
d7b37ae745 Merge pull request #8912 from timvandermeij/xml-parser
[api-minor] Replace `DOMParser` with `SimpleXMLParser`
2017-09-20 23:45:00 +02:00
Jonas Jenwald
abc864fca9 Merge pull request #8938 from brendandahl/bug1392647
Use font's default width even when 0. (bug 1392647)
2017-09-20 22:38:39 +02:00
Brendan Dahl
10ba292b46 Use font's default width even when 0.
Bug 1392647 has a PDF where the default width of the font
is 0. It draws some charcodes that don't have glyphs, but
we were wrongly using the 1000 default width for these
charcodes causing some text to be overlapping.
2017-09-20 11:38:30 -07:00
Tim van der Meij
d4309614f9
Replace DOMParser with SimpleXMLParser
The `DOMParser` is most likely overkill and may be less secure.
Moreover, it is not supported in Node.js environments.

This patch replaces the `DOMParser` with a simple XML parser. This
should be faster and gives us Node.js support for free. The simple XML
parser is a port of the one that existed in the examples folder with a
small regex fix to make the parsing work correctly.

The unit tests are extended for increased test coverage of the metadata
code. The new method `getAll` is provided so the example does not have
to access internal properties of the object anymore.
2017-09-19 23:09:07 +02:00
Jani Pehkonen
5d1074c110 Fix JBIG2 symbol offsets in text regions 2017-09-19 23:43:23 +03:00
Tim van der Meij
bc9afdf3c4
Convert src/display/metadata.js to ES6 syntax 2017-09-19 22:13:59 +02:00
Jani Pehkonen
3d99b8d706 CCITTFaxStream problem when EndOfBlock is false 2017-09-19 22:19:40 +03:00
Mukul Mishra
e4c09c7cba Fix #8907: Decode URL to get correct path in node_stream. 2017-09-19 15:10:36 +05:30
Tilman Hausherr
d75a497a6b support tiff predictor for 16bit
(for issue #6289)
This does the same for 16 bit as the existing 8 bit tiff predictor code, an addition of the last word to this word.

The last two "& 0xFF" may or may not be needed, I see this isn't done in the 8 bit code, but I'm not a JS developer.
2017-09-18 22:24:25 +02:00
Tim van der Meij
400e4aae0e
Implement support for stamp annotations 2017-09-16 16:37:50 +02:00
Tim van der Meij
3be941d982 Merge pull request #8909 from Snuffleupagus/PDFFunction-isEvalSupported
Check `isEvalSupported`, and test that `eval` is actually supported, before attempting to use the `PostScriptCompiler` (issue 5573)
2017-09-16 16:11:03 +02:00
Jonas Jenwald
eece66fa3e For /Filter entries containing Names, ignore the /DecodeParms entry if it contains an Array (issue 8895) 2017-09-15 23:02:16 +02:00
Jonas Jenwald
dc926ffc0f Check isEvalSupported, and test that eval is actually supported, before attempting to use the PostScriptCompiler (issue 5573)
Currently `PDFFunction` is implemented (basically) like a class with only `static` methods. Since it's used directly in a number of different `src/core/` files, attempting to pass in `isEvalSupported` would result in code that's *very* messy, not to mention difficult to maintain (since *every* single `PDFFunction` method call would need to include a `isEvalSupported` argument).

Rather than having to wait for a possible re-factoring of `PDFFunction` that would avoid the above problems by design, it probably makes sense to at least set `isEvalSupported` globally for `PDFFunction`.

Please note that there's one caveat with this solution: If `PDFJS.getDocument` is used to open multiple files simultaneously, with *different* `PDFJS.isEvalSupported` values set before each call, then the last one will always win.
However, that seems like enough of an edge-case that we shouldn't have to worry about it. Besides, since we'll also test that `eval` is actually supported, it should be fine.

Fixes 5573.
2017-09-15 12:02:45 +02:00
Mukul Mishra
ef7038fe34 Fix #8888: Change behaviour of fetch to make it compatible with XHR. 2017-09-14 23:53:06 +05:30
Jonas Jenwald
f2618eb2e4 Merge pull request #8808 from janpe2/issue8741
Fix color of image masks inside uncolored patterns
2017-09-12 14:27:56 +02:00
Tim van der Meij
320779e6ed Merge pull request #8691 from timvandermeij/square-circle-annotations
Implement support for square and circle annotations
2017-09-09 22:56:54 +02:00
Tim van der Meij
44c116ac49
Implement support for circle annotations 2017-09-09 21:36:27 +02:00
Tim van der Meij
cace2e9047
Implement support for square annotations 2017-09-09 21:36:27 +02:00
Tim van der Meij
f7fd1db52f
Introduce DOMSVGFactory
This patch provides a new unit tested factory for creating SVG
containers and elements. This code is duplicated twice in the
codebase, but with upcoming changes this would need to be duplicated
even more. Moreover, consolidating this code in one factory allows
us to replace it easily for e.g., supporting Node.js. Therefore, move
this to a central place and update/ES6-ify the related code.

Finally, we replace `setAttributeNS` with `setAttribute` because no
namespace is provided.
2017-09-09 21:36:27 +02:00
Tim van der Meij
437e9cb056 Merge pull request #8865 from Snuffleupagus/hide-unsupported-LinkAnnotation
Hide unsupported `LinkAnnotation`s (issue 3897)
2017-09-09 19:07:43 +02:00
Jonas Jenwald
8686baede5 Replace value === (value | 0) checks with Number.isInteger(value) in the src/ folder
Rather than doing what (at first) may seem like a fairly obscure comparison, using `Number.isInteger` will clearly indicate the intent of the code.
2017-09-09 14:12:52 +02:00
Jonas Jenwald
7115e136e4 Hide unsupported LinkAnnotations (issue 3897)
Rather than displaying links that does *nothing* when clicked, it probably makes more sense to simply not render them instead. Especially since it turns out that, at least at this point in time, this is *very* easy to both implement and test.

Fixes 3897.
2017-09-06 12:52:56 +02:00
Jani Pehkonen
86020396cb Fix color of image masks inside uncolored patterns 2017-09-06 13:41:48 +03:00
Jonas Jenwald
41415ba0a2 Correctly validate the response status for non-HTTP fetch requests (PR 8768 follow-up)
It seems that the status check, for non-HTTP loads, causes the default viewer to *refuse* to open local PDF files.

***STR:***
 1. Make sure that fetch support is enabled in the browser. In Firefox Nightly, set `dom.streams.enabled = true` and `javascript.options.streams = true` in `about:config`.
 2. Open https://mozilla.github.io/pdf.js/web/viewer.html.
 3. Click on the "Open file" button, and open a new PDF file.

***ER:***
 A new PDF file should open in the viewer.

***AR:***
 The PDF file fails to open, with an error message of the following format:
`Message: Unexpected server response (200) while retrieving PDF "blob:https://mozilla.github.io/a4fc455f-bc05-45b5-b6aa-2ecff3cb45ce".`
2017-09-05 17:07:44 +02:00
Jonas Jenwald
cfb4955a92 Replace the isArray helper function with the native Array.isArray function
*Follow-up to PR 8813.*
2017-09-01 20:27:13 +02:00
Jonas Jenwald
11408da340 Replace the isInt helper function with the native Number.isInteger function
*Follow-up to PR 8643.*
2017-09-01 16:52:50 +02:00
Tim van der Meij
d332f62d60 Merge pull request #8857 from Snuffleupagus/fetchUncompressed-type-checks
Avoid some redundant type checks in `XRef.fetchUncompressed`
2017-08-31 23:33:02 +02:00
Tim van der Meij
51be27853f Merge pull request #8847 from Snuffleupagus/AnnotationElement-isRenderable-regression
Correct the default value for `isRenderable` in the `AnnotationElement` constructor, to fix breaking errors when rendering unsupported annotations
2017-08-31 22:08:10 +02:00
Jonas Jenwald
772a5412a4 Avoid some redundant type checks in XRef.fetchUncompressed
When looking briefly at using `Number.isInteger`/`Number.isNan` rather than `isInt`/`isNaN`, I noticed that there's a couple of not entirely straightforward cases to consider.

At first I really couldn't understand why `parseInt` is being used like it is in `XRef.fetchUncompressed`, since the `num` and `gen` properties of an object reference should *always* be integers.
However, doing a bit of code archaeology pointed to PR 4348, and it thus seem that this was a very deliberate change. Since I didn't want to inadvertently introduce any regressions, I've kept the `parseInt` calls intact but moved them to occur *only* when actually necessary.[1]

Secondly, I noticed that there's a redundant `isCmd` check for an edge-case of broken operators. Since we're throwing a `FormatError` if `obj3` isn't a command, we don't need to repeat that check.

In practice, this patch could perhaps be considered as a micro-optimization, but considering that `XRef.fetchUncompressed` can be called *many* thousand times when loading larger PDF documents these changes at least cannot hurt.

---
[1] I even ran all tests locally, with an added `assert(Number.isInteger(obj1) && Number.isInteger(obj2));` check, and everything passed with flying colours.
However, since it appears that this was in fact necessary at one point, one possible explanation is that the failing test-case(s) have now been replaced by reduced ones.
2017-08-31 16:49:04 +02:00
Jonas Jenwald
84fe442b35 Correctly set the credentials of a fetch request, when the withCredentials parameter was passed to getDocument
Skimming through https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API/Using_Fetch#Sending_a_request_with_credentials_included, it looks to me like the `credentials` option was accidentally inverted.
2017-08-31 09:20:05 +02:00
Jonas Jenwald
87fc9bafea Correct the default value for isRenderable in the AnnotationElement constructor, to fix breaking errors when rendering unsupported annotations
*This regressed in PR 8828.*

When attempting to open e.g. http://mirrors.ctan.org/macros/latex/contrib/pdfcomment/doc/example.pdf, the annotation layers are now missing since `Error: Abstract method `AnnotationElement.render` called` is thrown multiple times.
2017-08-31 08:47:36 +02:00
Tim van der Meij
a4cc85fc5f Merge pull request #8828 from timvandermeij/es6-annotations
Improve the annotation code by converting to ES6 syntax and removing duplicate code
2017-08-31 00:02:07 +02:00
Yury Delendik
cd95b426c7 Disables fetch when ReadableStream is not available. 2017-08-30 10:53:59 -05:00
Yury Delendik
3cff7da0e7 Fixes fetch and node behavior when disableAutoFetch adn disableStream is used. 2017-08-30 10:53:38 -05:00
Mukul Mishra
3516a59384 Adds fetch stream logic for networking part of PDF.js 2017-08-29 22:56:48 +05:30
Jonas Jenwald
49b8cd5a6a Attempt to improve the EI detection heuristics, for inline images, in streams containing NUL bytes (issue 8823)
Since this patch will now treat (some) `NUL` bytes as "ASCII", the number of `followingBytes` checked are thus increased to (hopefully) reduce the risk of introducing new false positives.

Fixes 8823.
2017-08-27 12:48:28 +02:00
Tim van der Meij
2512eccbf0
Implement getOperatorList method in the WidgetAnnotation class to
avoid duplication in subclasses
2017-08-27 01:02:41 +02:00
Tim van der Meij
4f02857394
Let the two annotation factories use static methods
This corresponds to how other factories are implemented.
2017-08-27 01:02:40 +02:00
Tim van der Meij
af10f8b586
Convert src/display/annotation_layer.js to ES6 syntax 2017-08-27 01:02:40 +02:00
Tim van der Meij
24d741d045
Convert src/core/annotation.js to ES6 syntax 2017-08-27 00:53:45 +02:00
Rob Wu
7cc7260634 Merge pull request #8796 from timvandermeij/svg-text-rise
Implement text rise for the SVG back-end
2017-08-26 19:02:51 +02:00
Jonas Jenwald
42f2d36d1f Account for broken outlines/annotations, where the destination dictionary contains an invalid /Dest entry
According to the specification, see http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#page=377, a `Dest` entry in an outline item should *not* contain a dictionary.
Unsurprisingly there's PDF generators that completely ignore this, treating is an `A` entry instead.

The patch also adds a little bit more validation code in `Catalog.parseDestDictionary`.
2017-08-26 17:38:15 +02:00
Tim van der Meij
d368a5baed Merge pull request #8817 from mukulmishra18/intermittent-error
Set this.isCancelled in close method of streamSink.
2017-08-25 00:31:44 +02:00
Jonas Jenwald
88167b5e38 Merge pull request #8824 from Snuffleupagus/bug-1393476
Prevent an infinite loop in `XRef.readXRef` by keeping track of already parsed tables (bug 1393476)
2017-08-24 22:13:48 +02:00
Jonas Jenwald
4660cf8238 Prevent an infinite loop in XRef.readXRef by keeping track of already parsed tables (bug 1393476)
With this patch, not only is the infinite loop prevented, but we're also able to actually render the file (which e.g. Adobe Reader isn't able to).

Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1393476.
2017-08-24 19:18:08 +02:00
Yury Delendik
e82811adb4 Merge pull request #8712 from mukulmishra18/node_stream
Adds node.js logic for networking tasks for PDF.js
2017-08-24 11:35:29 -05:00
Mukul Mishra
efad0c7a40 Set this.isCancelled in close method of streamSink. 2017-08-24 13:51:27 +05:30
Mukul Mishra
d16709f5e4 Adds tests for node_stream 2017-08-24 12:46:44 +05:30
Tim van der Meij
e9ba54940d Merge pull request #8800 from Snuffleupagus/issue-8798
Try to recover if we reach the end of the stream when searching for the `EI` marker of an inline image (issue 8798)
2017-08-23 23:47:51 +02:00
Yury Delendik
438c0b28f2 Use Array.isArray in the LoopbackPort. 2017-08-23 09:25:25 -05:00
Mukul Mishra
18ede8c65d Adds http support to node_stream logic 2017-08-23 14:08:41 +05:30
Mukul Mishra
ed78b23ff2 Adds node.js logic for networking tasks for PDF.js 2017-08-23 14:06:43 +05:30
Jonas Jenwald
cb10c03d0a Merge pull request #8812 from yurydelendik/central-global
Moves global scope out of shared/util.
2017-08-23 09:36:13 +02:00
Yury Delendik
57bc3296f4 Moves global scope out of shared/util. 2017-08-22 18:20:52 -05:00
Tim van der Meij
cfc052a515
Implement text rise for the SVG back-end
The property and the setter for text rise were already present, but they
were never used or called. This patch completes the implementation by
calling the setter when the operator is encountered and by using the
text rise value when rendering text.
2017-08-23 00:34:39 +02:00
Jonas Jenwald
ca936ee0c7 Merge pull request #8491 from janpe2/jbig2Halftone-2
JBIG2 halftone regions and pattern dictionaries
2017-08-23 00:13:43 +02:00
Jonas Jenwald
cb55506b95 Try to recover if we reach the end of the stream when searching for the EI marker of an inline image (issue 8798) 2017-08-22 09:33:13 +02:00
Jonas Jenwald
2112999db7 Fix caching of small inline images in Parser.makeInlineImage (issue 8790)
*Follow-up to PR 5445.*

Using the PDF file from issue 2618, i.e. http://bugzilla-attachments.gnome.org/attachment.cgi?id=226471, with the following manifest file:
```json
[
    {  "id": "issue2618",
       "file": "../web/pdfs/issue2618.pdf",
       "md5": "",
       "rounds": 50,
       "type": "eq"
    }
]
```
I get the following results when comparing `master` against this patch:
```
browser | stat         | Count | Baseline(ms) | Current(ms) |  +/- |     %  | Result(P<.05)
------- | ------------ | ----- | ------------ | ----------- | ---- | ------ | -------------
firefox | Overall      |    50 |         4694 |        3974 | -721 | -15.35 |        faster
firefox | Page Request |    50 |            2 |           1 |    0 | -22.83 |
firefox | Rendering    |    50 |         4692 |        3972 | -720 | -15.35 |        faster
```

So, based on these results, it seems like a fairly clear win to fix this broken caching :-)
2017-08-18 23:08:55 +02:00
Tim van der Meij
50e10fdafc Merge pull request #8785 from Rob--W/svg-ignore-missing-glyph
SVG: Don't render missing glyphs
2017-08-17 23:40:57 +02:00
Rob Wu
f07ce2bbc2 SVG: Don't render missing glyphs
This bug is similar to the canvas bug of #6721.
I found this bug when I tried to run pdf2svg on a SVG file, and the generated
SVG could not be viewed in Chrome due to a SVG/XML parsing error:
"PCDATA invalid Char value 3"

Reduced test case:
- https://github.com/mozilla/pdf.js/files/1229507/pcdatainvalidchar.pdf
- expected: "hardware performance"
- Actual SVG source: "hardware\x03performance"
  (where "\x03" is a non-printable character, and invalid XML).

In terms of rendering, this bug is similar to #6721, where an unexpected glyph
appeared in the canvas renderer. This was fixed by #7023, which skips over
missing glyphs. This commit follows a similar logic.

The test case from #6721 can be used here too:
- https://github.com/mozilla/pdf.js/files/52205/issue6721_reduced.pdf
  expected: "Issue   6721"
  actual (before this patch): "Issue ààà6721"
2017-08-16 23:49:55 +02:00
Jonas Jenwald
563b68e74d Remove manual clamping code in src/core/jpx.js
Since we're now using `Uint8ClampedArray`, rather than `Uint8Array`, doing manual clamping shouldn't be necessary given that that is now handled natively.

This shouldn't have any measurable performance impact, but just to sanity check that I've done some quick benchmarking with the following manifest file:
```json
[
    {  "id": "S2-eq",
       "file": "pdfs/S2.pdf",
       "md5": "d0b6137846df6e0fe058f234a87fb588",
       "rounds": 100,
       "type": "eq"
    }
]
```
which gave the following results against the current `master` (repeated benchmark runs didn't result in any meaningful differences):
```
-- Grouped By browser, stat --
browser | stat         | Count | Baseline(ms) | Current(ms) | +/- |    %  | Result(P<.05)
------- | ------------ | ----- | ------------ | ----------- | --- | ----- | -------------
firefox | Overall      |   100 |          592 |         592 |   1 |  0.12 |
firefox | Page Request |   100 |            3 |           3 |   0 | -9.88 |
firefox | Rendering    |   100 |          588 |         589 |   1 |  0.18 |
```
2017-08-16 13:24:28 +02:00
Jonas Jenwald
f6636d6b19 Use Uint8ClampedArray when returning image data in src/core/jbig2.js and src/core/jpg.js 2017-08-16 13:24:28 +02:00
Jonas Jenwald
74ad90cb8f Update the mask data inversion in PDFImage.createMask to be compatible with both Uint8Array and Uint8ClampedArray 2017-08-16 13:24:21 +02:00
Jonas Jenwald
d6cd5355f0 Use Uint8ClampedArray, when returning data, and remove manual clamping in src/core/jpg.js (issue 4901)
This patch removes the `clamp0to255` helper function, as well as manual clamping code in `src/core/jpg.js`.
The adjusted constants in `_convertCmykToRgb` were taken from CMYK to RGB conversion code found in `src/core/colorspace.js`.

*Please note:* There will be some very slight movement in a number of existing test-cases, since `Uint8ClampedArray` appears to use `Math.round` (or equivalent) and the old code used (basically) `Math.floor`.
2017-08-14 16:19:57 +02:00
Jonas Jenwald
be36c60e0f Polyfill Uint8ClampedArray using core-js
https://github.com/zloirock/core-js
2017-08-14 16:19:55 +02:00
Tim van der Meij
903f372e3d Merge pull request #8762 from Snuffleupagus/evaluator-coded-isType3Font
Replace the `coded` property with `isType3Font` when building the font `properties` object in `PartialEvaluator.translateFont`
2017-08-08 22:02:28 +02:00
Tobias Schneider
da44d10af1 Fallback to plain object for globalScope. 2017-08-08 09:17:48 -07:00
Jani Pehkonen
9a581ee9ed Implement JBIG2 halftone regions and pattern dictionaries 2017-08-08 15:38:29 +03:00
Jonas Jenwald
093afd1212 Replace the coded property with isType3Font when building the font properties object in PartialEvaluator.translateFont
This appears to simply have been forgotten in the re-factoring in PR 4815, where the `coded` property was renamed to the *much* more descriptive `isType3Font` property.
2017-08-08 14:03:02 +02:00
Jonas Jenwald
4729e96fb7 Remove leftover args[0].code checks from the OPS.paintXObject cases in evaluator.js
From looking at blame, it seems that these checks became obsolete with PR 692 (which landed close to six years ago). Note how, after that PR, there's no longer anything being assigned to the `code` property of an Object.
2017-08-07 10:48:37 +02:00
Jonas Jenwald
ace9de6f7d Merge pull request #8747 from brendandahl/first-cmap
Fix two cmap related issues.
2017-08-04 14:11:12 +02:00
Jonas Jenwald
bbf5b4d706 Merge pull request #8745 from yurydelendik/cancel-stream
Properly cancel streams and guard at getTextContent.
2017-08-04 13:00:31 +02:00
Brendan Dahl
0bef50d56d Fix two cmap related issues.
In issue #8707, there's a char code mapped to a non-
existing glyph which shouldn't be drawn. However, we
saw it was missing and tried to then use the post table and
end up mapping it incorrectly.

This illuminated a problem with issue #5704 and bug
893730 where glyphs disappeared after above fix.  This was
from the cmap returning the wrong glyph id. Which in turn was
caused because the font had multiple of the same type of cmap
table and we were choosing the last one. Now, we instead
default to the first one. I'm unsure if we should instead be
merging the multiple cmaps, but using only the first one works.
2017-08-03 22:19:36 -07:00
Yury Delendik
a1dfbec532 Properly cancel streams and guard at getTextContent. 2017-08-03 16:36:46 -05:00
Yury Delendik
5b5781b45d Merge pull request #8738 from ChenMachluf/remove_workerPort_after_PDFWorker_destroy
Delete workerPort to PDFWorker cache after PDFWorker destroy
2017-08-03 15:49:40 -05:00
Yury Delendik
6beb925f0b Checks Edge support for streams. 2017-08-03 08:48:51 -05:00
Jonas Jenwald
e20d4a9c21 Merge pull request #8681 from brendandahl/glyph-ids
Fix several issues with glyph id mappings (issue 8668, bug 1383504)
2017-08-03 14:25:34 +02:00
Chen Machluf
9b1b160d4f remove PDFWorker from cache after detsroy 2017-08-02 23:48:42 +03:00
Brendan Dahl
5b7f712ca7 Merge pull request #8627 from yurydelendik/issue-8591
Fallback on font widths if CFF data is broken
2017-08-02 10:53:14 -07:00
Mukul Mishra
00e026ebcd Reduces the amount of data send via GetDocRequest. 2017-07-30 00:00:03 +05:30
Tim van der Meij
783d42ec2b Merge pull request #8611 from apoorv-mishra/colorspace-tests
Add unit-tests for colorspace.js
2017-07-29 00:14:03 +02:00
Yury Delendik
01b47d9012 Use streams-lib as polyfill 2017-07-28 11:54:33 -05:00
Apoorv Mishra
a129de7bd1 Add unit-tests for colorspace.js
Added unit-tests for DeviceGray, DeviceRGB and DeviceCMYK

Added unit-tests for CalGray

Added unit-tests for CalRGB

Removed redundant code

Added unit-tests for LabCS

Added unit-tests for IndexedCS

Update comment

Change lookup to Uint8Array as mentioned in pdf specs(these tests will pass after PR #8666 is merged).

Added unit-tests for AlternateCS

Resolved code-style issues

Fixed code-style issues

Addressed issues pointed out in https://github.com/mozilla/pdf.js/pull/8611#pullrequestreview-52865469
2017-07-28 14:24:56 +05:30
Yury Delendik
343b4dc2b6 Merge pull request #8617 from mukulmishra18/network-streaming
Adds Streams API support for networking task of PDF.js project.
2017-07-27 16:15:06 -05:00
Mukul Mishra
109106794d Adds Streams API support for networking task of PDF.js project.
network.js file moved to main thread and `PDFNetworkStream` implemented
at worker thread, that is used to ask for data whenever worker needs.
2017-07-28 02:32:30 +05:30
Brendan Dahl
ac33358e1f Fix several issues with glyph id mappings.
The initial issue with #8255 was I added a missing glyphs
check to adjustMapping, but this caused us to skip re-mapping
a glyph if the fontCharCode was a missingGlyph which in turn
caused us to overwrite a valid glyph id with an invalid one. While
fixing this, I also added a warning if the private use area is full since
this also accidentally happened when I made a different mistake.

This brought to light a number of issues where we map
missing glyphs to notdef, but often the notdef is actually defined
and then ends up being drawn. Now the glyphs don't get
mapped in toFontChar and so they are not drawn by the canvas.

Fixing the above brought up another issue though in bug1050040.pdf.
In this PDF, the font fails to load by the browser and before we were still
drawing the glyphs because it looked like the font had them, but with the fixes
above the glyphs showed up as missing so we didn't attempt draw them. To
fix this, I now throw an error when the loca table is in really bad shape and
we fall back to trying to use a system font. We now also use this fall back if
there are any format errors during converting fonts.
2017-07-26 13:00:55 -07:00
Tim van der Meij
37ac8f8623 Merge pull request #8698 from Snuffleupagus/issue-8697
Add a fallback for non-embedded SegoeUISymbol font (issue 8697)
2017-07-25 22:35:52 +02:00
Tim van der Meij
44a5cec25e Merge pull request #8666 from apoorv-mishra/fix-colorspace
Fix TypeError that occurs in colorspace.js on accidentally passing an 'Array' instead of 'TypedArray'
2017-07-25 22:13:20 +02:00
Yury Delendik
c830021b07 Fixes CFF data glyph widths 2017-07-25 12:29:51 -05:00
Jonas Jenwald
23ec6b16ca Add a fallback for non-embedded SegoeUISymbol font (issue 8697)
The PDF file uses a non-embedded SegoeUISymbol font, which is *not* a standard font (and is mainly used by Microsoft, see https://en.wikipedia.org/wiki/Segoe).

Fixes 8697.
2017-07-25 12:45:11 +02:00
Mukul Mishra
568b0b6a42 Adds ready capability rejection logic for stream sink. 2017-07-25 02:07:38 +05:30
Tim van der Meij
af71ea7a7d Merge pull request #8673 from Snuffleupagus/api-pageMode
[api-minor] Add support for PageMode in the API and viewer (issue 8657)
2017-07-23 13:17:07 +02:00
Tim van der Meij
e7cddcce28 Merge pull request #8684 from Snuffleupagus/rm-assert
Remove most `assert()` calls (issue 8506)
2017-07-22 19:42:24 +02:00
Tim van der Meij
7ded895d0c Merge pull request #8638 from Snuffleupagus/issue-4926-built-in-jpg
In `src/core/jpg.js`, ensure that the Adobe JPEG marker always takes precedence, even when the color transform code is zero
2017-07-22 17:25:09 +02:00
Jonas Jenwald
814fa1dee3 Remove most assert() calls (issue 8506)
This replaces `assert` calls with `throw new FormatError()`/`throw new Error()`.
In a few places, throwing an `Error` (which is what `assert` meant) isn't correct since the enclosing function is supposed to return a `Promise`, hence some cases were changed to `Promise.reject(...)` and similarily for `createPromiseCapability` instances.
2017-07-21 18:51:02 +02:00
Apoorv Mishra
d14956d4b8 Fix TypeError that occurs in colorspace.js on accidentally passing an 'Array' instead of 'TypedArray'
Fix TypeError that occurs in colorspace.js on accidentally passing an 'Array' instead of 'TypedArray'

Changed getRgbItem(...) to getRgbBuffer(...) since this.lookup has values in range[0, 255] whereas getRgbItem(...) expects those to be in range [0, 1]

Revert changes for IE9 compatibility
2017-07-21 01:15:05 +05:30
Jonas Jenwald
15f0963f51 Fix a typo, in the Catalog.numPages getter, than prevents shadowing from working correctly
Looking at the blame, it seems that this typo was present even before PR 700 (almost six years ago).
The result of using `'num'`, rather than the *correct* `'numPages'` string, is that the `Catalog.numPages` getter isn't actually being shadowed.
2017-07-20 12:35:09 +02:00
Jonas Jenwald
16c5d41c5b [api-minor] Add support for PageMode in the API (issue 8657)
Please refer to https://wwwimages2.adobe.com/content/dam/Adobe/en/devnet/pdf/pdfs/PDF32000_2008.pdf#page=82.
2017-07-19 16:40:03 +02:00
Yury Delendik
ca3c08f12b Merge pull request #8620 from Rob--W/issue-8560-improve-png-compression
Improve compression of PNG images embedded in generated SVG files
2017-07-14 07:23:38 -05:00
Tim van der Meij
26be1df5f7 Merge pull request #8641 from Snuffleupagus/eslint-version-4-upgrade
Update ESLint (and eslint-plugin-mozilla) to the latest version
2017-07-14 14:17:42 +02:00
Jonas Jenwald
f2270252c7 Add Number.isNaN and Number.isInteger polyfills in compatibility.js, since the Streams polyfill relies on them
Without this, the Streams polyfill will fail in Internet Explorer when the code-paths containing these methods are used.
2017-07-13 12:02:14 +02:00
Jonas Jenwald
6f3565e638 Update ESLint (and eslint-plugin-mozilla) to the latest version 2017-07-12 13:14:25 +02:00
Jonas Jenwald
e2ea9b693c In src/core/jpg.js, ensure that the Adobe JPEG marker always takes precedence, even when the color transform code is zero
According to the PDF specification, please see http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G6.2394361, if an Adobe JPEG marker is present it should always take precedence. This even seem to be consistent with the existing comment that is present in the code.
Hence it seems reasonable to interpret `transformCode === 0` as no color conversion being necessary.

Fixes the rendering of page 1 in `issue-4926` (from the test-suite), when the built-in `src/core/jpg.js` image decoder is used.
2017-07-11 17:08:30 +02:00
Rob Wu
01f03fe393 Optimize PNG compression in SVG backend on Node.js
Use the environment's zlib implementation if available to get
reasonably-sized SVG files when an XObject image is converted to PNG.
The generated PNG is not optimal because we do not use a PNG predictor.
Futher, when our SVG backend is run in a browser, the generated PNG
images will still be unnecessarily large (though the use of blob:-URLs
when available should reduce the impact on memory usage). If we want to
optimize PNG images in browsers too, we can either try to use a DEFLATE
library such as pako, or re-use our XObject image painting logic in
src/display/canvas.js. This potential improvement is not implemented by
this commit

Tested with:

- Node.js 8.1.3 (uses zlib)
- Node.js 0.11.12 (uses zlib)
- Node.js 0.10.48 (falls back to inferior existing implementation).
- Chrome 59.0.3071.86
- Firefox 54.0

Tests:

Unit test on Node.js:

```
$ gulp lib
$ JASMINE_CONFIG_PATH=test/unit/clitests.json node ./node_modules/.bin/jasmine --filter=SVG
```

Unit test in browser: Run `gulp server` and open
http://localhost:8888/test/unit/unit_test.html?spec=SVGGraphics

To verify that the patch works as desired,

```
$ node examples/node/pdf2svg.js test/pdfs/xobject-image.pdf
$ du -b svgdump/xobject-image-1.svg
 # ^ Calculates the file size. Confirm that the size is small
 #   (784 instead of 80664 bytes).
```
2017-07-10 18:56:57 +02:00
Rob Wu
94f1dde07d Move DEFLATE logic in convertImgDataToPng
Move the DEFLATE logic in convertImgDataToPng to a separate function.
A later commit will introduce a more efficient deflate algorithm,
and fall back to the existing, naive algorithm if needed.
2017-07-10 18:45:47 +02:00
Rob Wu
742ed3d1c9 Remove __pdfjsdev_webpack__, use webpack options
`__pdfjsdev_webpack__` was used to skip evaluating part of an AST,
in order to not mangle some `require` symbols.
This commit removes `__pdfjsdev_webpack__`, and:

- Uses `__non_webpack_require__` when one wants the output to
  contain `require` instead of `__webpack_require__`.
- Adds options to the webpack config to prevent "polyfills" for
  some Node.js-specific APIs to be added.
- Use `// eslint-disable-next-line no-undef` instead of `/* globals ... */`
  for variables that are not meant to be used globally.
2017-07-09 16:35:48 +02:00
Yury Delendik
d028c26210 Removes error() 2017-07-07 09:40:24 -05:00
Jonas Jenwald
ea71d23f74 Fix a stupid spelling error in the ASCII85Decode name in Parser.makeInlineImage (issue 8613)
This is a trivial follow-up to PR 5383, and it's a bit strange that this has been wrong since late 2014 without anyone noticing (maybe because inline images aren't too common).
So, apparently code works better if you actually spell correctly, who knew ;-)

Fixes 8613.
2017-07-05 19:43:09 +02:00
Yury Delendik
b3bac5100c Merge pull request #8596 from mukulmishra18/proper-read-result
Fixes wrong structure of fullReader.read() result.
2017-07-05 09:03:57 -05:00
Jonas Jenwald
eff257b820 Merge pull request #8580 from brendandahl/missing-glyf
Fix how we detect and handle missing glyph data.
2017-07-04 12:16:07 +02:00
Brendan Dahl
9f5c1550ed Merge pull request #8592 from brendandahl/cmap-3-0
Only mask char codes of (3, 0) cmap tables in the range of 0xF000 to 0…
2017-07-03 17:58:28 -07:00
Brendan Dahl
efbbd8533f Only mask char codes of (3, 0) cmap tables in the range of 0xF000 to 0xF0FF. 2017-07-03 13:13:46 -07:00
Brendan Dahl
6d4f748fb1 Fix how we detect and handle missing glyph data. 2017-07-03 13:06:06 -07:00
Mukul Mishra
308a83e5ca Fixes wrong structure of fullReader.read() result. 2017-07-01 15:52:47 +05:30
Jonas Jenwald
de0e7a9a68 Check that the MessageHandler isn't already terminated in the onFailure handler in src/core/worker.js (issue 8584)
All other code-paths already checks that the `MessageHandler` isn't terminated, but apparently `onFailure` was missing that check (compare e.g. with the `onSuccess` function).
From what I can tell, this is only an issue if workers are *disabled*, hence why I didn't bother adding a unit-test.

Fixes 8584.
2017-06-30 10:11:13 +02:00
Brendan Dahl
a8a8909d2d Fix missing notdef in expert encoding. 2017-06-29 12:12:39 -07:00
Tim van der Meij
f1a87bab10
SVG: move setting the stroke properties to the stroke method
In general, we may not know the stroke properties when path construction
happens. Since we must know the properties when we apply the stroke, we
should set the properties at that point. Note that we already do that
for the color and opacity, but not yet for the other properties.
2017-06-25 22:28:46 +02:00
Jonas Jenwald
859e3d4dce Merge pull request #8564 from timvandermeij/svg-opacity
SVG: implement fill and stroke opacity
2017-06-24 22:42:18 +02:00
Brendan Dahl
f1f9d98519 Merge pull request #8507 from Snuffleupagus/issue-8480
Only special-case OpenType fonts with `CFF` data if it's both a composite (i.e. Type0) font and also has a non-default CID to GID map (issue 8480)
2017-06-23 13:36:58 -07:00
Yury Delendik
e2ca894fec Merge pull request #8488 from mukulmishra18/streams-getTextContent
Streams get text content
2017-06-23 12:52:13 -05:00
ghetolay
7f79e12383 Fix error when using custom CMapReaderFactory and worker 2017-06-23 15:59:43 +02:00
Tim van der Meij
f9eafefa09
SVG: implement stroke opacity 2017-06-23 00:37:27 +02:00
Tim van der Meij
440914e49b
SVG: implement fill opacity
This makes the `eoFill` method similar to the `eoStroke` method and the
ones in `src/display/canvas.js`.
2017-06-23 00:37:27 +02:00
Tim van der Meij
c6ee05f7e5 Merge pull request #8542 from Rob--W/svg-clipping
Move svg:clipPath generation from clip to endPath
2017-06-22 23:48:06 +02:00
Rob Wu
fc6448d18c Move svg:clipPath generation from clip to endPath
In the PDF from issue 8527, the clip operator (W) shows up before a path
is defined. The current SVG backend however expects a path to exist
before generating a `<svg:clipPath>` element.
In the example, the path was defined after the clip, followed by a
endPath operator (n).
So this commit fixes the bug by moving the path generation logic from
clip to endPath.

Our canvas backend appears to use similar logic:
`CanvasGraphics_endPath` calls `consumePath`, which in turn draws the
clip and resets the `pendingClip` state. The canvas backend calls
`consumePath` from multiple other places, so we probably need to check
whether doing so is also necessary for the SVG backend.

I scanned our corpus of PDF files in test/pdfs, and found that in every
instance (except for one), the "W" PDF operator (clip) is immediately
followed by "n" (endPath). The new test from this commit (clippath.pdf)
starts with "W", followed by a path definition and then "n".

    # Commands used to find some of the clipping commands:
    grep -ra '^W$' -C7 | less -S
    grep -ra '^W ' -C7 | less -S
    grep -ra ' W$' -C7 | less -S

test/pdfs/issue6413.pdf is the only file where "W" (a tline 55) is not
followed by "n". In fact, the "W" is the last operation of a series of
XObject painting operations, and removing it does not have any effect
on the rendered PDF (confirmed by looking at the output of PDF.js's
canvas backend, and ImageMagick's convert command).
2017-06-22 01:08:17 +02:00
Yury Delendik
679ffc84f6 Merge pull request #8544 from Rob--W/compatiblity-safari-strict-error
compatibility.js: Rename parameters in JURL
2017-06-19 10:34:33 -05:00
Rob Wu
f912f89e69 compatibility.js: Rename parameters in JURL 2017-06-19 17:03:41 +02:00
Jonas Jenwald
73234577e1 Rename map to _map inside of Dict, to make it clearer that it should be regarded as a "private" property 2017-06-17 17:32:00 +02:00
Mukul Mishra
0c13d0ff46 Adds Streams API in getTextContent to stream data.
This patch adds Streams API support in getTextContent
so that we can stream data in chunks instead of fetching
whole data from worker thread to main thread. This patch
supports Streams API without changing the core functionality
of getTextContent.

Enqueue textContent directly at getTextContent in partialEvaluator.

Adds desiredSize and ready property in streamSink.
2017-06-17 20:03:27 +05:30
Jonas Jenwald
3a20fd165f Refactor ObjectLoader to use Dicts correctly, rather than abusing their internal properties
The `ObjectLoader` currently takes an Object as input, despite actually working with `Dict`s internally. This means that at the (two) existing call-sites, we're passing in the "private" `Dict.map` property directly.

Doing this seems like an anti-pattern, and we could (and even should) simply provide the actual `Dict` when creating an `ObjectLoader` instance.
Accessing properties stored in the `Dict` is now done using the intended methods instead, in particular `getRaw` which (as the name suggests) doesn't do any de-referencing, thus maintaining the current functionality of the code.

The only functional change in this patch is that `ObjectLoader.load` will now ignore empty nodes, such that `ObjectLoader._walk` only needs to deal with nodes that are known to contain data. (This lets us skip, among other checks, meaningless `addChildren` function calls.)
2017-06-16 22:59:32 +02:00
Jonas Jenwald
f2fc9ee281 Slightly refactor and ES6-ify the code in ObjectLoader
This patch changes all `var` to `let`, and caches the array lengths in all loops. Also removes two unnecessary temporary variable assignments.
2017-06-16 22:59:32 +02:00
Yury Delendik
0c93dee0de Merge pull request #8515 from yurydelendik/bloborigin
Adds special case for origin of blob to the compatibility URL.
2017-06-16 11:21:45 -05:00
Yury Delendik
631e6bebff Fixes WeakMap polyfill (and improves PDFWorker port check). 2017-06-13 09:36:58 -05:00
Yury Delendik
b44848b918 Adds special case for origin of blob to the compatibility URL. 2017-06-13 08:19:46 -05:00
Yury Delendik
24f14d44cb Preventing from using the same canvas for multiple render() 2017-06-12 16:33:49 -05:00
Yury Delendik
db7a770542 Additional check in globalScope detections 2017-06-12 10:14:46 -05:00
Jonas Jenwald
1766fe8184 Merge pull request #8508 from yurydelendik/issue8246
Fixes duplicate creation of PDFWorker for the same port.
2017-06-10 14:56:05 +02:00
Yury Delendik
69c804a0f4 Fixes duplicate creation of PDFWorker for the same port. 2017-06-10 07:02:29 -05:00
Jonas Jenwald
e589834f13 Ensure that TilingPatterns have valid (non-zero) /BBox arrays (issue 8330)
Fixes 8330.
2017-06-09 21:41:48 +02:00
Jonas Jenwald
8b4a42e5b8 Only special-case OpenType fonts with CFF data if it's both a composite (i.e. Type0) font and also has a non-default CID to GID map (issue 8480)
*As mentioned the last time that I touched this particular part of the font code, I'm sincerely hope that this doesn't cause any regressions!*

However, the patch passes all tests added in PRs 5770, 6270, and 7904 (and obviously all other tests as well). Furthermore, I've manually checked all the issues/bugs referenced in those PRs without finding any issues.

Fixes 8480.
2017-06-09 21:15:39 +02:00
Jonas Jenwald
999e30723d Reduce the duplication slightly when detecting an OpenType font (in the Font constructor) 2017-06-09 18:26:57 +02:00
Mukul Mishra
bbd9968f76 Added sendWithStream method in MessageHandler.
Adds functionality to accept Queueing Strategy in
sendWithStream method. Using Queueing Strategy we
can control the data that is enqueued into the sink,
and hence regulated the flow of chunks from worker
to main thread.

Adds capability in pull and cancel methods.
Adds ready and desiredSize property in streamSink.

Adds unit test for ReadableStream and sendWithStream.
2017-06-07 21:05:27 +05:30
Tim van der Meij
a3fae906a6 Merge pull request #8474 from Snuffleupagus/ESLint-object-styles-src-core
Fix inconsistent spacing and trailing commas in objects in `src/core/` files, so we can enable the `comma-dangle` and `object-curly-spacing` ESLint rules later on
2017-06-03 22:22:00 +02:00
Jonas Jenwald
f20d2cd2ae Fix inconsistent spacing and trailing commas in objects in remaining src/ files, so we can enable the comma-dangle and object-curly-spacing ESLint rules later on
http://eslint.org/docs/rules/comma-dangle
http://eslint.org/docs/rules/object-curly-spacing

Given that we currently have quite inconsistent object formatting, fixing this in *one* big patch probably wouldn't be feasible (since I cannot imagine anyone wanting to review that); hence I've opted to try and do this piecewise instead.

Please note: This patch was created automatically, using the ESLint `--fix` command line option. In a couple of places this caused lines to become too long, and I've fixed those manually; please refer to the interdiff below for the only hand-edits in this patch.

```diff
diff --git a/src/display/canvas.js b/src/display/canvas.js
index 5739f6f2..4216b2d2 100644
--- a/src/display/canvas.js
+++ b/src/display/canvas.js
@@ -2071,7 +2071,7 @@ var CanvasGraphics = (function CanvasGraphicsClosure() {
       var map = [];
       for (var i = 0, ii = positions.length; i < ii; i += 2) {
         map.push({ transform: [scaleX, 0, 0, scaleY, positions[i],
-                 positions[i + 1]], x: 0, y: 0, w: width, h: height, });
+                   positions[i + 1]], x: 0, y: 0, w: width, h: height, });
       }
       this.paintInlineImageXObjectGroup(imgData, map);
     },
diff --git a/src/display/svg.js b/src/display/svg.js
index 9eb05dfa..2aa21482 100644
--- a/src/display/svg.js
+++ b/src/display/svg.js
@@ -458,7 +458,11 @@ SVGGraphics = (function SVGGraphicsClosure() {

       for (var x = 0; x < fnArrayLen; x++) {
         var fnId = fnArray[x];
-        opList.push({ 'fnId': fnId, 'fn': REVOPS[fnId], 'args': argsArray[x], });
+        opList.push({
+          'fnId': fnId,
+          'fn': REVOPS[fnId],
+          'args': argsArray[x],
+        });
       }
       return opListToTree(opList);
     },
```
2017-06-02 12:32:18 +02:00
Jonas Jenwald
a8c87f8019 Fix inconsistent spacing and trailing commas in objects in src/core/ files, so we can enable the comma-dangle and object-curly-spacing ESLint rules later on
*Unfortunately this patch is fairly big, even though it only covers the `src/core` folder, but splitting it even further seemed difficult.*

http://eslint.org/docs/rules/comma-dangle
http://eslint.org/docs/rules/object-curly-spacing

Given that we currently have quite inconsistent object formatting, fixing this in *one* big patch probably wouldn't be feasible (since I cannot imagine anyone wanting to review that); hence I've opted to try and do this piecewise instead.

Please note: This patch was created automatically, using the ESLint --fix command line option. In a couple of places this caused lines to become too long, and I've fixed those manually; please refer to the interdiff below for the only hand-edits in this patch.

```diff
diff --git a/src/core/evaluator.js b/src/core/evaluator.js
index abab9027..dcd3594b 100644
--- a/src/core/evaluator.js
+++ b/src/core/evaluator.js
@@ -2785,7 +2785,8 @@ var EvaluatorPreprocessor = (function EvaluatorPreprocessorClosure() {
     t['Tz'] = { id: OPS.setHScale, numArgs: 1, variableArgs: false, };
     t['TL'] = { id: OPS.setLeading, numArgs: 1, variableArgs: false, };
     t['Tf'] = { id: OPS.setFont, numArgs: 2, variableArgs: false, };
-    t['Tr'] = { id: OPS.setTextRenderingMode, numArgs: 1, variableArgs: false, };
+    t['Tr'] = { id: OPS.setTextRenderingMode, numArgs: 1,
+                variableArgs: false, };
     t['Ts'] = { id: OPS.setTextRise, numArgs: 1, variableArgs: false, };
     t['Td'] = { id: OPS.moveText, numArgs: 2, variableArgs: false, };
     t['TD'] = { id: OPS.setLeadingMoveText, numArgs: 2, variableArgs: false, };
diff --git a/src/core/jbig2.js b/src/core/jbig2.js
index 5a17d482..71671541 100644
--- a/src/core/jbig2.js
+++ b/src/core/jbig2.js
@@ -123,19 +123,22 @@ var Jbig2Image = (function Jbig2ImageClosure() {
      { x: -1, y: -1, }, { x: 0, y: -1, }, { x: 1, y: -1, }, { x: -2, y: 0, },
      { x: -1, y: 0, }],
     [{ x: -3, y: -1, }, { x: -2, y: -1, }, { x: -1, y: -1, }, { x: 0, y: -1, },
-     { x: 1, y: -1, }, { x: -4, y: 0, }, { x: -3, y: 0, }, { x: -2, y: 0, }, { x: -1, y: 0, }]
+     { x: 1, y: -1, }, { x: -4, y: 0, }, { x: -3, y: 0, }, { x: -2, y: 0, },
+     { x: -1, y: 0, }]
   ];

   var RefinementTemplates = [
     {
       coding: [{ x: 0, y: -1, }, { x: 1, y: -1, }, { x: -1, y: 0, }],
-      reference: [{ x: 0, y: -1, }, { x: 1, y: -1, }, { x: -1, y: 0, }, { x: 0, y: 0, },
-                  { x: 1, y: 0, }, { x: -1, y: 1, }, { x: 0, y: 1, }, { x: 1, y: 1, }],
+      reference: [{ x: 0, y: -1, }, { x: 1, y: -1, }, { x: -1, y: 0, },
+                  { x: 0, y: 0, }, { x: 1, y: 0, }, { x: -1, y: 1, },
+                  { x: 0, y: 1, }, { x: 1, y: 1, }],
     },
     {
-      coding: [{ x: -1, y: -1, }, { x: 0, y: -1, }, { x: 1, y: -1, }, { x: -1, y: 0, }],
-      reference: [{ x: 0, y: -1, }, { x: -1, y: 0, }, { x: 0, y: 0, }, { x: 1, y: 0, },
-                  { x: 0, y: 1, }, { x: 1, y: 1, }],
+      coding: [{ x: -1, y: -1, }, { x: 0, y: -1, }, { x: 1, y: -1, },
+               { x: -1, y: 0, }],
+      reference: [{ x: 0, y: -1, }, { x: -1, y: 0, }, { x: 0, y: 0, },
+                  { x: 1, y: 0, }, { x: 0, y: 1, }, { x: 1, y: 1, }],
     }
   ];
```
2017-06-02 11:20:19 +02:00
Yury Delendik
bd288df909 Merge pull request #8396 from mukulmishra18/streams-lib
Adds streams-lib polyfill and exports ReadableStream from shared/util.
2017-05-31 08:42:48 -05:00
Yury Delendik
66c8893815 Removes last UMDs from the modules. 2017-05-31 07:14:17 -05:00
Jonas Jenwald
982b6aa65b Convert the files in the /src/core folder to ES6 modules
Please note that the `glyphlist.js` and `unicode.js` files are converted to CommonJS modules instead, since Babel cannot handle files that large and they are thus excluded from transpilation.
2017-05-30 22:06:21 +02:00
Yury Delendik
b66b705ed7 Using pre-built code for testing. 2017-05-30 22:06:21 +02:00
Jonas Jenwald
4ce5e520fb Add different code-paths to {CMap, ToUnicodeMap}.charCodeOf depending on length, since Array.prototype.indexOf can be extremely inefficient for very large arrays (issue 8372)
Fixes 8372.
2017-05-24 19:47:04 +02:00
Jonas Jenwald
ac942ac657 Merge pull request #8437 from yurydelendik/default-ctx
Resets canvas 2d context to the default state.
2017-05-23 23:31:57 +02:00
Yury Delendik
a67198895f Resets canvas 2d context to the default state. 2017-05-23 15:10:30 -05:00
Jonas Jenwald
31c24ed631 Don't map glyphs to the HANGUL FILLER (0x3164) Unicode location (issue 8424)
*This patch follows a similar pattern as previous ones, by skipping certain problematic Unicode locations.*

According to http://searchfox.org/mozilla-central/rev/6c2dbacbba1d58b8679cee700fd0a54189e0cf1b/gfx/harfbuzz/src/hb-unicode-private.hh#136, it seems that the HANGUL FILLER (0x3164) location is "special".

Fixes 8424.
2017-05-23 16:12:45 +02:00
Jonas Jenwald
0ddf52aca5 Remove the special handling for nameddests that look like standard pageNumbers
PR 7341 added special handling for `nameddest`s that look like pageNumbers, to prevent issues since we previously *incorrectly* supported specifying a pageNumber directly in the hash; i.e. `#10` versus the correct `#page=10` format.

Since this behaviour wasn't correct, PR 7757 fixed and deprecated the old format, which means that we no longer need to maintain the `nameddest` hack in multiple files.
2017-05-20 11:29:29 +02:00
Mukul Mishra
c9f44f30e5 Adds streams-lib polyfill and exports ReadableStream from shared/util.
Added test for ReadableStream.

Adds ref-implementation license-header in streams-lib
and change gulp task to copy external/streams/ in build/
external/streams/ and build/dist/external/streams folder.

Adds README.md and LICENSE.md
2017-05-20 00:26:34 +05:30
Yury Delendik
5dc8dcdc0f Merge pull request #8388 from Snuffleupagus/issue-8380
Cache JPEG images, just as we do for other image formats, in `evaluator.js` (issue 8380)
2017-05-17 17:25:51 -05:00
chris.greening
cfc2f36f5c Adds additional parameter so background color of canvas can be set 2017-05-17 17:06:44 +01:00
Jonas Jenwald
bbe8c3d8ed Enable running a subset of the API unit-tests on Travis
Notably, this patch skips all canvas rendering tests in Node.js.
2017-05-12 11:48:27 +02:00
Jonas Jenwald
c5f73edcd2 Convert the DOMCanvasFactory to an ES6 class
For consistency, also updates the `pdf2png.js` example to use the slightly less verbose `canvasAndContext` parameter name.
2017-05-11 20:15:22 +02:00
Jonas Jenwald
32baa6af7a Convert the DOMCMapReaderFactory to an ES6 class
Given that we only create *one* instance of this class per `getDocument` call, this shouldn't matter performance wise.
2017-05-11 20:15:19 +02:00
巴里切罗
8d5d97264e fix(svg) adjust strategy for decoding JPEG images 2017-05-08 11:32:44 +08:00
Jonas Jenwald
0c2ebda31c Cache JPEG images, just as we do for other image formats, in evaluator.js (issue 8380)
For some reason, we're putting all kind of images *except* JPEG into the `imageCache` in `evaluator.js`.[1]
This means that in the PDF file in issue 8380, we'll keep sending the *same* two small images[2] to the main-thread and decoding them over and over. This is obviously hugely inefficient!

As can be seen from the discussion in the issue, the performance becomes *extremely* bad if the user has the addon "Adblock Plus" installed. However, even in a clean Firefox profile, the performance isn't that great.

This patch not only addresses the performance implications of the "Adblock Plus" addon together with that particular PDF file, but it *also* improves the rendering times considerably for *all* users.
Locally, with a clean profile, the rendering times are reduced from `~2000 ms` to `~500 ms` for my setup!

Obviously, the general structure of the PDF file and its operator sequence is still hugely inefficient, however I'd say that the performance with this patch is good enough to consider the issue (as it stands) resolved.[3]

Fixes 8380.

---
[1] Not technically true, since inline images are cached from `parser.js`, but whatever :-)

[2] The two JPEG images have dimensions 1x2, respectively 4x2.

[3] To make this even more efficient, a new state would have to be added to the `QueueOptimizer`. Given that PDF files this stupid fortunately aren't too common, I'm not convinced that it's worth doing.
2017-05-07 13:07:41 +02:00
Jonas Jenwald
50d026fbda Merge pull request #8385 from tobytailor/master
Fix typo in LoopbackPort export
2017-05-06 10:35:49 +02:00
Tobias Schneider
e1a3e46cba Fix typo 2017-05-05 19:10:00 -07:00
Yury Delendik
c3cfcbe72f Merge pull request #8340 from ydfzgyj/fix-svg-spacing
Fix char spacing bug in SVG mode
2017-05-05 07:41:25 -05:00
巴里切罗
d58040aa29 fix(svg) char spacing bug 2017-05-05 12:11:20 +08:00
Jonas Jenwald
60b14f526e Convert the files in the /src/shared folder to ES6 modules 2017-05-04 21:07:59 +02:00
Yury Delendik
3adda80f97 Merge pull request #8358 from Snuffleupagus/PartialEvaluator-method-signatures
Change the signatures of the `PartialEvaluator` "constructor" and its `getOperatorList`/`getTextContent` methods to take parameter objects
2017-05-04 08:10:30 -05:00
Jonas Jenwald
6c81b8e6dd Replace unnecessary bind(this) and var self = this statements with arrow functions in remaining src/ files 2017-05-03 23:12:35 +02:00
Yury Delendik
74ba3033e8 Merge pull request #8359 from Snuffleupagus/Lexer-getNumber-ignore-line-breaks
Ignore line-breaks between operator and digit in `Lexer.getNumber`
2017-05-03 09:43:59 -05:00
Jonas Jenwald
3e20d30afc Change the signatures of the PartialEvaluator "constructor" and its getOperatorList/getTextContent methods to take parameter objects
Currently these methods accept a large number of parameters, which creates quite unwieldy call-sites. When invoking them, you have to remember not only what arguments to supply, but also the correct order, to avoid runtime errors.
Furthermore, since some of the parameters are optional, you also have to remember to pass e.g. `null` or `undefined` for those ones.
Also, adding new parameters to these methods (which happens occasionally), often becomes unnecessarily tedious (based on personal experience).

Please note that I do *not* think that we need/should convert *every* single method in `evaluator.js` (or elsewhere in `/core` files) to take parameter objects. However, in my opinion, once a method starts relying on approximately five parameter (or even more), passing them in individually becomes quite cumbersome.

With these changes, I obviously needed to update the `evaluator_spec.js` unit-tests. The main change there, except the new method signatures[1], is that it's now re-using *one* `PartialEvalutor` instance, since I couldn't see any compelling reason for creating a new one in every single test.

*Note:* If this patch is accepted, my intention is to (time permitting) see if it makes sense to convert additional methods in `evaluator.js` (and other `/core` files) in a similar fashion, but I figured that it'd be a good idea to limit the initial scope somewhat.

---

[1] A fun fact here, note how the `PartialEvaluator` signature used in `evaluator_spec.js` wasn't even correct in the current `master`.
2017-05-03 12:10:20 +02:00
Yury Delendik
84f174bb2f Merge pull request #8363 from yurydelendik/worker-fromport
Adds initializeFromPort to the WorkerMessageHandler.
2017-05-02 19:22:20 -05:00
Yury Delendik
008aa56ac6 Adds initializeFromPort to the WorkerMessageHandler. 2017-05-02 16:11:54 -05:00
Jonas Jenwald
40feca12c1 Ignore line-breaks between operator and digit in Lexer.getNumber
This is consistent with the behaviour in Adobe Reader (and PDFium), and it fixes the display of page 30 in https://bug1354114.bmoattachments.org/attachment.cgi?id=8855457 (taken from https://bugzilla.mozilla.org/show_bug.cgi?id=1354114).

The patch also makes the `error` message for invalid numbers slightly more useful, by including the charCode as well. (Having that information available would have reduced the time spent on debugging the PDF file above.)
2017-05-02 20:59:42 +02:00
Tobias Schneider
a80c405941 Rename FakeWorkerPort to LoopbackPort and export it 2017-05-02 11:33:19 -07:00
Jonas Jenwald
ebaa22478c Replace unnecessary bind(this) and var self = this statements with arrow functions in remaining src/core/ files 2017-05-02 15:47:43 +02:00
Jonas Jenwald
95bbc8101c Replace unnecessary bind(this) and var self = this statements with arrow functions in src/core/evaluator.js
Note that by using `let` instead of `var` in `PartialEvaluator.setGState` and `TranslatedFont.loadType3Data`, we can get rid of further `bind` usages since `let` is block-scoped.
Also, the fact that `bind` wasn't used in the `Font` case inside of `setGState` is actually a bug which has been present ever since PR 5205, where a closure was replaced by a standard loop.[1]

---
[1] I'm not aware of any bugs caused by this, but that is probably more a happy accident than anything else, since e.g. just removing the `bind` from the `SMask` case without using block-scoped variables causes test failures.
2017-05-01 20:29:44 +02:00
Tim van der Meij
06c93d8fbd Merge pull request #8342 from Snuffleupagus/eslint_object-shorthand-src-core
Enable the `object-shorthand` ESLint rule in `src/core`
2017-04-29 23:59:20 +02:00
Jonas Jenwald
165294a05f Merge pull request #8335 from Snuffleupagus/jbig2-decodeRefinement-subtract-offsets
Subtract the X/Y offsets when decoding refinement regions of JBIG2 images (issue 7145, 7308, 7401, 7850, 8270)
2017-04-28 23:13:07 +02:00
Yury Delendik
71bbcfad8a Merge pull request #8309 from vologab/feature/allow_query_string_for_pdfjs
Allow use versions for pdf.js script (i.e. - pdf.js?2412313)
2017-04-27 19:00:43 -05:00
Jonas Jenwald
ee09336f32 Restore the URL.createObjectURL check to the createObjectURL utility function (issue 8344)
This is a regression from commit 3888a993b1.

It turns out the even though we have a `URL` polyfill, it's still dependent on the existence of native `URL.{createObjectURL, revokeObjectURL}` functions.
Since no such thing exists in Node.js, our `createObjectURL` utility function breaks there.
2017-04-27 20:57:15 +02:00
Jonas Jenwald
afc74b0178 Enable the object-shorthand ESLint rule in src/shared
Please see http://eslint.org/docs/rules/object-shorthand.

For the most part, these changes are of the search-and-replace kind, and the previously enabled `no-undef` rule should complement the tests in helping ensure that no stupid errors crept into to the patch.
2017-04-27 17:29:40 +02:00
Jani Pehkonen
64deb6c700 Subtract the X/Y offsets when decoding refinement regions of JBIG2 images (issue 7145, 7308, 7401, 7850, 8270)
Please refer to the JBIG2 standard, see https://www.itu.int/rec/dologin_pub.asp?lang=e&id=T-REC-T.88-200002-I!!PDF-E&type=items.
In particular, section "6.3.5.3 Fixed templates and adaptive templates" mentions that the offsets should be *subtracted*; where the offsets are defined according to "Table 6" under section "6.3.2 Input parameters".

Fixes 7145.
Fixes 7308.
Fixes 7401.
Fixes 7850.
Fixes 8270.
2017-04-26 16:06:15 +02:00
Jonas Jenwald
07b5574006 Enable the object-shorthand ESLint rule in src/display
Please see http://eslint.org/docs/rules/object-shorthand.

For the most part, these changes are of the search-and-replace kind, and the previously enabled `no-undef` rule should complement the tests in helping ensure that no stupid errors crept into to the patch.
2017-04-25 16:17:18 +02:00
Jonas Jenwald
7bee0c2aa3 Enable the object-shorthand ESLint rule in src/shared
Please see http://eslint.org/docs/rules/object-shorthand.

For the most part, these changes are of the search-and-replace kind, and the previously enabled `no-undef` rule should complement the tests in helping ensure that no stupid errors crept into to the patch.
2017-04-25 16:07:59 +02:00
Jonas Jenwald
3888a993b1 Remove the URL checks in the createObjectURL utility function, since the URL polyfill have made them redundant
Also, this changes `createBlob` to throw when `Blob` isn't supported.
2017-04-20 10:16:06 +02:00
Vladimir Bloshchitsyn
319ea0f350 Allow use versions for pdf.js script (i.e. - pdf.js?2412313) 2017-04-18 15:07:09 +03:00
Jonas Jenwald
52e3de3c0a Convert the files in the /src/display folder to ES6 modules
Also disables ES2015 transpilation in development mode.
2017-04-16 12:19:10 +02:00
Jonas Jenwald
fd51a7cb8c Merge pull request #8287 from yurydelendik/babel-es2015-preset
Allow to convert (some of) ES6 code to ES5.
2017-04-14 21:47:45 +02:00
Yury Delendik
5855c0a8be Allow to convert (some of) ES6 code to ES5. 2017-04-14 14:39:25 -05:00
Yury Delendik
30bee9fe0c Moves Uint32ArrayView and hasCanvasTypedArrays into compatibility.js. 2017-04-14 10:04:52 -05:00
Yury Delendik
c4c44c1bbe Merge pull request #8240 from Snuffleupagus/api-stopAtErrors
[api-minor] Always allow e.g. rendering to continue even if there are errors, and add a `stopAtErrors` parameter to `getDocument` to opt-out of this behaviour (issue 6342, issue 3795, bug 1130815)
2017-04-13 10:58:49 -05:00
Yury Delendik
46646a9dd1 Merge pull request #8131 from timvandermeij/remove-umd-validation
ES6 modules: remove UMD header validation
2017-04-13 10:49:41 -05:00
Yury Delendik
ad1023ff55 Merge pull request #8262 from tcorral/master
Fix worker resolution on using minified version
2017-04-13 10:41:36 -05:00
Tim van der Meij
32e01cda96 Merge pull request #8228 from timvandermeij/line-annotations
Implement support for line annotations
2017-04-13 00:18:31 +02:00
Tim van der Meij
e15a2ec523
Annotations: implement support for line annotations
This patch implements support for line annotations. Other viewers only
show the popup annotation when hovering over the line, which may have
any orientation. To make this possible, we render an invisible line (SVG
element) over the line on the canvas that acts as the trigger for the
popup annotation. This invisible line has the same starting coordinates,
ending coordinates and width of the line on the canvas.
2017-04-12 23:05:25 +02:00
Jonas Jenwald
3a302fdb53 Correctly detect if requestAnimationFrame is supported in compatibility.js (issue 8272)
This is a regression from PR 8222.

Fixes 8272.
2017-04-11 17:01:35 +02:00
Jonas Jenwald
fbe7b2eee7 Always ignore Type3 glyphs if their OperatorLists contain errors, regardless of the value of the stopAtErrors option
Compared to the parsing of e.g. an entire page, it doesn't really make sense to only be able to render a Type3 glyph partially.
2017-04-11 08:59:22 +02:00
Jonas Jenwald
a39d636eb8 [api-minor] Always allow e.g. rendering to continue even if there are errors, and add a stopAtErrors parameter to getDocument to opt-out of this behaviour (issue 6342, issue 3795, bug 1130815)
Other PDF readers, e.g. Adobe Reader and PDFium (in Chrome), will attempt to render as much of a page as possible even if there are errors present.
Currently we just bail as soon the first error is hit, which means that we'll usually not render anything in these cases and just display a blank page instead.

NOTE: This patch changes the default behaviour of the PDF.js API to always attempt to recover as much data as possible, even when encountering errors during e.g. `getOperatorList`/`getTextContent`, which thus improve our handling of corrupt PDF files and allow the default viewer to handle errors slightly more gracefully.
In the event that an API consumer wishes to use the old behaviour, where we stop parsing as soon as an error is encountered, the `stopAtErrors` parameter can be set at `getDocument`.

Fixes, inasmuch it's possible since the PDF files are corrupt, e.g. issue 6342, issue 3795, and [bug 1130815](https://bugzilla.mozilla.org/show_bug.cgi?id=1130815) (and probably others too).
2017-04-11 08:59:22 +02:00
Jonas Jenwald
10e5f766a2 Merge pull request #8266 from brendandahl/issue6652
Normalize blend mode names.
2017-04-11 08:54:42 +02:00
Brendan Dahl
4969b2ad97 Normalize blend mode names. 2017-04-10 16:18:08 -07:00
TCASAS
010d38a8c0 Fix worker resolution on using minified version
- When the minified version is used the resolver of the worker can not find it properly and throws 404 error.
- The problem was that:
  - It was getting the current name of the file.
  - It was replacing **.js** by **.worker.js**
- When it was loading the unminified version it was working fine because:
  - *pdf.js - .js + .worker.js*  = **pdf.worker.js**
- When it was loading the minified version it didtn't work because:
  - *pdf.min.js - .js + .worker.js* = **pdf.min.worker.js**
  - **pdf.min.worker.js** doesn't exist the real file name is **pdf.worker.min.js**
2017-04-10 13:41:50 +02:00
Tim van der Meij
30d63b0c50
Annotations: move container border removal to the display layer
The display layer is responsible for creating the HTML elements for the
annotations from the core layer. If we need to ignore border styling for
the containers of certain elements, the display layer should do so and
not the core layer. I noticed this during the implementation of line
annotations, for which we actually need the original border width in the
display layer, even though we ignore it for the container. If we set the
border style to zero in the core layer, this becomes impossible.

To prevent this, this patch moves the container border removal code from
the core layer to the display layer. This makes the core layer output
the unchanged annotation data and lets the display layer remove any
border styling if necessary.
2017-04-09 19:01:38 +02:00
Jonas Jenwald
f41d80bdd3 Enable the prefer-promise-reject-errors ESLint rule
See http://eslint.org/docs/rules/prefer-promise-reject-errors, note that this is similar to the already used  `no-throw-literal` rule.
2017-04-08 11:47:22 +02:00
Brendan Dahl
cdc79a4721 Don’t skip glyph 0 in cmap. 2017-04-05 15:17:38 -07:00
Yury Delendik
b665b0319a Merge pull request #8222 from tjgrathwell/ios-fake-cancel-animation-frame
ios: Patch cancelAnimationFrame whenever fakeRequestAnimationFrame is used
2017-04-04 10:29:04 -05:00
Yury Delendik
31f8875614 Merge pull request #8157 from Snuffleupagus/api-RenderTask-cancel-Error
[api-minor] Reject the `RenderTask` with an actual `Error`, instead of just a `string`, when rendering is cancelled
2017-04-04 09:38:47 -05:00
Travis Grathwell
bd70a73d43 ios: Patch cancelAnimationFrame whenever fakeRequestAnimationFrame is used
The existing implementation of fakeRequestAnimationFrame
did not return a timer ID, so the frame could not be cancelled
if you wanted to cancel it. But if you do want to cancel it,
it needs to be cancelled through clearTimeout instead of
cancelAnimationFrame, because the timer IDs are different.

Signed-off-by: Jonathan Barnes <jbarnes@pivotal.io>
2017-03-31 15:31:04 -07:00
Tim van der Meij
8cee63df5d Merge pull request #8205 from Snuffleupagus/built-in-CMap-errors
Improve the error handling when loading of built-in CMap files fail (PR 8064 follow-up)
2017-03-30 23:01:13 +02:00
Jonas Jenwald
437104969d Improve the error handling when loading of built-in CMap files fail (PR 8064 follow-up)
I happened to notice that the error handling wasn't that great, which I missed previously since there were no unit-tests for failure to load built-in CMap files.
Hence this patch, which improves the error handling *and* adds tests.
2017-03-29 22:38:29 +02:00
Jonas Jenwald
61ee0de29f Use a simple RefSetCache to significantly improve the performance of Catalog.getPageDict for certain long documents (PR 8105 follow-up)
I found that PR 8105 unfortunately causes a *very serious* performance regression in long PDF documents where the `Pages` tree only has one level; my apologies for this!

Obviously we cannot revert that PR, since that would cause more issues than it solves. Hence it seems to me that the only viable solution here, is to add a simple `RefSetCache` to reduce the amount of redundant lookups.
Previously in PR 8105 caching was thought to be unnecessary, but as it turns out I don't think that we really have a choice in the matter any more.
2017-03-28 21:39:55 +02:00
Jonas Jenwald
62eee8c782 Try harder to find the next valid JPEG marker when decoding Scan data (issue 8182, issue 8189)
Tentatively fixes 8182 and fixes 8189.
2017-03-27 15:55:21 +02:00
Jonas Jenwald
e229c21ce1 Remove unnecessary xref parameters from various method signatures in PartialEvaluator, since this.xref is already available in the relevant scope
For reasons I don't pretend to understand, we're passing around `xref` arguments to a bunch of methods despite `this.xref` being available in `PartialEvaluator`.

This patch is a small first small step towards cleaning up the, often unwieldy, signatures of methods in `PartialEvaluator`.
2017-03-26 14:12:53 +02:00
Jonas Jenwald
e40fd63bd3 In src/core/evaluator.js, convert a couple of if (!someVariable) { error(...); } instances to assert(someVariable); instead
Rather than, in a number of places, basically duplicating the logic of `assert` we can simply utilize the function directly instead.
2017-03-26 13:53:13 +02:00
Jonas Jenwald
5c0c122a7d Ensure that the XMLHttpRequest is opened before attempting to set the responseType in the DOMCMapReaderFactory, since IE fails otherwise (issue 8193)
I really cannot understand why this change is necessary, since modern browsers such as Firefox and Chrome work just fine with the old code.
Hence this is patch is yet another "hack" that's needed just because IE apparently cannot just work like you'd expect.

For consistency, the Node factory used in the CMap unit-tests is changed as well.

Fixes 8193.
2017-03-25 17:44:48 +01:00
Jonas Jenwald
3705e5e459 Use a proper MessageHandler for PartialEvaluator.getTextContent to avoid errors for fonts relying on built-in CMap files (PR 8064 follow-up)
*My apologies for inadvertently breaking this in PR 8064; apparently we don't have any tests that cover this use-case :(*

Without this patch `getTextContent` will fail if called before `getOperatorList`, since loading of fonts during text-extraction may require fetching of built-in CMap files.

*Please note:* The `text` test added here, which uses an already existing PDF file, fails without this patch.
2017-03-24 17:39:33 +01:00
Rob Wu
49af56f730 Rethrow MissingDataException when needed
In core/document.js: `PDFDocument.prototype.parse` accesses a dictionary
property, which could throw if the underlying data is not yet available.

In core/obj.js: `get Catalog.prototype.metadata` calls
`stream.getBytes`, which can throw MissingDataException too when the
stream is a ChunkedStream.
2017-03-22 14:55:59 +01:00
Jonas Jenwald
8527d27eae Ensure that PDFDocument.documentInfo doesn't fail during document load, when the entire XRef table hasn't been fetched yet (issue 8180)
Similar to other `try-catch` statements in `/core` code, we must re-throw `MissingDataException` to prevent issues with missing data during document loading.
Note that I'm not sure if/how we can test this, which is why the patch doesn't include any test(s).

Fixes 8180.
2017-03-22 14:14:38 +01:00
Jonas Jenwald
e2e13df4a5 Merge pull request #8164 from Snuffleupagus/issue-7828
Don't read past the EOI marker for JPEG images with non-default restart interval (issue 7828)
2017-03-20 22:17:28 +01:00
Jonas Jenwald
d6d0f778aa Don't read past the EOI marker for JPEG images with non-default restart interval (issue 7828)
*After browsing through (a version of) the JPEG specification, see https://www.w3.org/Graphics/JPEG/itu-t81.pdf, I hope that this patch makes sense.*

Note that while issue 7828 became a problem after PR 7661, it isn't really a regression from than PR. The explanation is rather that we're now relying on `core/jpg.js` instead of the Native Image decoder in more situations than before, which thus exposed an *existing* issue in our JPEG decoder.
Another factor also seems to be that in many JPEG images, the DRI (Define Restart Interval) marker isn't present, in which case this bug won't manifest either.

According to https://www.w3.org/Graphics/JPEG/itu-t81.pdf#page=89 (at the bottom of the page):
"NOTE – The final restart interval may be smaller than the size specified by the DRI marker segment, as it includes only the number of MCUs remaining in the scan."
Furthermore, according to https://www.w3.org/Graphics/JPEG/itu-t81.pdf#page=39 (in the middle of the page):
"[...] If restart is enabled and the restart interval is defined to be Ri, each entropy-coded segment except the last one shall contain Ri MCUs. The last one shall contain whatever number of MCUs completes the scan."

Based on the above, it thus seem to me that we should simply ensure that we're not attempting to continue to parse Scan data once we've found all MCUs (Minimum Coded Unit) of the image.

Fixes 7828.
2017-03-20 17:16:33 +01:00
Jonas Jenwald
be1a6f294f Try to recover when encountering JPEG markers with too short marker lengths (issue 8169)
The issue with the JPEG image in question, is that the COM (Comment) marker has an incorrect length entry.

Fixes 8169.
2017-03-20 17:05:51 +01:00
Jonas Jenwald
a7c19d9cbb Adjust the yoda ESLint rule to apply to inequalities as well
I happened to notice that some inequalities had the wrong order, and was surprised since I thought that the `yoda` rule should have caught that.
However, reading http://eslint.org/docs/rules/yoda#options a bit more closely than previously, it's quite obvious that the `onlyEquality` option does *exactly* what its name suggests. Hence I think that it makes sense to adjust the options such that only ranges are allowed instead.
2017-03-19 13:27:14 +01:00
Jonas Jenwald
098a56270d Normalize the BBox entry in Tiling Pattern dictionaries (issue 8117)
According to the PDF specification, see http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G7.3982967, the `BBox` entry should have the form `[left, bottom, right, top]`.
Since some PDF generators apparently violates the specification, we normalize the `BBox` to ensure that the pattern is (correctly) rendered.

Fixes 8117.
2017-03-16 21:43:11 +01:00
Jonas Jenwald
d37d271afa [api-minor] Reject the RenderTask with an actual Error, instead of just a string, when rendering is cancelled
This patch gets rid of the only case in the code-base where we're throwing a plain `string`, rather than an `Error`, which besides better/more consistent error handling also allows us to enable the [`no-throw-literal`](http://eslint.org/docs/rules/no-throw-literal) ESLint rule.
2017-03-13 18:58:21 +01:00
Jonas Jenwald
6d672c4ba6 [api-minor] Add a pdfjsNext parameter, and PDFJS_NEXT build flag, to allow backwards incompatible API changes 2017-03-13 18:43:43 +01:00
Jonas Jenwald
224613a511 Merge pull request #8135 from jasonjensen/issue8097
Handle cff fonts with erroneous stackSize (issue 8097)
2017-03-11 09:55:00 +01:00
Tim van der Meij
fc5810c97a Merge pull request #8144 from timvandermeij/issue-8143
Widget annotations: do not crash if `Parent` is not a dictionary during field name construction (issue 8143)
2017-03-10 00:40:13 +01:00
Tim van der Meij
936d3c0698
Widget annotations: do not crash if Parent is not a dictionary
during field name construction (issue 8143)
2017-03-09 23:51:52 +01:00
Jason O. Jensen
d230784ac3 Handle cff fonts with erroneous stackSize 2017-03-06 19:28:46 -05:00
Tim van der Meij
5eb090f288
ES6 modules: remove UMD header validation
This patch is another step towards enabling Babel. Since we're moving
towards ES6 modules, we will not be using UMD headers anymore, so we can
remove the validation.
2017-03-04 21:43:25 +01:00
Tim van der Meij
4e3e97be8e Merge pull request #8129 from Snuffleupagus/getInheritedPageProp-undefined
Return `undefined` instead of `Dict.empty` from `Page.getInheritedPageProp` for non-existent properties to prevent possible future bugs
2017-03-04 16:18:24 +01:00
Yury Delendik
c290561488 Merge pull request #8120 from yurydelendik/lib
Publishes processed sources into pdfjs-dist/lib
2017-03-04 08:48:36 -06:00
Jonas Jenwald
9bed87f5dc Return undefined instead of Dict.empty from Page.getInheritedPageProp for non-existent properties to prevent possible future bugs
*This is something that I noticed while working on PR 8126, which is (more) fallout from PR 6065.*

In general, it's actually *not* correct to return `Dict.empty` as the default value for non-existent properties. Please note that a prior PR, see https://github.com/mozilla/pdf.js/pull/5957#issuecomment-103112698, asked for that behaviour but I don't think that's right.

Obviously for properties that are (or should) be `Dict`s it makes sense, however certain properties can be e.g. Strings or Arrays instead. In the latter case, returning `Dict.empty` is just plain wrong, and it's quite fascinating that this hasn't caused any errors in practice. (The existing validation in the various getters has actually saved us here.)

Also, when looking at this code again, it seemed unnecessary to duplicate the `MAX_LOOP_COUNT` check since we could just return immediately instead.
2017-03-04 13:08:39 +01:00
Tim van der Meij
1eb96d7ca9 Merge pull request #8128 from timvandermeij/csp-headers
Network: use the current location to prevent errors when using CSP headers
2017-03-04 00:01:50 +01:00
Yury Delendik
e7cc07cc11 Moves checkProblematicCharRanges to font_spec.js 2017-03-03 16:33:35 -06:00
Job van der Weiden
a05115d2ec
Network: use the current location to prevent errors when using CSP headers
When using content security headers to restrict connections to the same origin,
you may not make connections to `example.com`. This feature detection also
works with a request to the current location.
2017-03-03 23:18:51 +01:00
Jonas Jenwald
4a0ff5dbf7 Ensure that we don't ignore 0 values in Page.getInheritedPageProp (issue 8125)
It appears that I accidentally broke this in PR 6065, sorry about that!

The issue in this particular PDF file is that there's `/Rotate` entries on different levels of the `/Pages` tree. We're supposed to use the `/Rotate` entry in the `/Page` dict (which is `0`), but because of an incorrect condition we instead ended up with the one from the `/Pages` dict (which is `180`).

Fixes 8125.
2017-03-03 12:27:40 +01:00
Jonas Jenwald
9163a6fba4 Merge pull request #8112 from Snuffleupagus/JS-action-newWindow
Support the `newWindow` flag in white-listed `app.launchURL` JavaScript actions (PR 7794 follow-up)
2017-03-01 21:24:34 +01:00
Tim van der Meij
25f772a255 Merge pull request #8050 from yurydelendik/systemjs
Replaces RequireJS to SystemJS.
2017-02-27 23:31:41 +01:00
Tim van der Meij
4e201d3787 Merge pull request #8072 from timvandermeij/annotation-append-operator-list
Annotations: move operator list addition logic to `src/core/document.js`
2017-02-27 22:50:57 +01:00
Tim van der Meij
0739f90707
Annotations: move operator list addition logic to src/core/document.js
Ideally, the `Annotation` class should not have anything to do with the
page's operator list. How annotations are added to the page's operator
list is logic that belongs in `src/core/document.js` instead where the
operator list is constructed.

Moreover, some comments have been added to clarify the intent of the
code.
2017-02-27 22:17:49 +01:00
Tim van der Meij
9db4240b85 Merge pull request #8110 from timvandermeij/interactive-forms-choice-inherit-options
Interactive forms: make choice widget options inheritable (issue 8094)
2017-02-27 22:14:25 +01:00
Jonas Jenwald
2a7e5b8a54 Support the newWindow flag in white-listed app.launchURL JavaScript actions (PR 7794 follow-up)
A simple follow-up to PR 7794, which let's us add support for the `newWindow` parameter; refer to https://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/js_api_reference.pdf#G5.1507380.

The patch also fixes an embarrassing oversight regarding the placement of the case-insensitive flag, and also allows arbitrary white-space at the beginning of JS actions.
2017-02-27 15:58:28 +01:00
Yury Delendik
5b50e0d414 Replaces RequireJS to SystemJS. 2017-02-27 08:32:39 -06:00
Tim van der Meij
8990de8614
Interactive forms: make choice widget options inheritable (issue 8094)
Even though the PDF specification does not state that `Opt` fields are
inheritable, in practice there are PDF generators that let annotations
inherit the options from a parent.
2017-02-25 23:34:26 +01:00
Jonas Jenwald
14cc6acb90 Ensure that Dicts found in Object Streams are assigned an objId in XRef.fetch
This fixes something that I noticed while working with the code in `Catalog.getPageDict` when debugging issue 8088.

Note that while I don't have an example where this patch really matters, given that e.g. `PartialEvaluator.hasBlendModes` depends on the `objId` to avoid cyclic references this patch could potentially help for some PDF files.
2017-02-25 10:20:19 +01:00
Tim van der Meij
752510ffa0 Merge pull request #8107 from yurydelendik/init-via-port
Init PDFWorker via MesssagePort.
2017-02-25 00:13:33 +01:00
Tim van der Meij
59392fd544 Merge pull request #8102 from yurydelendik/mv-compatibilty
Move compatibility code to the shared/compatibility.js.
2017-02-24 22:47:49 +01:00
Yury Delendik
51767d63fe Init PDFWorker via MesssagePort. 2017-02-24 13:33:18 -06:00
Jonas Jenwald
1ce295541c Always check all Kids nodes, in Catalog.getPageDict, to avoid getting stuck in an empty node further down in the Pages tree (issue 8088)
As discussed on IRC, we need to check all nodes at the *bottom* of the tree to ensure that we find the correct `Page` dict.
Furthermore, this patch also gets rid of the caching present in a previous version, since it's not clear if that really helps.

Note that this patch purposely adds an `eq` test, using a reduced test-case, so that we can be sure that the algorithm actually finds the correct `Page` dict for each `pageIndex`.

Fixes 8088.
2017-02-24 12:09:46 +01:00
Yury Delendik
facefb0c79 Move compatibility code to the shared/compatibility.js. 2017-02-23 19:18:44 -06:00
Jonas Jenwald
9082f08e37 Enable running the cmap unit-tests on Travis by utilizing a NodeCMapReaderFactory 2017-02-17 23:15:36 +01:00
Yury Delendik
cfaa621a05 Merge pull request #8064 from Snuffleupagus/fetchBuiltInCMap
[api-minor] Refactor fetching of built-in CMaps to utilize a factory on the `display` side instead, to allow users of the API to provide a custom CMap loading factory (e.g. for use with Node.js)
2017-02-17 15:30:31 -06:00
Brendan Dahl
425ad30912 Merge pull request #8071 from Snuffleupagus/bug-1337429
Always choose a (3, 1) cmap table for TrueType fonts that have an encoding specified, regardless of the Symbolic font flag (bug 1337429)
2017-02-16 15:13:46 -08:00
Jonas Jenwald
111419a64a Cache built-in binary CMap files in the worker (issue 4794) 2017-02-16 10:55:39 +01:00
Jonas Jenwald
769c1450b7 [api-minor] Refactor fetching of built-in CMaps to utilize a factory on the display side instead, to allow users of the API to provide a custom CMap loading factory (e.g. for use with Node.js)
Currently the built-in CMap files are loaded in `src/core/cmap.js` using `XMLHttpRequest` directly. For some environments that might be a problem, hence this patch refactors that to instead use a factory to load built-in CMaps on the main thread and message the data to the worker thread.

This is inspired by other recent work, e.g. the addition of the `CanvasFactory`, and to a large extent on the IRC discussion starting at http://logs.glob.uno/?c=mozilla%23pdfjs&s=12+Oct+2016&e=12+Oct+2016#c53010.
2017-02-16 10:55:35 +01:00
Tim van der Meij
8aad33e8a3 Merge pull request #8065 from timvandermeij/annotation-appearances
Annotations: refactor setting the normal appearance stream
2017-02-15 23:27:40 +01:00
Tim van der Meij
26fc79d51d
Annotations: refactor setting the normal appearance stream
Previously, we had a function called `getDefaultAppearance`. This name,
however, is misleading as the method gets the normal appearance (in the
`N` entry) and not the default appearance (in the `DA` entry). Moreover,
it was not entirely clear how it works just from reading the code. It
primarily lacks comments and explicit error case handling.

This patch improves the situation by fixing the issues mentioned above
and making this function a proper method of the `Annotation` class, just
like e.g., `setColor` and `setBorderStyle`.
2017-02-15 22:42:17 +01:00
Jonas Jenwald
ce072022c1 Always choose a (3, 1) cmap table for TrueType fonts that have an encoding specified, regardless of the Symbolic font flag (bug 1337429)
This patch basically reverts one aspect of TrueType (3, 1) cmap parsing to the state prior to PR 4259. After that PR, a number of regressions occurred in this particular code-path, which necessitated a number of follow-ups such as PRs 5703, 5743, and 6425.
The empirical data suggests, at least to me, that we should always prefer a (3, 1) cmap for TrueType fonts when they have an encoding, regardless of the Symbolic font flag.

Obviously this patch passes all unit/font/reference tests locally, and I made sure that all the PRs mentioned above landed with test-cases included.
However, in my opinion, there's still a very real possibility that this patch could potentially cause new regressions.

Given that the PDF file in bug 1337429 has been broken for almost *three* years before anyone noticed, and considering that the code-path in question has been the source of numerous regressions, I do *not* intend to request uplift of this patch to previous Firefox versions (assuming that it's even accepted).

Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1337429.
2017-02-15 17:38:08 +01:00
Yury Delendik
fa0e559fe2 New node.js check to protect from webpack. 2017-02-14 15:00:52 -06:00
Jonas Jenwald
23c62cc321 Consume the current character when encountering illegal characters in Lexer.getObject, in order to prevent infinite loops during reading of streams (issue 8061)
*Please note:* The rendering of the PDF file in issue 8061 first regressed in PR 7039, and then PR 7493 exacerbated the problem even further by causing an infinite loop.

In this particular case, when errors were encountered inside of the `Lexer.getObject` method *itself*, we didn't advance the stream position. This thus caused an inifinite loop in `parseCMap`, since the exact same character was then parsed over and over again.

Fixes 8061.
2017-02-11 19:32:48 +01:00
pmysore1
af8292058f Font ascent descent calculation fix 2017-02-11 01:25:05 -05:00
Tim van der Meij
466760efca Merge pull request #8056 from Snuffleupagus/ChildNode.remove
Use `ChildNode.remove` instead of `ChildNode.ParentNode.removeChild` in a couple of places (bug 1334831, issue 8008)
2017-02-10 23:17:17 +01:00
Yury Delendik
7d9941d870 Fixes pdf.combined.js for webpack. 2017-02-10 11:24:35 -06:00
Jonas Jenwald
63f13773e7 Use ChildNode.remove instead of ChildNode.ParentNode.removeChild in a couple of places (bug 1334831, issue 8008)
Re: [bug 1334831](https://bugzilla.mozilla.org/show_bug.cgi?id=1334831) and issue 8008.

Note that according to the specification, see https://dom.spec.whatwg.org/#interface-childnode, the `remove` method shouldn't throw.
This is also consistent with e.g. the Firefox implementation, see http://searchfox.org/mozilla-central/rev/d3307f19d5dac31d7d36fc206b00b686de82eee4/dom/base/nsINode.cpp#1852.

Obviously this isn't supported in IE (because that would be too easy), however we can easily polyfill it to avoid having to WONTFIX the bug/issue.
2017-02-10 14:39:50 +01:00
Yury Delendik
32856f0adb Merge pull request #8046 from yurydelendik/webpack
Replacing custom bundling with webpack 2
2017-02-09 16:04:54 -06:00
Yury Delendik
a048519fa1 Replace copyright headers; changes UMD to CommonJS. 2017-02-08 16:35:58 -06:00
Yury Delendik
eb4c88cd44 Replacing custom bundling with webpack2. 2017-02-08 16:32:15 -06:00
vkuryakov
4e181e59ef Interactive forms: values for radio buttons (issue #6995) 2017-02-07 23:42:40 +01:00
Yury Delendik
9b0e0954fb Merge pull request #8036 from mukulmishra18/node-canvas
[api-minor] Fixes behaviour of DOMCanvasFactory to return {canvas, context}.
2017-02-07 07:39:12 -06:00
Tim van der Meij
d3ae5b38ce Merge pull request #8035 from Snuffleupagus/api-disableNativeImageDecoder
[api-minor] Add a `getDocument` parameter that allows disabling of the `NativeImageDecoder` (e.g. for use with Node.js)
2017-02-06 23:37:02 +01:00
Mukul Mishra
41d092d04b Fixes behaviour of DOMCanvasFactory to return {canvas, context}. 2017-02-07 03:47:13 +05:30
Jonas Jenwald
9c34d0aa8c [api-minor] Add a getDocument parameter that allows disabling of the NativeImageDecoder (e.g. for use with Node.js)
Note that I initially tried to add this as a parameter to the `PDFPageProxy.render` method, such that it could be passed to `PartialEvaluator.getOperatorList`.
However, given all the different code-paths that call `getOperatorList` (there's a bunch only in `annotation.js`), this seemed to very quickly become unwieldy and thus difficult to maintain compared to simply using the existing `evaluatorOptions`.
2017-02-06 22:21:34 +01:00
Tim van der Meij
ec26a7e565 Merge pull request #8028 from Snuffleupagus/tests-prevent-console-errors
Prevent browser console errors during testing
2017-02-06 22:04:54 +01:00
Yury Delendik
d842c9c6b0 Merge pull request #8002 from mukulmishra18/refactor-canvas
[api-minor] Fix #7798: Refactor scratch canvas usage.
2017-02-06 07:45:41 -06:00
Mukul Mishra
32817633c9 Fix #7798: Refactor scratch canvas usage.
Fixes extra canvas create calls.
Fixes unnecessary call of `new DOMCanvasFactory`.
Fixes undefined error of DOMCanvasFactory.
Fixes failures in some of the tests.
Fixes expected behaviour.
Remove unused vars.
2017-02-05 20:19:47 +05:30
Jonas Jenwald
e416032b38 Prevent browser console errors during testing
The `Driver._cleanup` method is removing all stylesheets between test runs, which causes "TypeError: styleElement.parentNode is null" console errors in `FontLoader.clear`.

As can also be seen during various tests, some of the changes I made in PR 7972 unfortunately causes console errors.
It seems that I didn't test this properly, since it *should* have been obvious to me that while tests are triggered using Node.js, the files in question are run within the *browser*.
My apologies for not testing this thoroughly, and for causing unnecessary churn in the code!
2017-02-05 13:23:42 +01:00
Jonas Jenwald
bc736fdc7d Adjust the brace-style ESLint rule to disallow single lines (and also enable no-iterator)
See http://eslint.org/docs/rules/brace-style.
Having the opening/closing braces on the same line can often make the code slightly more difficult to read, in particular for `if`/`else if` statements, compared to using new lines.

This patch also, for consistency with `mozilla-central`, enables the [`no-iterator`](http://eslint.org/docs/rules/no-iterator) rule. Note that this rule didn't require a single code change.
2017-02-04 15:53:08 +01:00
Tim van der Meij
6f0cf8c4cb Merge pull request #7972 from Snuffleupagus/eslint_no-unused-vars
Enable the `no-unused-vars` ESLint rule
2017-02-01 23:50:23 +01:00
Tim van der Meij
fe3b64d4ab Merge pull request #8016 from Snuffleupagus/remove-isStream-property
Remove the unused `isStream` property on various `Stream`s
2017-02-01 23:07:40 +01:00
Jonas Jenwald
f7d99ccc26 Remove the unused isStream property on various Streams
This property was added all the way back in PR 542, but hasn't actually been relied upon ever since PR 692.
Note that there's a `isStream()` utility function which replaced the property years ago, hence the `isStream` property is now dead code.
2017-02-01 11:38:11 +01:00
Jonas Jenwald
c102232275 Append the contents of FileAttachment annotations to the attachments view of the sidebar, for easier access to the embedded files
Other PDF viewers, e.g. Adobe Reader, seem to append `FileAttachment`s to their attachments views.
One obvious difference in PDF.js is that we cannot append all the annotations on document load, since that would require parsing *every* page. Despite that, it still seems like a good idea to add `FileAttachment`s, since it's thus possible to access all the various types of attachments from a single place.

*Note:* With the previous patch we display a notification when a `FileAttachment` is added to the sidebar, which thus makes appending the contents of these annotations to the sidebar slightly more visible/useful.
2017-01-31 22:26:16 +01:00
Tim van der Meij
95732279b6 Remove usage of mozFillRule
The non-standard `mozFillRule` has been removed in Firefox 51 [1, 2].
Instead, a parameter of the standard methods should be used. Note that
this is supported in all major browsers for a long time now, so there
should be no need keeping this Firefox-specific code around.

[1] https://developer.mozilla.org/en-US/Firefox/Releases/51
[2] https://bugzilla.mozilla.org/show_bug.cgi?id=826619
2017-01-29 23:24:44 +01:00
Jonas Jenwald
52e0f51917 Enable the no-unused-vars ESLint rule
Please see http://eslint.org/docs/rules/no-unused-vars; note that this patch purposely uses the same rule options as in `mozilla-central`, such that it fixes part of issue 7957.

It wasn't, in my opinion, entirely straightforward to enable this rule compared to the already existing rules. In many cases a `var descriptiveName = ...` format was used (more or less) to document the code, and I choose to place the old variable name in a trailing comment to not lose that information.

I welcome feedback on these changes, since it wasn't always entirely easy to know what changes made the most sense in every situation.
2017-01-29 23:23:17 +01:00
Jonas Jenwald
50c2856097 Move EOF/isEOF from core/parser.js to core/primitives.js
Given the nature of `EOF` and `isEOF`, it seems to me that they really ought to be placed in `core/primitives.js` instead.

In general, it doesn't seem great to have to depend on the entire `core/parser.js` file for such simple primitives/helper functions.
In particular, while `core/ps_parser.js` is completely separate from `core/parser.js` with regards to its function, it still depends on the latter for just *one* primitive.

Note that compared to e.g. PR 7389, this will not reduce the number of dependencies for `core/ps_parser`, however the new dependency IMHO makes more sense.
2017-01-27 13:37:48 +01:00
Jonas Jenwald
af42c266e7 Merge pull request #7987 from yurydelendik/noopener
[api-minor] Adds noopener and nofollow to rel attribute of hyperlinks.
2017-01-26 20:54:09 +01:00
Jonas Jenwald
f000417ce0 [Firefox addon] Stop bundling src/core/network.js into the FIREFOX/MOZCENTRAL builds (PR 7322 follow-up)
PR 7322 added the `PdfJsNetwork.jsm` file, instead of the general `src/core/network.js` file for the Firefox addon. However, `make.js` wasn't updated to actually stop including the now obsolete network file.
2017-01-23 22:23:17 +01:00
Jonas Jenwald
f77c52291e Enable the no-empty-pattern/no-floating-decimal/no-self-compare/no-delete-var/no-new-object ESLint rules
The following rules required no code changes:
http://eslint.org/docs/rules/no-empty-pattern
http://eslint.org/docs/rules/no-floating-decimal
http://eslint.org/docs/rules/no-delete-var
http://eslint.org/docs/rules/no-new-object

There was just one change needed in order to enable:
http://eslint.org/docs/rules/no-self-compare; which I think helps readability a lot, since that comparison makes no sense until you realize that we push `NaN` onto the `stack` in some cases *and* furthermore that `NaN !== NaN`.
2017-01-23 20:30:50 +01:00
Yury Delendik
fc2d8c15e8 Adds noopener and nofollow to rel attribute of hyperlinks. 2017-01-23 10:34:27 -06:00
Tim van der Meij
1948a53ebb Merge pull request #7973 from Snuffleupagus/eslint_spaced-comment
Enable the `spaced-comment` ESLint rule
2017-01-22 21:58:42 +01:00
Tim van der Meij
17dd2e6b20 Merge pull request #7979 from Snuffleupagus/eslint-more-rules
Enable the `no-unsafe-finally`/`no-octal`/`no-useless-call` ESLint rules
2017-01-22 20:53:26 +01:00
Mukul Mishra
4e38200030 Fix #7978: Fixes ESLint yoda rule for the URL polyfill. 2017-01-21 22:47:28 +05:30
Jonas Jenwald
82ea7e6e6e Enable the no-unsafe-finally/no-octal/no-useless-call ESLint rules
http://eslint.org/docs/rules/no-unsafe-finally, there's just one violation which in this case can actually be ignored since there's nothing `return`ed there.
http://eslint.org/docs/rules/no-octal, there're no violations in the code-base.
http://eslint.org/docs/rules/no-useless-call, there's just one violation that needs to be fixed.
2017-01-21 17:15:57 +01:00
Jonas Jenwald
4626fc8342 Enable the spaced-comment ESLint rule
Please see http://eslint.org/docs/rules/spaced-comment.

Note that the exceptions added for `line` comments are intended to still allow use of the old preprocessor without linting errors.
Also, I took the opportunity to improve the grammar slightly (w.r.t. capitalization and punctuation) for comments touched in the patch.
2017-01-19 16:41:59 +01:00
Tim van der Meij
1fda987a4c Merge pull request #7904 from Snuffleupagus/issue-7901
Further adjust the heuristics used to detect OpenType font files with CFF data, to ensure that all Type0 fonts are handled the same way regardless of font Subtype (issue 7901)
2017-01-12 21:55:57 +01:00
Syed Abdullah
857a5da8f1 Fix inverted calculation of RTL text percentage in bidi. 2017-01-12 23:54:06 +08:00
Yury Delendik
393740e2ae Merge pull request #7869 from PedroPachecoInf/master
Fixes issue #6071 - TIFF with 1 bit-depth
2017-01-10 12:37:26 -06:00
jazzchipc
493853031b Fixes issue #6071.
Corrects readBlockTiff() case for 1-bit depth and 1 color TIFF images incorporated in the PDF.

Adds reference test for PDF used to fix this issue.
2017-01-10 16:42:43 +00:00
Yury Delendik
77b7b84d1e Removes rest of 'no-else-return' comments. 2017-01-09 19:13:36 -06:00
Jonas Jenwald
642d8621ef Replace direct lookup of uniquePrefix/idCounters, in Page instances, with an idFactory containing an createObjId method instead
We're currently making use of `uniquePrefix`/`idCounters` in multiple files, to create unique object id's, and adding a new occurrence of them requires some care to ensure that an object id isn't accidentally reused.
Furthermore, having to pass around multiple parameters as we currently do seem like something you want to avoid.

Instead, this patch adds a factory which means that there's only *one* thing that needs to be passed around. And since it's now only necessary to call a method in order to obtain a unique object id, the details are thus abstracted away at the call-sites which avoids accidental reuse of object id's.

To test that this works as expected a very simple `Page` unit-test is added, and the existing `Annotation layer` tests are also adjusted slightly.
2017-01-09 23:16:25 +01:00
Jonas Jenwald
4046d67fde Enable the no-else-return ESLint rule
Using `else` after `return` is not necessary, and can often lead to unnecessarily cluttered code. By using the `no-else-return` rule in ESLint we can avoid this pattern, see http://eslint.org/docs/rules/no-else-return.
2017-01-09 20:27:39 +01:00
Jonas Jenwald
14b8523314 Refactor the password handling so that it's stored in the PdfManagers, instead of in the XRef
We're already passing in a, currently unused, `PdfManager` instance when initializing the `XRef`. To avoid having to pass a single `password` parameter around, we could thus simply get the `password` through the `PdfManager` instance instead.
2017-01-03 20:29:52 +01:00
Jonas Jenwald
27513cd23b [api-minor] Ensure that the getDocument Promise is rejected if the loadingTask is destroyed, or an Error is thrown, inside of the onPassword callback (issue 7806)
This patch also removes the `UpdatePassword` message, in favour of using the `sendWithPromise` method of `MessageHandler`.
Furthermore, the patch also refactors the `BasePdfManager_updatePassword`/`BasePdfManager_passwordChanged` methods (in pdf_manager.js), and the `pdfManagerReady` function (in worker.js).
2017-01-03 20:29:46 +01:00
Jonas Jenwald
ddea9a6b04 Improve the handling of Encoding dictionary, with Differences array, in PartialEvaluator_preEvaluateFont
I recently happened to look at the code I wrote for PR 5964, which fixed [bug 1157493](https://bugzilla.mozilla.org/show_bug.cgi?id=1157493), and I quickly realized that the solution is way too simplistic.
The fact that only using the `length` of a `Differences` array worked seems more like a happy accident for a particular set of font data, but could just as easily be incorrect for other PDF files.

Note that in practice, the case where the `Encoding` entry is a regular `Dict` (and not a `Ref` or `Name`) is very rare, hence I don't think that we really need to worry about having to reparse this data.
Also, the performance of this code-block is quite a bit better by updating the `hash` with the data from the *entire* `Differences` array, instead of at every loop iteration.
2016-12-28 21:32:54 +01:00
Jonas Jenwald
e963971244 Further adjust the heuristics used to detect OpenType font files with CFF data, to ensure that all Type0 fonts are handled the same way regardless of font Subtype (issue 7901)
Changing this particular code makes me somewhat nervous about regressions, since PR 5770 necessitated the follow-up PR 6270.
However, the patch passes all tests added in those PRs (and obviously all other tests). Furthermore, I've manually checked all the issues/bugs referenced in PRs 5770 and 6270 without finding any issues.

**Please note:** This patch fixes *only* the font bug, not the SVG conversion, present on pages two and three of the PDF file in issue 7901.
2016-12-20 17:03:51 +01:00
Yury Delendik
3b3a179486 Merge pull request #7879 from rossj/highlight-fix
Make use of textAdvanceScale consistent during combineTextItems. Fix for #7878.
2016-12-19 09:18:13 -06:00
Tim van der Meij
a428899b3c Button widget annotations: improve unit tests, simplify code and remove labels
Modern browsers support styling radio buttons and checkboxes with CSS.
This makes the implementation much easier, and the fallback for older
browsers is still decent.
2016-12-17 20:38:48 +01:00
Tim van der Meij
77148c7880 Button widget annotations: implement radio button value fetching according to the specification 2016-12-17 20:34:32 +01:00
Tim van der Meij
0c9a06c020 Button widget annotations: implement reference testing
Moreover, ensure that the read-only state is respected and improve CSS
names.
2016-12-17 20:33:35 +01:00
benweet
ba012c7a68 Button widget annotations: implement checkboxes and radio buttons 2016-12-17 20:31:30 +01:00
Jonas Jenwald
bd91f34513 Ensure that we handle indirect objects in all types of Opt entries in ChoiceWidget annotation dictionaries
I haven't got an example where the current code breaks, but given all the previous cases we've seen where PDF generators use indirect objects in Arrays it makes sense to fix this pro-actively.
I've modified the relevant unit-tests slightly, and they would *not* pass without the code changes in this patch.

*Note:* `Dict_getArray` only dereferences Array elements on the "top-level", to avoid recursion issues. Furthermore if you have to loop through the Array at the call-site anyway, then using `Dict_get` in combination with `XRef_fetchIfRef` is a tiny bit more efficient.
2016-12-17 13:44:20 +01:00
Jonas Jenwald
c850968fa7 Remove globals that are now unnecessary thanks to the use of various ESLint environments (e.g. Node, ShellJS, Jasmine) 2016-12-16 21:09:55 +01:00
Jonas Jenwald
2f3805efbc Switch to using ESLint, instead of JSHint, for linting
*Please note that most of the necessary code adjustments were made in PR 7890.*

ESLint has a number of advantageous properties, compared to JSHint. Among those are:
 - The ability to find subtle bugs, thanks to more rules (e.g. PR 7881).
 - Much more customizable in general, and many rules allow fine-tuned behaviour rather than the just the on/off rules in JSHint.
 - Many more rules that can help developers avoid bugs, and a lot of rules that can be used to enforce a consistent coding style. The latter should be particularily useful for new contributors (and reduce the amount of stylistic review comments necessary).
 - The ability to easily specify exactly what rules to use/not to use, as opposed to JSHint which has a default set. *Note:* in future JSHint version some of the rules we depend on will be removed, according to warnings in http://jshint.com/docs/options/, so we wouldn't be able to update without losing lint coverage.
 - More easily disable one, or more, rules temporarily. In JSHint this requires using a numeric code, which isn't very user friendly, whereas in ESLint the rule name is simply used instead.

By default there's no rules enabled in ESLint, but there are some default rule sets available. However, to prevent linting failures if we update ESLint in the future, it seemed easier to just explicitly specify what rules we want.
Obviously this makes the ESLint config file somewhat bigger than the old JSHint config file, but given how rarely that one has been updated over the years I don't think that matters too much.

I've tried, to the best of my ability, to ensure that we enable the same rules for ESLint that we had for JSHint. Furthermore, I've also enabled a number of rules that seemed to make sense, both to catch possible errors *and* various style guide violations.

Despite the ESLint README claiming that it's slower that JSHint, https://github.com/eslint/eslint#how-does-eslint-performance-compare-to-jshint, locally this patch actually reduces the runtime for `gulp` lint (by approximately 20-25%).

A couple of stylistic rules that would have been nice to enable, but where our code currently differs to much to make it feasible:
 - `comma-dangle`, controls trailing commas in Objects and Arrays (among others).
 - `object-curly-spacing`, controls spacing inside of Objects.
 - `spaced-comment`, used to enforce spaces after `//` and `/*. (This is made difficult by the fact that there's still some usage of the old preprocessor left.)

Rules that I indend to look into possibly enabling in follow-ups, if it seems to make sense: `no-else-return`, `no-lonely-if`, `brace-style` with the `allowSingleLine` parameter removed.

Useful links:
 - http://eslint.org/docs/user-guide/configuring
 - http://eslint.org/docs/rules/
2016-12-16 21:06:36 +01:00
Ross Johnson
4537590033 Consitently apply textAdvanceScale during building of textContentItems for improved highlighting. Fixes #7878. 2016-12-14 21:02:19 -06:00
Jonas Jenwald
28e50cfa21 Fix errors reported by the space-infix-ops ESLint rule
http://eslint.org/docs/rules/space-infix-ops
2016-12-12 20:36:00 +01:00
Jonas Jenwald
68bf47d55d Fix errors reported by the space-before-function-paren ESLint rule
http://eslint.org/docs/rules/space-before-function-paren
2016-12-12 20:35:59 +01:00
Jonas Jenwald
551eb263e3 Fix errors reported by the semi-spacing ESLint rule
http://eslint.org/docs/rules/semi-spacing
2016-12-12 20:35:58 +01:00
Jonas Jenwald
efbb1e9b1c Fix errors reported by the new-cap ESLint rule
http://eslint.org/docs/rules/new-cap
2016-12-12 20:35:57 +01:00
Jonas Jenwald
c36468cbce Fix errors reported by the keyword-spacing ESLint rule
http://eslint.org/docs/rules/keyword-spacing
2016-12-12 20:35:56 +01:00
Jonas Jenwald
86ba634c97 Fix errors reported by the key-spacing ESLint rule
http://eslint.org/docs/rules/key-spacing
2016-12-12 20:35:55 +01:00
Jonas Jenwald
ad915f8af1 Fix errors reported by the comma-spacing ESLint rule
http://eslint.org/docs/rules/comma-spacing
2016-12-12 20:35:53 +01:00
Jonas Jenwald
66d2637b3f Fix errors reported by the yoda ESLint rule
http://eslint.org/docs/rules/yoda
2016-12-12 20:35:52 +01:00
Jonas Jenwald
3820946301 Fix (most) errors reported by the no-multi-spaces ESLint rule
http://eslint.org/docs/rules/no-multi-spaces
2016-12-12 20:35:51 +01:00
Jonas Jenwald
25bf5db47e Fix errors reported by the no-extra-boolean-cast ESLint rule
http://eslint.org/docs/rules/no-extra-boolean-cast
2016-12-12 20:26:18 +01:00
Jonas Jenwald
fb5e756683 Fix errors reported by the no-cond-assign ESLint rule
http://eslint.org/docs/rules/no-cond-assign
2016-12-12 20:26:06 +01:00
Tim van der Meij
00a006e466 Merge pull request #7705 from Snuffleupagus/issue-2594
Move symbolic font glyphs to private use area if they don't have unicode mappings (issue 2594, bug 789074, bug 865644)
2016-12-10 21:30:28 +01:00
Tim van der Meij
47f03b619f Merge pull request #7873 from timvandermeij/mediabox-cropbox-indirect
Document: handle indirect objects in `MediaBox` and `CropBox` entries
2016-12-08 23:59:45 +01:00
Tim van der Meij
3800b5e463 Document: extract CropBox fetching and validation into a getter
This patch refactors the `CropBox` code to combine fetching and
validation code in a getter, like we already did for the `MediaBox`
property. Combined with variable name changes, this improves readability
of the code and makes the `view` getter simpler as well.
2016-12-08 22:44:53 +01:00
Jonas Jenwald
9be3aee9c9 Add a parameter to Page_getInheritedPageProp to make it possible to fetch (and dereference) Arrays, and use that for the MediaBox/CropBox getters (issue 7872) 2016-12-08 22:03:42 +01:00
Jonas Jenwald
b4ac6bd2f6 Ensure that we resolve indirect objects in Filter and DecodeParms arrays in parser.js
I've not actually, thus far, come across a PDF file that this patch fixes. However, given the string of recent patches that has fixed issues with indirect objects in arrays, I think that it makes sense to proactively avoid any issues in this code.
2016-12-08 11:55:08 +01:00
Jonas Jenwald
77bcc9232e Remove a misplaced false from a condition in fixMetadata, in metadata.js, since it currently short circuits the entire condition
This looks to me like a simple oversight, which has existed ever since PR 1598 all the way back in 2012.
2016-12-07 22:51:46 +01:00
Jonas Jenwald
94ddd8f61d Merge pull request #7863 from timvandermeij/colorspace
Colorspace: refactoring to prevent unnecessary creation of intermediate arrays
2016-12-06 11:18:53 +01:00
Tim van der Meij
90d94815ad Colorspace: miscellaneous improvements
- Remove an unnecessary check and assignment.
- Clean up code regarding mode setting (no need for a member variable).
- Indent two methods correctly.
2016-12-02 16:47:39 +01:00
Tim van der Meij
c5c0a00dca Colorspace: reduce duplication in AlternateCS.getRgbBuffer 2016-12-02 16:42:22 +01:00
Tim van der Meij
ef653d952b Colorspace: optimize default color initialization
This patch avoids the creation of extra arrays when initializing an
array with default (zero) values. Doing this additionally makes the code
more readable by allocating enough space for the number of color
components.
2016-12-02 16:42:22 +01:00
Jonas Jenwald
c5b06cb40d Ensure that PartialEvaluator_extractWidths is able to handle indirect objects in all kinds of "width" data (issue 7855)
Fixes 7855.
2016-11-29 20:49:07 +01:00
Jonas Jenwald
451956c0b1 Merge pull request #7628 from Snuffleupagus/issue-7580
Fallback to the `StandardEncoding` for Nonsymbolic fonts without `/Encoding` entry (issue 7580)
2016-11-29 12:37:36 +01:00
Jonas Jenwald
013f69e65f Merge pull request #7700 from Snuffleupagus/non-embedded-NuptialScript
Improve rendering of non-embedded NuptialScript font
2016-11-29 11:00:21 +01:00
Jonas Jenwald
c6008b4d7c Fix the JSDoc comment for Catalog.parseDestDictionary 2016-11-27 11:18:18 +01:00
Tim van der Meij
424fc2df4f Merge pull request #7846 from timvandermeij/bidi-types
Bidi: import Unicode types from the specification
2016-11-24 22:59:31 +01:00
Tim van der Meij
995be19378 Bidi: skip invalid Unicode character to make indexing work
For Arabic characters, the Unicode character codes are mapped to Unicode
character types using the character codes for indexing. However, the
character code 0x061D is undefined (and therefore invalid) in the
Unicode standard. The imported list does not contain this entry, but not
having it in the list breaks indexing for items after it. Therefore, put
an empty string on its position to make indexing work properly and issue
a warning in the unlikely event that we encounter this character.
2016-11-24 22:13:12 +01:00
Tim van der Meij
11839f018f Bidi: import Unicode types from the specification
Mention the specification in the comments for future reference. These
types have been imported from the CSV source.
2016-11-24 21:08:31 +01:00
Tim van der Meij
9ff19985c0 Merge pull request #7832 from seanburke-wf/expose-userunit-on-page
Expose the optional UserUnit entry as a page property
2016-11-22 21:18:57 +01:00
Jonas Jenwald
3170a4c40a Improve rendering of non-embedded NuptialScript font
*This patch fixes something that I noticed while debugging https://bugzilla.mozilla.org/show_bug.cgi?id=1308536.*

The PDF file contains a font called "NuptialScript", which unfortunately is not embedded. Since that is a non-standard font we will not be able to render it entirely correct. However, by adding "NuptialScript" to the `getNonStdFontMap`, we can at least improve the rendering slightly by using an italic (serif) fallback font.
2016-11-22 17:56:17 +01:00
Sean Burke
f76cd2ce43 Expose the optional UserUnit entry as a page property 2016-11-22 09:18:19 -07:00
Jonas Jenwald
d3043167de Correctly detect more cases of non-embedded Arial Black fonts (issue 7835)
This patch adds support for non-embedded Arial Black fonts, that use a `Arial-Black...` format for the font names.
Also, this patch changes `canvas.js` such that we always render Arial Black fonts with the maximum weight, which actually improves a number of existing test-cases. This should thus explain the test "failures", which are clear improvements compared with e.g. Adobe Reader.

Fixes 7835.
2016-11-22 13:56:21 +01:00
Yury Delendik
f7d6f3a739 Adds SVG rendering capabilities to the PDFViewer. 2016-11-18 13:03:49 -06:00
Jonas Jenwald
a930f9af15 For commands with with too few arguments, clear out args if it's an Array instead of replacing it with null in EvaluatorPreprocessor_read (issue 7804)
For `PartialEvaluator_getTextContent`, the same `args` Array should be re-used for every `EvaluatorPreprocessor_read` call. Hence we want to ensure that it's not accidentally replaced with `null` in `EvaluatorPreprocessor_read`, since otherwise corrupt PDF files (with too few arguments for certain commands) will cause errors in `PartialEvaluator_getTextContent`.

Perhaps a micro-optimization, but this patch also changes two `!args` comparisons to `args === null`, since that should be a tiny bit more efficient.
2016-11-16 10:20:29 +01:00
Mukul Mishra
6ce2be98b7 Fix #7701: additional check for http/https protocols to fix unsafe header request.
add missing ! and removed trailing whitespaces.
2016-11-14 11:39:10 +05:30
Jonas Jenwald
6d8a404a9c [api-minor] Add support for a couple of white-listed JavaScript actions that contains valid URLs (issue 3897, bug 843699)
By only allowing very specific type of `JavaScript` actions, and also utilizing the existing `URL` validation, this patch shouldn't pose too much risk.

Fixes one of the points in issue 3897 (with the PDF file taken from issue 3438).
Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=843699 (probably, since that bug doesn't contain a test-case).
2016-11-08 16:48:27 +01:00
Jonas Jenwald
b4100ba651 Merge pull request #7698 from Snuffleupagus/bug-1308536
Ignore reserved commands when parsing operands in `CFFParser_parseDict`, instead of just rejecting the entire font (bug 1308536)
2016-11-03 23:53:14 +01:00
Jonas Jenwald
0844a72b4d Add a bit more validation to Catalog_readPageLabels, to ensure that the Page Labels are well formed 2016-11-03 20:08:06 +01:00
Jonas Jenwald
2d8d8b5e53 Use stringToPDFString to sanitizing bad "Prefix" entries in Page Label dictionaries
It seems that certain bad PDF generators can create badly encoded "Prefix" entries for Page Labels, one example being http://ukjewishfilm.org/wp-content/uploads/2015/09/Jewish-Film-Festival-Programme-ONLINE.pdf.

Unfortunately I didn't come across such a PDF file while adding the API support for Page Labels, but with them now being used in the viewer I just found this issue. With this patch, we now display the Page Labels in the same way as Adobe Reader.
2016-11-03 19:48:08 +01:00
Jonas Jenwald
9dc6463933 Ignore reserved commands when parsing operands in CFFParser_parseDict, instead of just rejecting the entire font (bug 1308536)
According to the CFF specification, see http://partners.adobe.com/public/developer/en/font/5176.CFF.pdf#page=11, certain commands are currently reserved.

Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1308536.
2016-11-03 12:50:40 +01:00
Tim van der Meij
9f8d67475e Merge pull request #7775 from timvandermeij/widget-annotation-name
Widget annotation: implement field name according to the specification
2016-11-02 22:43:17 +01:00
Tim van der Meij
1d96854019 Widget annotation: implement field name according to the specification
The original code is difficult to read and, more importantly, performs
actions that are not described in the specification. It replaces empty
names with a backtick and an index, but this behavior is not described
in the specification. While the specification is not entirely clear
about what should happen in this case, it does specify that the `T`
field is optional and that multiple field dictionaries may have the same
fully qualified name, so to achieve this it makes the most sense to
ignore missing `T` fields during construction of the field name. This is
the most specification-compliant solution and, judging by opened issue #6623, also the required and expected behavior.
2016-11-02 21:44:44 +01:00
Tim van der Meij
6e22b32372 Merge pull request #7745 from Snuffleupagus/Launch-actions
[api-minor] Add basic support for `Launch` actions (issue 1778, issue 3897, issue 6616)
2016-11-01 21:12:08 +01:00
Tim van der Meij
5194e68134 Lint: correct code style violations
Manual observations and working with other linting tools found these.
2016-11-01 15:04:21 +01:00
Jonas Jenwald
81b9d553cf Add TeX-specific glyph names to glyphlist.js to improve both glyph mapping and text selection for mathematic fonts (issue 2594) 2016-10-26 16:39:58 +02:00
Brendan Dahl
8d036faf40 Move symbolic font glyphs to private use area if they don't have unicode mappings. 2016-10-26 16:39:21 +02:00
Jonas Jenwald
d5e3b2fbf0 Update PDFOutlineViewer_bindLink to look more like LinkAnnotationElement_bindLink 2016-10-23 16:08:26 +02:00
Jonas Jenwald
2b79782377 [api-minor] Add basic support for Launch actions (issue 1778, issue 3897, issue 6616)
In general we neither want, nor can, support arbitrary `Launch` actions. But in practice, all the cases we've seen so far just contains relative URLs to other PDF files. Building on PR 7689, we can thus at least support basic `Launch` actions.
2016-10-21 13:40:32 +02:00
Jonas Jenwald
d284cfd5eb [api-minor] Add support for relative URLs, in both annotations and the outline, by adding a docBaseUrl parameter to PDFJS.getDocument (bug 766086)
Note that in `FIREFOX/MOZCENTRAL/CHROME` builds of the standard viewer the `docBaseUrl` parameter will be set by default, since in that case it makes sense to use the current URL as a base.
For the `GENERIC` viewer, or the API itself, it doesn't make sense to try and set the `docBaseUrl` by default. However, custom deployments/implementations may still find the parameter useful.
2016-10-19 22:20:24 +02:00
Jonas Jenwald
71a781ee5c Deprecate the isValidUrl utility function and replace it with createValidAbsoluteUrl/isValidProtocal functions instead, since the main URL validation is now done using the new URL constructor 2016-10-19 22:11:22 +02:00
Jonas Jenwald
42f07c6262 [api-minor] Use the new URL constructor when validating URLs in annotations and the outline, as a complement to only checking the protocol, and add a bit more validation to Catalog_parseDestDictionary
Note that this will automatically reject any relative URL.
To make the API more useful to consumers, URLs that are rejected will be available via the `unsafeUrl` property in the data object returned by `PDFPageProxy_getAnnotations`.

The patch also adds a bit more validation of the data for `Named` actions.
2016-10-19 22:11:17 +02:00
Jonas Jenwald
e64bc1fd13 Move parsing of destination dictionaries to a helper function
This not only reduces code duplication, but it also allow us to easily support the same kind of URLs we currently do for Link annotations in the Outline as well.
2016-10-18 16:14:07 +02:00
Yury Delendik
1236b27993 Removes SVG this.cgrp usages. 2016-10-17 16:09:24 -05:00
Yury Delendik
273d2de6ec Merge pull request #7715 from timvandermeij/svg-groups
SVG: optimize and refactor group creation code
2016-10-17 10:10:47 -05:00
Jonas Jenwald
2ce9da9b7a Fix a couple of JSDoc @typedefs to use @property (instead of @param) to fix some missing documentation when running gulp jsdoc 2016-10-17 13:04:55 +02:00
Tim van der Meij
426fc454de SVG: factor out initialization code into a private method
Each well-formed SVG image has the following structure:

SVG element
- Definitions element
- Root group
  - Other group 1
  - ...
  - Other group n

This patch factors out initialization code into a private method in such
as way that the creation of this structure is clear from the code. The
root group is the replacement for the parent group from before. We need
this group as we cannot apply the viewport transform on the SVG element
itself (this caused issues in Chrome). If other code appends groups to
the SVG image, in reality it is appending those groups to the root
group, but this detail is abstracted away by this patch.
2016-10-15 21:45:44 +02:00
Tim van der Meij
fa90573c4b SVG: optimize transform group creation
This patch ensures that we only create transformation groups when it is
actually required and that we re-use transform groups as much as possible.
It reduces the number of transform groups for the Tracemonkey paper from
2790 to 1271, thereby making the DOM much lighter and rendering/scrolling
smoother. Moreover, it simplifies the code and prevents duplication.

Finally, we issue a warning when an unimplemented graphic state is
encountered. Before, this was ignored silently, making debugging harder.
2016-10-15 21:43:12 +02:00
Tim van der Meij
2e20000b71 Merge pull request #7727 from Snuffleupagus/parser-stream-decodeParms
Let `Parser_makeFilter` pass in the `DecodeParms` data to various image `Stream`s, instead of re-fetching it in various `[...]Stream.prototype.ensureBuffer` methods
2016-10-15 20:04:17 +02:00
Yury Delendik
ea5949f1fd Merge pull request #7668 from Snuffleupagus/issue-7665
Prevent an infinite loop in `XRef_fetchUncompressed` for encrypted PDF files with indirect objects in the /Encrypt dictionary (issue 7665)
2016-10-15 10:52:08 -05:00
Jonas Jenwald
c8f83d6487 Let Parser_makeFilter pass in the DecodeParms data to various image Streams, instead of re-fetching it in various [...]Stream.prototype.ensureBuffer methods
In `Parser_filter` the `DecodeParms` data is fetched and passed to `Parser_makeFilter`, where we also make sure that a `Ref` is resolved to a direct object.
We can thus pass this along to the various image `Stream` constructors, to avoid the current situation where we lookup/resolve data that is already available.
Note also that we currently do *not* handle the case where `DecodeParms` is an Array entirely correct in the various image `Stream`s, and this patch fixes that for free.
2016-10-15 12:09:51 +02:00
Jonas Jenwald
1da59bec9b Remove a remaining old-style preprocessor from src/core/fonts.js (PR 7322 follow-up)
Note that this code was added *after* PR 7322 was opened, which thus explains why it was missed during rebasing.
2016-10-15 11:33:09 +02:00
Yury Delendik
0576c9c6c6 Replaces all preprocessor directives with PDFJSDev calls. 2016-10-14 10:57:53 -05:00
Chas Emerick
85c52f1fd6 Fix getTextContent evaluation to only apply TJ horizontal offsets using numeric items/args
While the array argument to TJ should only contain strings and numbers, other
unfortunate items are found in PDFs in the wild, e.g.:

[(Grandes) 0.0 Tc
-250.0 (Client\350les,) 0.0 Tc
-250.0 (Financements) 0.0 Tc
-250.0 (et) 0.0 Tc
-250.0 (March\351s) ] TJ

getOperatorList already properly ignores any non-string, non-numeric values in
TJ arrays; without this patch to getTextContent, returned text items can have
NaN widths due to calculations being applied to those non-numeric values.
2016-10-13 08:08:31 -04:00
Yury Delendik
e336604ef1 Disables Font Loading API for Firefox. 2016-10-06 09:30:18 -05:00
Tim van der Meij
9b3a91f365 Merge pull request #7671 from timvandermeij/interactive-forms-choice-fields
Interactive forms: render choice widget annotations
2016-10-05 23:27:45 +02:00
Tim van der Meij
d5d9f362aa Choice widget annotations: core and display layer implementation 2016-10-05 21:25:29 +02:00
Yury Delendik
7b2a9ee4e0 Merge pull request #7670 from Snuffleupagus/Parser_makeFilter-maybeLength
Only skip parsing a stream in `Parser_makeFilter` when we know for sure that it is empty (PR 6372 follow-up)
2016-10-05 10:38:12 -05:00
Jonas Jenwald
54ee83eb12 Attempt to skip zero bytes at the end of Scan blocks when decoding JPEG images (issue 4090) 2016-09-28 16:31:02 +02:00
Jonas Jenwald
116ba19dd9 Respect the 'ColorTransform' entry in the image dictionary when decoding JPEG images (bug 956965, issue 6574)
Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=956965.
Fixes 6574.
2016-09-26 21:55:43 +02:00
Jonas Jenwald
a22f0ae820 Only skip parsing a stream in Parser_makeFilter when we know for sure that it is empty (PR 6372 follow-up)
For PDF files with multiple `/Filter`s, where the `/Length` entry is zero, we fail to render the file correctly. The reason is that `maybeLength` is `null` for the every filter except the first, and `!maybeLength` is thus truthy.
Hence it seems that we should completely ignore the `/Length` entry and also explicitly check `maybeLength === 0`.

Note that I've not (yet) come across a PDF file with this issue in the wild, but given all the stupid things PDF generators do I wouldn't be surprised if such a file actually exists. In order to prevent a possible future bug, I'm submitting this patch which includes a hand-edited PDF file that we currently cannot render correctly (but e.g. Adobe Reader can).
2016-09-25 12:40:15 +02:00
Jonas Jenwald
3e77cf6b32 Prevent an infinite loop in XRef_fetchUncompressed for encrypted PDF files with indirect objects in the /Encrypt dictionary (issue 7665) 2016-09-25 00:18:47 +02:00
Jonas Jenwald
6c263c1994 Merge pull request #7649 from timvandermeij/interactive-forms-tx-comb
Text widget annotations: implement comb support
2016-09-22 11:36:30 +02:00
Tim van der Meij
375229d6b9 Widget annotations: simplify field flag handling
Directly use the hexadecimal representation, just like the
`AnnotationFlags`, to avoid calculations and to improve readability.
This allows us to simplify the unit tests for text widget annotations as
well.
2016-09-21 21:11:37 +02:00
Jonas Jenwald
5f16cbd2c0 When rendering forms, don't use element.value since it prevents the AnnotationLayer rasterizer (in test/driver.js) from parsing the elements correctly
Without this, the reference test images will have empty fields despite the viewer working as intended.
2016-09-21 12:33:04 +02:00
Jonas Jenwald
ded01356c7 Pass in the renderInteractiveForms parameter to Annotation_appendToOperatorList, in Page_getOperatorList, instead of to the Annotation constructor (PR 7633 follow-up)
When debugging issue 7643, I noticed that the `forms` tests currently doesn't look like the rendering in the viewer (with `renderInteractiveForms = true` set).
After scratching my head for a little while, I realized that PR 7633 make the implicit assumption that `Page_getOperatorList` (in `core/document.js`) is called *before* fetching the annotation with `PDFPageProxy_getAnnotations` (in `display/api.js`).

Hence this patch, that changes it so that we instead pass in the `renderInteractiveForms` parameter to `Annotation_appendToOperatorList` to ensure that it's always correctly set.
2016-09-21 12:21:20 +02:00
Tim van der Meij
6100ab4b18 Text widget annotations: implement comb support 2016-09-20 22:31:10 +02:00
Brendan Dahl
15e1ae4e3f Merge pull request #7639 from Snuffleupagus/bug-1252420
Replace empty CharStrings with '.notdef' in `Type1Font_wrap` to prevent OTS from rejecting the font (bug 1252420)
2016-09-20 11:56:47 -07:00
Tim van der Meij
ab1b4cec5d Merge pull request #7640 from timvandermeij/interactive-forms-rm-global
Interactive forms: remove global PDFJS usage
2016-09-19 01:02:44 +02:00
Tim van der Meij
2da2c45889 Interactive forms: remove global PDFJS usage 2016-09-19 00:12:42 +02:00
Jonas Jenwald
170871ab3d Prevent rendering TextWidgetAnnotations in both the core/display layer (issue 7643) 2016-09-18 15:42:22 +02:00
Tim van der Meij
f062695d62 Merge pull request #7633 from timvandermeij/interactive-forms-tx-flags
Text widget annotations: support read-only/multiline fields and improve testing
2016-09-17 17:19:47 +02:00
Tim van der Meij
dbea302a6e Text widget annotations: do not render on canvas as well
If interactive forms are enabled, then the display layer takes care of
rendering the form elements. There is no need to draw them on the canvas
as well. This also leads to issues when values are prefilled, because
the text fields are transparent, so the contents that have been rendered
onto the canvas will be visible too.

We address this issue by passing the `renderInteractiveForms` parameter
to the render task and handling it when the page is rendered (i.e., when
the canvas is rendered).
2016-09-17 15:24:48 +02:00
Tim van der Meij
adf0972ca5 Text widget annotations: improve unit and reference tests
This patch improves the unit tests by testing the support for read-only
and multiline fields. Moreover, we add a reference test to ensure that
the text widgets are not only rendered, but also that their contents are
styled properly.

Finally, we perform minor improvements in `src/core/annotation.js`, for
example adding missing comments.
2016-09-17 15:24:48 +02:00
Tim van der Meij
f6965fadc0 Text widget annotations: support multiline and read-only fields
Moreover, this patch provides us with a framework for handling field
flags in general for all types of widget annotations.
2016-09-17 15:24:47 +02:00
Jonas Jenwald
aadcbe98c8 Replace empty CharStrings with '.notdef' in Type1Font_wrap to prevent OTS from rejecting the font (bug 1252420)
Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1252420.
2016-09-17 14:39:10 +02:00
Jonas Jenwald
4acd31f51e Merge pull request #7550 from Snuffleupagus/Type1-toUnicode-builtInEncoding-fallback
For embedded Type1 fonts without included `ToUnicode`/`Encoding` data, attempt to improve text selection by using the `builtInEncoding` to amend the `toUnicode` map (issue 6901, issue 7182, issue 7217, bug 917796, bug 1242142)
2016-09-16 17:51:55 +02:00
Tim van der Meij
26da2d57ce Merge pull request #7632 from Snuffleupagus/more-efficient-expandTextDivs
[EnhanceTextSelection] Make `expandTextDivs` more efficient by updating all styles at once instead of piecewise
2016-09-15 16:01:07 +02:00
Jonas Jenwald
8eaa2cbce3 Remove the deprecated mozDash/mozDashOffset canvas 2D context methods
According to [MDN](https://developer.mozilla.org/en-US/docs/Web/API/CanvasRenderingContext2D/setLineDash#Browser_compatibility) the standard versions of these methods have been supported since Firefox 27, which was released over two and a half years ago. (See the dates in https://wiki.mozilla.org/RapidRelease/Calendar#Past_branch_dates)

Furthermore the non-standard properties are now in the process of being removed, please see https://groups.google.com/forum/#!topic/mozilla.dev.platform/UIudMABegcY.
Hence I don't think that we need to keep the old `moz` prefixed ones as fallback any more.
2016-09-15 10:05:40 +02:00
Jonas Jenwald
cb5f9df0c8 [EnhanceTextSelection] Make expandTextDivs more efficient by updating all styles at once instead of piecewise
I intended to provide proper benchmarking results here, as outlined in https://github.com/mozilla/pdf.js/wiki/Benchmarking-your-changes, but after wasting a couple of hours over the weekend getting weird results I gave up.
It appears that there's a lot of, i.e. way too much, variance between subsequent runs of `text` tests for the results to be meaningful.
(Previously I've only benchmarked `eq` tests, so I don't know if the `text` tests has never worked well or if it's a newer problem. For reference, please see the results of back-to-back benchmark runs on the current `master` with a *very* simple manifest file: [link here].)

Instead I used `console.time/timeEnd` in `appendText` and `expandTextDivs` to be able to compare the performance with/without the patch. The entire viewer was (skip-cache) reloaded between measurements, and the result are available here: [link here].
Given the troubles I've had with benchmarking, I've not yet computed any statistics on the results (e.g. mean, variance, confidence intervals, and so on).
However, just by looking at the data I think it's safe to say that this patch first of all doesn't seem to regress the current performance. Secondly it certainly looks *very* likely that this patch actually improves the performance, especially for the one-glyph-per-text-div case (cf. issue 7224).

Re: issue 7584.
2016-09-14 21:19:28 +02:00
Tim van der Meij
323e86c442 Text widget annotations: implement unit testing and sanitize data values 2016-09-13 14:57:11 +02:00
Jonas Jenwald
356b321f6d Fallback to the StandardEncoding for Nonsymbolic fonts without /Encoding entry (issue 7580)
Even though this patch passes all tests (unit/font/reference) locally, including the new ones that I added in PR 7621, I'm still a bit nervous about modifying the code that choose the fallback encoding for fonts without an `/Encoding` entry.
Note that over the years this code has been changed on a number of occasions, see a possibly incomplete [list here], to deal with various cases of incorrect font data.

According to the PDF specification, see http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G8.1904184, it seems that we should fallback to the `StandardEncoding` for Nonsymbolic fonts.
There's obviously a risk that fixing this particular issue *could* break other PDF files for which we don't have tests. However I've tried to change the logic as little as possible in this patch, to hopefully reduce possible breakage.

Based on debugging numerous font issue, it seems that a lot of fonts actually set the Symbolic flag, even when they are in fact *not* Symbolic. Fonts actually marked as Nonsymbolic seem to be somewhat less common, which I hope should reduce the risk of the patch somewhat.

Fixes 7580.
2016-09-13 14:07:16 +02:00
Yash Srivastav
4e428c7675
Fix lint warnings in URL polyfill 2016-09-12 20:34:51 +05:30
Tim van der Meij
03588ccbf7 Merge pull request #7623 from Snuffleupagus/jpx-error
Change `src/core/jpx.js` to use the `error` utility function instead of using `throw new Error`
2016-09-12 15:34:05 +02:00
Yury Delendik
160b176109 Adding "proper" message port for fake worker. 2016-09-12 11:17:10 +02:00
Jonas Jenwald
f620f61887 Change src/core/jpx.js to use the error utility function instead of using throw new Error
Note that in `parseCodestream` I purposly left the `throw new Error` instances inside of the `try` block, since we don't want to throw any `Errors` while in recovery mode.
Finally somewhat unrelated to the rest of the patch, but I moved the `doNotRecover` variable declaration outside of the `try` block to avoid variable hoisting given that it's accessed inside the `catch` block.
2016-09-12 11:05:43 +02:00
Jonas Jenwald
325f7afcca For embedded Type1 fonts without included ToUnicode/Encoding data, attempt to improve text selection by using the builtInEncoding to amend the toUnicode map (issue 6901, issue 7182, issue 7217, bug 917796, bug 1242142)
Note that in order to prevent any possible issues, this patch does *not* try to amend the `toUnicode` data for Type1 fonts that contain either `ToUnicode` or `Encoding` entries in the font dictionary.

Fixes, or at least improves, issues/bugs such as e.g. 6658, 6901, 7182, 7217, bug 917796, bug 1242142.
2016-09-11 20:54:10 +02:00
Tim van der Meij
be485f59ab Text widget annotations: implement maximum length and text alignment
Moreover, we refactor the code a bit to extract code that is shared
between the two branches and we only apply text alignment (and create
the array) when it is actually defined, since it's optional and left is
already the default.
2016-09-11 20:49:00 +02:00
Jonas Jenwald
0b75f63c03 Don't duplicate the first entry in the charCodeToGlyphId map for CIDFontType2 fonts with a CIDToGIDMap that already mapped the first entry to a non-zero glyphId (issue 7544)
Fixes 7544.
2016-09-09 22:33:41 +02:00
Tim van der Meij
b112f9f9f4 Merge pull request #7600 from Snuffleupagus/issue-7598
Check that Type1C fonts does not actually contain OpenType font files (issue 7598)
2016-09-09 22:02:58 +02:00
Tim van der Meij
e686db250c Render interactive form (AcroForm) text widget annotations
This patch is the first step towards implementing support for
interactive forms (AcroForms). It makes it possible to render text
widget annotations exactly like Adobe Reader/Acrobat.

Everything we implement for AcroForms is disabled by default using a
preference, mainly because it is not ready to use yet, but has to
implemented in many steps to avoid complexity. The preference allows us
to work with the code while not exposing the behavior by default. Mainly
storing entered values and printing them is still absent, which would be
minimal requirements for enabling this by default.
2016-09-07 15:37:28 +02:00
Jonas Jenwald
8dbb5a7c4a Merge pull request #7596 from timvandermeij/widget-annotation-cleanup
Improve the structure for widget annotations
2016-09-06 13:46:31 +02:00
Jonas Jenwald
44b75c01a1 Check that Type1C fonts does not actually contain OpenType font files (issue 7598)
This patch is yet another instalment in the (never ending) series of patches for PDF files that specify completely incorrect Type/Subtype for its fonts. In this case Type1/Type1C, when in fact OpenType would have been correct.

Fixes 7598.
2016-09-06 10:13:11 +02:00
Tim van der Meij
576f742047 Improve the structure for widget annotations
Currently, we only support text widget annotations (field type 'Tx')
partially. However, the current code does not make this entirely clear
and does not provide a warning when an unsupported field type is
encountered, making it harder to determine why rendering fails.

Moreover, in the display layer we make no distinction between the
various types of widget annotations, causing the code for text widget
annotations to also be executed for other types of widget annotations in
a fallback situation.

This patch improves the structure of the widget annotation code. In the
core layer, we use the same structure we use for non-widget annotations
in the factory and provide a clear warning when an unsupported type is
encountered. In the display layer, we do the same and split the
`WidgetAnnotationElement` class into two classes, namely
`TextWidgetAnnotationElement` for text widget annotations and
`WidgetAnnotationElement` for other unsupported annotations as a
fallback. From this it clear that we only support text widget
annotations and nothing else.
2016-09-06 00:26:05 +02:00
Jonas Jenwald
37998076c9 In display/api.js ensure that we always reject with an Error in JpegDecode, and adjust a couple of other rejection sites as well
In the case where the document was destroyed, we were rejecting the `Promise` in `JpegDecode` with a string instead of an `Error`. The patch also brings the wording more inline with other such rejections.

Use the `isInt` utility function when validating the `pageNumber` parameter in `WorkerTransport_getPage`, to make it more obvious what's actually happening. There's also a couple more unit-tests added, to ensure that we always fail in the expected way.

Finally, we can simplify the rejection handling in `WorkerTransport_getPageIndexByRef` somewhat. (Note that the only reason for using `catch` here is that since the promise is rejected on the worker side, the `reason` becomes a string instead of an `Error` which is why we "re-reject" on the display side.)
2016-09-05 16:35:32 +02:00
Jonas Jenwald
38c85039d1 Merge pull request #7588 from timvandermeij/text-layer-weakmap
Use a `WeakMap` in `src/display/text_layer.js`
2016-09-04 21:25:48 +02:00
Tim van der Meij
96593571eb Optimize scale calculation in text_layer.js
This patch avoids having to calculate the scale twice by saving it in
the properties object.

Moreover, we remove a temporary variable and place parentheses around a
calculation inside a string concatenation.
2016-09-04 20:19:31 +02:00
Jonas Jenwald
a35773ec8c Change src/core/jpg.js to use the error utility function instead of throwing
This allows us to remove the `try/catch` statements used in `src/core/stream.js` when parsing JPEG images.
As far as I can tell, the only reason for the current usage of plain `throw` is that `jpg.js` originally was external code. Given that this code now lives in our repo, this patch brings the JPEG code more in line with e.g. `src/core/jpx.js` and `src/core/jbig2.js`.
2016-09-04 16:28:23 +02:00
Tim van der Meij
d03651efff Merge pull request #7407 from Snuffleupagus/issue-7406
Assign the `quantizationTables` after parsing the entire JPEG image, to prevent issues when the DQT (Define Quantization Tables) marker is encountered after SOF{n} (Start of Frame) markers (issue 7406)
2016-09-04 14:49:01 +02:00
Tim van der Meij
b3818d5c36 Replace div.dataset with a WeakMap in text_layer.js
This patch improves performance by avoiding unnecessary type
conversions, which also help the JIT for optimizations.

Moreover, this patch fixes issues with the div expansion code where
`textScale` would be undefined in a division. Because of the `dataset`
usage, other comparisons evaluated to `true` while `false` would have
been correct. This makes the expansion mode now work correctly for cases
with, for example, each glyph in one div.

The polyfill for `WeakMap` has been provided by @yurydelendik.
2016-09-03 20:06:42 +02:00
Tim van der Meij
b10add14f3 Refactor text_layer.js to pass the task as a parameter
We pass many parameters to `appendText` while we might as well pass the
`task` object that contains them. This saves a few lines of code and
makes the signature of `appendText` more clear. We do the same for
`expand`, which is useful for the next commit in which we replace
`div.dataset` with a `WeakMap`.

Furthermore, this patch adds a missing parameter to a comment block to
make it clear which parameters remain.
2016-09-02 20:46:36 +02:00
Tim van der Meij
7c961b6b7a Minor code style improvements after #7539 2016-09-01 18:07:12 +02:00
Tim van der Meij
6bb95e3129 Merge pull request #7539 from jeremypress/fairexpand
[api-minor] Expanding divs to improve selection
2016-09-01 17:43:31 +02:00
Jeremy Press
6faa84abdb Continuing fairexpand #6663
1. Expanding divs to improve text selection. (Yury)
2. Adding enhanceTextSelection as an option.
3. Moving feature functionality from text_layer_builder.js to text_layer.js.
4. Added expandTextDivs method to only load expanded divs on first click, and only show on subsequent clicks
2016-08-31 09:54:52 -07:00
Jonas Jenwald
1bbc694ac3 Assign the quantizationTables after parsing the entire JPEG image, to prevent issues when the DQT (Define Quantization Tables) marker is encountered after SOF{n} (Start of Frame) markers (issue 7406)
This is a tentative patch that fixes 7406.
2016-08-31 18:42:05 +02:00
Yury Delendik
ffa99397ad Merge pull request #7387 from Snuffleupagus/issue-5808
Attempt to ignore multiple identical Tf (setFont) commands in `PartialEvaluator_getTextContent` (issue 5808)
2016-08-30 15:21:41 -05:00
Tim van der Meij
f520616e00 Merge pull request #7570 from Snuffleupagus/issue-7569
Create a fallback annotation `id` for entries in `Annots` dictionaries that are not indirect objects (issue 7569)
2016-08-28 00:23:59 +02:00
Jonas Jenwald
088ce6c009 Add a unit-test to check that ProblematicCharRanges contains valid entries
When adding new entries to `ProblematicCharRanges`, you have to be careful to not make any mistakes since that could cause glyph mapping issues.
Currently the existing reference tests should probably help catch any errors, but based on experience I think that having a unit-test which specifically checks `ProblematicCharRanges` would be both helpful and timesaving when modifying/reviewing changes to this code.

Hence this patch which adds a function (and unit-test) that is used to validate the entries in `ProblematicCharRanges`, and also checks that we don't accidentally add more character ranges than the Private Use Area can actually contain.
The way that the validation code, and thus the unit-test, is implemented also means that we have an easy way to tell how much of the Private Use Area is potentially utilized by re-mapped characters.
2016-08-27 11:56:00 +02:00
Jonas Jenwald
78889646c8 Create a fallback annotation id for entries in Annots dictionaries that are not indirect objects (issue 7569)
According to the PDF specification, see http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#page=86, entries in `Annots` dictionaries should be indirect objects, but obviously there're PDF generators that ignore this.

Fixes 7569.
2016-08-27 10:56:16 +02:00
Jonas Jenwald
5379749d4b Try to prevent CanvasGraphics_getSinglePixelWidth from intermittently returning incorrect values in Firefox (issue 7188)
Fixes 7188.
2016-08-22 20:00:24 +02:00
Tim van der Meij
b4c8814fc9 Merge pull request #7534 from Snuffleupagus/isName-name-check
Add a parameter to the `isName` function that enables checking not just that something is a `Name`, but also that the actual `name` properties matches
2016-08-17 15:48:42 +02:00
Jonas Jenwald
544d29f5cb Add a recoveryMode that suppresses errors from the Parser, and utilize it when searching for the main trailer in XRef_indexObjects (bug 1250079)
Instead of having `Parser_getObj` fail unconditionally for the referenced PDF file, this patch attempts to let searching for the main trailer continue even if there are errors.

Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1250079.
2016-08-17 12:37:35 +02:00
Jonas Jenwald
83ce6f0b6d Adjust the (applicable) existing isName callsites to use the new isName(v, name) version of the function 2016-08-10 11:15:08 +02:00
Jonas Jenwald
af636aae96 Add a parameter to the isName function that enables checking not just that something is a Name, but also that the actual name properties matches
This is similar to the existing `isCmd` and `isDict` functions, which already support similar kind of checks.
With the updated `isName` function, we'll be able to simplify many callsites from: `isName(someVariable) && someVariable.name === 'someName'` to: `isName(someVariable, 'someName')`.
2016-08-10 11:15:03 +02:00
Jonas Jenwald
77c6ed5389 Attempt to ignore multiple identical Tf (setFont) commands in PartialEvaluator_getTextContent (issue 5808)
This patch improves the performance of issue 5808, but I'm not sure if it's enough to call it fixed. On average, this patch reduces the number of textLayer div's by a factor of 3, and it also reduces the time spend in `getTextContent` by a factor of ~2.

The PDF file is generated by `Scribus PDF`, which for reasons I cannot understand is placing redundant `Tf` commands before *every* showText command.
Note how the PDF file also contains lots of (basically) identical fonts, but with slightly different names, which causes unnecessary font-switching. This causes some unnecessary breaking of textLayer div's, but this issue cannot be easily worked around.
2016-07-27 21:37:52 +02:00
Yury Delendik
a02e2686b9 Merge pull request #7475 from Snuffleupagus/api-getTextContent-combineTextItems
[api-minor] Add a parameter to `PDFPageProxy_getTextContent` that controls whether `PartialEvaluator_getTextContent` will attempt to combine same line text items
2016-07-27 08:34:24 -05:00
Jonas Jenwald
558a22cd02 Prevent errors when parsing Annotations with missing (or invalid) /Subtype entries (issue 7446)
Note that I used a separate warning message for this case, instead of utilizing the same one as in the unsupported subtype case, to more clearly indicate that the PDF file itself is to blame rather than PDF.js.

Fixes 7446.
2016-07-25 13:59:26 +02:00
Brendan Dahl
5678486802 Merge pull request #7347 from Snuffleupagus/evaluator-more-Ref_toString
Slightly refactor the `fontRef` handling in `PartialEvaluator_loadFont` (issue 7403 and issue 7402)
2016-07-22 17:21:47 -07:00
Brendan Dahl
50d6e4f147 Merge pull request #7447 from Snuffleupagus/buildToUnicode-notdef
Ignore .notdef in the `differences` array when building a fallback `toUnicode` map in `PartialEvaluator_buildToUnicode` (issue 5256)
2016-07-22 14:33:32 -07:00
Jonas Jenwald
390c02a3e9 Attempt to cache fonts that are direct objects (i.e. Dicts), as opposed to Refs, to prevent re-rendering after cleanup from breaking (issue 7403 and issue 7402)
Fonts that are not referenced by `Ref`s are very uncommon in practice, but it can unfortunately happen. In this case, we're currently not caching them in the usual way, i.e. by `Ref`, which leads to failures when a page is rendered after `cleanup` has run.
The simplest solution would have been to remove the `font.translated` workaround, but since this would have meant loading these kind of fonts over and over, the patch attempts to be a bit clever about this situation.

Note that if we instead loaded fonts per *page*, instead of per document, this issue wouldn't have existed.
2016-07-21 16:04:07 +02:00
Jonas Jenwald
2e9cd3ea64 Slightly refactor the fontRef handling in PartialEvaluator_loadFont (issue 7403 and issue 7402)
Originally, I was just going to change this code to use `Ref_toString` in a couple more places. When I started reading the code, I figured that it wouldn't hurt to clean up a couple of comments. While doing this, I noticed that the logic for the (rare) `isDict(fontRef)` case could do with a few improvements.

There should be no functional changes with this patch, but given the added reference checks, we will now avoid bogus `Ref`s when resolving font aliases. In practice, as issue 7403 shows, the current code can break certain PDF files even if it's very rare.

Note that the only thing that this patch will change, is the `font.loadedName` in the case where a `fontRef` is a reference *and* the font doesn't have a descriptor. Previously for `fontRef = Ref(4, 0)` we'd get `font.loadedName = 'g_d0_f4_0'`, and with this patch `font.loadedName = g_d0_f4R`, which is actually one character shorted in most cases. (Given that `Ref_toString` contains an optimization for the `gen === 0` case, which is by far the most common `gen` value.)

In the already existing fallback case, where the `fontName` is used to when creating the `font.loadedName`, we allow any alphanumeric character. Hence I don't see how (as mentioned above) e.g. `font.loadedName = g_d0_f4R` would be an issue here.
2016-07-21 16:03:33 +02:00
Tim van der Meij
10f9f11ec4 Merge pull request #7490 from Snuffleupagus/issue-7426
Don't map glyphs to the Lepcha Unicode block (issue 7426)
2016-07-21 14:39:19 +02:00
Jonas Jenwald
f297e4d17c [api-minor] Add a parameter to PDFPageProxy_getTextContent that controls whether PartialEvaluator_getTextContent will attempt to combine same line text items
From the discussion in issue 7445, it seems that there may be cases where an API consumer would want to get the text content as is, without combined text items.
2016-07-19 13:38:57 +02:00
Jonas Jenwald
90d19de935 Catch errors and continue parsing in parseCMap (issue 7492)
After PR 7039, the PDF file in issue 7492 no longer renders at all, but note that text selection wasn't working correctly previously.

The problem with the PDF file in issue 7492 is that the `cMap`, in the `toUnicode` entry in the font, contains an invalid name:
```
/CMapName /-usr-share-fonts-truetype-Panton-Panton Family-Fontfabric - Panton.otf,000-UTF16 def
```
When we parse that line, things obviously break because there are spaces present in the wrong places.
To avoid that issue, the patch simply lets `parseCMap` continue when errors are encountered, to try and recover usable data. Note that by not aborting immediatly when an error is encountered, we are also able to fix the text selection.

Obviously, it could be argued that we should just immediatly reject a corrupt `cMap`. But given that they usually are correct, it seems that trying to recover as much data as possible from corrupt one can only be a good thing for both glyph mapping and text selection.

Fixes 7492.
2016-07-18 16:39:56 +02:00
Jonas Jenwald
64783c8b6e Don't map glyphs to the Lepcha Unicode block (issue 7426)
In the PDF file in the issue, some of the glyphs end up being mapped to the Lepcha Unicode block; see https://en.wikipedia.org/wiki/Lepcha_(Unicode_block).
This didn't use to matter, but after HarfBuzz updates that improved support for Lepcha fonts, in particular https://bugzilla.mozilla.org/show_bug.cgi?id=1249861, some glyphs are now moved horizontally.
To avoid that, this patch adds the Lepcha block to the list of Unicode ranges that we skip when building the glyph mapping.

Fixes 7426.
2016-07-17 16:53:36 +02:00
klemens
6f03f62327 trivial spelling fixes 2016-07-17 14:33:41 +02:00
Jonas Jenwald
8f4ec669d0 Remove the obsolete MozBlobBuilder fallback from the createBlob utility function
`MozBlobBuilder` has been obsolete since Firefox 14, so there's no reason to keep this code around anymore.
2016-07-09 16:37:05 +02:00
Jonas Jenwald
51e46fa1a7 Change the warn to info in recoverGlyphName to reduce the console spam
After PR 7441, where `recoverGlyphName` is used a lot more than before, many PDF files will generate a lot of warnings the console. For normal usage, compared to debugging/development, this is probably more annoying than helpful.
2016-07-09 12:08:41 +02:00
Tim van der Meij
b6826a46a8 Merge pull request #7453 from simoncpu/master
Expose the text widget's maximum length.
2016-07-07 01:03:40 +02:00
Brendan Dahl
1f3f4a8dd7 Merge pull request #7441 from Snuffleupagus/issue-7439
Fallback to attempt to recover standard glyph names when amending the `charCodeToGlyphId` with entries from the `differences` array in `type1FontGlyphMapping` (issue 7439)
2016-07-06 13:02:21 -07:00
Brendan Dahl
e2e657e44f Merge pull request #7390 from Snuffleupagus/issue-7180
Add upper-case `I` as a possible space replacement fallback in `Font.spaceWidth` to improve text-selection (issue 7180)
2016-06-29 15:11:19 -07:00
Simon Cornelius P. Umacob
d872fc90b9 Expose the text widget's maximum length. 2016-06-29 17:04:33 +08:00
Jonas Jenwald
bdd58ab1d2 Ignore .notdef in the differences array when building a fallback toUnicode map in PartialEvaluator_buildToUnicode (issue 5256)
Fixes 5256.
2016-06-27 16:20:23 +02:00
Jonas Jenwald
7866109af9 Fallback to attempt to recover standard glyph names when amending the charCodeToGlyphId with entries from the differences array in type1FontGlyphMapping (issue 7439)
Fixes 7439.
2016-06-25 14:54:34 +02:00
Jonas Jenwald
c1ca268ef3 Skip mapping of glyphs to Unicode "Ideographic space" (issue 7416)
Fixes 7416, which is an IE specific issue.
2016-06-22 08:58:00 +02:00
Tim van der Meij
f97d52182a Merge pull request #7341 from Snuffleupagus/getDestinationHash-Array
[api-minor] Improve handling of links that are using explicit destination arrays
2016-06-09 00:29:10 +02:00
Tim van der Meij
70b3eea4a3 Merge pull request #7389 from Snuffleupagus/move-isSpace
Move the `isSpace` utility function from core/parser.js to shared/util.js
2016-06-08 22:57:45 +02:00
Jonas Jenwald
6a0b047bfa Add upper-case I as a possible space replacement fallback in Font.spaceWidth to improve text-selection (issue 7180)
In fonts with only upper-case glyphs, that are also missing a space glyph, `get spaceWidth` won't be able to return anything useful.
By adding upper-case `I` as a fallback, we can thus improve text-selection in some PDF files.
Note that locally, the patch causes slight movement in a few existing `text` tests, but in my opinion this actually looks like slight improvements.

Fixes 7180.
2016-06-07 22:55:25 +02:00
Jonas Jenwald
6260fc09a3 Attempt to recover valid format 3 FDSelect data from broken CFF fonts (bug 1146106)
According to the CFF specification, see http://partners.adobe.com/public/developer/en/font/5176.CFF.pdf#G3.46884, for `format 3` FDSelect data: "The first range must have a ‘first’ GID of 0".
Since the PDF file (attached in the bug) violates that part of the specification, this patch tries to recover valid FDSelect data to prevent OTS from rejecting the font.

Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1146106.
2016-06-06 18:20:52 +02:00
Jonas Jenwald
a36a946976 Move the isSpace utility function from core/parser.js to shared/util.js
Currently the `isSpace` utility function is a member of `Lexer`, which seems suboptimal, given that it's placed in `core/parser.js`. In practice, this means that in a number of `core/*.js` files we thus have an *otherwise* completely unnecessary dependency on `core/parser.js` for a one-line function.

Instead, this patch moves `isSpace` into `shared/util.js` which seems more appropriate for this kind of utility function. Not to mention that since all the affected `core/*.js` files already depends on `shared/util.js`, this doesn't incur any more file dependencies.
2016-06-06 09:11:33 +02:00
Jonas Jenwald
b02d560ae0 Fix errors in setGState in PartialEvaluator_getTextContent that prevents text-selection from working properly
Currently `setGState` is completely broken, and looking through the history of that code, it seems to me that this may never have worked correctly.
This patch fixes the text-selection in `extgstate.pdf` in the test-suite, which is also added as a `text` test.
2016-06-01 22:58:49 +02:00
Jonas Jenwald
98fe094d18 Let non-viewable Popup Annotations inherit the parent's Annotation Flags if the parent is viewable
Fixes http://www.pdf-archive.com/2013/09/30/file2/file2.pdf.

Note how it's not possible to show the various Popup Annotations in the above document.
To fix that, this patch lets the Popup inherit the flags of the parent, in the special case where the parent is `viewable` *and* the Popup is not.
In general, I don't think that a Popup must have the same flags set as the parent. However, it seems very strange to have a `viewable` parent annotation, and then not being able to view the Popup.

Annoyingly the PDF specification doesn't, as far as I can find, mention anything about how this case should be handled, but this patch seem consistent with the actual behaviour in Adobe Reader.
2016-05-25 23:00:26 +02:00
Brendan Dahl
b86610ffdb Merge pull request #7300 from Snuffleupagus/bug-1068432
Prevent adding invalid values in `CFFDict_setByKey` (bug 1068432)
2016-05-24 12:12:38 -07:00
Jonas Jenwald
b354682dd6 [api-minor] Let LinkAnnotation/PDFLinkService_getDestinationHash return a stringified version of the destination array for explicit destinations
Currently for explicit destinations, compared to named destinations, we manually try to build a hash that often times is a quite poor representation of the *actual* destination. (Currently this only, kind of, works for `\XYZ` destinations.)
For PDF files using explicit destinations, this can make it difficult/impossible to obtain a link to a specific section of the document through the URL.

Note that in practice most PDF files, especially newer ones, use named destinations and these are thus unnaffected by this patch.
This patch also fixes an existing issue in `PDFLinkService_getDestinationHash`, where a named destination consisting of only a number would not be handled correctly.

With the added, and already existing, type checks in place for destinations, I really don't think that this patch exposes any "sensitive" internal destination code not already accessible through normal hash parameters.

*Please note:* Just trying to improve the algorithm that generates the hash is unfortunately not possible in general, since there are a number of cases where it will simply never work well.

 - First of all, note that `getDestinationHash` currently relies on the `_pagesRefCache`, hence it's possible that the hash returned is empty during e.g. ranged/streamed loading of a PDF file.

 - Second of all, the currently computed hash is actually dependent on the document rotation. With named destinations, the fetched internal destination array is rotational invariant (as it should be), but this will not hold in general for the hash. We can easily avoid this issue by using a stringified destination array.

 - Third of all, note that according to the PDF specification[1], `GoToR` destinations may actually contain explicit destination arrays. Since we cannot really construct a hash in `annotation.js`, we currently have no good way to support those. Even though this case seems *very* rare in practice (I've not actually seen such a PDF file), it's in the specification, and this patch allows us to support that for "free".

---
[1] http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G11.1951685
2016-05-21 14:14:07 +02:00
Jonas Jenwald
01ab15a6f1 [api-minor] Let Catalog_getPageIndex check that the Ref actually points to a /Page dictionary
Currently the `getPageIndex` method will happily return `0`, even if the `Ref` parameter doesn't actually point to a proper /Page dictionary.
Having the API trust that the consumer is doing the right thing seems error-prone, hence this patch which adds a check for this case.

Given that the `Catalog_getPageIndex` method isn't used in any hot part of the codebase, this extra check shouldn't be a problem.
(Note: in the standard viewer, it is only ever used from `PDFLinkService_navigateTo` if a destination needs to be resolved during document loading, which isn't common enough to be an issue IMHO.)
2016-05-21 14:13:41 +02:00
Tim van der Meij
db46829ef7 Merge pull request #7316 from timvandermeij/remove-unused
Remove unused variables
2016-05-21 14:07:33 +02:00
Jonas Jenwald
7ddb0bc718 Attempt to combine text runs positioned with setTextMatrix 2016-05-18 17:21:58 +02:00
Tim van der Meij
6a7012aaca Remove unused variables
These have been found using `gulp lint` in combination with the `unused:
true` parameter for JSHint. Unfortunately there are too many false
positives to enable this feature, but now that most globals have been
removed because of the conversion to UMD the results are much more
useful than before.
2016-05-11 16:11:13 +02:00
Tim van der Meij
c1c199d702 Merge pull request #7295 from Snuffleupagus/core-getArray
Use `Dict_getArray` in more places in `src/core/` to avoid issues when Arrays contain indirect objects
2016-05-10 23:21:54 +02:00
Jonas Jenwald
182d33800a Ignore 'endobj' commands inside of ObjStm streams (issue 5241, bug 898610, bug 1037816)
According to an example in the PDF specification, see http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#page=56, an `ObjStm` stream should not contain 'endobj' commands.

Fixes 5241.
Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=898610.
Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1037816.
2016-05-09 09:50:45 +02:00
Jonas Jenwald
c9b6de3b16 Prevent adding invalid values in CFFDict_setByKey (bug 1068432)
In the font in question, there are a couple of `topDict` entries that have invalid values (`0xF 0xF`, i.e. just eof markers without any actual numbers).
This causes the `parseFloatOperand` function, inside `CFFParser_parseDict`, to return `NaN`. Currently we pass this broken font onto the browser, which OTS unsurprisingly rejects.

Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1068432.
2016-05-07 21:09:58 +02:00
Jonas Jenwald
6111c17c8a Use Dict_getArray in more places in src/core/ to avoid issues when Arrays contain indirect objects
As evident from e.g. PRs 6485 and 7118, some bad PDF generators unfortunately create Arrays where *some* elements are indirect objects (i.e. `Ref`s). This seems to mostly affect Arrays that contain numbers, such as e.g. `Matrix/FontMatrix/BBox/FontBBox/Rect/Color/...`, and has manifested itself in PDF files that fail to render correctly (some elements are missing).

The problem in both the cases above, besides broken rendering, was that there were *no* errors/warnings that indicated what the problem was, making it difficult to pinpoint the issue.
Hence this patch, where I've audited all usages of `Dict_get` in `src/core/` files, and replaced it with `Dict_getArray` where appropriate to try and prevent unnecessary future bugs.
2016-05-05 19:42:57 +02:00
Tim van der Meij
9c95d089de Merge pull request #7281 from yurydelendik/static-warnings
Fixes some static analysis warnings and recommendations
2016-05-03 01:26:17 +02:00
Yury Delendik
32ce369d88 Fixes some static analysis warnings and recommendations
* Useless conditional
* Superfluous trailing arguments
* Useless assignment to local variable
* Misspelled identifier
* JSDoc tag for non-existent parameter
2016-05-02 17:34:58 -05:00
Yury Delendik
4cde9984f9 Fixes unneed conversion to array in CFF encodeInteger. 2016-05-02 15:24:16 -05:00
Yury Delendik
4016f9fd43 Fixes weird loop in the bidi.js.
Closes #7231.
2016-04-27 16:14:33 -05:00
Yury Delendik
3d49879211 Merge pull request #7130 from nschloe/patch-1
Add element to text layer even if width === 0
2016-04-22 16:10:07 -05:00
Jonas Jenwald
9ceeb21741 Prevent accidentally overriding the error function in the commonobj messageHandler in api.js (issue 7232)
This naming issue has been present since PR 3529, but at least I cannot find any issues/bugs that seem to have been caused by it, which is good.

The patch also removes an unnecessary `else` branch, since an already existing `break` means that it's redundant.

Fixes 7232.
2016-04-22 10:12:12 +02:00
Jonas Jenwald
e281ef15db Adjust incorrect first obj number of "free" xref entry in XRef_readXRefTable (issue 7229)
Fixes 7229.
2016-04-21 16:36:32 +02:00
Jonas Jenwald
19e0599f74 Split the two paths in PDFImage.resize into separate helper functions, placed in colorspace.js and image.js
Re: issue 6777.
2016-04-17 10:24:36 +02:00
Jonas Jenwald
f3f825cc71 Various improvements for GoToR actions
- Add support for the 'NewWindow' property.

 - Ensure that destinations are applied to the *remote* document, instead of the current one.

 - Handle the `F` entry being a standard string, instead of a dictionary.
2016-04-15 22:55:05 +02:00
Jonas Jenwald
b63ef7a8b6 Refactor LinkAnnotation slightly to add data.url/data.dest at the end
This patch also makes sure that all URLs are converted to the correct encoding.
2016-04-15 22:55:05 +02:00
Tim van der Meij
4a601ffc28 Merge pull request #7197 from prakashpalanisamy/remove-combineurl-test
Remove `combineUrl` and replace it with `new URL`. Issue #7183, for reference.
2016-04-15 22:44:07 +02:00
Prakash Palanisamy
a25c29d98d Remove combineUrl and replace it with new URL. 2016-04-15 21:33:10 +05:30
Jonas Jenwald
079b563e2d Ensure that the params parameter of the PredictorStream is a dictionary (issue 7200)
Fixes 7200.
2016-04-15 16:30:18 +02:00
Yury Delendik
6282ec24d1 Merge pull request #7172 from yurydelendik/umd-web
Introduces UMD headers to the web/ folder.
2016-04-13 10:23:23 -05:00
Yury Delendik
006e8fb59d Introduces UMD headers to the web/ folder. 2016-04-13 10:09:48 -05:00
Yury Delendik
fa2f80d0fd Merge pull request #7189 from yurydelendik/webpack-browserify-love
Removing "entry-loader" dependency from webpack.
2016-04-13 08:42:11 -05:00
Yury Delendik
ae415f9e80 Removing "entry-loader" dependency from webpack. 2016-04-13 08:24:25 -05:00
Yury Delendik
b834b6899c Merge pull request #7185 from iloire/issue-7177-support-almondjs
Support almond.js #7177
2016-04-12 17:21:03 -05:00
Ivan Loire
1dfc49152a Support almond.js #7177 2016-04-12 09:32:07 +10:00
Yury Delendik
398e6acbc5 Stops bleeding of pattern edges for mesh. 2016-04-11 18:21:44 -05:00
Jonas Jenwald
be6754a1a0 Merge pull request #7176 from yurydelendik/smask-resume
Allow SMask be resumed after restore() and better transform after SMask
2016-04-11 15:57:40 +02:00
Yury Delendik
63f62a0e53 Finishing SMask at the end of operators list. 2016-04-11 08:02:06 -05:00
Yury Delendik
1485c1d1da Suspending/resuming SMask operation during setGState/restore. 2016-04-11 08:02:06 -05:00
Jonas Jenwald
f59c3a0644 Remove the remaining usages of new {Name,Cmd} in favor of {Name,Cmd}.get
Using `new {Name,Cmd}` should be avoided, since it creates a new object on *every* call, whereas `{Name,Cmd}.get` uses caches to only create *one* object regardless of how many times they are called.

Most of these are found in the unit-tests, where increased memory usage probably doesn't matter very much. But it still seems good to get rid of those cases, since no part of the codebase ought to advertise that usage.

Given the small size of the patch, I'm also tweaking a few comments and class names.
2016-04-08 12:14:05 +02:00
Yury Delendik
1e4886a15a Remove global window and navigator usages from the core code. 2016-04-07 13:46:07 -05:00
Yury Delendik
1e3e14e6b2 Exposes all functional members via lib exports and use them in viewer. 2016-04-07 13:46:07 -05:00
Yury Delendik
1d12aed5ca Move all PDFJS.xxx settings into display/global. 2016-04-07 13:46:07 -05:00
Yury Delendik
118b71925c Forces UMD header to have relative path and extension for CommonJS. 2016-04-02 11:10:36 -05:00
Yury Delendik
34aa915441 Merge pull request #7146 from Snuffleupagus/extract-CFFParser
Extract CFFParser and Type1Parser from fonts.js
2016-04-02 10:50:38 -05:00
Yury Delendik
055d642bf2 Merge pull request #7107 from Rob--W/worker-loading
Detect premature worker load error
2016-04-02 10:40:26 -05:00
Rob Wu
c8996f654f Detect and handle premature worker load error
Fall back to a fake worker if the worker fails to load or initialize,
e.g. due to a network error, a security error or simply a script error.
2016-04-02 11:06:15 +02:00
Jonas Jenwald
ef551e8266 Extract Type1Parser from fonts.js 2016-04-01 23:38:53 +02:00
Jonas Jenwald
b961e1d21b Extract CFFParser from fonts.js (issue 6777) 2016-04-01 22:32:39 +02:00
Yury Delendik
a250c150ab Merge pull request #7134 from yurydelendik/circ-stream-colorspace
Refactors to remove stream.js dependency on colorspace.js
2016-04-01 08:23:24 -05:00
Yury Delendik
ff3ce973b8 Merge pull request #7106 from Snuffleupagus/issue-7101
Keep track of the character to glyph mapping in font_renderer.js, to prevent errors when different characters point to the same glyph (issue 7101)
2016-04-01 08:09:21 -05:00
Yury Delendik
35cbf74b12 Refactors to remove stream.js dependency on colorspace.js 2016-04-01 07:36:16 -05:00
Brendan Dahl
13d440df61 Merge pull request #7078 from Snuffleupagus/refactor-toFontChar-without-file
Refactor the building of `toFontChar` for non-embedded fonts
2016-03-31 10:43:11 -07:00
Jonas Jenwald
05cf709f8e Parse Type1 font files to determine the various Length{n} properties, instead of trusting the PDF file (issue 5686, issue 3928)
Fixes 5686.
Fixes 3928.
2016-03-31 11:08:12 +02:00
Jonas Jenwald
c40df8a393 Make Type1Font more class-like, by adding closure
*Note:* Ignoring whitespace should simplify reviewing a great deal.
2016-03-31 11:00:27 +02:00
Jonas Jenwald
17aaa125df Keep track of the character to glyph mapping in font_renderer.js, to prevent errors when different characters point to the same glyph (issue 7101)
Fixes 7101.
2016-03-30 11:33:04 +02:00
Nico Schlömer
7cb055307d Add element to text layer even if width === 0
Some browsers render certain special characters with width 0, others with strictly positive width. (For example, the Greek Delta, Δ, has width 0 in Google Chrome, and a positive width in Firefox.) The `if` clause in operation so far results in different text layer DOM trees for different browsers.

This commit fixes that by adding the elements independently of their width.
2016-03-29 19:32:51 +02:00
Brendan Dahl
4e2f70440f Merge pull request #6711 from yurydelendik/errors
Better errors capturing at the core and stop rendering on error.
2016-03-29 09:19:28 -07:00
Jonas Jenwald
13d7a5070e Prevent failures in the Annotation code if the Rect array contains indirect objects (issue 7115)
Note that in the PDF files provided by the reporter, this issue was limited to `Rect` arrays in AcroForm entries (which we currently don't support).
However, since a bad PDF generator could create this problem in *any* kind of annotation, the reduced test-case included here uses a simple LinkAnnotation instead.

Fixes 7115.
2016-03-26 20:55:16 +01:00
Brendan Dahl
df7afcf004 Merge pull request #7053 from yurydelendik/rm-pdfjs-core
Removes global PDFJS usage from the src/core/.
2016-03-25 13:19:43 -07:00
Yury Delendik
2fa4dd6f40 Proxy global PDFJS.verbosity to properly configure shared/util. 2016-03-23 19:24:37 -05:00
Yury Delendik
a8e5912cb1 Moves shared/global to display/global 2016-03-23 19:24:37 -05:00
Yury Delendik
e372f3608b Makes WorkerMessageHandler non-global. 2016-03-23 19:24:37 -05:00
Yury Delendik
bda5e6235e Removes global PDFJS usage from the src/core/. 2016-03-23 19:24:37 -05:00
Yury Delendik
54ee15d866 Merge pull request #7100 from yurydelendik/stream-wo-parser
Removes core/stream circular dependency on core/parser.
2016-03-22 15:08:12 -05:00
Yury Delendik
6038c236b2 Removes core/stream circular dependency on core/parser. 2016-03-22 14:06:01 -05:00
Jonas Jenwald
d78fae0181 Ensure that TrueType font tables have uint32 checksums
According to "The table directory" under https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6.html#Directory, TrueType font tables should have `uint32` checksums.

This is something that I noticed, and was initially confused about, while debugging a TrueType issue.
As far as I can tell, the current (`int32`) checksums we use doesn't cause any issues in practice. However, I do think that this should be addressed to agree with the specification, and to reduce possible confusion when reading the font code.
2016-03-22 13:40:50 +01:00
Yury Delendik
21ed8ff71d Merge pull request #7039 from prometheansacrifice/async-cmap-factory
Refactors CMapFactory.create to make it async
2016-03-21 13:57:36 -05:00
Manas
f6d28ca323 Refactors CMapFactory.create to make it async 2016-03-21 23:08:19 +05:30
Jonas Jenwald
91756f6e86 Pass the PDFJS.postMessageTransfer parameter to the worker, so that the MessageHandler can be setup correctly in createDocumentHandler (issue 6957)
This regressed in commit acdd49f480, i.e. PR 6571.

Fixes 6957.
2016-03-16 18:34:26 +01:00
Yury Delendik
c6d2b7f9d9 Merge pull request #6906 from KamiHQ/fix-printing
avoid apply transform twice for composite context
2016-03-11 08:26:59 -06:00
Yury Delendik
8ba413e761 Better errors capturing at the core and stop rendering on error. 2016-03-11 07:59:09 -06:00
Jonas Jenwald
cd2bd057ab Refactor the building of toFontChar for non-embedded fonts
Currently there's a lot of duplicate code for non-embedded `toFontChar`, which this patch simplifies by extracting the code into a helper function instead.
2016-03-10 21:25:39 +01:00
Jonas Jenwald
dfe9015a43 Convert uniXXXX glyph names to proper ones when building the charCodeToGlyphId map for TrueType fonts (bug 1132849, issue 6893, issue 6894)
This patch adds a `getUnicodeForGlyph` helper function, which is used to recover Unicode values for non-standard glyph names.

Some PDF generators, e.g. Scribus PDF, use improper `uniXXXX` glyph names which breaks the glyph mapping. We can avoid this by converting them to "standard" glyph names instead.

Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1132849.
Fixes 6893.
Fixes 6894.
2016-03-09 19:37:15 +01:00
Preetham Mysore
be1e12dbcb Fix for descent calculation while reading font hhea headers 2016-03-03 08:51:41 -05:00
Yury Delendik
a022f6f069 Reverts back un-need change made at #6879. 2016-03-02 09:57:33 -06:00
Jonas Jenwald
8402c79171 Merge pull request #7050 from brendandahl/issue4402
For CIDFontType2 use CID as glyph ID when missing CID to GID map.
2016-03-02 10:11:42 +01:00
Brendan Dahl
a6acf74b54 Merge pull request #7023 from brendandahl/issue6721
Only draw glyphs on canvas if they are in the font or the font file is missing.
2016-03-01 18:03:37 -08:00
Brendan Dahl
6e1d131384 For CIDFontType2 use CID as glyph ID when missing CID to GID map. 2016-03-01 17:05:33 -08:00
Brendan Dahl
ff87f3fb86 Only draw glyphs on canvas if they are in the font or the font file is missing. 2016-03-01 13:24:58 -08:00
Jonas Jenwald
505f15f221 Avoid accidentally getting the entire font file in readNameTable (issue 7020)
In the PDF file in question, some of the 'name' table entries have `record.length === 0`. This becomes problematic in the non-unicode case, since `font.getBytes(0)` will fetch the *entire* stream.
Given that OTS rejects 'name' entries larger than `2^16`, this thus explain the sanitizer errors.

Fixes 7020.
2016-03-01 21:59:49 +01:00
Yury Delendik
22341c0761 Merge pull request #6879 from yurydelendik/streams
Makes PDF data reading Streams API friendly.
2016-03-01 09:10:52 -06:00
Tim van der Meij
ad31e52a26 Group popup creation code and apply it to more annotation types 2016-02-25 00:35:45 +01:00
Jonas Jenwald
41efb92d3a Merge pull request #6988 from timvandermeij/fileattachment-annotation
Implement support for FileAttachment annotations
2016-02-24 12:58:06 +01:00
Tim van der Meij
0351c7eff4 Move the getFileName helper function to the core
This is required to be able to use it in the annotation display code,
where we now apply it to sanitize the filename of the FileAttachment
annotation. The PDF file from https://bugzilla.mozilla.org/show_bug.cgi?id=1230933 has shown that some PDF generators include the path of the file rather than the filename, which causes filenames with weird initial characters. PDF viewers handle this differently (for example Foxit Reader just replaces forward slashes with spaces), but we think it's better to only show the filename as intended.

Additionally we add unit tests for the `getFilenameFromUrl` helper
function.
2016-02-23 22:49:53 +01:00
Tim van der Meij
6a33dfd13a Implement support for FileAttachment annotations 2016-02-23 22:49:53 +01:00
Tim van der Meij
c53581f4e5 Merge pull request #7009 from KamiHQ/annotations-fix
[api-minor] Always expose data.title and data.contents for TextAnnotation
2016-02-22 22:59:38 +01:00
Tim van der Meij
ebe6fb2560 Merge pull request #7012 from KamiHQ/fix-annotation-popup
don't render highlight/underline/squiggly/strikeout annotations that doesn't have popup
2016-02-22 21:54:08 +01:00
Xiliang Chen
6762ff2fd6 don't render highlight/underline/squiggly/strikeout annotations that doesn't have popup 2016-02-22 13:10:20 +13:00
Xiliang Chen
266cedd960 always expose data.title and data.content 2016-02-19 13:50:25 +13:00
Yury Delendik
0d591719d9 Makes PDF data reading Streams API friendly. 2016-02-18 13:17:53 -06:00
Jonas Jenwald
a494e33776 Update JpegImage.getData to support forceRGBoutput for images with numComponents === 1 (issue 6066)
*A more robust solution for issue 6066.*

As a temporary work-around for (the upstream) [bug 1164199](https://bugzilla.mozilla.org/show_bug.cgi?id=1164199), we parsed *all* images in the Firefox addon during a short time.
Doing so uncovered an issue with our image handling (see 6066), for JPEG images with a `DeviceGray` ColorSpace *and* `bpc !== 1` (bits per component).

As long as we let the browser handle image decoding in this case, this isn't going to be an issue, but I do think that we should proactively fix this to avoid future issues if we change where the images are decoded (in `jpg.js` vs in browser).
Also, we currently don't seem to have a test-case for that kind of image data.
2016-02-18 10:12:37 +01:00
Jonas Jenwald
7cf9de2c17 [api-minor] Change getOutline to actually return the RGB color of outline items
Currently the `C` entry in an outline item is returned as is, which is neither particularly useful nor what the API documentation claims.

This patch also adds unit-tests for both the color handling, and the `F` entry (bold/italic flags).
2016-02-15 13:41:22 +01:00
Jonas Jenwald
98db068079 Reduce the overall indentation level in Catalog_readDocumentOutline, by using early returns, in order to improve readability 2016-02-14 11:38:43 +01:00
Tim van der Meij
e9a1a47d28 Merge pull request #6982 from Snuffleupagus/evaluator-remove-getAll
Remove the only remaining `Dict_getAll` usage (in evaluator.js) and the method itself
2016-02-13 20:48:37 +01:00
Tim van der Meij
addc4a3ded Merge pull request #6856 from KamiHQ/remove-has-html
move hasHtml to AnnotationElement
2016-02-13 20:12:09 +01:00
Jonas Jenwald
1ee016b005 Remove Dict_getAll since it is now unused
`Dict_getAll` is problematic for a number of reasons. First of all, as issue 6961 shows, it can be really bad for performance, since it dereferences all indirect objects.
Second of all, all the derefencing can lead to data being unncessarily requested when ranged/chunked loading is used, thus unnecessarily delaying rendering.

Note: For cases where `Dict_getAll` was previously used, `Dict_getKeys` in combination with `Dict_get` can be used instead. This has the advantage that data isn't requested until it's actually needed.
2016-02-12 22:32:07 +01:00
Jonas Jenwald
93ea866f01 Remove getAll from EvaluatorPreprocessor_read
For the operators that we currently support, the arguments are not `Dict`s, which means that it's not really necessary to use `Dict_getAll` in `EvaluatorPreprocessor_read`.
Also, I do think that if/when we support operators that use `Dict`s as arguments, that should be dealt with in the corresponding `case` in `PartialEvaluator_getOperatorList` which handles the operator.

The only reason that I can find for using `Dict_getAll` like that, is that prior to PR 6550 we would just append certain (currently unsupported) operators without doing any further processing/checking. But as issue 6549 showed, that can lead to issues in practice, which is why it was changed.

In an effort to prevent possible issue with unsupported operators, this patch simply ignores operators with `Dict` arguments in `PartialEvaluator_getOperatorList`.
2016-02-12 22:31:50 +01:00
Tim van der Meij
1f49e7b194 Merge pull request #6975 from Snuffleupagus/ColorSpace-remove-getAll
Get rid of `getAll` usage in colorspace.js
2016-02-11 14:23:10 +01:00
Tim van der Meij
6133a993a9 Merge pull request #6972 from Snuffleupagus/loadType3Data-getKeys
Replace `getAll` with `getKeys` in `loadType3Data`
2016-02-11 14:15:31 +01:00
Jonas Jenwald
02a6b73492 Get rid of getAll usage in colorspace.js
For the `CalGray`/`CalRGB`/`Lab` colour spaces, we're currently using `getAll` to retrieve the parameters. However that's not really necessary, since we may just as well explicitly `get` the needed parameters instead.
2016-02-11 11:59:26 +01:00
Prayag Verma
049beac346 Fix a typo in api.js
`fulfills` misspelt as `fullfills`
2016-02-11 07:18:56 +05:30
Xiliang Chen
e42da0f5e9 move hasHtml to AnnotationElement 2016-02-11 13:58:17 +13:00
Jonas Jenwald
f7f60197ce Replace getAll with getKeys in loadType3Data
Not only is `getAll` less efficient, but given that we actually need the keys here, using `getKeys` seems much more suitable.
2016-02-10 20:19:14 +01:00
Jonas Jenwald
07e1ad40a2 Replace getAll with getKeys in PartialEvaluator_hasBlendModes to speed up loading of badly generated PDF files (issue 6961)
Some bad PDF generators, in particular "Scribus PDF", duplicates resources *a lot* at various levels of the PDF files. This can lead to `PartialEvaluator_hasBlendModes` taking an unreasonable amount of time to complete.
The reason is that the current code is using `Dict_getAll`, which recursively dereferences *all* indirect objects, which can be really slow. This patch instead uses `Dict_getKeys`, and then manually looks up only the necessary indirect objects.

I've added the PDF file as a `load` test. The most important thing here is probably to ensure that the file remains available in the repo, and the comment should help reduced the chance of regressions. (Note that locally, the `load` test times out without this patch, but we cannot really assume that that always happens.)

Fixes 6961.
2016-02-10 17:21:38 +01:00
Tim van der Meij
03f12a10b5 Merge pull request #6954 from Snuffleupagus/setGState-fixes
Various `setGState` improvements
2016-02-10 00:38:31 +01:00
Tim van der Meij
02b161d432 Merge pull request #6933 from brendandahl/faster-decrypt
Make type 1 font program decryption faster.
2016-02-09 23:41:22 +01:00
Brendan Dahl
d7e5935f91 Merge pull request #6938 from yurydelendik/fix-ff-disableworker
Fix 'Ready' message sequence for Firefox ext disabled worker.
2016-02-09 10:50:12 -08:00
Tim van der Meij
a0aa781c39 Merge pull request #6913 from Rob--W/importScripts-work-around
Improve work-around for importScripts bug.
2016-02-04 19:51:26 +01:00
Rob Wu
097e273ca4 Improve work-around for importScripts bug.
Reverts "Hack to avoid intermidiate Chrome failures during tests."
(2b2c521213).

require.js uses importScript asynchronously, which activates the worker
GC bug in WebKit. This patch works around a bug in a way that is similar
in the upcoming (but not yet released) require.js 2.1.23

The advantage of the new work-around is that it allows the runtime to
garbage-collect idle Workers.

References:
- https://crbug.com/572225
- https://webkit.org/b/153317
2016-02-03 23:30:27 +01:00
Yury Delendik
4c59712606 Merge pull request #6904 from timvandermeij/cleanup-workers
Destroy workers when they are no longer needed in the unit tests
2016-02-03 14:46:22 -06:00
Jonas Jenwald
a1fe2cb443 Don't directly access the private map in setGState, and ensure that we avoid indirect objects
*This patch is based on something I noticed while debugging some of the PDF files in issue 6931.*

In a number of the cases in `setGState`, we're implicitly assuming that we're not dealing with indirect objects (i.e. `Ref`s). See e.g. the 'Font' case, or the various cases where we simply do `gStateObj.push([key, value]);` (since the code in `canvas.js` won't be able to deal with a `Ref` for those cases).

The reason that I didn't use `Dict_forEach` instead, is that it would re-introduce the unncessary closures that PR 5205 removed.
2016-02-03 17:13:42 +01:00
Jonas Jenwald
2d4a1aa0af Actually ignore no-op setGState (PR 5192 followup)
The intention of PR 5192 was to avoid adding empty `setGState` ops to the operatorList. But the patch accidentally used `>=`, which means that it's not actually working as intended, since empty arrays always have `length === 0`.
2016-02-03 17:13:02 +01:00
Jonas Jenwald
4770b516fe Correct the upper bound used when building the transferMap for SMasks (PR 6723 followup)
Even though the currently known test-cases render correctly without this patch, that seems more like a lucky coincidence, given that there's no guarantee that `transferMap[255] === 0` for every possible transfer function.
2016-02-03 13:41:10 +01:00
Jonas Jenwald
992472fd38 Ensure that we don't modify the Dict data when the Differences array of a font contains indirect objects
This patch fixes an issue that I inadvertently introduced in PR 5815, where we accidentally modify the `Differences` array in the encoding dictionary for indirect objects.

Instead of this change, we could also have used the now existing `Dict_getArray`. However in this case I don't think that would have been a good idea, since it would mean iterating through the array *twice*.
2016-01-30 13:31:24 +01:00
Brendan Dahl
02331f6e33 Make type 1 font program decryption faster.
Discard the values first so we don't have to slice the array.
2016-01-29 11:10:30 -08:00
Yury Delendik
e36b04e1ff Fix 'Ready' message sequence for Firefox ext disabled worker. 2016-01-29 11:43:19 -06:00
Jonas Jenwald
e401804f13 Remove the 'oplist' rendering intent when getOperatorList has returned the complete OperatorList, and prevent errors in PDFPageProxy_destroy for the 'oplist' rendering intent 2016-01-29 12:34:26 +01:00
Tim van der Meij
5d797e1a85 Merge pull request #6921 from Snuffleupagus/mandatory-deprecated
Always display `deprecated` messages, regardless of the verbosity setting
2016-01-29 12:12:08 +01:00
Jeff Walden
4691a4a1e4 Adjust a comment discussing transferred ArrayBuffers to refer to those buffers being detached, not neutered. This change makes the comment consistent with terminology used in the ECMAScript specification. 2016-01-28 14:52:07 -08:00
Yury Delendik
825a2225ab Merge pull request #6915 from yurydelendik/lookuptables
Refactor lookup hash tables/objects
2016-01-28 15:01:06 -06:00
Yury Delendik
2edf2792dc Replaces literal {} created lookup tables with Object.create 2016-01-28 12:18:38 -06:00
Yury Delendik
d6adf84159 Lazify OP_MAP. 2016-01-28 12:18:37 -06:00
Yury Delendik
1de90454b7 Lazify Metrics 2016-01-28 12:11:46 -06:00
Yury Delendik
55a201d92d Lazify NormalizedUnicodes 2016-01-28 11:56:42 -06:00
Yury Delendik
d0738d7e24 Lazify stdFontMap, serifFonts, GlyphMapForStandardFonts 2016-01-28 11:51:54 -06:00
Yury Delendik
1a9a665adf Refactor Encodings 2016-01-28 11:32:59 -06:00
Yury Delendik
4ef20de429 Lazify GlyphsUnicode. 2016-01-28 11:32:59 -06:00
Brendan Dahl
252b9d5910 Merge pull request #6753 from yurydelendik/cdn-worker
Wraps worker script if its cross-origin location is detected.
2016-01-27 13:21:10 -08:00
Jonas Jenwald
1140a34f5c [api-minor] Change getPageLabels to always return the pageLabels, even if they are identical to standard page numbering 2016-01-27 13:36:03 +01:00
Yury Delendik
bc30c42758 Fixes URL polyfill check for MS Edge. 2016-01-26 16:49:44 -06:00
Jonas Jenwald
15ce96a6eb Prevent failures in the "scanning for endstream" code, in Parser_makeStream, by handling the case where 'endstream' is split between contiguous chunks (issue 1536) 2016-01-26 09:03:51 +01:00
Jonas Jenwald
472e793a27 Always display deprecated messages, regardless of the verbosity setting
Currently the `deprecated` message is using `warn`, meaning that it's possible to disable warnings about deprecated API usage through the `PDFJS.verbosity` setting.

I don't think that it should be possible to opt out of deprecation messages,[1] since it might mean that in a custom deployment of PDF.js these messages could be overlooked, leading to PDF.js being broken (seemingly without any warning) when updating to a future version.
Obviously this could be considered the responsibility of the people doing custom PDF.js implementations, but in order to reduce the support burden later on, it seems better to "annoy" people upfront.

Compared to various `info`/`warn`/`error` messages, `deprecated` messages should be very simple to get rid of -- just update the API usage and the message goes away!

---
[1] In e.g. Firefox it doesn't seem possible to prevent deprecation warnings from being displayed (in the Browser Console).
2016-01-25 12:38:34 +01:00
Xiliang Chen
069f4b9bdf avoid apply transform twice for composite context 2016-01-21 14:25:39 +13:00
Tim van der Meij
58329f7f92 Merge pull request #6803 from Snuffleupagus/page-labels
[api-minor] Add support for PageLabels in the API
2016-01-20 22:05:48 +01:00
Yury Delendik
0aa373cdf3 Merge pull request #6891 from Snuffleupagus/issue-6889
Map missing glyphs to the `notdef` glyph for TrueType (3, 1) fonts regardless if the 'post' table is defined or not (issue 6889)
2016-01-20 13:14:47 -06:00
Jonas Jenwald
85cf90643f [api-minor] Add support for PageLabels in the API 2016-01-19 22:49:04 +01:00
Jonas Jenwald
8ad18959d7 Add support for NumberTree 2016-01-19 22:47:45 +01:00
Tim van der Meij
1eea0db897 Merge pull request #6822 from Snuffleupagus/urls-in-outline
[api-minor] Add support for URLs in the document outline
2016-01-19 22:21:40 +01:00
Jonas Jenwald
0030a82dc3 [api-minor] Add support for URLs in the document outline
Re: issue 5089.
(Note that since there are other outline features that we currently don't support, e.g. bold/italic text and custom colours, I thus think we can keep the referenced issue open.)
2016-01-19 21:36:27 +01:00
Jonas Jenwald
4855d4cc9f Map missing glyphs to the notdef glyph for TrueType (3, 1) fonts regardless if the 'post' table is defined or not (issue 6889) 2016-01-17 22:58:00 +01:00
Yury Delendik
1e45f2d4e1 Wraps worker script if its cross-origin location is detected. 2016-01-15 15:05:46 -06:00
Jonas Jenwald
d52495a9c8 [TrueType] Recover from a missing "glyf" table by replacing it with dummy data, utilizing the existing code in sanitizeGlyphLocations
It seems to be fairly common for OCR software to include incomplete TrueType fonts, notable missing the "glyf" table, in PDF files. Since we currently reject such fonts, the result is that text-selection/copying is broken.

This patch contains a suggested approach to try and use these kind of broken fonts, by using existing code in `sanitizeGlyphLocations` to replace a missing "glyf" table with dummy data.

Fixes 4684.
Fixes 6007.
Fixes 6829.
2016-01-15 21:44:59 +01:00
Jonas Jenwald
cca265352f Add an extra set of // to the comment for the URL polyfill, since the preprocessor eats one set, thus breaking the world (PR 6846 followup)
The Firefox addon currently fails with:
```
SyntaxError: missing ; before statement pdf.js:1692:12
TypeError: PDFJS.shadow is not a function viewer.js:6228:12
```
2016-01-14 22:43:34 +01:00
Brendan Dahl
794e1a3178 Merge pull request #6846 from brendandahl/url
Use URL constructor for combineURL.
2016-01-14 13:29:16 -08:00
Brendan Dahl
e362c3b8fc Use URL constructor for combineURL. 2016-01-14 11:36:36 -08:00
Brendan Dahl
3057b69e45 Merge pull request #6839 from Snuffleupagus/issue-6782
Check that CIDFontType0 fonts does not actually contain OpenType font files (issue 6782)
2016-01-11 08:56:48 -08:00
Tim van der Meij
30b8f41003 Merge pull request #6820 from Snuffleupagus/showText-shadingPattern
Apply Patterns, if necessary, when rendering text
2016-01-08 14:02:56 +01:00
Brendan Dahl
4a215f0892 Merge pull request #6825 from yurydelendik/pdfjsumd
Adds UMD header to pdf.js and pdf.worker.js files.
2016-01-07 15:07:58 -08:00
Daan Sprenkels
90ec2c9294 shading-pattern: Decreased Shadings.SMALL_NUMBER
and added a test case for #6298
2016-01-06 15:26:40 +01:00
Jonas Jenwald
896e390285 Check that CIDFontType0 fonts does not actually contain OpenType font files (issue 6782)
*This patch follows a similar idea as PR 5756.*

The patch is based on the nice debugging done by Brendan in the referenced issue 6782.
A better way to handle this, and similar issues, would probably be to completely ignore what the PDF file claims about font type/subtype, and just check the actual data. But until that kind of rewrite happens, this patch should help.

Fixes 6782.
2016-01-06 02:19:02 +01:00
Tim van der Meij
4399d01169 Merge pull request #6834 from Snuffleupagus/issue-6832
Strip `null` (\x00) characters from the URLs in LinkAnnotations (issue 6832)
2016-01-05 23:59:25 +01:00
Brendan Dahl
eb7c36beb6 Add validation for callsubr and callgsubr for type 2 charstrings. 2016-01-05 09:54:25 -08:00
Jonas Jenwald
97c10e9c08 Strip null (\x00) characters from the URLs in LinkAnnotations (issue 6832)
Apparently some PDF files can have annotations with `URI` entries ending with `null` characters, thus breaking the links.
To handle this edge-case of bad PDFs, this patch moves the already existing utility function from `ui_utils.js` into `util.js`, in order to fix those URLs.

Fixes 6832.
2016-01-04 21:55:20 +01:00
Tim van der Meij
6ef7120a04 Implement support for Highlight annotations 2016-01-01 15:31:46 +01:00
Yury Delendik
f340dd5cd5 Adds pdfjs/main_loader module to better mirror pdfjs-dist/build/pdf. 2015-12-30 13:28:57 -06:00
Tim van der Meij
34918a6666 Implement support for Squiggly annotations 2015-12-30 19:37:04 +01:00
Jonas Jenwald
d956177482 Merge pull request #6819 from timvandermeij/strikeout-annotation
Implement support for StrikeOut annotations
2015-12-30 14:44:50 +01:00
Yury Delendik
cbbb9bb82d Adds UMD header to pdf.js and pdf.worker.js files. 2015-12-29 18:15:14 -06:00
Tim van der Meij
e8db825512 Merge pull request #6771 from yurydelendik/requirejs
Removes hardcoded module loading order
2015-12-30 00:37:32 +01:00
Yury Delendik
b8e7efaaa1 Merge pull request #6821 from yurydelendik/bug951051
Bug 951051 - Better crypto key length recovery.
2015-12-29 15:35:15 -06:00
Yury Delendik
c991480687 Better crypto key length recovery. 2015-12-29 15:10:38 -06:00
Jonas Jenwald
1d1f175826 Apply Patterns, if necessary, when rendering text
Currently we're not applying Patterns for text, but only for graphics.

This patch is unfortunately not a complete solution, but rather a step on the way, since there are still some PDF files where the Patterns look more like a solid colour, rather than the intended gradient.
I've been unable to fix these issues completely, and I've not managed to determine if the remaining issues are caused either by the pattern code, the canvas code, or perhaps both.

However, given that even this simple patch improves the current situation quite a bit, I figured that it couldn't hurt to submit it as-is.

 - Fixes 5804.
 - Fixes 6130.
 - Improves 3988 a lot, since the text is now visible. However, it looks like the text is *one* solid colour, instead of the correct gradient.
 - Improves 5432, since the text is no longer gray. (This file also suffers from the same problem as the previous one.)
2015-12-29 20:02:40 +01:00
Yury Delendik
2b2c521213 Hack to avoid intermidiate Chrome failures during tests.
Remove when https://code.google.com/p/chromium/issues/detail?id=572225 is fixed.
2015-12-29 09:20:53 -06:00
Yury Delendik
fc3282db56 Adds RequireJS to worker. 2015-12-29 09:20:52 -06:00
Yury Delendik
85e95d34ed Use RequireJS in the viewer, examples and tests. 2015-12-29 09:20:52 -06:00
Tim van der Meij
c5f4b9750e Implement support for StrikeOut annotations 2015-12-29 15:09:28 +01:00
Jonas Jenwald
b32cdf5836 Merge pull request #6813 from timvandermeij/underline-annotation
Implement support for Underline annotations
2015-12-28 23:48:31 +01:00
Jonas Jenwald
85589483d6 Merge pull request #6807 from timvandermeij/popup-annotation-hidden
Ensure that hidden popups do not use any space
2015-12-28 22:58:41 +01:00
Tim van der Meij
ae329afc03 Ensure that hidden popups do not use any space 2015-12-28 18:54:10 +01:00
Tim van der Meij
edf8ccc1d8 Merge pull request #6814 from Snuffleupagus/beginAnnotations-baseTransform
Ensure that the `baseTransform` is applied when rendering annotations
2015-12-28 18:32:50 +01:00
Jonas Jenwald
2f2ea6160b Ensure that the baseTransform is applied when rendering annotations
Fixes 3350.
Fixes 5946.
Fixes 6334.
Fixes 6722.
Probably fixes 3826 (since the PDF files are no longer available, I cannot confirm it).
2015-12-28 16:02:38 +01:00
Tim van der Meij
cd28dd34fe Implement support for Underline annotations 2015-12-28 00:33:41 +01:00
Tim van der Meij
26379ddae2 Rename and reorder link annotation CSS 2015-12-27 15:51:58 +01:00
Jonas Jenwald
3c7088dc44 Do not modify data.rect in AnnotationElement_createContainer, since that will corrupt the annotation position on subsequent calls
Fixes 6804; this regressed in PR 6714.
2015-12-27 12:46:20 +01:00
Tim van der Meij
7d43971f54 Implement support for Popup annotations
Most code for Popup annotations is already present for Text annotations.
This patch extracts the popup creation logic from the Text annotation
code so it can be reused for Popup annotations.

Not only does this add support for Popup annotations, the Text
annotation code is also considerably easier. If a `Popup` entry is
available for a Text annotation, it will not be more than an image. The
popup will be handled by the Popup annotation. However, it is also
possible for Text annotations to not have a separate Popup annotation,
in which case the Text annotation handles the popup creation itself.
2015-12-25 13:17:21 +01:00
Tim van der Meij
05b9d3730a Merge pull request #6785 from yurydelendik/frameworks
Adds/modifies examples for node.js and webpack.
2015-12-21 22:44:58 +01:00
Yury Delendik
79c2f69c32 Adds/modifies examples for node.js and webpack. 2015-12-21 13:46:50 -06:00
Tim van der Meij
5b66ad626d Refactor annotation display layer code to use classes 2015-12-19 19:31:37 +01:00
Tony Jin
f9e2091c5a Fixing externalLinkTarget. isExternalLinkTargetSet was set to
wrong sharedUtil method.
2015-12-18 11:27:21 -08:00
Yury Delendik
42beb0c27b Merge pull request #6767 from tonyjin/strip-link-referrer
Strip referrer from link annotation.
2015-12-18 12:27:15 -06:00
Tony Jin
11f3deac56 Allow link rel to be customized. Defaults to 'noreferrer' 2015-12-17 10:36:53 -08:00
Tim van der Meij
df81b832bb Remove unused variables 2015-12-16 23:52:16 +01:00
Jonas Jenwald
d4c026980e Only export Uint32ArrayView when it's actually defined, to prevent breaking e.g. the Firefox addon/built-in version
*This is a follow-up to PR 6683.*
2015-12-16 11:15:36 +01:00
Brendan Dahl
f1c64b6a2a Merge pull request #6683 from yurydelendik/module-core
Adds UMD headers to core, display and shared files.
2015-12-15 17:10:21 -08:00
Xiliang Chen
8bf17f5df8 Fix incorrect position of text widget 2015-12-16 11:21:01 +13:00
Yury Delendik
b084dc09ee Allows requirejs and node load fake worker files. 2015-12-15 13:24:39 -06:00
Yury Delendik
6b60c8f4db Adds UMD headers to core, display and shared files. 2015-12-15 13:24:39 -06:00
Tim van der Meij
1b5940edd2 Merge pull request #6714 from timvandermeij/annotation-web-to-src
[api-minor] Move annotation DOM manipulation logic to src/display/annotation_layer.js
2015-12-15 19:06:48 +01:00
Tim van der Meij
edce8daeac Implement annotation layer rendering and updating in src/display/annotation_layer.js 2015-12-15 17:00:01 +01:00
Tim van der Meij
b1937e7670 Remove superfluous comments in the annotation layer code 2015-12-15 16:27:35 +01:00
Tim van der Meij
8d36aad30a Implement constants for all annotation types
Now we have a full list of all possible annotation types and the
numbering corresponds to the order in the specification. Not only is
this more consistent and complete, it also prevents having to add these
constants when a new annotation type is implemented.

Additionally fix an issue where a regular Widget annotation would not
have `data.annotationType` set. It was only set for a
TextWidgetAnnotation, but instead move it to the base Widget annotation
class to add it for all Widget annotations (since TextWidgetAnnotation
inherits from WidgetAnnotation it will have it too).
2015-12-15 15:23:55 +01:00
Tim van der Meij
f93a220736 Merge pull request #6684 from dsprenkels/issue-6296-radial-shading-size
shading-pattern: While drawing patterns, use transform to baseTransform first
2015-12-14 20:39:08 +01:00
Yury Delendik
f7ec866c39 Merge pull request #6747 from Snuffleupagus/issue-6742
Remove the superfluous `PDFJS.disableFontFace = false;` statement at the top of font_loader.js (issue 6742)
2015-12-12 12:53:56 -05:00
Jonas Jenwald
12068490ed Remove the superfluous PDFJS.disableFontFace = false; statement at the top of font_loader.js (issue 6742)
Fixes 6742.
2015-12-12 12:39:56 +01:00
Jonas Jenwald
f0e4795302 Reset the styleElement when clearing out loaded fonts (bug 1232017)
*This is a follow-up to PR 6571.*

Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1232071.
2015-12-12 11:33:43 +01:00
Brendan Dahl
91b27aae46 Merge pull request #6651 from yurydelendik/fix-chars-scaling
Fix chars scaling for standard fonts. (redo of #4908)
2015-12-10 14:18:48 -05:00
Tim van der Meij
bce3214105 Move link creation logic to src/display/annotation_layer.js
Additionally simplify the div creation logic (it needs to happen only
once, so it should not be in the for-loop) and remove/rename variables
for shorter code.
2015-12-08 23:56:17 +01:00
Jonas Jenwald
ee0d522187 Use adjustWidths for TrueType fonts if we handle them as OpenType (issue 5027, issue 5084, issue 6556, bug 1204903)
In `Font_checkAndRepair` we can decide that a font isn't TrueType, and instead parse it as CFF. In that case it's quite possible that the `fontMatrix` will be changed, and without calling `adjustWidths` we're failing to update the glyph widths correctly.

Fixes 5027.
Fixes 5084.
Fixes 6556.
Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1204903.
2015-12-08 00:49:22 +01:00
Jonas Jenwald
084bb8682f Merge pull request #6723 from yurydelendik/smask-transfer
Adds transfer function support for SMask.
2015-12-05 22:41:56 +01:00
Jonas Jenwald
4810b7b8fc Fix the charCodeOf method in IdentityToUnicodeMap in order to prevent text selection from breaking
After PR 6590, `font.spaceWidth` is now called in more cases than before (in `PartialEvaluator_getTextContent`), which exposed an underlying issue with `IdentityToUnicodeMap_charCodeOf` throwing an error.
This breaks text-selection in some PDF files found in the wild, hence this patch replaces the `error` with an actual function instead (modelled after `IdentityCMap_charCodeOf`).
2015-12-05 13:15:55 +01:00
Yury Delendik
15c9969abe Adds transfer function support for SMask. 2015-12-04 12:52:45 -06:00
Jonas Jenwald
e2aca385c6 Merge pull request #6720 from yurydelendik/smask-state
Fixes canvas state after smask group ends.
2015-12-03 22:19:37 +01:00
Yury Delendik
d4843ebf6d Fixes canvas state after smask group ends. 2015-12-03 14:34:12 -06:00
Brendan Dahl
87762afec4 Remove glyph id's outside the range of valid glyphs.
OTS does not like invalid glyph ids in a camp table.
2015-12-03 11:53:06 -08:00
Tim van der Meij
2dc3ee38c0 Move most rendering logic to src/display/annotation_layer.js 2015-12-03 00:20:18 +01:00
Tim van der Meij
38567ac3a3 Rename initContainer(item) to getContainer(data)
Naming it this way makes more sense when compared to the core
annoatation code. That code also uses `data` for the annotation
properties.
2015-12-02 23:35:21 +01:00
Tim van der Meij
91274d6d2d Rename annotation_helper.js to annotation_layer.js 2015-12-02 23:30:28 +01:00
Daan Sprenkels
a9081653fc shading-pattern: While drawing patterns, transform to the baseTransform first 2015-12-02 21:49:38 +01:00
Brendan Dahl
376788f2b2 Merge pull request #6698 from yurydelendik/rm-UnsupportedManager
[api-minor] Replaces UnsupportedManager with callback.
2015-12-02 10:57:44 -08:00
Jonas Jenwald
5f56a20b34 Merge pull request #6701 from timvandermeij/pdf-manager-inherit
Make use of `Util.inherit` in `src/core/pdf_manager.js`
2015-12-01 19:53:43 +01:00
Yury Delendik
4a82f2f5fd Merge pull request #6695 from Snuffleupagus/issue-6692
Ensure that `Lexer_getName` does not fail if a `Name` contains in invalid usage of the NUMBER SIGN (#) (issue 6692)
2015-12-01 10:25:57 -06:00
Jonas Jenwald
361fa83f3b Merge pull request #6699 from timvandermeij/annotation-refactoring
Improve code structure of the annotation code
2015-12-01 13:31:18 +01:00
Yury Delendik
c9cb6a3025 Replaces UnsupportedManager with callback. 2015-11-30 14:42:47 -06:00
Tim van der Meij
0c41866433 Make use of Util.inherit in src/core/pdf_manager.js
While we are here, fix some incorrect function names.
2015-11-29 00:58:19 +01:00
Tim van der Meij
8b79becad6 Improve code structure of the annotation code
This patch improves the code structure of the annotation code.

- Create the annotation border style object in the `setBorderStyle` method instead of in the constructor. The behavior is the same as the `setBorderStyle` method is always called, thus a border style object is still always available.
- Put all data object manipulation lines in one block in the constructor. This improves readability and maintainability as it is more visible which properties are exposed.
- Simplify `appendToOperatorList` by removing the promise capability and removing an unused parameter.
- Remove some unnecessary newlines/spaces.
2015-11-29 00:04:21 +01:00
Jonas Jenwald
e8e79029b1 Replace font.bindDOM() with font.createFontFaceRule() in the FontLoader for MOZCENTRAL specific code (PR 6571 follow-up)
This is a follow-up to PR 6571, and fixes font loading in the `MOZCENTRAL` version.

Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1228603#c2.
2015-11-28 21:33:07 +01:00
Jonas Jenwald
995e1a45b8 Ensure that Lexer_getName does not fail if a Name contains in invalid usage of the NUMBER SIGN (#) (issue 6692)
*This is a regression from PR 3424.*

The PDF file in the referenced issue is using `Type3` fonts. In one of those, the `/CharProcs` dictionary contains an entry with the name `/#`. Before the changes to `Lexer_getName` in PR 3424, we were allowing certain invalid `Name` patterns containing the NUMBER SIGN (#).

It's unfortunate that this has been broken for close to two and a half years before the bug surfaced, but it should at least indicate that this is not a widespread issue.

Fixes 6692.
2015-11-28 11:59:09 +01:00
Yury Delendik
e4e69e2f05 Set error font for Type3 if its loading failed. 2015-11-27 13:05:51 -06:00
Yury Delendik
8dff301ce1 Worker shall wait for MessageHandler to be created at api side. 2015-11-25 18:21:23 -06:00
Jonas Jenwald
6dfe53b976 [api-minor] Add a parameter to PDFPageProxy_getTextContent that enables replacing of all whitespace with standard spaces in the textLayer (issue 6612)
This patch goes a bit further than issue 6612 requires, and replaces all kinds of whitespace with standard spaces.

When testing this locally, it actually seemed to slightly improve two existing test-cases (`tracemonkey-text` and `taro-text`).

Fixes 6612.
2015-11-25 17:28:40 +01:00
Tim van der Meij
c2dfe9e9a9 Merge pull request #6571 from yurydelendik/worker
[api-minor] Allows a worker to handle multiple documents.
2015-11-24 22:42:44 +01:00
Tim van der Meij
c280fb516d Merge pull request #6679 from Snuffleupagus/better-openExternalLinksInNewWindow-deprecation-comments
Improve the comment and deprecation warning for `PDFJS.openExternalLinksInNewWindow`
2015-11-24 22:38:43 +01:00
Yury Delendik
06c1904675 Refactors FontLoader to group fonts per document. 2015-11-24 13:27:22 -06:00
Yury Delendik
09772e1e15 Creates PDFWorker, separates fetchDocument from transport. 2015-11-24 13:27:22 -06:00
Yury Delendik
acdd49f480 Adds peer communication between MessageHandlers. 2015-11-24 12:16:58 -06:00
Yury Delendik
4b243cdd89 Merge pull request #6675 from Snuffleupagus/getAnnotations-intent
[api-minor] Let `getAnnotations` fetch all annotations by default, unless an intent is specified
2015-11-24 12:11:51 -06:00
Jonas Jenwald
27c8e1e22f Improve the comment and deprecation warning for PDFJS.openExternalLinksInNewWindow
This patch:
 - Updates the JSDoc comment in `api.js`, to more clearly point out that `PDFJS.openExternalLinksInNewWindow` is deprecated, and explains what to use instead.

 - Changes the `warn`, in `isExternalLinkTargetSet()`, to use the new `deprecated` function instead. Also updates the message with more detailed information about what to use instead.

 - Changes the pre-processor tag to ensure the deprecation warning is seen in all build types where it could possibly matter (in case people are using `PDFJS.openExternalLinksInNewWindow` in e.g. custom-built extensions).

These changes were prompted by seeing http://stackoverflow.com/questions/33813373/pdf-js-how-to-open-hyperlinks-in-a-new-tab-window, since it seems to me that the current comments/warnings might not be worded well enough.
2015-11-23 13:12:23 +01:00
Jonas Jenwald
a2a5d36d5b Restore the data.annotationFlags parameter for annotations (PR 6672 follow-up) 2015-11-23 10:17:11 +01:00
Jonas Jenwald
b05652ca97 [api-minor] Let getAnnotations fetch all annotations by default, unless an intent is specified
Currently `getAnnotations` will *only* fetch annotations that are either `viewable` or `printable`. This is "hidden" inside the `core.js` file, meaning that API consumers might be confused as to why they are not recieving *all* the annotations present for a page.

I thus think that the API should, by default, return *all* available annotations unless specifically told otherwise. In e.g. the default viewer, we obviously only want to display annotations that are `viewable`, hence this patch adds an `intent` parameter to `getAnnotations` that makes it possible to decide if only `viewable` or `printable` annotations should be fetched.
2015-11-22 15:51:37 +01:00
Yury Delendik
0029000c9f Merge pull request #6671 from Snuffleupagus/make-stripCommentHeaders-less-gready
Make `stripCommentHeaders` less greedy, to ensure that it doesn't eat 'use strict' directive at the top of files (PR 6627 follow-up)
2015-11-22 07:24:20 -06:00
Tim van der Meij
0991c06395 Refactor annotation flags code
This patch makes it possible to set and get all possible flags that the PDF specification defines. Even though we do not support all possible annotation types and not all possible annotation flags yet, this general framework makes it easy to access all flags for each annotation such that annotation type implementations can use this information.

We add constants for all possible annotation flags such that we do not need to hardcode the flags in the code anymore. The `isViewable()` and `isPrintable()` methods are now easier to read. Additionally, unit tests have been added to ensure correct behavior.

This is another part of #5218.
2015-11-22 01:06:37 +01:00
Jonas Jenwald
373da010ac Move the globals comments in bidi.js and metadata.js to after the Copyright comments 2015-11-21 18:43:08 +01:00
Yury Delendik
56ccaea99b Move text layer building logic into src/display/text_layer.js 2015-11-19 10:50:27 -06:00
Yury Delendik
2f34fd46cb Move CustomStyle. 2015-11-19 10:47:17 -06:00
Yury Delendik
194994a289 Merge pull request #6551 from yurydelendik/subaa
[api-minor] Enables subpixel anti-aliasing for most of the content.
2015-11-17 19:45:32 -06:00
Yury Delendik
8200f099a3 Fix chars scaling for standard fonts. 2015-11-17 13:21:27 -06:00
Yury Delendik
2f1a626d6a Merge pull request #6640 from dsprenkels/issue-6006-radial-gradient-size
Apply transformation matrix to RadialGradient radiuses
2015-11-17 11:40:13 -06:00
Daan Sprenkels
6ce83d3290 apply transformation matrix to RadialGradient radiuses,
not only to circle origin points
fix for #6006
2015-11-17 00:20:42 +01:00
Yury Delendik
1d8800370a Allow subpixel anti-aliasing for most of the content. 2015-11-16 10:50:02 -06:00
Manas
a2ba1b8189 Uses editorconfig to maintain consistent coding styles
Removes the following as they unnecessary
/* -*- Mode: Java; tab-width: 2; indent-tabs-mode: nil; c-basic-offset: 2 -*- */
/* vim: set shiftwidth=2 tabstop=2 autoindent cindent expandtab: */
2015-11-14 07:32:18 +05:30
Jonas Jenwald
50a70429ec Ignore the /Mask entry in images unless its /ImageMask entry is explicitly set to true (issue 6621)
Fixes 6621.
2015-11-12 22:49:26 +01:00
Yury Delendik
7381ff9523 Merge pull request #6599 from prometheansacrifice/generate-better-api-docs
Generate better API documentation
2015-11-12 14:26:18 -06:00
Manas
dbcb46c8de Uses @alias to fix missing comments on JSDocs pages 2015-11-13 01:24:15 +05:30
Yury Delendik
3c6df26704 Merge pull request #6608 from Rob--W/improved-error-message-local-file
Improve error message for non-existent local files
2015-11-09 15:40:41 -06:00
Rob Wu
c604cc22d1 Improve error message for non-existent local files
I received multiple reports about the following cryptic error in the
Chrome extension when the user tried to open a local file:

> PDF.js v1.1.527 (build: 2096a2a)
> Message: Cannot read property 'Symbol(Symbol.iterator)' of null

This error most likely originated from core/stream.js:

    function Stream(arrayBuffer, start, length, dict) {
      this.bytes = (arrayBuffer instanceof Uint8Array ?
                    arrayBuffer : new Uint8Array(arrayBuffer));
                                                 ^^^^^^^^^^^
`arrayBuffer` is `null`, and that in turn is caused by the fact that
for non-existing files, there is no data. I've applied two fixes:

1. Never call onDone with a void buffer, but call the error handler
   instead.
2. Show a sensible error message for local files with status = 0.
2015-11-08 18:03:28 +01:00
Jonas Jenwald
ff64ef0243 Prevent readCmapTable from failing if the cmap is missing in TrueType fonts
Fixes http://arrow.dit.ie/cgi/viewcontent.cgi?article=1000&context=aaschadpoth#page=3.
2015-11-08 16:48:37 +01:00
Yury Delendik
bb29e13307 Merge pull request #6601 from yurydelendik/ascent
Fixes incorrect PDF file font metrics.
2015-11-06 20:16:04 -06:00
Brendan Dahl
9a830a7b62 Merge pull request #6590 from yurydelendik/combinechars
Combines standalone chars into text groups.
2015-11-06 15:06:41 -08:00
Yury Delendik
cc5bc18728 Fixes incorrect PDF file font metrics. 2015-11-06 14:47:10 -06:00
Yury Delendik
fa423cfab0 Refactors fake space heuristics for speed. 2015-11-06 10:55:43 -06:00
Yury Delendik
376f8bde14 Combines standalone divs into text groups. 2015-11-06 10:20:49 -06:00
Yury Delendik
601d29b14e Fixes all examples to require workerSrc to be set. 2015-11-06 07:50:21 -06:00
Yury Delendik
28d340679a Uses document.currentScript for pdf.worker.js path. 2015-11-06 07:50:21 -06:00
Yury Delendik
fa46b73c47 Better spacing in text layer. 2015-11-02 08:54:15 -06:00
Brendan Dahl
b56b41514c Merge pull request #6578 from yurydelendik/issue6577
Ignore any pending data when worker is terminated.
2015-10-29 11:49:40 -07:00
Yury Delendik
8d15ecb14b Ignore any pending data when worker is terminated. 2015-10-29 13:06:22 -05:00
Yury Delendik
d26ef21d52 Merge pull request #6568 from tonyjin/api-rangeChunkSize
[api-minor] Add an optional param to DocumentInitParameters for speci…
2015-10-28 16:52:52 -05:00
Tony Jin
ef667823dd [api-minor] Add an optional param to DocumentInitParameters for specifying the range request chunk size to use. Defaults to 2^16 = 65536. 2015-10-26 17:22:11 -07:00
Jonas Jenwald
1c66d4a106 Add a totalLength getter to OperatorList, since the length is zero after flushing
In the `RenderPageRequest` handler in `worker.js`, we attempt to print an `info` message containing the rendering time and the length of the operator list. The latter is currently broken (and has been for quite some time), since the `length` of an `OperatorList` is reset when flushing occurs.
This patch attempts to rectify this, by adding a getter which keeps track of the total length.
2015-10-26 18:12:14 +01:00
Jonas Jenwald
5bd95df427 Prevent TypeError: page is undefined when the document has been destroyed (PR 6546 follow-up)
*Follow-up to PR 6546.*

If rendering has already started when the document is destroyed, then `this.pageCache[data.pageIndex]` may already have been cleared when the `StartRenderingPage`/`RenderPageChunk` messages are recieved in `api.js`, which results in `TypeError`s being thrown.
2015-10-23 22:16:34 +02:00
Yury Delendik
5135aa9bec Adds deprecation warning for the API calls. 2015-10-23 09:06:32 -05:00
Yury Delendik
58c3ea0820 Adds thread abort capabilities. 2015-10-23 09:06:32 -05:00
Yury Delendik
59c13b32aa Adds destroy method to the document loading task.
Also renames PDFPageProxy.destroy method to cleanup.
2015-10-23 08:57:14 -05:00
Jonas Jenwald
487ba9065a Fail gracefully, and with a notification, if paintXObject is encountered in canvas.js
We should never actually try to execute `paintXObject` in canvas.js, but in some cases where we fail to parse the PDF file correctly it can happen. Currently this will potentially cause an entire page to fail to render, which seems suboptimal.
With this patch, we will instead continue rendering with a warning that things might not work correctly.
2015-10-21 21:30:59 +02:00
Jonas Jenwald
2e751199fb Prevent getOperatorList from failing to correctly parse OPS.paintXObject for TilingPatterns that are missing some /Resources entries (issue 6541)
Fixes 6541.
2015-10-21 21:30:56 +02:00
Rob Wu
50ff2d4c2a Ignore operators that are known to be unsupported
`operatorList.addOp` adds the arguments to the list which is then
passed as-is by postMessage to the main thread. But since we don't
parse these operations, they are raw PDF objects and may therefore
cause a serialization error.

This is a conservative patch, and only affects operators which are
known to be unsupported. We should ignore all unknown operators,
but I haven't really looked into the consequences of doing that.

Fixes #6549
2015-10-21 15:39:25 +02:00
Brendan Dahl
e4f0e6f2a0 Merge pull request #6531 from covlllp/new_merge
Fixes bluebeam password protection issue
2015-10-16 13:47:06 -07:00
Colin VanLang
6d8e883fe6 Fixes bluebeam password protection issue 2015-10-15 21:22:27 -04:00
Jonas Jenwald
49883439a5 Ensure that Dict_getArray doesn't fail if xref in undefined (PR 6485 follow-up)
In PR 6485 I somehow missed to account for the case where `xref` is undefined. Since a dictonary can be initialized without providing a reference to an `xref` instance, `Dict_getArray` can thus fail without this added check.
2015-10-15 11:47:07 +02:00
Jonas Jenwald
9ab896e307 [api-minor] Add an option to PDFJS for specifying the |target| attribute of external links
Replaces `PDFJS.openExternalLinksInNewWindow` with a more generic configuration option.
*Note:* `PDFJS.openExternalLinksInNewWindow = true;` is equal to `PDFJS.externalLinkTarget = PDFJS.LinkTarget.BLANK;`.
2015-10-13 21:52:00 +02:00
Brendan Dahl
3eaeacfe19 Merge pull request #6476 from Snuffleupagus/PartialEvaluator_readToUnicode-cmap-length
Right-size the `map` array in PartialEvaluator_readToUnicode
2015-10-09 10:31:28 -07:00
Tim van der Meij
5e4910f7b6 Merge pull request #6491 from Snuffleupagus/check-trailer-if-xref-missing
Make `XRef_indexObjects` even more robust against bad PDF files, by checking for the existence of 'trailer' if 'xref' is not found
2015-10-04 16:00:00 +02:00
Tim van der Meij
dd9d0b8770 Merge pull request #5480 from CodingFabian/issue-5458
Remove TryCatch in canvas for EvenOdd winding rule.
2015-10-04 15:31:34 +02:00
Jonas Jenwald
9b12c64be5 Cache the regular expression used for finding objs in XRef_indexObjects, to avoid unnecessary allocations 2015-10-02 12:46:58 +02:00
Jonas Jenwald
192907e0d2 Make XRef_indexObjects even more robust against bad PDF files, by checking for the existence of 'trailer' if 'xref' is not found
Fixes http://www.cyjack.com/cognition/Terence%20McKenna%20-%20Lectures%20on%20Alchemy.pdf.
2015-10-01 15:01:25 +02:00
Tim van der Meij
1bdfc47de8 Merge pull request #6411 from Snuffleupagus/remove-Parser_fetchIfRef
Remove `Parser_fetchIfRef` since it's obsolete
2015-09-30 00:38:35 +02:00
Jonas Jenwald
1b8cb52555 Prevent PartialEvaluator_buildFormXObject from failing if the Matrix or BBox contains indirect objects
This patch fixes yet another instance of bad PDF data, specifically a case where the `BBox` array contains indirect objects (i.e. `Ref`s).

Fixes the missing image in http://www.int.washington.edu/talks/WorkShops/int_08_37W/People/Franz_M/Franz.pdf#page=24. *Note:* There are missing images on a number of the pages in that file.
2015-09-29 10:11:49 +02:00
Jonas Jenwald
75557d27d1 Add getArray method to Dict
This method extend `get`, and will fetch all indirect objects (i.e. `Ref`s) when the result is an `Array`.
2015-09-29 10:11:47 +02:00
Jonas Jenwald
9eab463b6d Ensure that the baseTransform is always defined for TilingPatterns
Fixes http://www2.emersonprocess.com/siteadmincenter/PM%20Micro%20Motion%20Documents/High-Pressure-Measurement-WP-001287.pdf#page=3.
2015-09-27 22:49:34 +02:00
Jonas Jenwald
8d831449ab Right-size the map array in PartialEvaluator_readToUnicode
We can avoid a lot of intermediate resizings, by directly allocating the required number of elements for the `map` array.
2015-09-24 13:08:53 +02:00
Fabian Lange
2564827503 Fix text spacing with vertical fonts (#6387)
According to the PDF spec 5.3.2, a positive value means in horizontal,
that the next glyph is further to the left (so narrower), and in
vertical that it is further down (so wider).
This change fixes the way PDF.js has interpreted the value.
2015-09-15 09:28:45 +02:00
Tim van der Meij
12b0b9744b Merge pull request #6427 from Snuffleupagus/slightly-more-robust-get-fingerprint
Make `get fingerprint` slightly more robust against corrupt PDF files
2015-09-10 22:07:44 +02:00
Jonas Jenwald
5853553455 Make get fingerprint slightly more robust against corrupt PDF files
This patch adjusts `get fingerprint` to also check that the `/ID` entry contains (non-empty) strings, to prevent more possible failures when loading corrupt PDF files (follow-up to PR 5602).

Note that I've not actually encountered such a PDF file in the wild. However given that `stringToBytes` will assert that the input is a string, and that we'll thus fail to load a document unless `get fingerprint` succeeds, making this more robust seems like a good idea to me.
2015-09-08 13:42:53 +02:00
Jonas Jenwald
29a1cdb6a6 Only choose a (3, 1) cmap table for TrueType fonts that have an encoding specified (issue 6410)
For (1, 0) cmaps, we have two different codepaths depending on whether the font has/hasn't got an encoding. But with (3, 1) cmaps we don't have a good fallback when the encoding is missing, hence this patch changes `readCmapTable` to only choose a (3, 1) cmap table if the font is non-symbolic *and* an encoding exists. Without this, we'll not be able to successfully create a working glyph map for some TrueType fonts with (3, 1) cmap tables.

Fixes 6410.
2015-09-07 16:56:05 +02:00
Fabian Lange
063ca95f5f Remove TryCatch in canvas fill
As verified by @Rob--W, the evenodd fill rule works correctly in all supported browsers. This now allows optimization by JS engines.

This fixes #5458
2015-09-05 11:10:51 +02:00
Brendan Dahl
238e16feeb Merge pull request #6407 from Snuffleupagus/bug-1200096
Fallback in `readCmapTable`, instead of using `error`, for TrueType fonts with unsupported cmap formats (bug 1200096)
2015-09-04 18:10:34 -07:00
Jonas Jenwald
cfd5a64df5 Ensure that the clipping path is reset when the state is restored (issue 6413)
According to the specification, see `NOTE 2` in http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G7.3882161, it appears that we should ensure that the clipping path is reset when the restore (`Q`) operator is encountered.

Fixes 6413.
2015-09-03 17:35:32 +02:00
Jonas Jenwald
b1d148a4aa Remove Parser_fetchIfRef since it's obsolete
This code was added in PR 1214, but was made obsolete by PRs 1488/1493. Prior to the latter ones, `Dict_get` retured the raw objects. However, afterwards (and currently) `Dict_get` now resolves indirect objects, which makes `Parser_fetchIfRef` redundant.

*Potential risks with this patch:*
This patch passes all tests locally, but there's a *small* possibility that it could break some weird PDF files.
In the current code, wrapping `Dict_get` inside `Parser_fetchIfRef` will potentially mean two back-to-back call of `XRef_fetch`, if a reference points directly to another reference. I'm not sure if this can actually happen in practice, and I'd think that if that were the case we'd already have run into it elsewhere in the code-base, given that `Parser` is the only place where we try to "double" resolve references.
2015-09-02 23:11:00 +02:00
Jonas Jenwald
0fb31a4a9e Fallback in readCmapTable, instead of using error, for TrueType fonts with unsupported cmap formats (bug 1200096)
Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1200096.

The problematic font has a `format 2` cmap, which we've never supported properly. Prior to PR 2606, we were able to fallback to a working state, despite not having proper support for that cmap format.

Obviously the best/correct solution would be to implement actual support for more cmap formats[1]. However, I'm hoping that a simple patch will be OK for now, given that:
 - `format 2` cmaps seem to be quite rare in practice, since this has been broken for 2.5 years before anyone noticed.
 - Having a simple patch will make potential uplifts a lot easier.

[1] See the specification at https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6cmap.html
2015-09-01 14:01:19 +02:00
Tim van der Meij
0020f33873 Merge pull request #6357 from Snuffleupagus/bidi-result
Avoid more allocations for RTL text in bidi.js
2015-09-01 00:44:33 +02:00
Tim van der Meij
b42b894570 Merge pull request #6386 from Snuffleupagus/Parser_makeFilter-warn-on-empty-stream
Add a warning when we encounter an empty stream in `Parser_makeFilter`
2015-08-30 23:14:22 +02:00
Rob Wu
582573b96b Merge pull request #6358 from Snuffleupagus/Parser_tryShift-missingDataException
Don't catch `MissingDataException` in `Parser_tryShift`
2015-08-27 14:46:24 +02:00
Jonas Jenwald
f814fdc215 Add a warning when we encounter an empty stream in Parser_makeFilter
Having a warning here would have meant that issue 6360 could have been solved in approximately five minutes, instead of an hour. To avoid that happening again, this patch adds a warning whenever we treat a stream as empty.
2015-08-26 20:14:30 +02:00
Brendan Dahl
88e0326787 Merge pull request #6337 from Snuffleupagus/issue-6336
Adjust which TrueType (3, 1) glyphs we attempt to skip mapping of (issue 6336)
2015-08-25 09:49:46 -07:00
Jonas Jenwald
56a43a3181 Make XRef_indexObjects more robust against bad PDF files (issue 5752)
This patch improves the detection of `xref` in files where it is followed by an arbitrary whitespace character (not just a line-breaking char).
It also adds a check for missing whitespace, e.g. `1 0 obj<<`, to speed up `readToken` for the PDF file in the referenced issue.
Finally, the patch also replaces a bunch of magic numbers with suitably named constants.

Fixes 5752.

Also improves 6243, but there are still issues.
2015-08-21 20:33:02 +02:00
Yury Delendik
23cb01c8af Merge pull request #6372 from Snuffleupagus/issue-6360
Also check `maybeLength` when deciding if a stream is empty in `Parser_makeFilter` (issue 6360, bug 1191694)
2015-08-20 17:43:11 -05:00
Jonas Jenwald
5128603f64 Also check maybeLength when deciding if a stream is empty in Parser_makeFilter (issue 6360)
The problem with the PDF files in the issue, besides the obviously broken XRef tables which we're able to recover from, is that many/most of the streams have Dictionaries where the `Length` entry is set to `0`. This causes us to return `NullStream`, instead of the appropriate one in `Parser_makeFilter`.

Fixes 6360.
2015-08-20 23:04:18 +02:00
Yury Delendik
b11bc727c2 Merge pull request #6370 from castevinz/fix-getdocument-pdfBytes-check
api/getDocument: handle ArrayBuffer check for PDF binary data (byteLength)
2015-08-20 07:01:53 -05:00
Vincent Castelain
0cd4cc4e80 api/getDocument : handle ArrayBuffer check for PDF binary data (byteLength) 2015-08-20 08:56:05 +02:00
Jonas Jenwald
ede5235d3d Merge pull request #6332 from Rob--W/postMessage-error
Serialize errors before invoking postMessage
2015-08-19 12:00:02 +02:00
Yury Delendik
c56dc9a093 Merge pull request #6141 from skalnik/fix-font-csp-issues
Provide a fallback for font rendering when not allowed to use `eval`
2015-08-18 18:50:11 -05:00
Jonas Jenwald
3fa5f6cc3b Only take the fast-path in PDFImage_createImageData for un-masked JPEG images with "standard" colour spaces (issue 6364)
Fixes 6364.
2015-08-18 22:25:37 +02:00