Commit Graph

187 Commits

Author SHA1 Message Date
Jonas Jenwald
2b79782377 [api-minor] Add basic support for Launch actions (issue 1778, issue 3897, issue 6616)
In general we neither want, nor can, support arbitrary `Launch` actions. But in practice, all the cases we've seen so far just contains relative URLs to other PDF files. Building on PR 7689, we can thus at least support basic `Launch` actions.
2016-10-21 13:40:32 +02:00
Jonas Jenwald
d284cfd5eb [api-minor] Add support for relative URLs, in both annotations and the outline, by adding a docBaseUrl parameter to PDFJS.getDocument (bug 766086)
Note that in `FIREFOX/MOZCENTRAL/CHROME` builds of the standard viewer the `docBaseUrl` parameter will be set by default, since in that case it makes sense to use the current URL as a base.
For the `GENERIC` viewer, or the API itself, it doesn't make sense to try and set the `docBaseUrl` by default. However, custom deployments/implementations may still find the parameter useful.
2016-10-19 22:20:24 +02:00
Jonas Jenwald
71a781ee5c Deprecate the isValidUrl utility function and replace it with createValidAbsoluteUrl/isValidProtocal functions instead, since the main URL validation is now done using the new URL constructor 2016-10-19 22:11:22 +02:00
Jonas Jenwald
42f07c6262 [api-minor] Use the new URL constructor when validating URLs in annotations and the outline, as a complement to only checking the protocol, and add a bit more validation to Catalog_parseDestDictionary
Note that this will automatically reject any relative URL.
To make the API more useful to consumers, URLs that are rejected will be available via the `unsafeUrl` property in the data object returned by `PDFPageProxy_getAnnotations`.

The patch also adds a bit more validation of the data for `Named` actions.
2016-10-19 22:11:17 +02:00
Jonas Jenwald
e64bc1fd13 Move parsing of destination dictionaries to a helper function
This not only reduces code duplication, but it also allow us to easily support the same kind of URLs we currently do for Link annotations in the Outline as well.
2016-10-18 16:14:07 +02:00
Jonas Jenwald
3e77cf6b32 Prevent an infinite loop in XRef_fetchUncompressed for encrypted PDF files with indirect objects in the /Encrypt dictionary (issue 7665) 2016-09-25 00:18:47 +02:00
Tim van der Meij
b4c8814fc9 Merge pull request #7534 from Snuffleupagus/isName-name-check
Add a parameter to the `isName` function that enables checking not just that something is a `Name`, but also that the actual `name` properties matches
2016-08-17 15:48:42 +02:00
Jonas Jenwald
544d29f5cb Add a recoveryMode that suppresses errors from the Parser, and utilize it when searching for the main trailer in XRef_indexObjects (bug 1250079)
Instead of having `Parser_getObj` fail unconditionally for the referenced PDF file, this patch attempts to let searching for the main trailer continue even if there are errors.

Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1250079.
2016-08-17 12:37:35 +02:00
Jonas Jenwald
83ce6f0b6d Adjust the (applicable) existing isName callsites to use the new isName(v, name) version of the function 2016-08-10 11:15:08 +02:00
Jonas Jenwald
01ab15a6f1 [api-minor] Let Catalog_getPageIndex check that the Ref actually points to a /Page dictionary
Currently the `getPageIndex` method will happily return `0`, even if the `Ref` parameter doesn't actually point to a proper /Page dictionary.
Having the API trust that the consumer is doing the right thing seems error-prone, hence this patch which adds a check for this case.

Given that the `Catalog_getPageIndex` method isn't used in any hot part of the codebase, this extra check shouldn't be a problem.
(Note: in the standard viewer, it is only ever used from `PDFLinkService_navigateTo` if a destination needs to be resolved during document loading, which isn't common enough to be an issue IMHO.)
2016-05-21 14:13:41 +02:00
Tim van der Meij
c1c199d702 Merge pull request #7295 from Snuffleupagus/core-getArray
Use `Dict_getArray` in more places in `src/core/` to avoid issues when Arrays contain indirect objects
2016-05-10 23:21:54 +02:00
Jonas Jenwald
182d33800a Ignore 'endobj' commands inside of ObjStm streams (issue 5241, bug 898610, bug 1037816)
According to an example in the PDF specification, see http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#page=56, an `ObjStm` stream should not contain 'endobj' commands.

Fixes 5241.
Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=898610.
Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1037816.
2016-05-09 09:50:45 +02:00
Jonas Jenwald
6111c17c8a Use Dict_getArray in more places in src/core/ to avoid issues when Arrays contain indirect objects
As evident from e.g. PRs 6485 and 7118, some bad PDF generators unfortunately create Arrays where *some* elements are indirect objects (i.e. `Ref`s). This seems to mostly affect Arrays that contain numbers, such as e.g. `Matrix/FontMatrix/BBox/FontBBox/Rect/Color/...`, and has manifested itself in PDF files that fail to render correctly (some elements are missing).

The problem in both the cases above, besides broken rendering, was that there were *no* errors/warnings that indicated what the problem was, making it difficult to pinpoint the issue.
Hence this patch, where I've audited all usages of `Dict_get` in `src/core/` files, and replaced it with `Dict_getArray` where appropriate to try and prevent unnecessary future bugs.
2016-05-05 19:42:57 +02:00
Jonas Jenwald
e281ef15db Adjust incorrect first obj number of "free" xref entry in XRef_readXRefTable (issue 7229)
Fixes 7229.
2016-04-21 16:36:32 +02:00
Jonas Jenwald
41efb92d3a Merge pull request #6988 from timvandermeij/fileattachment-annotation
Implement support for FileAttachment annotations
2016-02-24 12:58:06 +01:00
Tim van der Meij
6a33dfd13a Implement support for FileAttachment annotations 2016-02-23 22:49:53 +01:00
Jonas Jenwald
7cf9de2c17 [api-minor] Change getOutline to actually return the RGB color of outline items
Currently the `C` entry in an outline item is returned as is, which is neither particularly useful nor what the API documentation claims.

This patch also adds unit-tests for both the color handling, and the `F` entry (bold/italic flags).
2016-02-15 13:41:22 +01:00
Jonas Jenwald
98db068079 Reduce the overall indentation level in Catalog_readDocumentOutline, by using early returns, in order to improve readability 2016-02-14 11:38:43 +01:00
Yury Delendik
825a2225ab Merge pull request #6915 from yurydelendik/lookuptables
Refactor lookup hash tables/objects
2016-01-28 15:01:06 -06:00
Yury Delendik
2edf2792dc Replaces literal {} created lookup tables with Object.create 2016-01-28 12:18:38 -06:00
Jonas Jenwald
1140a34f5c [api-minor] Change getPageLabels to always return the pageLabels, even if they are identical to standard page numbering 2016-01-27 13:36:03 +01:00
Jonas Jenwald
85cf90643f [api-minor] Add support for PageLabels in the API 2016-01-19 22:49:04 +01:00
Jonas Jenwald
8ad18959d7 Add support for NumberTree 2016-01-19 22:47:45 +01:00
Jonas Jenwald
0030a82dc3 [api-minor] Add support for URLs in the document outline
Re: issue 5089.
(Note that since there are other outline features that we currently don't support, e.g. bold/italic text and custom colours, I thus think we can keep the referenced issue open.)
2016-01-19 21:36:27 +01:00
Yury Delendik
6b60c8f4db Adds UMD headers to core, display and shared files. 2015-12-15 13:24:39 -06:00
Manas
a2ba1b8189 Uses editorconfig to maintain consistent coding styles
Removes the following as they unnecessary
/* -*- Mode: Java; tab-width: 2; indent-tabs-mode: nil; c-basic-offset: 2 -*- */
/* vim: set shiftwidth=2 tabstop=2 autoindent cindent expandtab: */
2015-11-14 07:32:18 +05:30
Yury Delendik
59c13b32aa Adds destroy method to the document loading task.
Also renames PDFPageProxy.destroy method to cleanup.
2015-10-23 08:57:14 -05:00
Jonas Jenwald
49883439a5 Ensure that Dict_getArray doesn't fail if xref in undefined (PR 6485 follow-up)
In PR 6485 I somehow missed to account for the case where `xref` is undefined. Since a dictonary can be initialized without providing a reference to an `xref` instance, `Dict_getArray` can thus fail without this added check.
2015-10-15 11:47:07 +02:00
Jonas Jenwald
9b12c64be5 Cache the regular expression used for finding objs in XRef_indexObjects, to avoid unnecessary allocations 2015-10-02 12:46:58 +02:00
Jonas Jenwald
192907e0d2 Make XRef_indexObjects even more robust against bad PDF files, by checking for the existence of 'trailer' if 'xref' is not found
Fixes http://www.cyjack.com/cognition/Terence%20McKenna%20-%20Lectures%20on%20Alchemy.pdf.
2015-10-01 15:01:25 +02:00
Jonas Jenwald
75557d27d1 Add getArray method to Dict
This method extend `get`, and will fetch all indirect objects (i.e. `Ref`s) when the result is an `Array`.
2015-09-29 10:11:47 +02:00
Jonas Jenwald
56a43a3181 Make XRef_indexObjects more robust against bad PDF files (issue 5752)
This patch improves the detection of `xref` in files where it is followed by an arbitrary whitespace character (not just a line-breaking char).
It also adds a check for missing whitespace, e.g. `1 0 obj<<`, to speed up `readToken` for the PDF file in the referenced issue.
Finally, the patch also replaces a bunch of magic numbers with suitably named constants.

Fixes 5752.

Also improves 6243, but there are still issues.
2015-08-21 20:33:02 +02:00
Rob Wu
c676ecb5a0 Detect scripted auto-print requests
Fixes #6106

To avoid future regressions, two new unit tests were added:
1. A new PDF based on the report from #6106, which contains an
   OpenAction of type JavaScript and a string "this.print({...}".
2. An existing PDF from https://bugzil.la/1001080 (from #4698).

Although it does not matter, since we don't execute the JavaScript code,
I have also changed "print(true)" to "print({})" since the print method
takes an object (not a boolean). See "Printing PDF documents", page 62:
http://adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/js_developer_guide.pdf
2015-07-20 18:25:02 +02:00
Jonas Jenwald
28f40b1b58 Fetch all indirect objects (i.e. Refs) in NameTree_getAll and NameTree_get (issue 6204) 2015-07-14 10:56:56 +02:00
Tim van der Meij
1416a1b521 Merge pull request #6187 from Snuffleupagus/more-efficient-getDestination
A couple of improvements of `getDestination` (unit-test included)
2015-07-13 23:03:13 +02:00
Rob Wu
fd29bb0c57 Subtract start offset for xrefs in recovery mode
Xref offsets are relative to the start of the PDF data, not to the start
of the PDF file. This is clear if you look at the other code:

- In the XRef's readXRefTable and processXRefTable methods of XRef, the
  offset of a xref entry is set to the bytes as given by a PDF file.
  These values are always relative to the start of the PDF file (%PDF-).

- The XRef's readXRef method adds the start offset of the stream to
  Xref entry's offset: "stream.pos = startXRef + stream.start".
  Clearly, this line assumes that the entry offset excludes the start
  offset.

However, when the PDF is parsed in recovery mode, the xref table is
filled with entries whose offset is relative to the start of the stream
rather than the PDF file. This is incorrect, and the fix is to subtract
the start offset of the stream from the entry's byte offset.

The manually created PDF file serves as a regression test. It is a valid
PDF, except:
- The integer to point to the start of the xref table and the %%EOF
  trailer are missing. This will activate recovery mode in PDF.js
- Some junk was added before the start of the PDF file. This exposes the
  bad offset bug.
2015-07-10 23:33:10 +02:00
Jonas Jenwald
7df78f997e Slightly more efficient getDestination
For named destinations that are contained in a `Dict`, as opposed to a `NameTree`, we currently iterate through the *entire* dictionary just to fetch *one* destination.
This code appears to simply have been copy-pasted from the `get destinations` method, but in its current form it's quite unnecessary/inefficient since can just get the required destination directly instead.
2015-07-08 18:31:51 +02:00
Jonas Jenwald
940bedf75f Add a unit-test that attempts to fetch a non-existent named destination
Doing this helped uncover an issue with the `getDestination` implementation.
Currently if a named destination doesn't exist, the method (in `obj.js`) may return `undefined` which leads to the promise being stuck in a pending state.
*Note:* returning `null` for this case is consistent with other methods, e.g. `getOutline` and `getAttachments`.
2015-07-07 22:05:08 +02:00
Jonas Jenwald
a28ed7c834 Always traverse the entire parent chain in Page_getInheritedPageProp (issue 5954)
This enables us to find resources placed on multiple levels of the tree.

Fixes 5954.
2015-05-30 12:21:05 +02:00
Tim van der Meij
d484ebd492 Merge pull request #5910 from jordan-thoms/fix-concatenated-files
Fix error reading concatenated pdfs
2015-05-13 22:40:55 +02:00
Jonas Jenwald
760222cf0b Handle the Encoding being a dictionary in PartialEvaluator_preEvaluateFont (bug 1157493)
*This is a regression from PR 4423.*

Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1157493.
2015-04-25 16:48:14 +02:00
Brendan Dahl
846eb967cc Merge pull request #5655 from Snuffleupagus/issue-5644
Avoid getting stuck in empty nodes in the Pages tree when calling |Catalog_getPageDict| (issue 5644)
2015-04-20 11:46:27 -07:00
Jordan Thoms
d0ea772fc6 Fix error reading concatenated pdfs 2015-04-18 20:56:07 +12:00
Jonas Jenwald
888cbe0bde Avoid getting stuck in empty nodes in the Pages tree when calling |Catalog_getPageDict| (issue 5644) 2015-02-22 17:42:15 +01:00
Collin Anderson
54e984c763 cleaned whitespace 2015-02-17 11:07:37 -05:00
Tim van der Meij
aaa1f2cb11 Implemented NameTree.get() using binary search 2014-10-07 00:02:15 +02:00
Tim van der Meij
b215af30d3 Require destinations when they are needed and do not fetch all of them in advance 2014-10-06 22:26:18 +02:00
Jonas Jenwald
06b5d97bc6 Remove two instances of leftover console.log debug statements
The `console.log` statement in evaluator_spec.js is obviously not needed. In obj.js it could have been replaced by `info`, but that seemed unnecessary given the already existing `error`.
2014-08-13 14:29:46 +02:00
Yury Delendik
cc180d7e2b Removes some bind() calls from fetchAsync 2014-08-05 21:22:12 -05:00
Jonas Jenwald
ee0c0dd8a9 Add strict equalities in src/core/obj.js 2014-08-01 21:56:04 +02:00
Nicholas Nethercote
856e1c600b Optimize Ref_toString().
I have a large PDF where this function is called 1.6 million times
during loading. Minimizing the string concatenations reduces the
cumulative allocations done by Firefox within this function from 113 MB
to 48 MB.
2014-07-24 06:49:56 -07:00
Yury Delendik
84157e039d Merge pull request #4973 from nnethercote/better-ref-keys
Factor out repeated Ref key string generation code.
2014-06-19 21:00:09 -05:00
Nicholas Nethercote
1ad3ffbc7b Factor out repeated Ref key string generation code.
In src/core/obj.js, we convert a Ref to a string to index into a table like
this: 'R1.0'.  This conversion is repeated numerous times.

This patch factors out the conversion into a new function.
Ref.prototype.toString().
2014-06-19 18:22:39 -07:00
Yury Delendik
623fa29300 Removes error catch from fetchUncompressed() 2014-06-18 18:30:27 -05:00
Yury Delendik
0cd28ebfa3 Telemetry for used stream and font types 2014-06-16 16:41:04 -05:00
Fabian Lange
532d7246ea add object id to streams to prevent infinite loops.
fixes http://bugzil.la/1020858
2014-06-10 11:29:25 +02:00
Yury Delendik
b20b404061 Fixes typo in getAsync 2014-06-04 11:30:53 -05:00
koderok
81d3f4a89b merged with earlier commits 2014-05-24 05:37:25 +05:30
Yury Delendik
f4baea900e Fixes regression in the cleanup 2014-05-20 21:57:04 -05:00
Yury Delendik
e5a0d89da9 Refactors loadFont for translateFont be async; fixes type3 dup data 2014-05-19 16:27:54 -05:00
Jonas Jenwald
7079992d89 Merge pull request #4770 from yurydelendik/promise-operationlist
Adds Promises to getOperatorList
2014-05-19 23:22:40 +02:00
Yury Delendik
d8eb8b1de1 Adds Promise to the getOperatorList 2014-05-19 16:19:54 -05:00
Jonas Jenwald
a984fe5b55 Add more unit tests for the API 2014-05-18 23:35:29 +02:00
Jonas Jenwald
c68ffcf978 Check if the Names dictionary actually contains a Dests dictionary before attempting to get the destinations 2014-05-14 12:43:20 +02:00
Tim van der Meij
6b9aeb34f1 Fixes rendering of PDFs with nested trailer dictionary 2014-05-02 21:01:34 +02:00
Pramodh KP
8616b2ccf3 Remove LegacyPromise from src/core/obj.js 2014-05-01 19:22:47 +05:30
Samuel Chantaraud
25ee0e8572 Preliminary attachments support
Added a partial Filespec support
Added getAttachments in API
Added a new attachments view in UI (with a new icon by @shorlander)
2014-04-18 12:11:00 -04:00
Rob Wu
2e97c0d085 Remove some unused variables from src/
Only obviously useless, local variables have been removed.
2014-04-15 17:10:23 +02:00
Joshua T Kalis
5828b2c687 Refactor - remove redundant function and all references
The function `assertWllFormed` was doing nothing different than `assert` which is
available in the same namespace. Removing it will lighten the filesize - albeit
very slightly - and reduce complexity.
2014-04-13 16:18:07 -04:00
Tim van der Meij
dd3df20a88 Makes PDF files load when xrefEntry is undefined 2014-04-12 12:05:12 +02:00
Tim van der Meij
f463f96f35 Resolving new lint issues 2014-04-11 00:41:18 +02:00
Tim van der Meij
df91acf239 Fixes lint warning W004 in src/core 2014-04-11 00:41:08 +02:00
Christian Krebs
79f34b183c Treat fonts with the same font descriptor, encoding and unicode map as aliases
Different fonts can point to the same font descriptor
(see https://github.com/mozilla/pdf.js/issues/4339 for details). With this
commit such fonts are treated as aliases if they have also the same encoding
and the same toUnicode map. The according info is stored on the font descriptor.
This change must also ensure that aliases use always the same font name
because translated fonts can get cleared depending on the CLEANUP_TIMEOUT setting.
2014-04-08 20:45:21 +02:00
Yury Delendik
31f081ae17 Doesn't traverse cyclic references in Dict.getAll; reduces empty-Dict garbage 2014-03-26 09:07:38 -05:00
Tim van der Meij
284288f1d0 Making src/core/{image,obj,parser}.js adhere to the style guide 2014-03-20 20:28:22 +01:00
Tim van der Meij
026c45e5d1 Start counting from actual beginning of PDF file 2014-03-04 22:16:54 +01:00
Yury Delendik
ad4eb9a21d Merge pull request #4354 from nnethercote/Name-cache
Use a cache to minimize the number of Name objects.
2014-03-02 18:44:29 -06:00
Thorben Bochenek
f2f713bf15 Refactor XRef in obj.js
- remove unused "if (e instanceof Stream)" in XRef_fetch
- split XRef_fetch into XRef_fetchUncompressed and XRef_fetchCompressed
This explains the actual code flow better
- add line breaks after functions
- change the document to conform to current StyleGuide
2014-02-28 14:00:54 +01:00
Nicholas Nethercote
fdb7c218da Use a cache to minimize the number of Name objects. 2014-02-27 20:41:03 -08:00
Yury Delendik
754e000907 Fixes and refactors log functionality 2014-01-15 15:28:31 -06:00
Yury Delendik
5bf3e44e30 Introduces LegacyPromise; polyfills DOM Promise 2014-01-03 18:17:05 -06:00
Yury Delendik
98ebf57144 Index objects if Prev xref was not found 2013-11-22 08:49:36 -06:00
Yury Delendik
e712c4136a Cleaning up fonts when viewer is idle for some time 2013-11-18 13:01:54 -06:00
Brendan Dahl
c2d65fc4ab Don't traverse all pages to get a single page. 2013-11-13 15:27:46 -08:00
Manas
de179e3d9b Trying to fix #3611 2013-09-24 22:54:27 +05:30
Yury Delendik
1f232ded90 Stops objects indexing at the end 2013-08-23 13:03:30 -05:00
Brendan Dahl
5ecce4996b Split files into worker and main thread pieces. 2013-08-12 10:48:06 -07:00