Sakurai/pdf.js - pdf.js - Gitea on kemo

Sakurai/pdf.js

Author	SHA1	Message	Date
Jonas Jenwald	42f07c6262	[api-minor] Use the `new URL` constructor when validating URLs in annotations and the outline, as a complement to only checking the protocol, and add a bit more validation to `Catalog_parseDestDictionary` Note that this will automatically reject any relative URL. To make the API more useful to consumers, URLs that are rejected will be available via the `unsafeUrl` property in the data object returned by `PDFPageProxy_getAnnotations`. The patch also adds a bit more validation of the data for `Named` actions.	2016-10-19 22:11:17 +02:00
Jonas Jenwald	e64bc1fd13	Move parsing of destination dictionaries to a helper function This not only reduces code duplication, but it also allow us to easily support the same kind of URLs we currently do for Link annotations in the Outline as well.	2016-10-18 16:14:07 +02:00
Jonas Jenwald	3e77cf6b32	Prevent an infinite loop in `XRef_fetchUncompressed` for encrypted PDF files with indirect objects in the /Encrypt dictionary (issue 7665)	2016-09-25 00:18:47 +02:00
Tim van der Meij	b4c8814fc9	Merge pull request #7534 from Snuffleupagus/isName-name-check Add a parameter to the `isName` function that enables checking not just that something is a `Name`, but also that the actual `name` properties matches	2016-08-17 15:48:42 +02:00
Jonas Jenwald	544d29f5cb	Add a `recoveryMode` that suppresses errors from the `Parser`, and utilize it when searching for the main trailer in `XRef_indexObjects` (bug 1250079) Instead of having `Parser_getObj` fail unconditionally for the referenced PDF file, this patch attempts to let searching for the main trailer continue even if there are errors. Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1250079.	2016-08-17 12:37:35 +02:00
Jonas Jenwald	83ce6f0b6d	Adjust the (applicable) existing `isName` callsites to use the new `isName(v, name)` version of the function	2016-08-10 11:15:08 +02:00
Jonas Jenwald	01ab15a6f1	[api-minor] Let `Catalog_getPageIndex` check that the `Ref` actually points to a /Page dictionary Currently the `getPageIndex` method will happily return `0`, even if the `Ref` parameter doesn't actually point to a proper /Page dictionary. Having the API trust that the consumer is doing the right thing seems error-prone, hence this patch which adds a check for this case. Given that the `Catalog_getPageIndex` method isn't used in any hot part of the codebase, this extra check shouldn't be a problem. (Note: in the standard viewer, it is only ever used from `PDFLinkService_navigateTo` if a destination needs to be resolved during document loading, which isn't common enough to be an issue IMHO.)	2016-05-21 14:13:41 +02:00
Tim van der Meij	c1c199d702	Merge pull request #7295 from Snuffleupagus/core-getArray Use `Dict_getArray` in more places in `src/core/` to avoid issues when Arrays contain indirect objects	2016-05-10 23:21:54 +02:00
Jonas Jenwald	182d33800a	Ignore 'endobj' commands inside of `ObjStm` streams (issue 5241, bug 898610, bug 1037816) According to an example in the PDF specification, see http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#page=56, an `ObjStm` stream should not contain 'endobj' commands. Fixes 5241. Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=898610. Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1037816.	2016-05-09 09:50:45 +02:00
Jonas Jenwald	6111c17c8a	Use `Dict_getArray` in more places in `src/core/` to avoid issues when Arrays contain indirect objects As evident from e.g. PRs 6485 and 7118, some bad PDF generators unfortunately create Arrays where some elements are indirect objects (i.e. `Ref`s). This seems to mostly affect Arrays that contain numbers, such as e.g. `Matrix/FontMatrix/BBox/FontBBox/Rect/Color/...`, and has manifested itself in PDF files that fail to render correctly (some elements are missing). The problem in both the cases above, besides broken rendering, was that there were no errors/warnings that indicated what the problem was, making it difficult to pinpoint the issue. Hence this patch, where I've audited all usages of `Dict_get` in `src/core/` files, and replaced it with `Dict_getArray` where appropriate to try and prevent unnecessary future bugs.	2016-05-05 19:42:57 +02:00
Jonas Jenwald	e281ef15db	Adjust incorrect first obj number of "free" xref entry in `XRef_readXRefTable` (issue 7229) Fixes 7229.	2016-04-21 16:36:32 +02:00
Jonas Jenwald	41efb92d3a	Merge pull request #6988 from timvandermeij/fileattachment-annotation Implement support for FileAttachment annotations	2016-02-24 12:58:06 +01:00
Tim van der Meij	6a33dfd13a	Implement support for FileAttachment annotations	2016-02-23 22:49:53 +01:00
Jonas Jenwald	7cf9de2c17	[api-minor] Change `getOutline` to actually return the RGB color of outline items Currently the `C` entry in an outline item is returned as is, which is neither particularly useful nor what the API documentation claims. This patch also adds unit-tests for both the color handling, and the `F` entry (bold/italic flags).	2016-02-15 13:41:22 +01:00
Jonas Jenwald	98db068079	Reduce the overall indentation level in `Catalog_readDocumentOutline`, by using early returns, in order to improve readability	2016-02-14 11:38:43 +01:00
Yury Delendik	825a2225ab	Merge pull request #6915 from yurydelendik/lookuptables Refactor lookup hash tables/objects	2016-01-28 15:01:06 -06:00
Yury Delendik	2edf2792dc	Replaces literal {} created lookup tables with Object.create	2016-01-28 12:18:38 -06:00
Jonas Jenwald	1140a34f5c	[api-minor] Change `getPageLabels` to always return the pageLabels, even if they are identical to standard page numbering	2016-01-27 13:36:03 +01:00
Jonas Jenwald	85cf90643f	[api-minor] Add support for PageLabels in the API	2016-01-19 22:49:04 +01:00
Jonas Jenwald	8ad18959d7	Add support for NumberTree	2016-01-19 22:47:45 +01:00
Jonas Jenwald	0030a82dc3	[api-minor] Add support for URLs in the document outline Re: issue 5089. (Note that since there are other outline features that we currently don't support, e.g. bold/italic text and custom colours, I thus think we can keep the referenced issue open.)	2016-01-19 21:36:27 +01:00
Yury Delendik	6b60c8f4db	Adds UMD headers to core, display and shared files.	2015-12-15 13:24:39 -06:00
Manas	a2ba1b8189	Uses editorconfig to maintain consistent coding styles Removes the following as they unnecessary /* -- Mode: Java; tab-width: 2; indent-tabs-mode: nil; c-basic-offset: 2 -- / / vim: set shiftwidth=2 tabstop=2 autoindent cindent expandtab: */	2015-11-14 07:32:18 +05:30
Yury Delendik	59c13b32aa	Adds destroy method to the document loading task. Also renames PDFPageProxy.destroy method to cleanup.	2015-10-23 08:57:14 -05:00
Jonas Jenwald	49883439a5	Ensure that `Dict_getArray` doesn't fail if `xref` in undefined (PR 6485 follow-up) In PR 6485 I somehow missed to account for the case where `xref` is undefined. Since a dictonary can be initialized without providing a reference to an `xref` instance, `Dict_getArray` can thus fail without this added check.	2015-10-15 11:47:07 +02:00
Jonas Jenwald	9b12c64be5	Cache the regular expression used for finding `obj`s in `XRef_indexObjects`, to avoid unnecessary allocations	2015-10-02 12:46:58 +02:00
Jonas Jenwald	192907e0d2	Make `XRef_indexObjects` even more robust against bad PDF files, by checking for the existence of 'trailer' if 'xref' is not found Fixes http://www.cyjack.com/cognition/Terence%20McKenna%20-%20Lectures%20on%20Alchemy.pdf.	2015-10-01 15:01:25 +02:00
Jonas Jenwald	75557d27d1	Add `getArray` method to `Dict` This method extend `get`, and will fetch all indirect objects (i.e. `Ref`s) when the result is an `Array`.	2015-09-29 10:11:47 +02:00
Jonas Jenwald	56a43a3181	Make `XRef_indexObjects` more robust against bad PDF files (issue 5752) This patch improves the detection of `xref` in files where it is followed by an arbitrary whitespace character (not just a line-breaking char). It also adds a check for missing whitespace, e.g. `1 0 obj<<`, to speed up `readToken` for the PDF file in the referenced issue. Finally, the patch also replaces a bunch of magic numbers with suitably named constants. Fixes 5752. Also improves 6243, but there are still issues.	2015-08-21 20:33:02 +02:00
Rob Wu	c676ecb5a0	Detect scripted auto-print requests Fixes #6106 To avoid future regressions, two new unit tests were added: 1. A new PDF based on the report from #6106, which contains an OpenAction of type JavaScript and a string "this.print({...}". 2. An existing PDF from https://bugzil.la/1001080 (from #4698). Although it does not matter, since we don't execute the JavaScript code, I have also changed "print(true)" to "print({})" since the print method takes an object (not a boolean). See "Printing PDF documents", page 62: http://adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/js_developer_guide.pdf	2015-07-20 18:25:02 +02:00
Jonas Jenwald	28f40b1b58	Fetch all indirect objects (i.e. `Ref`s) in `NameTree_getAll` and `NameTree_get` (issue 6204)	2015-07-14 10:56:56 +02:00
Tim van der Meij	1416a1b521	Merge pull request #6187 from Snuffleupagus/more-efficient-getDestination A couple of improvements of `getDestination` (unit-test included)	2015-07-13 23:03:13 +02:00
Rob Wu	fd29bb0c57	Subtract start offset for xrefs in recovery mode Xref offsets are relative to the start of the PDF data, not to the start of the PDF file. This is clear if you look at the other code: - In the XRef's readXRefTable and processXRefTable methods of XRef, the offset of a xref entry is set to the bytes as given by a PDF file. These values are always relative to the start of the PDF file (%PDF-). - The XRef's readXRef method adds the start offset of the stream to Xref entry's offset: "stream.pos = startXRef + stream.start". Clearly, this line assumes that the entry offset excludes the start offset. However, when the PDF is parsed in recovery mode, the xref table is filled with entries whose offset is relative to the start of the stream rather than the PDF file. This is incorrect, and the fix is to subtract the start offset of the stream from the entry's byte offset. The manually created PDF file serves as a regression test. It is a valid PDF, except: - The integer to point to the start of the xref table and the %%EOF trailer are missing. This will activate recovery mode in PDF.js - Some junk was added before the start of the PDF file. This exposes the bad offset bug.	2015-07-10 23:33:10 +02:00
Jonas Jenwald	7df78f997e	Slightly more efficient `getDestination` For named destinations that are contained in a `Dict`, as opposed to a `NameTree`, we currently iterate through the entire dictionary just to fetch one destination. This code appears to simply have been copy-pasted from the `get destinations` method, but in its current form it's quite unnecessary/inefficient since can just get the required destination directly instead.	2015-07-08 18:31:51 +02:00
Jonas Jenwald	940bedf75f	Add a unit-test that attempts to fetch a non-existent named destination Doing this helped uncover an issue with the `getDestination` implementation. Currently if a named destination doesn't exist, the method (in `obj.js`) may return `undefined` which leads to the promise being stuck in a pending state. Note: returning `null` for this case is consistent with other methods, e.g. `getOutline` and `getAttachments`.	2015-07-07 22:05:08 +02:00
Jonas Jenwald	a28ed7c834	Always traverse the entire parent chain in Page_getInheritedPageProp (issue 5954) This enables us to find resources placed on multiple levels of the tree. Fixes 5954.	2015-05-30 12:21:05 +02:00
Tim van der Meij	d484ebd492	Merge pull request #5910 from jordan-thoms/fix-concatenated-files Fix error reading concatenated pdfs	2015-05-13 22:40:55 +02:00
Jonas Jenwald	760222cf0b	Handle the Encoding being a dictionary in PartialEvaluator_preEvaluateFont (bug 1157493) This is a regression from PR 4423. Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1157493.	2015-04-25 16:48:14 +02:00
Brendan Dahl	846eb967cc	Merge pull request #5655 from Snuffleupagus/issue-5644 Avoid getting stuck in empty nodes in the Pages tree when calling \|Catalog_getPageDict\| (issue 5644)	2015-04-20 11:46:27 -07:00
Jordan Thoms	d0ea772fc6	Fix error reading concatenated pdfs	2015-04-18 20:56:07 +12:00
Jonas Jenwald	888cbe0bde	Avoid getting stuck in empty nodes in the Pages tree when calling \|Catalog_getPageDict\| (issue 5644)	2015-02-22 17:42:15 +01:00
Collin Anderson	54e984c763	cleaned whitespace	2015-02-17 11:07:37 -05:00
Tim van der Meij	aaa1f2cb11	Implemented NameTree.get() using binary search	2014-10-07 00:02:15 +02:00
Tim van der Meij	b215af30d3	Require destinations when they are needed and do not fetch all of them in advance	2014-10-06 22:26:18 +02:00
Jonas Jenwald	06b5d97bc6	Remove two instances of leftover console.log debug statements The `console.log` statement in evaluator_spec.js is obviously not needed. In obj.js it could have been replaced by `info`, but that seemed unnecessary given the already existing `error`.	2014-08-13 14:29:46 +02:00
Yury Delendik	cc180d7e2b	Removes some bind() calls from fetchAsync	2014-08-05 21:22:12 -05:00
Jonas Jenwald	ee0c0dd8a9	Add strict equalities in src/core/obj.js	2014-08-01 21:56:04 +02:00
Nicholas Nethercote	856e1c600b	Optimize Ref_toString(). I have a large PDF where this function is called 1.6 million times during loading. Minimizing the string concatenations reduces the cumulative allocations done by Firefox within this function from 113 MB to 48 MB.	2014-07-24 06:49:56 -07:00
Yury Delendik	84157e039d	Merge pull request #4973 from nnethercote/better-ref-keys Factor out repeated Ref key string generation code.	2014-06-19 21:00:09 -05:00
Nicholas Nethercote	1ad3ffbc7b	Factor out repeated Ref key string generation code. In src/core/obj.js, we convert a Ref to a string to index into a table like this: 'R1.0'. This conversion is repeated numerous times. This patch factors out the conversion into a new function. Ref.prototype.toString().	2014-06-19 18:22:39 -07:00

1 2