Sakurai/pdf.js - pdf.js - Gitea on kemo

Sakurai/pdf.js

Author	SHA1	Message	Date
Rob Wu	49af56f730	Rethrow MissingDataException when needed In core/document.js: `PDFDocument.prototype.parse` accesses a dictionary property, which could throw if the underlying data is not yet available. In core/obj.js: `get Catalog.prototype.metadata` calls `stream.getBytes`, which can throw MissingDataException too when the stream is a ChunkedStream.	2017-03-22 14:55:59 +01:00
Jonas Jenwald	9163a6fba4	Merge pull request #8112 from Snuffleupagus/JS-action-newWindow Support the `newWindow` flag in white-listed `app.launchURL` JavaScript actions (PR 7794 follow-up)	2017-03-01 21:24:34 +01:00
Jonas Jenwald	2a7e5b8a54	Support the `newWindow` flag in white-listed `app.launchURL` JavaScript actions (PR 7794 follow-up) A simple follow-up to PR 7794, which let's us add support for the `newWindow` parameter; refer to https://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/js_api_reference.pdf#G5.1507380. The patch also fixes an embarrassing oversight regarding the placement of the case-insensitive flag, and also allows arbitrary white-space at the beginning of JS actions.	2017-02-27 15:58:28 +01:00
Jonas Jenwald	14cc6acb90	Ensure that `Dict`s found in Object Streams are assigned an `objId` in `XRef.fetch` This fixes something that I noticed while working with the code in `Catalog.getPageDict` when debugging issue 8088. Note that while I don't have an example where this patch really matters, given that e.g. `PartialEvaluator.hasBlendModes` depends on the `objId` to avoid cyclic references this patch could potentially help for some PDF files.	2017-02-25 10:20:19 +01:00
Jonas Jenwald	1ce295541c	Always check all Kids nodes, in `Catalog.getPageDict`, to avoid getting stuck in an empty node further down in the Pages tree (issue 8088) As discussed on IRC, we need to check all nodes at the bottom of the tree to ensure that we find the correct `Page` dict. Furthermore, this patch also gets rid of the caching present in a previous version, since it's not clear if that really helps. Note that this patch purposely adds an `eq` test, using a reduced test-case, so that we can be sure that the algorithm actually finds the correct `Page` dict for each `pageIndex`. Fixes 8088.	2017-02-24 12:09:46 +01:00
Jonas Jenwald	111419a64a	Cache built-in binary CMap files in the worker (issue 4794)	2017-02-16 10:55:39 +01:00
Jonas Jenwald	4046d67fde	Enable the `no-else-return` ESLint rule Using `else` after `return` is not necessary, and can often lead to unnecessarily cluttered code. By using the `no-else-return` rule in ESLint we can avoid this pattern, see http://eslint.org/docs/rules/no-else-return.	2017-01-09 20:27:39 +01:00
Jonas Jenwald	14b8523314	Refactor the `password` handling so that it's stored in the `PdfManager`s, instead of in the `XRef` We're already passing in a, currently unused, `PdfManager` instance when initializing the `XRef`. To avoid having to pass a single `password` parameter around, we could thus simply get the `password` through the `PdfManager` instance instead.	2017-01-03 20:29:52 +01:00
Jonas Jenwald	c6008b4d7c	Fix the JSDoc comment for `Catalog.parseDestDictionary`	2016-11-27 11:18:18 +01:00
Jonas Jenwald	6d8a404a9c	[api-minor] Add support for a couple of white-listed `JavaScript` actions that contains valid URLs (issue 3897, bug 843699) By only allowing very specific type of `JavaScript` actions, and also utilizing the existing `URL` validation, this patch shouldn't pose too much risk. Fixes one of the points in issue 3897 (with the PDF file taken from issue 3438). Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=843699 (probably, since that bug doesn't contain a test-case).	2016-11-08 16:48:27 +01:00
Jonas Jenwald	0844a72b4d	Add a bit more validation to `Catalog_readPageLabels`, to ensure that the Page Labels are well formed	2016-11-03 20:08:06 +01:00
Jonas Jenwald	2d8d8b5e53	Use `stringToPDFString` to sanitizing bad "Prefix" entries in Page Label dictionaries It seems that certain bad PDF generators can create badly encoded "Prefix" entries for Page Labels, one example being http://ukjewishfilm.org/wp-content/uploads/2015/09/Jewish-Film-Festival-Programme-ONLINE.pdf. Unfortunately I didn't come across such a PDF file while adding the API support for Page Labels, but with them now being used in the viewer I just found this issue. With this patch, we now display the Page Labels in the same way as Adobe Reader.	2016-11-03 19:48:08 +01:00
Tim van der Meij	6e22b32372	Merge pull request #7745 from Snuffleupagus/Launch-actions [api-minor] Add basic support for `Launch` actions (issue 1778, issue 3897, issue 6616)	2016-11-01 21:12:08 +01:00
Tim van der Meij	5194e68134	Lint: correct code style violations Manual observations and working with other linting tools found these.	2016-11-01 15:04:21 +01:00
Jonas Jenwald	2b79782377	[api-minor] Add basic support for `Launch` actions (issue 1778, issue 3897, issue 6616) In general we neither want, nor can, support arbitrary `Launch` actions. But in practice, all the cases we've seen so far just contains relative URLs to other PDF files. Building on PR 7689, we can thus at least support basic `Launch` actions.	2016-10-21 13:40:32 +02:00
Jonas Jenwald	d284cfd5eb	[api-minor] Add support for relative URLs, in both annotations and the outline, by adding a `docBaseUrl` parameter to `PDFJS.getDocument` (bug 766086) Note that in `FIREFOX/MOZCENTRAL/CHROME` builds of the standard viewer the `docBaseUrl` parameter will be set by default, since in that case it makes sense to use the current URL as a base. For the `GENERIC` viewer, or the API itself, it doesn't make sense to try and set the `docBaseUrl` by default. However, custom deployments/implementations may still find the parameter useful.	2016-10-19 22:20:24 +02:00
Jonas Jenwald	71a781ee5c	Deprecate the `isValidUrl` utility function and replace it with `createValidAbsoluteUrl`/`isValidProtocal` functions instead, since the main URL validation is now done using the `new URL` constructor	2016-10-19 22:11:22 +02:00
Jonas Jenwald	42f07c6262	[api-minor] Use the `new URL` constructor when validating URLs in annotations and the outline, as a complement to only checking the protocol, and add a bit more validation to `Catalog_parseDestDictionary` Note that this will automatically reject any relative URL. To make the API more useful to consumers, URLs that are rejected will be available via the `unsafeUrl` property in the data object returned by `PDFPageProxy_getAnnotations`. The patch also adds a bit more validation of the data for `Named` actions.	2016-10-19 22:11:17 +02:00
Jonas Jenwald	e64bc1fd13	Move parsing of destination dictionaries to a helper function This not only reduces code duplication, but it also allow us to easily support the same kind of URLs we currently do for Link annotations in the Outline as well.	2016-10-18 16:14:07 +02:00
Jonas Jenwald	3e77cf6b32	Prevent an infinite loop in `XRef_fetchUncompressed` for encrypted PDF files with indirect objects in the /Encrypt dictionary (issue 7665)	2016-09-25 00:18:47 +02:00
Tim van der Meij	b4c8814fc9	Merge pull request #7534 from Snuffleupagus/isName-name-check Add a parameter to the `isName` function that enables checking not just that something is a `Name`, but also that the actual `name` properties matches	2016-08-17 15:48:42 +02:00
Jonas Jenwald	544d29f5cb	Add a `recoveryMode` that suppresses errors from the `Parser`, and utilize it when searching for the main trailer in `XRef_indexObjects` (bug 1250079) Instead of having `Parser_getObj` fail unconditionally for the referenced PDF file, this patch attempts to let searching for the main trailer continue even if there are errors. Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1250079.	2016-08-17 12:37:35 +02:00
Jonas Jenwald	83ce6f0b6d	Adjust the (applicable) existing `isName` callsites to use the new `isName(v, name)` version of the function	2016-08-10 11:15:08 +02:00
Jonas Jenwald	01ab15a6f1	[api-minor] Let `Catalog_getPageIndex` check that the `Ref` actually points to a /Page dictionary Currently the `getPageIndex` method will happily return `0`, even if the `Ref` parameter doesn't actually point to a proper /Page dictionary. Having the API trust that the consumer is doing the right thing seems error-prone, hence this patch which adds a check for this case. Given that the `Catalog_getPageIndex` method isn't used in any hot part of the codebase, this extra check shouldn't be a problem. (Note: in the standard viewer, it is only ever used from `PDFLinkService_navigateTo` if a destination needs to be resolved during document loading, which isn't common enough to be an issue IMHO.)	2016-05-21 14:13:41 +02:00
Tim van der Meij	c1c199d702	Merge pull request #7295 from Snuffleupagus/core-getArray Use `Dict_getArray` in more places in `src/core/` to avoid issues when Arrays contain indirect objects	2016-05-10 23:21:54 +02:00
Jonas Jenwald	182d33800a	Ignore 'endobj' commands inside of `ObjStm` streams (issue 5241, bug 898610, bug 1037816) According to an example in the PDF specification, see http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#page=56, an `ObjStm` stream should not contain 'endobj' commands. Fixes 5241. Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=898610. Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1037816.	2016-05-09 09:50:45 +02:00
Jonas Jenwald	6111c17c8a	Use `Dict_getArray` in more places in `src/core/` to avoid issues when Arrays contain indirect objects As evident from e.g. PRs 6485 and 7118, some bad PDF generators unfortunately create Arrays where some elements are indirect objects (i.e. `Ref`s). This seems to mostly affect Arrays that contain numbers, such as e.g. `Matrix/FontMatrix/BBox/FontBBox/Rect/Color/...`, and has manifested itself in PDF files that fail to render correctly (some elements are missing). The problem in both the cases above, besides broken rendering, was that there were no errors/warnings that indicated what the problem was, making it difficult to pinpoint the issue. Hence this patch, where I've audited all usages of `Dict_get` in `src/core/` files, and replaced it with `Dict_getArray` where appropriate to try and prevent unnecessary future bugs.	2016-05-05 19:42:57 +02:00
Jonas Jenwald	e281ef15db	Adjust incorrect first obj number of "free" xref entry in `XRef_readXRefTable` (issue 7229) Fixes 7229.	2016-04-21 16:36:32 +02:00
Jonas Jenwald	41efb92d3a	Merge pull request #6988 from timvandermeij/fileattachment-annotation Implement support for FileAttachment annotations	2016-02-24 12:58:06 +01:00
Tim van der Meij	6a33dfd13a	Implement support for FileAttachment annotations	2016-02-23 22:49:53 +01:00
Jonas Jenwald	7cf9de2c17	[api-minor] Change `getOutline` to actually return the RGB color of outline items Currently the `C` entry in an outline item is returned as is, which is neither particularly useful nor what the API documentation claims. This patch also adds unit-tests for both the color handling, and the `F` entry (bold/italic flags).	2016-02-15 13:41:22 +01:00
Jonas Jenwald	98db068079	Reduce the overall indentation level in `Catalog_readDocumentOutline`, by using early returns, in order to improve readability	2016-02-14 11:38:43 +01:00
Yury Delendik	825a2225ab	Merge pull request #6915 from yurydelendik/lookuptables Refactor lookup hash tables/objects	2016-01-28 15:01:06 -06:00
Yury Delendik	2edf2792dc	Replaces literal {} created lookup tables with Object.create	2016-01-28 12:18:38 -06:00
Jonas Jenwald	1140a34f5c	[api-minor] Change `getPageLabels` to always return the pageLabels, even if they are identical to standard page numbering	2016-01-27 13:36:03 +01:00
Jonas Jenwald	85cf90643f	[api-minor] Add support for PageLabels in the API	2016-01-19 22:49:04 +01:00
Jonas Jenwald	8ad18959d7	Add support for NumberTree	2016-01-19 22:47:45 +01:00
Jonas Jenwald	0030a82dc3	[api-minor] Add support for URLs in the document outline Re: issue 5089. (Note that since there are other outline features that we currently don't support, e.g. bold/italic text and custom colours, I thus think we can keep the referenced issue open.)	2016-01-19 21:36:27 +01:00
Yury Delendik	6b60c8f4db	Adds UMD headers to core, display and shared files.	2015-12-15 13:24:39 -06:00
Manas	a2ba1b8189	Uses editorconfig to maintain consistent coding styles Removes the following as they unnecessary /* -- Mode: Java; tab-width: 2; indent-tabs-mode: nil; c-basic-offset: 2 -- / / vim: set shiftwidth=2 tabstop=2 autoindent cindent expandtab: */	2015-11-14 07:32:18 +05:30
Yury Delendik	59c13b32aa	Adds destroy method to the document loading task. Also renames PDFPageProxy.destroy method to cleanup.	2015-10-23 08:57:14 -05:00
Jonas Jenwald	49883439a5	Ensure that `Dict_getArray` doesn't fail if `xref` in undefined (PR 6485 follow-up) In PR 6485 I somehow missed to account for the case where `xref` is undefined. Since a dictonary can be initialized without providing a reference to an `xref` instance, `Dict_getArray` can thus fail without this added check.	2015-10-15 11:47:07 +02:00
Jonas Jenwald	9b12c64be5	Cache the regular expression used for finding `obj`s in `XRef_indexObjects`, to avoid unnecessary allocations	2015-10-02 12:46:58 +02:00
Jonas Jenwald	192907e0d2	Make `XRef_indexObjects` even more robust against bad PDF files, by checking for the existence of 'trailer' if 'xref' is not found Fixes http://www.cyjack.com/cognition/Terence%20McKenna%20-%20Lectures%20on%20Alchemy.pdf.	2015-10-01 15:01:25 +02:00
Jonas Jenwald	75557d27d1	Add `getArray` method to `Dict` This method extend `get`, and will fetch all indirect objects (i.e. `Ref`s) when the result is an `Array`.	2015-09-29 10:11:47 +02:00
Jonas Jenwald	56a43a3181	Make `XRef_indexObjects` more robust against bad PDF files (issue 5752) This patch improves the detection of `xref` in files where it is followed by an arbitrary whitespace character (not just a line-breaking char). It also adds a check for missing whitespace, e.g. `1 0 obj<<`, to speed up `readToken` for the PDF file in the referenced issue. Finally, the patch also replaces a bunch of magic numbers with suitably named constants. Fixes 5752. Also improves 6243, but there are still issues.	2015-08-21 20:33:02 +02:00
Rob Wu	c676ecb5a0	Detect scripted auto-print requests Fixes #6106 To avoid future regressions, two new unit tests were added: 1. A new PDF based on the report from #6106, which contains an OpenAction of type JavaScript and a string "this.print({...}". 2. An existing PDF from https://bugzil.la/1001080 (from #4698). Although it does not matter, since we don't execute the JavaScript code, I have also changed "print(true)" to "print({})" since the print method takes an object (not a boolean). See "Printing PDF documents", page 62: http://adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/js_developer_guide.pdf	2015-07-20 18:25:02 +02:00
Jonas Jenwald	28f40b1b58	Fetch all indirect objects (i.e. `Ref`s) in `NameTree_getAll` and `NameTree_get` (issue 6204)	2015-07-14 10:56:56 +02:00
Tim van der Meij	1416a1b521	Merge pull request #6187 from Snuffleupagus/more-efficient-getDestination A couple of improvements of `getDestination` (unit-test included)	2015-07-13 23:03:13 +02:00
Rob Wu	fd29bb0c57	Subtract start offset for xrefs in recovery mode Xref offsets are relative to the start of the PDF data, not to the start of the PDF file. This is clear if you look at the other code: - In the XRef's readXRefTable and processXRefTable methods of XRef, the offset of a xref entry is set to the bytes as given by a PDF file. These values are always relative to the start of the PDF file (%PDF-). - The XRef's readXRef method adds the start offset of the stream to Xref entry's offset: "stream.pos = startXRef + stream.start". Clearly, this line assumes that the entry offset excludes the start offset. However, when the PDF is parsed in recovery mode, the xref table is filled with entries whose offset is relative to the start of the stream rather than the PDF file. This is incorrect, and the fix is to subtract the start offset of the stream from the entry's byte offset. The manually created PDF file serves as a regression test. It is a valid PDF, except: - The integer to point to the start of the xref table and the %%EOF trailer are missing. This will activate recovery mode in PDF.js - Some junk was added before the start of the PDF file. This exposes the bad offset bug.	2015-07-10 23:33:10 +02:00

1 2 3