Sakurai/pdf.js - pdf.js - Gitea on kemo

Sakurai/pdf.js

Author	SHA1	Message	Date
Jonas Jenwald	66ee8f5acd	Remove variable shadowing from the JavaScript files in the `test/unit/` folder This is part of a series of patches that will try to split PR 11566 into smaller chunks, to make reviewing more feasible. Once all the code has been fixed, we'll be able to eventually enable the ESLint no-shadow rule; see https://eslint.org/docs/rules/no-shadow	2020-03-24 10:44:17 +01:00
Jonas Jenwald	ae2900e510	[api-minor] Change the pageIndex, on `PDFPageProxy` instances, to a private property This property has never been documented and/or intentionally exposed through the API, instead the `PDFPageProxy.pageNumber` property is the documented/intended API to use here. Hence pageIndex is changed to a "private" property on `PDFPageProxy` instances, and internal API functionality is also updated to consistently use `this._pageIndex` rather than a mix of formats.	2020-03-19 15:47:11 +01:00
Jonas Jenwald	01fb309a2a	[api-minor] Add more general OpenAction support (PR 10334 follow-up, issue 11642) This patch deprecates the existing `getOpenActionDestination` API method, in favor of a better and more general `getOpenAction` method instead. (For now JavaScript actions, related to printing, are still handled as before.) By clearly separating "regular" Print actions from the JavaScript handling, it's thus possible to get rid of the somewhat annoying and strictly incorrect warning when the viewer loads.	2020-03-06 13:03:00 +01:00
Jonas Jenwald	3c7b7be100	Prevent circular references in the /Pages tree	2020-02-19 01:49:39 +01:00
Tim van der Meij	2fb4076e05	Merge pull request #11568 from Snuffleupagus/PDF-header-validation Ensure that the PDF header contains an actual number (PR 11463 follow-up)	2020-02-09 17:16:25 +01:00
Jonas Jenwald	7117ee03d6	[api-minor] Change `PDFDocumentProxy.cleanup`/`PDFPageProxy.cleanup` to return data This patch makes the following changes, to improve these API methods: - Let `PDFPageProxy.cleanup` return a boolean indicating if clean-up actually happened, since ongoing rendering will block clean-up. Besides being used in other parts of this patch, it seems that an API user may also be interested in the return value given that clean-up isn't guaranteed to happen. - Let `PDFDocumentProxy.cleanup` return the promise indicating when clean-up is finished. - Improve the JSDoc comment for `PDFDocumentProxy.cleanup` to mention that clean-up is triggered on both threads (without going into unnecessary specifics regarding what exactly said data actually is). Add a note in the JSDoc comment about not calling this method when rendering is ongoing. - Change `WorkerTransport.startCleanup` to throw an `Error` if it's called when rendering is ongoing, to prevent rendering from breaking. Please note that this won't stop worker-thread clean-up from happening (since there's no general "something is rendering"-flag), however I'm not sure if that's really a problem; but please don't quote me on that :-) All of the caches that's being cleared in `Catalog.cleanup`, on the worker-thread, should be re-filled automatically even if cleared during parsing/rendering, and the only thing that probably happens is that e.g. font data would have to be re-parsed. On the main-thread, on the other hand, clearing the caches is more-or-less guaranteed to cause rendering errors, since the rendering code in `src/display/canvas.js` isn't able to re-request any image/font data that's suddenly being pulled out from under it. - Last, but not least, add a couple of basic unit-tests for the clean-up functionality.	2020-02-07 17:00:29 +01:00
Jonas Jenwald	88c35d872f	Ensure that the PDF header contains an actual number (PR 11463 follow-up) While it would be nice to change the `PDFFormatVersion` property, as returned through `PDFDocumentProxy.getMetadata`, to a number (rather than a string) that would unfortunately be a breaking API change. However, it does seem like a good idea to at least validate the PDF header version on the worker-thread, rather than potentially returning an arbitrary string.	2020-02-07 12:25:07 +01:00
Jonas Jenwald	9e262ae7fa	Enable the ESLint `prefer-const` rule globally (PR 11450 follow-up) Please find additional details about the ESLint rule at https://eslint.org/docs/rules/prefer-const With the recent introduction of Prettier this sort of mass enabling of ESLint rules becomes a lot easier, since the code will be automatically reformatted as necessary to account for e.g. changed line lengths. Note that this patch is generated automatically, by using the ESLint `--fix` argument, and will thus require some additional clean-up (which is done separately).	2020-01-25 00:20:22 +01:00
Jonas Jenwald	36881e3770	Ensure that all `import` and `require` statements, in the entire code-base, have a `.js` file extension In order to eventually get rid of SystemJS and start using native `import`s instead, we'll need to provide "complete" file identifiers since otherwise there'll be MIME type errors when attempting to use `import`.	2020-01-04 13:01:43 +01:00
Jonas Jenwald	d9d856020f	Move the regular expression, used with auto printing in the viewer, to `web/ui_utils.js` and also use it in the API unit-tests Rather than having a copy of this regular expression in the `test/unit/api_spec.js` file, with a comment about keeping it up-to-date with the code in the viewer (note the incorrect file reference as well), we can just import it instead to simplify all of this.	2019-12-27 00:38:28 +01:00
Tim van der Meij	dfe42a5ca4	Include a unit test for OpenAction dictionaries without `Type` entries (PR 11443 follow-up) The original issue did not contain a (reduced) test case that we could include and linked test cases are not ideal for unit tests, so the original PR could only be verified manually. I found this a bit unfortunate considering that the print data is exposed through the API, so I thought about how we could have an automated test and managed to create a reduced test case with the OpenAction dictionary from the file in the original issue. Therefore, this commit includes a unit test for parsing OpenAction dictionaries without `Type` entries. I verified that this PDF file behaves the same as the original one, i.e., no print dialog is shown for older viewers and the print dialog is shown for the most recent viewer.	2019-12-27 00:05:51 +01:00
Jonas Jenwald	de36b2aaba	Enable auto-formatting of the entire code-base using Prettier (issue 11444) Note that Prettier, purposely, has only limited [configuration options](https://prettier.io/docs/en/options.html). The configuration file is based on [the one in `mozilla central`](https://searchfox.org/mozilla-central/source/.prettierrc) with just a few additions (to avoid future breakage if the defaults ever changes). Prettier is being used for a couple of reasons: - To be consistent with `mozilla-central`, where Prettier is already in use across the tree. - To ensure a consistent coding style everywhere, which is automatically enforced during linting (since Prettier is used as an ESLint plugin). This thus ends "all" formatting disussions once and for all, removing the need for review comments on most stylistic matters. Many ESLint options are now redundant, and I've tried my best to remove all the now unnecessary options (but I may have missed some). Note also that since Prettier considers the `printWidth` option as a guide, rather than a hard rule, this patch resorts to a small hack in the ESLint config to ensure that comments won't become too long. Please note: This patch is generated automatically, by appending the `--fix` argument to the ESLint call used in the `gulp lint` task. It will thus require some additional clean-up, which will be done in a separate commit. (On a more personal note, I'll readily admit that some of the changes Prettier makes are extremely ugly. However, in the name of consistency we'll probably have to live with that.)	2019-12-26 12:34:24 +01:00
Jonas Jenwald	b4d95f3763	Tweak the "gets page stats after rendering page, with `pdfBug` set" unit-test to remove an intermittent failure on Travis I recently noticed a couple of intermittent failures on Travis, hence this patch which changes the expectation to be identical to the 'Page Request' check in the preceding test-case.	2019-12-23 23:07:02 +01:00
Jonas Jenwald	b00835f589	Attempt to improve the `PDFDocument` error message for empty files (issue 5887) Given that the error in question is surfaced on the API-side, this patch makes the following changes: - Updates the wording such that it'll hopefully be slightly easier for users to understand. - Changes the plain `Error` to an `InvalidPDFException` instead, since that should work better with the existing Error handling. - Adds a unit-test which loads an empty PDF document (and also improves a pre-existing `InvalidPDFException` message and its test-case).	2019-12-09 15:45:50 +01:00
Jonas Jenwald	74e00ed93c	Change `isNodeJS` from a function to a constant Given that this shouldn't change after the `pdf.js`/`pdf.worker.js` files have been loaded, it doesn't seems necessary to keep this as a function.	2019-11-10 16:44:29 +01:00
Jonas Jenwald	2817121bc1	Convert `globalScope` and `isNodeJS` to proper modules Slightly unrelated to the rest of the patch, but this also removes an out-of-place `globals` definition from the `web/viewer.js` file.	2019-11-10 16:44:29 +01:00
Jonas Jenwald	681bc9d70e	[api-minor] Support custom `offsetX`/`offsetY` values in `PDFPageProxy.getViewport` and `PageViewport.clone` There's no good reason, as far as I can tell, to not also support `offsetX`/`offsetY` in addition to e.g. `dontFlip`.	2019-10-23 20:48:14 +02:00
Jonas Jenwald	2046adcc49	Change the 'gets viewport respecting "dontFlip" argument' unit-test to use a valid rotation angle As can be seen in `PageViewport` only multiples of 90 degrees are really supported by the code, hence the unit-test doesn't really make sense. (Possibly this should be enforced in the API, to avoid surprises, but given that this problem has always existed I'm passing on that for now.)	2019-10-23 20:30:25 +02:00
Tim van der Meij	09df1ee0ce	Include a reduced, non-linked PDF file for the attachments API unit test	2019-08-25 15:14:57 +02:00
Jonas Jenwald	d637b25e36	Fallback gracefully when encountering corrupt PDF files with empty /MediaBox and /CropBox entries This is based on a real-world PDF file I encountered very recently[1], although I'm currently unable to recall where I saw it. Note that different PDF viewers handle these sort of errors differently, with Adobe Reader outright failing to render the attached PDF file whereas PDFium mostly handles it "correctly". The patch makes the following notable changes: - Refactor the `cropBox` and `mediaBox` getters, on the `Page`, to reduce unnecessary duplication. (This will also help in the future, if support for extracting additional page bounding boxes are added to the API.) - Ensure that the page bounding boxes, i.e. `cropBox` and `mediaBox`, are never empty to prevent issues/weirdness in the viewer. - Ensure that the `view` getter on the `Page` will never return an empty intersection of the `cropBox` and `mediaBox`. - Add an optional parameter to `Util.intersect`, to allow checking that the computed intersection isn't actually empty. - Change `Util.intersect` to have consistent return types, since Arrays are of type `Object` and falling back to returning a `Boolean` thus seem strange. --- [1] In that case I believe that only the `cropBox` was empty, but it seemed like a good idea to attempt to fix a bunch of related cases all at once.	2019-08-09 10:18:13 +02:00
Jonas Jenwald	0276385e6e	[api-minor] Fix completely broken `getStats` method by returning stats in Objects, rather than in Arrays (PR 11029 follow-up) With the changes to the `StreamType`/`FontType` "enums" in PR 11029, one unfortunate result is that `getStats` now always returns empty Arrays. Something that everyone, myself included, apparently missed is that you obviously cannot index an Array with Strings :-) I wrongly assumed that the unit-tests would catch any bugs, but they apparently suffered from the same issue as the code in `src/core/`. Another possible option could perhaps be to use `Set`s, rather than objects, but that will require larger changes since `LoopbackPort` (in `src/display/api.js`) doesn't support them.	2019-08-02 14:09:24 +02:00
Jonas Jenwald	ff90aa4323	Inline the `isCmd` check in the `Parser.shift` method For very large and complex PDF files this will help performance slightly, since `Parser.shift` is called a lot during parsing. This patch was tested using the PDF file from issue 2618, i.e. http://bugzilla-attachments.gnome.org/attachment.cgi?id=226471 (with well over four million `Parser.shift` calls for just the one page), using the following manifest file: ``` [ { "id": "issue2618", "file": "../web/pdfs/issue2618.pdf", "md5": "", "rounds": 100, "type": "eq" } ] ``` This gave the following results when comparing this patch against the `master` branch: ``` -- Grouped By browser, stat -- browser \| stat \| Count \| Baseline(ms) \| Current(ms) \| +/- \| % \| Result(P<.05) ------- \| ------------ \| ----- \| ------------ \| ----------- \| --- \| ----- \| ------------- Firefox \| Overall \| 100 \| 3386 \| 3322 \| -65 \| -1.92 \| faster Firefox \| Page Request \| 100 \| 1 \| 1 \| 0 \| -8.08 \| Firefox \| Rendering \| 100 \| 3385 \| 3321 \| -65 \| -1.92 \| faster ```	2019-07-22 12:07:36 +02:00
Jonas Jenwald	c7de6dbe41	Update the `fingerprint` API unit-tests to explicitly check for the expected result The current tests won't catch inadvertent changes to the logic used to obtain/compute the document `fingerprint`.	2019-07-15 11:19:17 +02:00
Jonas Jenwald	c7fb7116d6	Add an API unit-test for the `stopAtErrors` option (PRs 8240 and 8922 follow-up) Also fixes an inconsistency in the 'PageError' handler, for `getOperatorList`, in the API.	2019-07-13 16:06:05 +02:00
Jonas Jenwald	311bac3ebb	[api-minor] Add support for ViewerPreferences in the API (issue 10736) Please see the specification, https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#M11.9.12864.1Heading.71.Viewer.Preferences Furthermore, note that this patch only adds API support and unit-tests but does not attempt to integrate e.g. the `ViewerPreferences -> Direction` property into the viewer (which would be necessary to address issue 10736). The reason for this is that it's not entirely clear to me exactly if/how that could be implemented; e.g. would it be as simple as setting the `dir` attribute on the `viewerContainer` DOM element, or will it be more complicated? There's also the question of how the `ViewerPreferences -> Direction` value interacts with the `PageMode`, and this will generally require a fair bit of manual testing. Since the direction of the entire viewer depends on the browser locale, there's also a somewhat open question regarding what default value to use for different locales. Finally, if the viewer supports `ViewerPreferences -> Direction` then I'm assuming that it will be necessary to allow users to override the default value, which will require (most likely) new `SecondaryToolbar` buttons and icons for those etc. Hence this patch only lays the necessary foundation for eventually addressing issue 10736, but defers the actual implementation until later. (Time permitting, I'll try to look into the viewer part later.)	2019-04-14 14:20:52 +02:00
Jonas Jenwald	7a999d1d67	[api-minor] Add basic support for PageLayout in the API and the viewer Please see the specification, https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G6.2393749, and refer to the inline comments for additional details.	2019-04-05 11:32:01 +02:00
Jonas Jenwald	bb384dd5ed	[Firefox regression] Fix `disableRange=true` bug in `PDFDataTransportStream` Currently if trying to set `disableRange=true` in the built-in PDF Viewer in Firefox, either through `about:config` or via the URL hash, the PDF document will never load. It appears that this has been broken for a couple of years, without anyone noticing. Obviously it's not a good idea to set `disableRange=true`, however it seems that this bug affects the PDF Viewer in Firefox even with default settings: - In the case where `initialData` already contains the entire file, we're forced to dispatch a range request to re-fetch already available data just so that file loading may complete. - (In the case where the data arrives, via streaming, before being specifically requested through `requestDataRange`, we're also forced to re-fetch data unnecessarily.) This part was removed, to reduce the scope/risk of the patch somewhat. In the cases outlined above, we're having to re-fetch already available data thus potentially delaying loading/rendering of PDF files in Firefox (and wasting resources in the process).	2019-03-26 16:34:13 +01:00
Jonas Jenwald	24fc4f83ca	Small clean-up of the `PDFDocumentProxy.destroy` method and related code Note how `PDFDocumentProxy.destroy` is a nothing more than an alias for `PDFDocumentLoadingTask.destroy`. While removing the latter method would be a breaking API change, there's still room for at least some clean-up here. The main changes in this patch are: - Stop providing a `PDFDocumentLoadingTask` instance separately when creating a `PDFDocumentProxy`, since the loadingTask is already available through the `WorkerTransport` instance. - Stop tracking the `PDFDocumentProxy` instance on the `WorkerTransport`, since that property is completely unused. - Simplify the 'Multiple `getDocument` instances' unit-tests by only destroying once, rather than twice, for each document.	2019-03-12 13:25:29 +01:00
Jonas Jenwald	a1f7517996	Rename the `src/display/dom_utils.js` file to `src/display/display_utils.js` This file (currently) contains not only DOM-specific helper functions/classes, but is used generally for various helper code relevant for main-thread functionality.	2019-02-23 16:30:16 +01:00
Jonas Jenwald	a0354494bd	Re-factor the `PDFDataRangeTransport` unit-tests and enable them in Node.js/Travis There doesn't appear to be any particular reason for only running these unit-tests in browsers, since the `PDFDataRangeTransport` functionality itself should be back-end agnostic.	2019-02-17 14:45:17 +01:00
Jonas Jenwald	507e0a4907	Add a new `DOMFileReaderFactory` helper to the unit-tests, and re-factor `NodeFileReaderFactory` to be asynchronous This allows simplification of the 'creates pdf doc from URL and aborts loading after worker initialized' API unit-test. Note that the `DOMFileReaderFactory` uses the Fetch API, for simplicity, since it should be available in all browsers where we're running tests.	2019-02-17 14:41:14 +01:00
Jonas Jenwald	60f6d49ff7	[api-minor] Expose the existence of a `Collection` dictionary via the `getMetadata` API method (issue 10555) Given the complexity of this functionality, and the fact that it doesn't seem widely used, I highly doubt that it'd ever make sense to support Collections; see also https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#M11.9.39646.2Heading.824.Collections	2019-02-15 15:40:31 +01:00
Tim van der Meij	7c91e94b19	Implement the `NodeCanvasFactory` class to execute more unit tests in Node.js	2019-02-10 19:37:34 +01:00
Jonas Jenwald	22468817e1	Add a `settled` property, tracking the fulfilled/rejected stated of the Promise, to `createPromiseCapability` This allows cleaning-up code which is currently manually tracking the state of the Promise of a `createPromiseCapability` instance.	2019-02-02 15:18:56 +01:00
Jonas Jenwald	2b0b6178f7	Clean-up after the `gets operatorList with JPEG image (issue 4888)` unit-test This unit-test wasn't destroying the `loadingTask` when complete, as it should have done.	2019-01-29 15:24:08 +01:00
Jonas Jenwald	6f94a05a29	Do the final text scaling correctly in `flushTextContentItem` (issue 8276) It's necessary to take into account whether or not the text is vertical, to avoid either the textContent `width` or `height` becoming incorrect.	2019-01-29 15:24:04 +01:00
Tim van der Meij	103f4616ac	Merge pull request #10334 from Snuffleupagus/OpenAction-dest [api-minor] Add support for OpenAction destinations (issue 10332)	2018-12-23 20:49:50 +01:00
Jonas Jenwald	f0719ed565	[api-minor] Change the `getViewport` method, on `PDFPageProxy`, to take a parameter object rather than a bunch of (randomly) ordered parameters If, as PR 10368 suggests, more parameters should be added to `getViewport` I think that it would be a mistake to not change the signature first to avoid needlessly unwieldy call-sites. To not break any existing code and third-party use-cases, this is obviously implemented with a deprecation warning and with a working fallback[1] for the old method signature. --- [1] This is limited to `GENERIC` builds, which should be sufficient.	2018-12-21 11:55:20 +01:00
Jonas Jenwald	b05f053287	[api-minor] Add support for OpenAction destinations (issue 10332) Note that the OpenAction dictionary may contain other information besides just a destination array, e.g. instructions for auto-printing[1]. Given first of all that an arbitrary `Dict` cannot be sent from the Worker (since cloning would fail), and second of all that the data obviously needs to be validated, this patch purposely only adds support for fetching a destination from the OpenAction entry[2]. --- [1] This information is, currently in PDF.js, being included through the `getJavaScript` API method. [2] This significantly reduces the complexity of the implementation, which seems fine for now. If there's ever need for other kinds of OpenAction to be fetched, additional API methods could/should be implemented as necessary (could e.g. follow the `getOpenActionWhatever` naming scheme).	2018-12-19 11:45:16 +01:00
Jonas Jenwald	ba2edeae18	[api-minor] Add support, in `getMetadata`, for custom information dictionary entries (issue 5970, issue 10344) (#10346 ) The custom entries, provided that they exist and that their types are safe to include, are exposed through a new `Custom` infoDict entry to clearly separate them from the standard ones. Fixes 5970. Fixes 10344.	2018-12-18 23:26:02 +01:00
Jonas Jenwald	2c003a82d5	Convert `RenderTask`, in `src/display/api.js`, to an ES6 class Also deprecates the `then` method, in favour of the `promise` getter.	2018-11-18 19:08:00 +01:00
Jonas Jenwald	327f2eb588	Ensure that `onProgress` is always called when the entire PDF file has been loaded, regardless of how it was fetched (issue 10160) Please note: I'm totally fine with this patch being rejected, and the issue closed as WONTFIX; however these changes should address the issue if that's desired. From a conceptual point of view, reporting loading progress doesn't really make a lot of sense for PDF files opened by passing raw binary data directly to `getDocument` (since obviously all data was loaded). This is compared to PDF files loaded via e.g. `XMLHttpRequest` or the Fetch API, where the entire PDF file isn't available from the start and knowing the loading progress makes total sense. However I can certainly see why the current API could be considered inconsistent, which isn't great, since a registered `onProgress` callback will never be called for certain `getDocument` calls. The simplest solution to this inconsistency thus seem to be to ensure that `onProgress` is always called when handling the `DataLoaded` message, since that will always be dispatched[1] from the worker-thread. --- [1] Note that this isn't guaranteed to happen, since setting `disableAutoFetch = true` often prevents the entire file from ever loading. However, this isn't relevant for the issue at hand, and is a well-known consequence of using `disableAutoFetch = true`; note how the default viewer even has a specialized code-path for hiding the loadingBar.	2018-10-16 13:51:12 +02:00
Tim van der Meij	e812c6e7ac	Use shorter code for failing a test in `test/unit/api_spec.js`	2018-09-02 21:23:09 +02:00
Tim van der Meij	959ed3705b	Implement a permissions API	2018-09-02 21:23:09 +02:00
Jonas Jenwald	1179584fd6	Reject `getDestination`, in the API, for non-string inputs Note how e.g. the `getPage` method does basic validation of the input.	2018-08-11 16:06:35 +02:00
Jonas Jenwald	f78efd883e	Attempt to throw `MissingPDFException` when applicable in `node_stream.js` (issue 9791)	2018-08-06 10:00:03 +02:00
Brendan Dahl	d762567bcf	Allow loaded progress of 0 in unit tests.	2018-08-03 10:31:46 -07:00
Jonas Jenwald	928b89382e	[api-minor] Add an `IsLinearized` property to the `PDFDocument.documentInfo` getter, to allow accessing the linearization status through the API (via `PDFDocumentProxy.getMetadata`) There was a (somewhat) recent question on IRC about accessing the linearization status of a PDF document, and this patch contains a simple way to expose that through already existing API methods. Please note that during setup/parsing in `PDFDocument` the linearization data is already being fetched and parsed, provided of course that it exists. Hence this patch will not cause any additional data to be loaded.	2018-07-26 15:54:19 +02:00
Jonas Jenwald	61186698c3	Replace the remaining occurences of `instanceof Array` with `Array.isArray()` Follow-up to PRs 8864 and 8813. As explained in https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/isArray, `instanceof Array` can have inconsistent behavior. To ensure that only `Array.isArray` is used, an ESLint plugin/rule is added to enforce this.	2018-07-09 13:17:41 +02:00
Jonas Jenwald	bf0aca86d7	Fix re-rendering, using the same canvas, when rendering was previously cancelled (PR 8519 follow-up) Currently if `RenderTask.cancel` is called immediately after rendering was started, then by the time that `InternalRenderTask.initializeGraphics` is called rendering will already have been cancelled. However, we're still inserting the canvas into the `canvasInRendering` map, thus breaking any future attempts at re-rendering using the same canvas. Considering that `InternalRenderTask.cancel` always removes the canvas from the map, I cannot imagine that we'd ever want to re-add it after rendering was cancelled (it was likely just a simple oversight in PR 8519). Fixes 9456.	2018-06-28 22:56:37 +02:00

1 2 3 4