Sakurai/pdf.js - pdf.js - Gitea on kemo

Sakurai/pdf.js

Author	SHA1	Message	Date
Jonas Jenwald	27add0f1f3	Re-factor the `source` parsing, in `getDocument`, to use `switch` rather than `if...else` Given the number of parameters that we now need to parse here, this code is no longer as readable as one would like. Hence this re-factoring, which will improve overall readability and also help with the next patch.	2021-03-31 16:21:37 +02:00
Jonas Jenwald	9c6770748c	Move the `PDFDocumentStats` typedef closer to its usage Currently this typedef appears slightly out-of-place, in the middle of the arguably much more important `getDocument` JSDocs.	2021-03-31 16:21:22 +02:00
Tim van der Meij	8269ddbd16	Merge pull request #13105 from Snuffleupagus/BasePdfManager-parseDocBaseUrl Improve memory usage around the `BasePdfManager.docBaseUrl` parameter (PR 7689 follow-up)	2021-03-19 23:03:20 +01:00
Jonas Jenwald	57e7557235	Actually reset the `PDFPageProxy._xfaPromise` property as intended (PR 13069 follow-up) (#13119 ) Similar to the existing `annotationsPromise` and `_jsActionsPromise` properties, the new `_xfaPromise` should obviously also be reset, since otherwise you might end up holding onto a lot of data for pages that are no longer active. (That caching wasn't present in the original version of PR 13069, which is why I didn't spot it until now.)	2021-03-19 11:31:54 +01:00
calixteman	24e598a895	XFA - Add a layer to display XFA forms (#13069 ) - add an option to enable XFA rendering if any; - for now, let the canvas layer: it could be useful to implement XFAF forms (embedded pdf in xml stream for the background and xfa form for the foreground); - ui elements in template DOM are pretty close to their html counterpart so we generate a fake html DOM from template one: - it makes easier to translate template properties to html ones; - it makes faster the creation of the html element in the main thread.	2021-03-19 10:11:40 +01:00
Jonas Jenwald	c4c7216171	Improve memory usage around the `BasePdfManager.docBaseUrl` parameter (PR 7689 follow-up) While there is nothing outright wrong with the existing implementation, it can however lead to increased memory usage in one particular case (that I completely overlooked when implementing this): For "data:"-URLs, which by definition contains the entire PDF document and can thus be arbitrarily large, we obviously want to avoid sending, storing, and/or logging the "raw" docBaseUrl in that case. To address this, this patch makes the following changes: - Ignore any non-string in the `docBaseUrl` option passed to `getDocument`, since those are unsupported anyway, already on the main-thread. - Ignore "data:"-URLs in the `docBaseUrl` option passed to `getDocument`, to avoid having to send what could potentially be a very long string to the worker-thread. - Parse the `docBaseUrl` option directly in the `BasePdfManager`-constructors, on the worker-thread, to avoid having to store the "raw" docBaseUrl in the first place.	2021-03-17 15:48:24 +01:00
Jonas Jenwald	50681d71c8	Ensure that `getDocument` handles Node.js `Buffer`s more gracefully (issue 13075) While the JSDocs have never advertised `getDocument` as supporting Node.js `Buffer`s, that apparently doesn't stop users from passing such data structures to `getDocument`. In theory the existing `instanceof Uint8Array` check ought to have caught Node.js `Buffer`s, however for reasons that I don't even pretend to understand that check actually passes. Hence this patch which, only in Node.js environments, will special-case `Buffer`s to hopefully provide a slightly better out-of-the-box behaviour in Node.js environments[1]. --- [1] Although I'm not sure that we necessarily want to advertise this in the JSDocs, given the specialized use-case.	2021-03-13 10:52:38 +01:00
Jonas Jenwald	6fd899dc44	[api-minor] Support the Content-Disposition filename in the Firefox PDF Viewer (bug 1694556, PR 9379 follow-up) As can be seen [in the mozilla-central code](https://searchfox.org/mozilla-central/rev/a6db3bd67367aa9ddd9505690cab09b47e65a762/toolkit/components/pdfjs/content/PdfStreamConverter.jsm#1222-1225), we're already getting the Content-Disposition filename. However, that data isn't passed through to the viewer nor to the `PDFDataTransportStream`-implementation, which explains why it's currently being ignored. Please note: This will also require a small mozilla-central patch, see https://bugzilla.mozilla.org/show_bug.cgi?id=1694556, to forward the necessary data to the viewer.	2021-02-26 10:50:29 +01:00
Jonas Jenwald	d69cf702f3	Add a `this`-bound method for `InternalRenderTask.cancel` This is similar to the other methods, and the only reason for this not having been done originally is that the `cancel` functionality is a later addition.	2021-02-20 14:47:57 +01:00
Jonas Jenwald	e9038cc3d1	Send the `AnnotationStorage`-data to the worker-thread as a `Map` Rather than converting the `AnnotationStorage`-data to an Object, before sending it to the worker-thread, we should be able to simply send the internal `Map` directly. The "structured clone algorithm" doesn't have a problem with `Map`s, however the `LoopbackPort` used when workers are disabled (e.g. in Node.js environments) didn't use to support them. With PR 12997 having lifted that restriction, we should now be able to simply send the `AnnotationStorage`-data as-is rather than having to iterate through it to first create an Object. Please note: The changes in `src/core/annotation.js` could have been a lot more compact if we were able to use optional chaining in the `src/core` folder. Unfortunately that's still not possible, since SystemJS is being used in the development viewer (i.g. `gulp server`) and fixing that is still blocked by [bug 1247687](https://bugzilla.mozilla.org/show_bug.cgi?id=1247687).	2021-02-18 17:13:43 +01:00
Tim van der Meij	4619b1b568	Merge pull request #12997 from Snuffleupagus/metadata-worker Move the Metadata parsing to the worker-thread	2021-02-17 20:57:46 +01:00
Jonas Jenwald	3398070e26	[api-minor] Remove support for synchronous event dispatching in `LoopbackPort` Please note: The `defer` parameter has been enabled by default ever since PR 9777 (in 2018), which first shipped in PDF.js release `2.0.943`. With workers disabled, e.g. in Node.js environments, this has been used ever since without any problems reported[1]. The impetus for this change was that I happened to notice that if the `LoopbackPort` was used with synchronous event dispatching, we'd simply send that data as-is to the listeners. This created an inconsistency in the data returned from the `pdf.worker.js` file, since `postMessage` used with actual workers (or the `LoopbackPort` with `defer = true`) will ignore/throw when encountering unclonable data. Originally my intention was simply to just call `cloneValue` regardless of the event dispatching used in `LoopbackPort`, however looking at the use-cases (or lack thereof) of the `LoopbackPort` it seemed reasonable to simply remove the `defer` parameter instead. This patch is tagged "[api-minor]" since the `LoopbackPort` is still exposed in the API, although I really hope that no third-party is using this (since disabling workers leads to bad performance). Finally, this patch changes a `forEach` loop to `for...of` and makes uses of optional changing in existing code. --- [1] As evident by the `npm test` command run by Github Actions, and previously by Travis.	2021-02-17 16:12:29 +01:00
Jonas Jenwald	73bf45e64b	Support `Map` and `Set`, with `postMessage`, when workers are disabled The `LoopbackPort` currently doesn't support `Map` and `Set`, which it should since the "structured clone algorithm" used in browsers does support both of them; please see https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm#supported_types	2021-02-17 13:11:59 +01:00
Jonas Jenwald	298ee5cfbb	Replace some ternary operators with optional chaining, and nullish coalescing, in the `src/display/`-folder This way, we can further reduce unnecessary code-repetition in some cases.	2021-01-19 17:20:02 +01:00
Jonas Jenwald	13742eb82d	Inlude the JS `actions` for the page when dispatching the "pageopen"-event in the `BaseViewer` Note first of all how the `PDFDocumentProxy.getJSActions` method in the API caches the result, which makes repeated lookups cheap enough to not really be an issue. Secondly, with the previous patch, we're now only dispatching "pageopen"/"pageclose"-events when there's actually a sandbox that listens for them. All-in-all, with these changes we can thus simplify the default-viewer "pageopen"-event handler a fair bit.	2021-01-12 20:28:50 +01:00
Jonas Jenwald	81525fd446	Use ESLint to ensure that `export`s are sorted alphabetically There's built-in ESLint rule, see `sort-imports`, to ensure that all `import`-statements are sorted alphabetically, since that often helps with readability. Unfortunately there's no corresponding rule to sort `export`-statements alphabetically, however there's an ESLint plugin which does this; please see https://www.npmjs.com/package/eslint-plugin-sort-exports The only downside here is that it's not automatically fixable, but the re-ordering is a one-time "cost" and the plugin will help maintain a consistent ordering of `export`-statements in the future. Note: To reduce the possibility of introducing any errors here, the re-ordering was done by simply selecting the relevant lines and then using the built-in sort-functionality of my editor.	2021-01-09 20:37:51 +01:00
Jonas Jenwald	941b65f683	Remove unncessary `CanvasFactory`/`CMapReaderFactory`/`FileReaderFactory` duplication in unit-tests Given that the API will now, after PR 12039, automatically pick the correct factories to use depending on the environment (browser vs. Node.js), we can utilize that in the unit-tests as well. This way we don't have to manually repeat the same initialization code in multiple unit-tests. Note: The official PDF.js API is defined in `src/pdf.js`, hence the new exports in `src/display/api.js` will not affect that. Also, updates the unit-test `FileReaderFactory` helpers similarily. Drive-by change: Fix the `CMapReaderFactory` usage in the annotation unit-tests, since the cache should only contain raw data and not a Promise. While this obviously works as-is, having unit-tests that "abuse" the intended data format can easily lead to unnecessary failures if changes are made to the relevant `src/core/` code.	2021-01-08 17:33:59 +01:00
Calixte Denizet	6523f8880b	JS -- Plug PageOpen and PageClose actions	2021-01-06 13:31:15 +01:00
Jonas Jenwald	f9530e56da	Run `AnnotationStorage.resetModified` when destroying the `PDFDocumentLoadingTask`/`PDFDocumentProxy` This will, in a very simple way using the existing events, thus allow the viewer to remove the "beforeunload" `window` event listener when the document is closed. Generally speaking we want to avoid having global event listeners for the PDF document instance, which is why the `EventBus` exists, and instead reserve global events for the viewer itself. However, the `AnnotationStorage` "beforeunload" event unfortunately needs to be document-specific and we should thus ensure that it's correctly removed when the document is destroyed.	2020-12-19 14:05:31 +01:00
Calixte Denizet	1e2173f038	JS - Collect and execute actions at doc and pages level * the goal is to execute actions like Open or OpenAction * can be tested with issue6106.pdf (auto-print) * once #12701 is merged, we can add page actions	2020-12-18 20:03:59 +01:00
Jonas Jenwald	01d12b465c	[api-minor] Add "contentLength" to the information returned by the `getMetadata` method Given that we already include the "Content-Disposition"-header filename, when it exists, it shouldn't hurt to also include the information from the "Content-Length"-header. For PDF documents opened via a URL, which should be a very common way for the PDF.js library to be used, this will[1] thus provide a way of getting the PDF filesize without having to wait for the `getDownloadInfo`-promise to resolve[2]. With these API improvements, we can also simplify the filesize handling in the `PDFDocumentProperties` class. --- [1] Assuming that the server is correctly configured, of course. [2] Since that's not guaranteed to happen in general, with e.g. `disableAutoFetch = true` set.	2020-11-20 15:30:36 +01:00
Jonas Jenwald	de628cec59	Some `hasJSActions`, and general annotation-code, related cleanup in the viewer and API - Add support for logical assignment operators, i.e. `&&=`, `\|\|=`, and `??=`, with a Babel-plugin. Given that these required incrementing the ECMAScript version in the ESLint and Acorn configurations, and that platform/browser support is still fairly limited, always transpiling them seems appropriate for now. - Cache the `hasJSActions` promise in the API, similar to the existing `getAnnotations` caching. With this implemented, the lookup should now be cheap enough that it can be called unconditionally in the viewer. - Slightly improve cleanup of resources when destroying the `WorkerTransport`. - Remove the `annotationStorage`-property from the `PDFPageView` constructor, since it's not necessary and also brings it more inline with the `BaseViewer`. - Update the `BaseViewer.createAnnotationLayerBuilder` method to actaually agree with the `IPDFAnnotationLayerFactory` interface.[1] - Slightly tweak a couple of JSDoc comments. --- [1] We probably ought to re-factor both the `IPDFTextLayerFactory` and `IPDFAnnotationLayerFactory` interfaces to take parameter objects instead, since especially the `IPDFAnnotationLayerFactory` one is becoming quite unwieldy. Given that that would likely be a breaking change for any custom viewer-components implementation, this probably requires careful deprecation.	2020-11-14 13:58:35 +01:00
Calixte Denizet	a5279897a7	JS -- Add listener for sandbox events only if there are some actions * When no actions then set it to null instead of empty object * Even if a field has no actions, it needs to listen to events from the sandbox in order to be updated if an action changes something in it.	2020-11-09 18:37:59 +01:00
Jonas Jenwald	1dad255784	Convert files in the `src/display/`-folder to use optional chaining where possible By using optional chaining, see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Optional_chaining, it's possible to reduce unnecessary code-repetition in many cases. Note that these changes also reduce the size of the built `pdf.js` file, when `SKIP_BABEL == true` is set, and for the `MOZCENTRAL` build-target that result in a `0.1%` filesize reduction from a simple and mostly mechanical code change.	2020-11-07 13:22:06 +01:00
Tim van der Meij	e341e6e542	Merge pull request #12525 from brendandahl/mark-info [api-minor] Implement API to get MarkInfo from the catalog.	2020-10-31 00:05:19 +01:00
Brendan Dahl	f5c821e9c3	[api-minor] Implement API to get MarkInfo from the catalog.	2020-10-30 10:59:45 -07:00
Jonas Jenwald	c293fc2b8f	Add (some) optional chaining usage in `src/display/api.js` Since we no longer use SystemJS to load the unit-tests, there's now nothing that prevents us from using optional chaining and nullish coalescing in the `src/display/` directory.	2020-10-26 11:11:48 +01:00
Jonas Jenwald	d9084c0be2	Load the fake worker, in non-`PRODUCTION` mode, with native async `import` This removes the last SystemJS usage from both the API and the default viewer.	2020-10-26 11:11:48 +01:00
Calixte Denizet	c30a3a94f0	JS - Add a function in api to get the fields ids in AcroForm::CO	2020-10-17 12:56:40 +02:00
Jonas Jenwald	3351d3476d	Don't store complex data in `PDFDocument.formInfo`, and replace the `fields` object with a `hasFields` boolean instead This patch is based on a couple of smaller things that I noticed when working on PR 12479. - Don't store the /Fields on the `formInfo` getter, since that feels like overloading it with unintended (and too complex) data, and utilize a `hasFields` boolean instead. This functionality was originally added in PR 12271, to help determine what kind of form data a PDF document contains, and I think that we should ensure that the return value of `formInfo` only consists of "simple" data. With these changes the `fieldObjects` getter instead has to look-up the /Fields manually, however that shouldn't be a problem since the access is guarded by a `formInfo.hasFields` check which ensures that the data both exists and is valid. Furthermore, most documents doesn't even have any /AcroForm data anyway. - Determine the `hasFields` property first, to ensure that it's always correct even if there's errors when checking e.g. the /XFA or /SigFlags entires, since the `fieldObjects` getter depends on it. - Simplify a loop in `fieldObjects`, since the object being accessed is a `Map` and those have built-in iteration support. - Use a higher logging level for errors in the `formInfo` getter, and include the actual error message, since that'd have helped with fixing PR 12479 a lot quicker. - Update the JSDoc comment in `src/display/api.js` to list the return values correctly, and also slightly extend/improve the description.	2020-10-16 12:47:27 +02:00
Calixte Denizet	71ecc3129b	Add the possibility to collect Javascript actions	2020-10-14 10:44:16 +02:00
Jonas Jenwald	2a8983d76b	Enable the ESLint `no-var` rule in the `src/display/` folder Previously this rule has been enabled in the `web/` folder, and in select files in the `src/` sub-folders. Note that a number of the files in the `src/display/` folder were already enforcing the `no-var` rule, and thanks to Prettier the necessary re-writing will be (mostly) handled automatically. Please find additional details about the ESLint rule at https://eslint.org/docs/rules/no-var	2020-10-02 16:16:23 +02:00
Jonas Jenwald	2393443e73	Include the `/Order` array, if available, when parsing the Optional Content configuration The `/Order` array is used to improve the display of Optional Content groups in PDF viewers, and it allows a PDF document to e.g. specify that Optional Content groups should be displayed as a (collapsable) tree-structure rather than as just a list. Note that not all available Optional Content groups must be present in the `/Order` array, and PDF viewers will often (by default) hide those toggles in the UI. To allow us to improve the UX around toggling of Optional Content groups, in the default viewer, these hidden-by-default groups are thus appended to the parsed `/Order` array under a custom nesting level (with `name == null`). Finally, the patch also slightly tweaks an `OptionalContentConfig` related JSDoc-comment in the API.	2020-08-30 16:28:40 +02:00
Jonas Jenwald	1f5021d76a	Prevent errors if `PDFDocumentProxy.saveDocument` is called without the `annotationStorage` parameter (PR 12241 follow-up) Obviously it doesn't make sense to call that method without providing an `AnnotationStorage`-instance, however we should ensure that doing so won't cause errors. Hence we need to check that `annotationStorage` is actually defined, before attempting to call its `resetModified` method.	2020-08-22 18:09:17 +02:00
Brendan Dahl	8023175103	Support file save triggered from the Firefox integrated version. Related to https://bugzilla.mozilla.org/show_bug.cgi?id=1659753 This allows Firefox trigger a "save" event from ctrl/cmd+s or the "Save Page As" context menu, which in turn lets pdf.js generate a new PDF if there is form data to save. I also now use `sourceEventType` on downloads so Firefox can determine if it should launch the "open with" dialog or "save as" dialog.	2020-08-20 18:05:08 -07:00
Aki Sasaki	83365a3756	confirm if leaving a modified form without saving	2020-08-20 17:23:06 -07:00
Jonas Jenwald	b26d736809	Ensure that the "DocException" message handler, in the API, will always either error or warn (depending on the build) if a valid `Error` isn't found Having this present would have made debugging issues 11941 and 12209 so much quicker and easier.	2020-08-13 13:17:30 +02:00
Calixte Denizet	1a6816ba98	Add support for saving forms	2020-08-12 10:32:59 +02:00
Jonas Jenwald	4d351eab93	A couple of (small) tweaks of the `AnnotationStorage` (PR 12173 follow-up) - Initialize the `AnnotationStorage`-instance, on `PDFDocumentProxy`, lazily. - Change the `AnnotationStorage` to use a `Map` internally, rather than a regular Object (simplifies the following points). - Let `AnnotationStorage.getAll` return `null` when there's no data stored, to avoid unnecessary parsing on the worker-thread. This ought to "just work", since the worker-thread code should already handle the `!annotationStorage` case everywhere. - Add a new `AnnotationStorage.size` getter, to be able to easily tell if there's any data stored.	2020-08-10 17:07:24 +02:00
Jonathan Grimes	ac723a1760	Allow loading pdf fonts into another document.	2020-08-08 02:52:32 +00:00
Takashi Tamura	4ac62d8787	Fix the type of PDFDocumentLoadingTask.destroy.	2020-08-07 16:10:19 +09:00
Jonas Jenwald	5e44b241b2	[api-minor] Fix the `annotationStorage` parameter in `PDFPageProxy.render` While the parameter name (clearly) suggests that an `AnnotationStorage`-instance is expected, looking at the only call-sites that include the parameter (i.e. the `PDFPrintServiceFactory` instances) it actually contains just a normal Object. Hence it seems much more reasonable to actually pass a valid `AnnotationStorage`-instance, as the name suggests, and simply have `PDFPageProxy.render` do the `annotationStorage.getAll()` call. (Since we cannot send an `AnnotationStorage`-instance as-is to the worker-thread, given the "structured clone algorithm".)	2020-08-05 23:02:30 +02:00
Takashi Tamura	a0f0ab78f3	Fix the type definition of TypedArray.	2020-08-05 17:01:08 +09:00
Tim van der Meij	56ca027c08	Improve consistency for the API documentation comments Over time we used multiple different formats for JSDoc comments. This commit standardizes those formats to the one we used most often. Moreover, this removes the example in the outline endpoint documentation since it now has a proper type definition and it didn't render correctly in JSDoc.	2020-08-04 23:27:22 +02:00
Tim van der Meij	ba4a07ce07	Fix incorrect types in the API documentation	2020-08-04 23:19:59 +02:00
Tim van der Meij	3116216e1d	Improve the API documentation for `PDFDocumentLoadingTask` This commit: - formats the documentation block according to the standards; - replaces the callback definitions with the `function` type (we have that for other definitions already and the callback type was not rendered correctly by JSDoc); - synchronizes the type documentation and the class documentation; - fixes the documentation by making it easier to read and making sure that all optional properties are indicated as such; - uses the `@link` tag to indicate links to other code. The `typestest` still passes and JSDoc now renders this class correctly.	2020-08-04 23:17:24 +02:00
Brendan Dahl	ac494a2278	Add support for optional marked content. Add a new method to the API to get the optional content configuration. Add a new render task param that accepts the above configuration. For now, the optional content is not controllable by the user in the viewer, but renders with the default configuration in the PDF. All of the test files added exhibit different uses of optional content. Fixes #269. Fix test to work with optional content. - Change the stopAtErrors test to ensure the operator list has something, instead of asserting the exact number of operators.	2020-08-04 09:26:55 -07:00
Tim van der Meij	00a8b42e67	Merge pull request #12102 from ineiti/add_types_annotations Add types annotations	2020-08-02 16:45:37 +02:00
Jonas Jenwald	05baa4c89f	Revert "[api-minor] Allow loading pdf fonts into another document."	2020-08-01 12:52:39 +02:00
Tim van der Meij	173b92a873	Merge pull request #12131 from jsg2021/issue-8271 [api-minor] Allow loading pdf fonts into another document.	2020-08-01 01:13:41 +02:00

1 2 3 4 5 ...