pdf.js

Author	SHA1	Message	Date
Jonas Jenwald	ceec93c832	[api-minor] Remove calling `getDocument` directly with a `PDFDataRangeTransport`-instance (PR 15943 follow-up) This was deprecated in PR 15943, which has now been included in two official PDF.js releases. Given that `PDFDataRangeTransport` is somewhat unlikely to be used outside of the built-in Firefox PDF Viewer, it doesn't seem necessary to wait longer before removing this. Also, removes the specific error-message for GENERIC builds to not unnecessarily "advertise" using non-objects when calling the `getDocument`-function. Please note: This patch is written using the GitHub UI, since I'm currently without a dev machine, so hopefully it works correctly.	2023-03-02 15:12:01 +01:00
Calixte Denizet	fd03cd5493	[api-minor] Generate images in the worker instead of the main thread. We introduced the use of OffscreenCanvas in #14754 and this patch aims to use them for all kind of images. It'll slightly improve performances (and maybe slightly decrease memory use). Since an image can be rendered in using some transfer maps but because of OffscreenCanvas we don't have the underlying pixels array the transfer maps stuff is re-implemented in using the SVG filter feComponentTransfer.	2023-03-01 17:40:12 +01:00
Jonas Jenwald	f42a2e8451	[api-minor] Move the `canvasFactory` option into `getDocument` Rather than repeatedly initializing a `canvasFactory`-instance for every page, move it to the document-level instead. Please note: This patch is written using the GitHub UI, since I'm currently without a dev machine, so hopefully it works correctly.	2023-03-01 09:07:16 +01:00
Jonas Jenwald	5075d0495b	Use `OffscreenCanvas` as intended for all code-paths in `src/display/text_layer.js` (PR 15722 follow-up) Currently some `getCtx` calls will have `isOffscreenCanvasSupported === undefined` set, meaning that `OffscreenCanvas` isn't being used as intended, since no `TextLayerRenderTask._isOffscreenCanvasSupported` property exists. Please note: This patch is written using the GitHub UI, since I'm currently without a dev machine, so hopefully it works correctly.	2023-02-24 11:29:58 +01:00
Calixte Denizet	3a21423386	[Acroform] Use the full path to find the node in the XFA datasets where to store the value I noticed several 'Path not found' errors because of a field called #subform[2]. From the XFA specs, the hash is used for a class of elements in the template tree. When we're looking for a node in the datasets tree, it doesn't make sense to search for a class. Hence the path element starting with a hash are just skipped.	2023-02-23 12:09:39 +01:00
Jonas Jenwald	1b076b7a35	Move the `ImageBitmap` clean-up into the `PDFObjects` class With upcoming changes we'll potentially start to cache `ImageBitmap` data at the document-level, in addition to just at the page-level. Hence we need to ensure that such data is actually released on clean-up, and rather than duplicating the existing manual handling this code is instead moved into the `PDFObjects.clear` method. (In my opinion, this is an overall improvement even without globally cached `ImageBitmap` data.) Please note: This patch is written using the GitHub UI, since I'm currently without a dev machine, so hopefully it's correct and makes sense.	2023-02-21 12:00:45 +01:00
Calixte Denizet	dca54c8f8a	[JS] Send a Validate action on change on Choice widget	2023-02-19 16:33:05 +01:00
Jonas Jenwald	b6ba8cc84a	[api-minor] Deprecate providing binary data as `Buffer` in Node.js environments The `Buffer`-object is Node.js specific functionality[1], thus (obviously) not found in browsers. Please note that the PDF.js library has never officially supported/documented that binary data can be passed as a `Buffer`, and that internally in the `src/core`-code we only work with standard `Uint8Array`s. This means that if, in Node.js environments, a `Buffer` is passed to the API we need to wrap it into a `Uint8Array`, which essentially means creating a copy of the data and thus increasing memory usage. --- [1] Refer to https://nodejs.org/api/buffer.html#buffer	2023-02-14 11:30:40 +01:00
Jonas Jenwald	df3b359280	Remove "else after return" from the `getUrlProp`/`getDataProp` helper functions This helps readability of this code a little bit, in my opinion, and it's actually ever so slightly less code in the built `pdf.js` file.	2023-02-14 10:50:22 +01:00
Jonas Jenwald	8026ed6b0a	Reduce duplication for reference tests with an `annotationStorage` entry Currently we duplicate the same code more than once in the `test/driver.js` file, which we can avoid by adding a new `AnnotationStorage` helper method instead.	2023-02-13 11:09:16 +01:00
Jonas Jenwald	9d29abdfa0	Change the `LoopbackPort` class to use a Set internally This is a tiny bit more compact, thanks to the `Set.prototype.delete` method.	2023-02-09 12:34:41 +01:00
Jonas Jenwald	0a0f3fc733	Move the main-thread CMap/StandardFontData factory initialization to `getDocument` By default we're using worker-thread fetching (in browsers) of this data nowadays, however in Node.js environments or if the user provides custom factories we still fallback to main-thread fetching. Hence it makes sense, as far as I'm concerned, to move this initialization into the `getDocument` function to ensure that the factories can actually be initialized before attempting to load the document. Also, this further reduces the amount of `getDocument` parameters that we need to pass into into the `WorkerTransport` class.	2023-02-05 11:52:35 +01:00
Jonas Jenwald	ce8ac6d96a	Only pass the necessary parameters to `_fetchDocument` and `WorkerTransport` Currently we're passing all available parameters to this function respectively class, despite that not actually being necessary. By splitting the parameters we not only improve the structure, and basically "document" the code a little bit, but we can also simplify the `_fetchDocument` function considerably.	2023-02-05 11:52:33 +01:00
Jonas Jenwald	512aa50fdd	Re-factor the parameter parsing/validation in `getDocument` This is very old code, where we loop through the user-provided options and build an internal parameter object. To prevent errors we also need to ensure that the parameters are correct/valid, which is especially important for the ones that are sent to the worker-thread such that structured cloning won't fail.[1] Over the years this has led to more and more code being added in `getDocument` to validate the user-provided options, and at this point most of them have at least basic validation. However the way that this is implemented feels slightly backwards, since we first build the internal parameter object and only afterwards validate those parameters.[2] Hence this patch changes the `getDocument` function to instead check/validate the supported options upfront, and then explicitly build the internal parameter object with only the needed properties. --- [1] Note the supported types at https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm#supported_types [2] The internal parameter object may also, because of the loop, end up with lots of unnecessary properties since anything that the user provides is being copied.	2023-02-05 11:52:25 +01:00
Tim van der Meij	e698664927	Merge pull request #16004 from Snuffleupagus/WorkerTransport-cacheSimpleMethod Improve how we cache Promises in `WorkerTransport`	2023-02-04 15:13:12 +01:00
Tim van der Meij	b75dafba87	Merge pull request #15987 from Snuffleupagus/onOpenWithTransport-params Remove unused parameters from the `onOpenWithTransport` method in `PDFViewerApplication.initPassiveLoading`	2023-02-04 15:07:42 +01:00
Tim van der Meij	e848a0e61c	Merge pull request #15981 from Snuffleupagus/cMapPacked-true [api-minor] Let the `cMapPacked` parameter, in `getDocument`, default to `true`	2023-02-04 15:00:26 +01:00
Jonas Jenwald	2de03a7d91	Improve how we cache Promises in `WorkerTransport` A number of methods have their Promises cached, to avoid repeated worker round-trips, since they're expected to be called more than once from the default viewer. The way that the caching is currently implemented means that we need to remember to manually clear these Promises on document cleanup/destruction, and it'd be nice to avoid that. With this patch the relevant Promises are now instead placed in just one `Map`, which is easy to clear, and a new helper method is also introduced to reduce duplication for simple `WorkerTransport` methods.	2023-02-04 11:57:37 +01:00
Calixte Denizet	185281957d	[Editor] Make the annotation editor layer invisible when disabled and empty It'll help to avoid to consider them when the browser is restyling.	2023-02-01 17:53:44 +01:00
Jonas Jenwald	cf8ee47589	Remove unused parameters from the `onOpenWithTransport` method in `PDFViewerApplication.initPassiveLoading` The only parameter that we actually need here is the `PDFDataRangeTransport`-instance, since the others are not necessary. - The `url` parameter, as passed to the `getDocument` function in the API, is simply being ignored; see `2d87a2eb1c/src/display/api.js (L447-L458)` - The `length` parameter, as passed to the `getDocument` function in the API, is always being overwritten; see `2d87a2eb1c/src/display/api.js (L519-L525)`	2023-02-01 09:33:22 +01:00
Jonas Jenwald	5e88228767	Allow, optionally, using worker-modules during local development Until PR 12563 is deemed safe to land, I'd still like to be able to use worker-modules in the viewer during local development. Hence this patch which temporarily adds a new `workerModules` hash-parameter, only available in non-PRODUCTION mode, that allows using worker-modules in the development viewer. To enable this functionality, simply use http://localhost:8888/web/viewer.html#workerModules=true	2023-01-31 12:09:44 +01:00
Jonas Jenwald	c5d6391898	[api-minor] Let the `cMapPacked` parameter, in `getDocument`, default to `true` The initial CMap support was added in PR 4259 using the "raw" Adobe files, however they were quickly deemed to be unnecessarily large. As a result PR 4470 introduced the more compact "binary" CMap format, with both of those PRs being included in the very same release (version `0.8.1334`) . Please note that we've thus never shipped anything except the "binary" CMap files with the PDF library, and furthermore note that we've not even once updated the CMap files since they were originally added almost nine years ago. Requiring users to remember that `cMapPacked = true` is necessary, in addition to setting the `cMapUrl` parameter, in order for CMap loading to work feels like a less than ideal API. Hence this patch, which suggests that we simply let `cMapPacked` default to `true` now.	2023-01-30 15:35:02 +01:00
Tim van der Meij	ee3be2f979	Merge pull request #15951 from Snuffleupagus/polyfill-Path2D Polyfill `Path2D` in Node.js environments	2023-01-28 19:06:54 +01:00
Tim van der Meij	e539d2da1e	Merge pull request #15964 from Snuffleupagus/getDocument-non-object Only accept non-objects passed to `getDocument` in GENERIC builds	2023-01-28 18:42:09 +01:00
Jonas Jenwald	cf0369d622	Polyfill `Path2D` in Node.js environments Until just recently the only existing `Path2D` polyfill didn't have support for Node.js and/or the `node-canvas` package. Given that this was just fixed, in the latest version, we can now finally remove our inline-checks at the relevant call-sites; please also see https://github.com/nilzona/path2d-polyfill#usage-with-node-canvas	2023-01-28 18:28:22 +01:00
Tim van der Meij	cb5a28ceca	Merge pull request #15954 from Snuffleupagus/getDocument-URL-tweaks Tweak the internal handling of the `url`-parameter in `getDocument` (PR 13166 follow-up)	2023-01-28 18:18:17 +01:00
Jonas Jenwald	1c4af2727c	Simplify setting the `GlobalWorkerOptions` default values (PR 9480 follow-up) There's really no need for these "complicated" default value assignments, since `GlobalWorkerOptions` is a local variable at this point, and this is rather a case of too much copy-and-paste. Note that years ago, when all options were set using a global `PDFJS` object, it's possible that options had been set (from the outside) before the object had been properly initialized; see e.g. `a89071bdef/src/display/global.js`	2023-01-26 14:16:01 +01:00
Jonas Jenwald	4758e6649c	Only accept non-objects passed to `getDocument` in GENERIC builds In general it's always recommended to pass a parameter object when calling the `getDocument`-function in the API, since that's the only way to provide additional options, and the fact that it also accepts a URL or TypedArray directly is now mostly for backwards compatibility reasons. Unfortunately we cannot really remove this, since that code has existed since "forever", however we can limit it to only the GENERIC build to avoid completely unnecessary checks in e.g. the Firefox PDF Viewer. Finally, note that the default-viewer always provides a parameter object when calling the `getDocument`-function and it's thus completely unaffected by these changes.	2023-01-26 10:48:58 +01:00
Jonas Jenwald	755319130e	Tweak the internal handling of the `url`-parameter in `getDocument` (PR 13166 follow-up) - Use a `URL`-instance directly, since it's by definition an absolute URL. - Actually limit the "raw" url-string handling to Node.js environments, as intended. - Skip the warning, since we're already throwing an Error if the `url`-parameter is invalid.	2023-01-24 11:18:41 +01:00
Jonas Jenwald	7976fc7851	[api-minor] Deprecate calling `getDocument` directly with a `PDFDataRangeTransport`-instance In general it's recommended to pass a parameter object when calling the `getDocument`-function in the API, since that's the only way to provide additional options, and the fact that it also accepts a URL or TypedArray directly is now mostly for backwards compatibility reasons. However, the `getDocument`-function also accepts a direct `PDFDataRangeTransport`-instance which just seems unnecessary. Please note: The `PDFDataRangeTransport`-implementation was added specifically for the built-in Firefox PDF Viewer, however it's most likely not commonly used by any third-party (given that it requires manual PDF-data loading). Furthermore, the default-viewer always provides a parameter object when calling the `getDocument`-function and it's thus completely unaffected by these changes.	2023-01-19 14:25:55 +01:00
Jonas Jenwald	397f943ca3	[api-minor] Enable transferring of TypedArray PDF data by default (PR 15908 follow-up) This patch removes the recently introduced `transferPdfData` API-option, and simply enables transferring of TypedArray data by default instead of copying it. This will help reduce main-thread memory usage, however it will take ownership of the TypedArrays. Currently this only applies to the following cases: - TypedArrays passed to the `getDocument`-function in the API, in order to open PDF documents from binary data. - TypedArrays passed to a `PDFDataRangeTransport`-instance, used to support custom PDF document fetching/loading (see e.g. the Firefox PDF Viewer). PLEASE NOTE: To avoid being affected by this, please simply copy any TypedArray data before passing it to either of the functions/methods mentioned above. Now that we transfer TypedArray data that we previously only copied, we need to be more careful with input validation. Given how the `{IPDFStreamReader, IPDFStreamRangeReader}.read` methods will always return ArrayBuffer data, which is then transferred to the worker-thread[1], the actual TypedArray data passed to the API thus need to have the same exact size as its underlying ArrayBuffer to prevent issues. Hence we'll check for this and only allow transferring of safe TypedArray data, and fallback to simply copying the data just as before. This obviously shouldn't be an issue in the Firefox PDF Viewer, but for the general PDF.js library we need to be more careful here. --- [1] See `e09ad99973/src/display/api.js (L2492-L2506)` respectively `e09ad99973/src/display/api.js (L2578-L2590)`	2023-01-14 10:39:36 +01:00
Jonas Jenwald	99cfab18c1	Combine the array-like and ArrayBuffer branches, when handling binary data, in `getDocument`	2023-01-13 13:28:44 +01:00
Jonas Jenwald	e09ad99973	Merge pull request #15916 from Snuffleupagus/fetch-transfer [api-minor] Enabling transferring of data fetched with the `PDFFetchStream` implementation	2023-01-13 13:28:12 +01:00
Jonas Jenwald	1362cd91d0	Improve input validation in `PDFDataTransportStream._onReceiveData` (PR 15908 follow-up) The mozilla-central [method `PdfDataListener.readData`](https://searchfox.org/mozilla-central/rev/893a8f062ec6144c84403fbfb0a57234418b89cf/toolkit/components/pdfjs/content/PdfStreamConverter.jsm#207-210) can return `null`, hence it seems like a very good idea to update `PDFDataTransportStream._onReceiveData` to handle that gracefully since the current code will throw in that case. Also, improves the JSDocs for the `PDFDataRangeTransport` class in the API.	2023-01-12 15:24:59 +01:00
Jonas Jenwald	cee97fcd15	[api-minor] Enabling transferring of data fetched with the `PDFFetchStream` implementation Note how in the API we're transferring the PDF data that's fetched over the network[1]: - `f28bf23a31/src/display/api.js (L2467-L2480)` - `f28bf23a31/src/display/api.js (L2553-L2564)` To support that functionality we have the `PDFDataTransportStream`, `PDFFetchStream`, `PDFNetworkStream`, and `PDFNodeStream` implementations. Here these stream-implementations vary slightly in how they handle `ArrayBuffer`s internally, w.r.t. transferring or copying the data: - In `PDFDataTransportStream` we optionally, after PR 15908, allow transferring of the PDF data as provided externally (used e.g. in the Firefox PDF Viewer). - In `PDFFetchStream` we're currenly always copying the PDF data returned by the Fetch API, which seems unnecessary. As discussed in PR 15908, it'd seem very weird if this sort of browser API didn't allow transferring of the returned data. - In `PDFNetworkStream` we're already, since many years, transferring the PDF data returned by the `XMLHttpRequest` functionality. Note how the `getArrayBuffer` helper function simply returns an `ArrayBuffer` response as-is. - In `PDFNodeStream` we're currently copying the PDF data, however this is unfortunately necessary since Node.js returns data as a `Buffer` object[2]. Given that the `PDFNetworkStream` has been, indirectly, supporting transferring of PDF data for years it would seem really strange if this didn't also apply to the `PDFFetchStream`-implementation. Hence this patch simply enables transferring of PDF data, when accessed using the Fetch API, unconditionally to help reduced main-thread memory usage since the `PDFFetchStream`-implementation is used by default in browsers (for the GENERIC build). --- [1] As opposed to PDF data being provided as e.g. a TypedArray when calling `getDocument` in the API. [2] This is a "special" Node.js object, see https://nodejs.org/api/buffer.html#buffer, which doesn't exist in browsers.	2023-01-12 13:59:21 +01:00
Jonas Jenwald	bbe629018d	[api-minor] Add a new `transferPdfData` option to allow transferring more data to the worker-thread (bug 1809164) Also, removes the `initialData`-parameter JSDocs for the `getDocument`-function given that this parameter has been completely unused since PR 8982 (over five years ago). Note that the `initialData`-parameter is, and always was, intended to be provided when initializing a `PDFDataRangeTransport`-instance.	2023-01-10 21:03:44 +01:00
Tim van der Meij	69113f08f2	Merge pull request #15887 from Snuffleupagus/rm-setPDFNetworkStreamFactory Inline the `setPDFNetworkStreamFactory` functionality in `src/display/api.js`	2023-01-07 13:16:23 +01:00
Tim van der Meij	b428824269	Merge pull request #15879 from Snuffleupagus/useWorkerFetch-defaults [api-minor] Improve the `useWorkerFetch` default value checks	2023-01-07 13:13:25 +01:00
Jonas Jenwald	1d5de9f4f4	Inline the `setPDFNetworkStreamFactory` functionality in `src/display/api.js` Given that this is internal functionality, not exposed in the official API, it's not entirely clear (at least to me) why we can't just initialize this directly in `src/display/api.js` instead. When testing both the development viewer and all the ways in which we run tests, everthing still appears to work just fine with this patch.	2023-01-06 13:23:07 +01:00
Jonas Jenwald	1a69d537c1	[api-minor] Limit the `PDFDocumentLoadingTask.onUnsupportedFeature` functionality to GENERIC builds (PR 15758 follow-up) This was deprecated in PR 15758 but it's unfortunately quite difficult to tell if third-party users are depending on this, e.g. to implement custom error reporting, and if so to what extent. However, thanks to the pre-processor we can limit most of this code to GENERIC builds which still seem like a worthwhile change. These changes reduce the bundle size of the Firefox PDF Viewer by 3.8 kB in total.	2023-01-01 17:53:12 +01:00
Jonas Jenwald	0c1fb4e740	[api-minor] Remove the `PDFDocumentProxy.stats` getter (PR 15758 follow-up) This was deprecated in PR 15758 and given that it's quite unlikely that any third-party users are relying on this functionality, since it was only ever added to support telemetry reporting in the Firefox PDF Viewer, it should hopefully be fine to remove this fairly quickly. These changes reduce the bundle size of the Firefox PDF Viewer by 4.5 kB in total.	2023-01-01 17:06:47 +01:00
Jonas Jenwald	2c57a4232c	[api-minor] Improve the `useWorkerFetch` default value checks Given that the Fetch API only supports the http/https protocols, worker-thread fetching of CMaps and Standard-fonts may thus fail in certain cases. To improve the default behaviour we'll now also check that the `cMapUrl` and `standardFontDataUrl` options are appropriate, except in Firefox where this should always work.	2023-01-01 14:48:28 +01:00
Jonas Jenwald	3110d1f29a	Merge pull request #15869 from Snuffleupagus/_abortOperatorList-clearTimeout Always abort a pending `streamReader` cancel timeout in `PDFPageProxy._abortOperatorList` (PR 15825 follow-up)	2022-12-27 13:26:43 +01:00
Jonas Jenwald	841abb53e6	Remove `PDFPageProxy.getJSActions` caching, since it's unused, in the API Note how, in the scripting initialization in the viewer, we only ever invoke `PDFPageProxy.getJSActions` once per page in order to improve overall performance; see `a575aa13b9/web/pdf_scripting_manager.js (L372-L375)` Hence it really shouldn't be necessary to cache its result in the API, especially when that is done manually rather than using something like `shadow`.	2022-12-27 10:39:33 +01:00
Jonas Jenwald	ae24dbd064	Always abort a pending `streamReader` cancel timeout in `PDFPageProxy._abortOperatorList` (PR 15825 follow-up) When we're destroying a `PDFPageProxy`-instance, during full document destruction, we'll force-abort any worker-thread parsing of operatorLists. Hence we should make sure that any pending cancel timeout is always aborted, since a later `PDFPageProxy._abortOperatorList` call should always "replace" a previous one. Please note: Technically this was always wrong, but with the changes in PR 15825 it became ever so slightly easier to trigger this thanks to the potentially longer timeout.	2022-12-27 10:19:39 +01:00
Jonas Jenwald	ded02941f2	[api-minor] Move, most of, the `isPureXfa`-handling from `PDFViewer` and into `PDFPageView` By moving this code the "pageviewer"-component example will become slightly more usable on its own, it may simplify a future addition of XFA Foreground document support, and finally also serves as preparation for the following patches.	2022-12-18 13:10:23 +01:00
Calixte Denizet	a84d14b382	[Editor] Avoid to scroll when an annotation is commited (fixes issue #15744 )	2022-12-17 13:48:19 +01:00
Jonas Jenwald	506bbb7283	Merge pull request #15825 from Snuffleupagus/cancel-extraDelay [api-minor] Allow specifying an extra-delay, in `RenderTask.cancel`, for worker-thread aborting of operatorList parsing	2022-12-14 19:26:39 +01:00
Jonas Jenwald	91524d1a60	[api-minor] Allow specifying an extra-delay, in `RenderTask.cancel`, for worker-thread aborting of operatorList parsing This is done to support upcoming viewer-changes, and in order to prevent third-party users from outright breaking things we'll simply ignore too large values.	2022-12-14 12:34:16 +01:00
Jonas Jenwald	dcf9ff2182	Handle possibly undefined parameters once per `AnnotationLayer.render` invocation There's no reason to repeat this for every single annotation. Also, adds a couple of missing JSDoc-parameters.	2022-12-14 12:23:24 +01:00

1 2 3 4 5 ...

1660 Commits