pdf.js

Author	SHA1	Message	Date
Jonas Jenwald	d6be5141e9	Fallback to using the `name` table to infer the encoding for TrueType fonts missing such data (issue 15910) The relevant TrueType font is missing both /ToUnicode and /Encoding entires, either of which would have prevented the (current) broken textLayer rendering. My first idea was that we could use the `post` table in the TrueType font, see https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6post.html, to get the actual glyphNames and amend the fallback ToUnicode-map that way. Unfortunately that didn't work, since the `post` table only contained ".notdef" and "" (i.e. empty string) entries. Instead we try to use the `name` table in the TrueType font, see https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6name.html, to determine if the platform is Windows and thus fallback to generate a ToUnicode-map from the `WinAnsiEncoding`.	2023-01-17 16:04:51 +01:00
Jonas Jenwald	d8d5545e03	Merge pull request #15926 from Snuffleupagus/annotation-appearance-stream Ensure that Annotation `appearance`-entries are actually Streams	2023-01-16 15:00:12 +01:00
Jonas Jenwald	cefaecc2e8	Ensure that Annotation `appearance`-entries are actually Streams Note how all over the `src/core/annotation.js`-code we're assuming that if an `appearance`-entry exists it's also a Stream. However, we're not actually checking that thoroughly enough which causes issues in some badly generated PDF documents.	2023-01-16 13:02:53 +01:00
Jonas Jenwald	397f943ca3	[api-minor] Enable transferring of TypedArray PDF data by default (PR 15908 follow-up) This patch removes the recently introduced `transferPdfData` API-option, and simply enables transferring of TypedArray data by default instead of copying it. This will help reduce main-thread memory usage, however it will take ownership of the TypedArrays. Currently this only applies to the following cases: - TypedArrays passed to the `getDocument`-function in the API, in order to open PDF documents from binary data. - TypedArrays passed to a `PDFDataRangeTransport`-instance, used to support custom PDF document fetching/loading (see e.g. the Firefox PDF Viewer). PLEASE NOTE: To avoid being affected by this, please simply copy any TypedArray data before passing it to either of the functions/methods mentioned above. Now that we transfer TypedArray data that we previously only copied, we need to be more careful with input validation. Given how the `{IPDFStreamReader, IPDFStreamRangeReader}.read` methods will always return ArrayBuffer data, which is then transferred to the worker-thread[1], the actual TypedArray data passed to the API thus need to have the same exact size as its underlying ArrayBuffer to prevent issues. Hence we'll check for this and only allow transferring of safe TypedArray data, and fallback to simply copying the data just as before. This obviously shouldn't be an issue in the Firefox PDF Viewer, but for the general PDF.js library we need to be more careful here. --- [1] See `e09ad99973/src/display/api.js (L2492-L2506)` respectively `e09ad99973/src/display/api.js (L2578-L2590)`	2023-01-14 10:39:36 +01:00
Jonas Jenwald	99cfab18c1	Combine the array-like and ArrayBuffer branches, when handling binary data, in `getDocument`	2023-01-13 13:28:44 +01:00
Jonas Jenwald	e09ad99973	Merge pull request #15916 from Snuffleupagus/fetch-transfer [api-minor] Enabling transferring of data fetched with the `PDFFetchStream` implementation	2023-01-13 13:28:12 +01:00
Jonas Jenwald	1362cd91d0	Improve input validation in `PDFDataTransportStream._onReceiveData` (PR 15908 follow-up) The mozilla-central [method `PdfDataListener.readData`](https://searchfox.org/mozilla-central/rev/893a8f062ec6144c84403fbfb0a57234418b89cf/toolkit/components/pdfjs/content/PdfStreamConverter.jsm#207-210) can return `null`, hence it seems like a very good idea to update `PDFDataTransportStream._onReceiveData` to handle that gracefully since the current code will throw in that case. Also, improves the JSDocs for the `PDFDataRangeTransport` class in the API.	2023-01-12 15:24:59 +01:00
Jonas Jenwald	cee97fcd15	[api-minor] Enabling transferring of data fetched with the `PDFFetchStream` implementation Note how in the API we're transferring the PDF data that's fetched over the network[1]: - `f28bf23a31/src/display/api.js (L2467-L2480)` - `f28bf23a31/src/display/api.js (L2553-L2564)` To support that functionality we have the `PDFDataTransportStream`, `PDFFetchStream`, `PDFNetworkStream`, and `PDFNodeStream` implementations. Here these stream-implementations vary slightly in how they handle `ArrayBuffer`s internally, w.r.t. transferring or copying the data: - In `PDFDataTransportStream` we optionally, after PR 15908, allow transferring of the PDF data as provided externally (used e.g. in the Firefox PDF Viewer). - In `PDFFetchStream` we're currenly always copying the PDF data returned by the Fetch API, which seems unnecessary. As discussed in PR 15908, it'd seem very weird if this sort of browser API didn't allow transferring of the returned data. - In `PDFNetworkStream` we're already, since many years, transferring the PDF data returned by the `XMLHttpRequest` functionality. Note how the `getArrayBuffer` helper function simply returns an `ArrayBuffer` response as-is. - In `PDFNodeStream` we're currently copying the PDF data, however this is unfortunately necessary since Node.js returns data as a `Buffer` object[2]. Given that the `PDFNetworkStream` has been, indirectly, supporting transferring of PDF data for years it would seem really strange if this didn't also apply to the `PDFFetchStream`-implementation. Hence this patch simply enables transferring of PDF data, when accessed using the Fetch API, unconditionally to help reduced main-thread memory usage since the `PDFFetchStream`-implementation is used by default in browsers (for the GENERIC build). --- [1] As opposed to PDF data being provided as e.g. a TypedArray when calling `getDocument` in the API. [2] This is a "special" Node.js object, see https://nodejs.org/api/buffer.html#buffer, which doesn't exist in browsers.	2023-01-12 13:59:21 +01:00
Jonas Jenwald	bbe629018d	[api-minor] Add a new `transferPdfData` option to allow transferring more data to the worker-thread (bug 1809164) Also, removes the `initialData`-parameter JSDocs for the `getDocument`-function given that this parameter has been completely unused since PR 8982 (over five years ago). Note that the `initialData`-parameter is, and always was, intended to be provided when initializing a `PDFDataRangeTransport`-instance.	2023-01-10 21:03:44 +01:00
calixteman	fcaeb5db88	Merge pull request #15901 from calixteman/15289_followup Avoid null ExpansionFactor in type1 fonts (follow-up of #15289)	2023-01-07 18:20:31 +01:00
Jonas Jenwald	74e4b515c5	Merge pull request #15897 from Snuffleupagus/issue-15893 Support parsing encrypted documents in `XRef.indexObjects` (issue 15893)	2023-01-07 16:55:41 +01:00
Calixte Denizet	c170245fc0	Avoid null ExpansionFactor in type1 fonts (follow-up of #15289 )	2023-01-07 16:25:24 +01:00
Calixte Denizet	e565e455e2	Set ExpansionFactor to 0.06 when it's equals to 0 in the private dict of CFF fonts	2023-01-07 14:53:13 +01:00
Tim van der Meij	69113f08f2	Merge pull request #15887 from Snuffleupagus/rm-setPDFNetworkStreamFactory Inline the `setPDFNetworkStreamFactory` functionality in `src/display/api.js`	2023-01-07 13:16:23 +01:00
Tim van der Meij	b428824269	Merge pull request #15879 from Snuffleupagus/useWorkerFetch-defaults [api-minor] Improve the `useWorkerFetch` default value checks	2023-01-07 13:13:25 +01:00
Jonas Jenwald	1d5de9f4f4	Inline the `setPDFNetworkStreamFactory` functionality in `src/display/api.js` Given that this is internal functionality, not exposed in the official API, it's not entirely clear (at least to me) why we can't just initialize this directly in `src/display/api.js` instead. When testing both the development viewer and all the ways in which we run tests, everthing still appears to work just fine with this patch.	2023-01-06 13:23:07 +01:00
Jonas Jenwald	7d94fdeb48	Support parsing encrypted documents in `XRef.indexObjects` (issue 15893) Please note: The reduced test-case is not a perfect reproduction of the original PDF document, since this one fails to open in e.g. Adobe Reader, but I do believe that it captures the most important points here. For corrupt and encrypted PDF documents, it's possible that only some trailer dictionaries actually contain an /Encrypt-entry. Previously we'd could easily miss that, since we generally pick the first not obviously corrupt trailer dictionary, and the solution implemented here is to simply pre-parse all trailer dictionaries to see if there's any /Encrypt-entries.	2023-01-06 13:09:37 +01:00
Calixte Denizet	dea2471e96	[JS] UserActivation must be enabled before running document actions else auto-print is broken (it's a regression from patch #15822).	2023-01-04 21:26:36 +01:00
Jonas Jenwald	6bdbb5c5ca	Update the `type`/`subtype` at the end of font parsing This fixes a warning reported by CodeQL, and should also make general sense given that we parse the font-data to determine the actual `type`/`subtype` rather than trusting the PDF document.	2023-01-02 16:21:48 +01:00
Jonas Jenwald	1a69d537c1	[api-minor] Limit the `PDFDocumentLoadingTask.onUnsupportedFeature` functionality to GENERIC builds (PR 15758 follow-up) This was deprecated in PR 15758 but it's unfortunately quite difficult to tell if third-party users are depending on this, e.g. to implement custom error reporting, and if so to what extent. However, thanks to the pre-processor we can limit most of this code to GENERIC builds which still seem like a worthwhile change. These changes reduce the bundle size of the Firefox PDF Viewer by 3.8 kB in total.	2023-01-01 17:53:12 +01:00
Jonas Jenwald	0c1fb4e740	[api-minor] Remove the `PDFDocumentProxy.stats` getter (PR 15758 follow-up) This was deprecated in PR 15758 and given that it's quite unlikely that any third-party users are relying on this functionality, since it was only ever added to support telemetry reporting in the Firefox PDF Viewer, it should hopefully be fine to remove this fairly quickly. These changes reduce the bundle size of the Firefox PDF Viewer by 4.5 kB in total.	2023-01-01 17:06:47 +01:00
Jonas Jenwald	2c57a4232c	[api-minor] Improve the `useWorkerFetch` default value checks Given that the Fetch API only supports the http/https protocols, worker-thread fetching of CMaps and Standard-fonts may thus fail in certain cases. To improve the default behaviour we'll now also check that the `cMapUrl` and `standardFontDataUrl` options are appropriate, except in Firefox where this should always work.	2023-01-01 14:48:28 +01:00
Jonas Jenwald	3110d1f29a	Merge pull request #15869 from Snuffleupagus/_abortOperatorList-clearTimeout Always abort a pending `streamReader` cancel timeout in `PDFPageProxy._abortOperatorList` (PR 15825 follow-up)	2022-12-27 13:26:43 +01:00
Jonas Jenwald	841abb53e6	Remove `PDFPageProxy.getJSActions` caching, since it's unused, in the API Note how, in the scripting initialization in the viewer, we only ever invoke `PDFPageProxy.getJSActions` once per page in order to improve overall performance; see `a575aa13b9/web/pdf_scripting_manager.js (L372-L375)` Hence it really shouldn't be necessary to cache its result in the API, especially when that is done manually rather than using something like `shadow`.	2022-12-27 10:39:33 +01:00
Jonas Jenwald	ae24dbd064	Always abort a pending `streamReader` cancel timeout in `PDFPageProxy._abortOperatorList` (PR 15825 follow-up) When we're destroying a `PDFPageProxy`-instance, during full document destruction, we'll force-abort any worker-thread parsing of operatorLists. Hence we should make sure that any pending cancel timeout is always aborted, since a later `PDFPageProxy._abortOperatorList` call should always "replace" a previous one. Please note: Technically this was always wrong, but with the changes in PR 15825 it became ever so slightly easier to trigger this thanks to the potentially longer timeout.	2022-12-27 10:19:39 +01:00
Jonas Jenwald	2fcf8bb5be	Re-factor searching for incomplete objects in `XRef.indexObjects` (issue 15803) When trying to find incomplete objects, i.e. those missing the "endobj"-string at the end, there's unfortunately a number of possible operators that we need to check for. Otherwise we could miss e.g. the "trailer" at the end of a corrupt PDF document, which is why the referenced document didn't work. Currently we do all searching on the "raw" bytes of the PDF document, for efficiency, however this doesn't really work when we need to check for multiple potential command-strings. To keep the complexity manageable we'll instead use regular expressions here, but we can at least avoid creating lots of substrings thanks to the `RegExp.lastIndex` property; which is well supported across browsers according to https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/lastIndex#browser_compatibility Note that this repeated regular expression usage could perhaps be slightly less efficient than the old code, however this method is only invoked for corrupt PDF documents.	2022-12-19 23:01:09 +01:00
Jonas Jenwald	ded02941f2	[api-minor] Move, most of, the `isPureXfa`-handling from `PDFViewer` and into `PDFPageView` By moving this code the "pageviewer"-component example will become slightly more usable on its own, it may simplify a future addition of XFA Foreground document support, and finally also serves as preparation for the following patches.	2022-12-18 13:10:23 +01:00
Calixte Denizet	a84d14b382	[Editor] Avoid to scroll when an annotation is commited (fixes issue #15744 )	2022-12-17 13:48:19 +01:00
calixteman	cb212b24fd	Merge pull request #15841 from calixteman/15784 Strip out a reserved operator (9) from CFF char strings (fixes issue #15784)	2022-12-16 15:55:02 +01:00
Calixte Denizet	f80880ccaa	Strip out a reserved operator (9) from CFF char strings (fixes issue #15784 )	2022-12-16 15:17:46 +01:00
Jonas Jenwald	0c83bebf03	Merge pull request #15832 from Snuffleupagus/issue-15828 Attempt to expose `OnProgressParameters` in the TypeScript definitions (issue 15828)	2022-12-16 12:44:29 +01:00
Jonas Jenwald	26135b0313	Always parse the entire `startXRefQueue` in `XRef.readXRef` (issue 15833) Previously we'd abort all parsing if an Error was encountered, despite the fact that multiple `startXRefQueue`-entries may be available and that continued parsing could thus eventually be able to find usable data. Note that in the referenced PDF document the `startxref`-operator, at the end of the file, points to a position in the middle of an arbitrary `stream` which is why things break.	2022-12-15 13:46:28 +01:00
Jonas Jenwald	0ef72044e2	Attempt to expose `OnProgressParameters` in the TypeScript definitions (issue 15828) Hopefully this works, since as usual I don't really know anything about TypeScript...	2022-12-14 21:36:31 +01:00
Jonas Jenwald	506bbb7283	Merge pull request #15825 from Snuffleupagus/cancel-extraDelay [api-minor] Allow specifying an extra-delay, in `RenderTask.cancel`, for worker-thread aborting of operatorList parsing	2022-12-14 19:26:39 +01:00
Jonas Jenwald	91524d1a60	[api-minor] Allow specifying an extra-delay, in `RenderTask.cancel`, for worker-thread aborting of operatorList parsing This is done to support upcoming viewer-changes, and in order to prevent third-party users from outright breaking things we'll simply ignore too large values.	2022-12-14 12:34:16 +01:00
Jonas Jenwald	dcf9ff2182	Handle possibly undefined parameters once per `AnnotationLayer.render` invocation There's no reason to repeat this for every single annotation. Also, adds a couple of missing JSDoc-parameters.	2022-12-14 12:23:24 +01:00
Calixte Denizet	2ebf8745a2	[JS] Run the named actions before running the format when the file is open (issue #15818 ) It's a follow-up of #14950: some format actions are ran when the document is open but we must be sure we've everything ready for that, hence we have to run some named actions before runnig the global format. In playing with the form, I discovered that the blur event wasn't triggered when JS called `setFocus` (because in such a case the mouse was never down). So I removed the mouseState thing to just use the correct commitKey when blur is triggered by a TAB key.	2022-12-13 21:12:32 +01:00
Calixte Denizet	0c1ec946aa	[JS] Handle correctly choice widgets where the display and the export values are different (issue #15815 )	2022-12-13 19:08:26 +01:00
Calixte Denizet	1a397681fe	The annotation layer dimensions must be set before adding some elements (follow-up of #15770 ) In order to move the annotations in the DOM to have something which corresponds to the visual order, we need to have their dimensions/positions which means that the parent must have some dimensions.	2022-12-13 14:54:45 +01:00
Jonas Jenwald	cafdc48147	[api-minor] Add a new `PageViewport`-getter to access the original, un-scaled, viewport dimensions While reviewing recent patches, I couldn't help but noticing that we now have a lot of call-sites that manually access the `PageViewport.viewBox`-property. Rather than repeating that verbatim all over the code-base, this patch adds a lazily computed and cached getter for this data instead.	2022-12-11 18:37:35 +01:00
Jonas Jenwald	9b6d0d994d	Remove the API-caching of annotation-data This was essentially done only to compensate for the viewer calling `PDFPageProxy.getAnnotations` unconditionally on every annotationLayer-rendering invocation. With the previous patch that's no longer happening, and this API-caching should thus no longer be necessary.	2022-12-11 18:12:10 +01:00
Calixte Denizet	a989b5a879	Set the dimensions of the various layers at their creation - Use a unique helper function in display/display_utils.js; - Move those dimensions in css' side.	2022-12-10 14:35:06 +01:00
Calixte Denizet	4f0bfabe7a	Take all the viewBox into account when computing the coordinates of an annotation in the page (fixes #15789 )	2022-12-08 15:02:20 +01:00
Calixte Denizet	b93bf9f654	[Editor] Don't use the editor parent which can be null. An annotation editor layer can be destroyed when it's invisible, hence some annotations can have a null parent but when printing/saving or when changing font size, color, ... of all added annotations (when selected with ctrl+a) we still need to have some parent properties especially the page dimensions, global scale factor and global rotation angle. This patch aims to remove all the references to the parent in the editor instances except in some cases where an editor should obviously have one. It fixes #15780.	2022-12-08 14:06:06 +01:00
Calixte Denizet	9af89381cd	[Editor] Add a very basic and incomplete workaround for issue #15780 The main issue is due to the fact that an editor's parent can be null when we want to serialize it and that lead to an exception which break all the saving/printing process. So this incomplete patch fixes only the saving/printing issue but not the underlying problem (i.e. having a null parent) and doesn't bring that much complexity, so it should help to uplift it the next Firefox release.	2022-12-06 16:22:24 +01:00
Jonas Jenwald	cdd39ec69e	Merge pull request #15778 from Snuffleupagus/keep-structTree Don't re-create the `structTreeLayer` on zooming and rotation	2022-12-06 10:02:20 +01:00
Jonas Jenwald	0274245e90	Remove the unused `TextLayerRenderTask._renderingDone` property (PR 15259 follow-up) This is yet another property that I forgot to remove in PR 15259.	2022-12-05 11:49:14 +01:00
Jonas Jenwald	fe8fded23b	[api-minor] Combine the `textContent`/`textContentStream` parameters Rather than handling these parameters separately, which is a left-over from back when streaming of textContent was originally added, we can simply pass either data directly to the `TextLayer` and let it handle things accordingly. Also, improves a few JSDoc comments and `typedef`-imports.	2022-12-04 21:22:14 +01:00
Jonas Jenwald	da0e6bc590	Don't re-create the `structTreeLayer` on zooming and rotation Compared to the recent PR 15722 for the `textLayer` this one should be a (comparatively) much a smaller win overall, since most documents don't have any structTree-data and the required parsing should be cheaper. However, it seems to me that it cannot hurt to improve this nonetheless. Note that by moving the `structTreeLayer` initialization we remove the need for the "textlayerrendered" event listener, which thus simplifies the code a little bit. Also, removes the API-caching of the structTree-data since this was basically done to offset the lack of caching in the viewer.	2022-12-04 10:18:58 +01:00
Tim van der Meij	67e1c37e0f	Merge pull request #15773 from Snuffleupagus/view-worker-normalize [api-minor] Normalize the `view`-getter on the worker-thread	2022-12-02 19:52:44 +01:00

... 8 9 10 11 12 ...

6121 Commits