pdf.js

Author	SHA1	Message	Date
Jonas Jenwald	1d6d476cab	Rename the `src/core/obj.js` file to `src/core/catalog.js` Now that only the `Catalog` remains in this file, after the previous patches, it makes sense to rename the file to reduce confusion.	2021-04-13 21:00:30 +02:00
Jonas Jenwald	088a55f80d	Enable the `no-var` rule in the `src/core/xref.js` file	2021-04-13 21:00:30 +02:00
Jonas Jenwald	bc828cd41f	Convert the `XRef` to a "normal" class	2021-04-13 21:00:30 +02:00
Jonas Jenwald	e8750cfe95	Move the `XRef` from `src/core/obj.js` and into its own file The size of the `src/core/obj.js` file has increased slowly over the years, and it also contains a fair amount of distinct functionality. In order to improve readability and make it easier to navigate through the code, this patch moves the `XRef` into its own file.	2021-04-13 21:00:30 +02:00
Jonas Jenwald	24e5ecdf76	Move `NameTree`/`NumberTree` from `src/core/obj.js` and into its own file The size of the `src/core/obj.js` file has increased slowly over the years, and it also contains a fair amount of distinct functionality. In order to improve readability and make it easier to navigate through the code, this patch moves `NameTree`/`NumberTree` into its own file.	2021-04-13 21:00:30 +02:00
Jonas Jenwald	92141e0468	Enable the `no-var` rule in the `src/core/file_spec.js` file	2021-04-13 21:00:30 +02:00
Jonas Jenwald	22a066e657	Convert the `FileSpec` to a "normal" class	2021-04-13 21:00:30 +02:00
Jonas Jenwald	e02d17da93	Move the `FileSpec` from `src/core/obj.js` and into its own file The size of the `src/core/obj.js` file has increased slowly over the years, and it also contains a fair amount of distinct functionality. In order to improve readability and make it easier to navigate through the code, this patch moves the `FileSpec` into its own file.	2021-04-13 21:00:30 +02:00
Jonas Jenwald	6a935682fd	Covert the `ObjectLoader` to a "normal" class	2021-04-13 21:00:30 +02:00
Jonas Jenwald	604cd6d600	Move the `ObjectLoader` from `src/core/obj.js` and into its own file The size of the `src/core/obj.js` file has increased slowly over the years, and it also contains a fair amount of distinct functionality. In order to improve readability and make it easier to navigate through the code, this patch moves the `ObjectLoader` into its own file.	2021-04-13 21:00:30 +02:00
Tim van der Meij	ebeb3f7999	Merge pull request #13234 from Snuffleupagus/hasJSActions-MissingDataException [api-minor] Ensure that `PDFDocumentProxy.hasJSActions` won't fail if `MissingDataException`s are thrown during the associated worker-thread parsing	2021-04-13 20:44:58 +02:00
Cetin Sert	d498897ab5	Fix annotation input focus trap regression in Safari (#13232 ) `setSelectionRange(0, 0)` added in `44b24fcc29` for #12359, required only by Firefox ([bug](https://bugzilla.mozilla.org/show_bug.cgi?id=860329)), causes issues mozilla#13191, mozilla#12592 in Safari. `scrollLeft = 0` is a fix that breaks the focus trap in Safari while keeping Firefox behavior same for #12359.	2021-04-13 20:40:52 +02:00
Tim van der Meij	3d2d8002b0	Merge pull request #13223 from Snuffleupagus/worker-xfa-structTree-tweaks Remove the unused "GetIsPureXfa" message handler; and avoid unnecessary parsing when no structTree is available (PR 13069 follow-up, PR 13221 follow-up)	2021-04-13 20:39:52 +02:00
Jonas Jenwald	2b2234fd5a	[api-minor] Ensure that `PDFDocumentProxy.hasJSActions` won't fail if `MissingDataException`s are thrown during the associated worker-thread parsing With the current implementation of `PDFDocument.hasJSActions`, in the worker-thread, we're not actually handling not-yet-loaded data correctly. This can thus fail in two different ways: - The `PDFDocument.fieldObjects` getter (and its helper method), while it may return a Promise, still fetches all of its data synchronously and it can thus throw a `MissingDataException` during parsing. - The `Catalog.jsActions` getter, which is completely synchronous, can obviously throw a `MissingDataException` during parsing. If either of these cases occur currently, the `PDFDocumentProxy.hasJSActions` method in the API can either return a rejected Promise (which it never should) or possibly "hang" and never resolve. Please note: While I've not yet seen this error in an actual PDF document, it can happen during loading if you're unlucky enough with e.g. the structure of the PDF document and/or the download speed offered by the server. This patch is thus based on code-inspection and on manually throwing a `MissingDataException` on the first access of `Catalog.jsActions` to simulate this situation. Finally, this patch adds a couple of API unit-tests for this (since none existed).	2021-04-13 14:33:56 +02:00
Jonas Jenwald	4aa27cc645	Re-factor `Catalog._collectJavaScript` to use a `Map` rather than an Object Given that this only an internal helper method, used by the `Catalog.{javaScript, jsActions}` getters, this change simplifies iteration of the returned data. We can also (slightly) re-factor the code of the `jsActions` getter, and remove an obsolete[1] JSDoc-comment from the `openAction` getter. --- [1] Not really relevant now that we've got proper scripting support.	2021-04-13 14:16:17 +02:00
Calixte Denizet	a4c986515f	XFA -- Display text content - display xhtml; - allow spaces in xhtml (xfa-spacerun:yes); - support column layout; - fix some border issues.	2021-04-12 14:13:49 +02:00
Jonas Jenwald	54ef4370a2	Ensure that the data is loaded, in the "GetPageJSActions" message handler Similar to all other data accesses, note e.g. the "GetDocJSActions" handler just above, we need to ensure that a `MissingDataException` isn't propagated to the main-thread if this data is accessed while the PDF document is still loading.	2021-04-12 13:54:37 +02:00
Jonas Jenwald	9360c7cbdc	Avoid unnecessary parsing, in `Page.GetStructTree`, when no structTree is available (PR 13221 follow-up) It's obviously (a bit) more efficient to return early in `Page.getStructTree`, rather than trying to first "parse" an empty structTree-root. Somehow I didn't think of this yesterday, but this feels like a much better solution overall; sorry about the churn here!	2021-04-12 08:54:21 +02:00
Jonas Jenwald	0d2dd6c2fe	Remove the unused "GetIsPureXfa" message handler in the worker (PR 13069 follow-up) Looking at the API, there's no code which actually sends this message. Most likely it's a left-over from a previous version of PR 13069, since the `isPureXfa` parameter is being included in the "GetDoc" message.	2021-04-12 08:52:27 +02:00
Jonas Jenwald	5adee0cdd1	[api-minor] Let `PDFPageProxy.getStructTree` return `null`, rather than an empty structTree, for documents without any accessibility data (PR 13171 follow-up) This is first of all consistent with existing API-methods, where we return `null` when the data in question doesn't exist. Secondly, it should also be (slightly) more efficient since there's less dummy-data that we need to transfer between threads. Finally, this prevents us from adding an empty/unnecessary span to every single page even in documents without any structure tree data.	2021-04-11 12:35:33 +02:00
Jonas Jenwald	ff4dae05b0	Ensure that `getStructTree` won't break with `disableAutoFetch = true` set (PR 13171 follow-up) Open http://localhost:8888/web/viewer.html?file=/test/pdfs/pdf.pdf#disableStream=true&disableAutoFetch=true and observe the following message in the console (repeated for each page of the document): ``` Uncaught (in promise) Object { message: "Missing data [19787293, 19787294)", name: "UnknownErrorException", details: "MissingDataException: Missing data [19787293, 19787294)", stack: "BaseExceptionClosure@http://localhost:8888/src/shared/util.js:458:29\n@http://localhost:8888/src/shared/util.js:462:3\n" } ```	2021-04-11 12:15:33 +02:00
Tim van der Meij	d9d626a5e1	Merge pull request #13214 from calixteman/signatures Display widget signature	2021-04-10 19:35:16 +02:00
Calixte Denizet	5875ebb1ca	Display widget signature - but don't validate them for now; - Firefox will display a bar to warn that the signature validation is not supported (see https://bugzilla.mozilla.org/show_bug.cgi?id=854315) - almost all (all ?) pdf readers display signatures; - validation is done in edge but for now it's behind a pref.	2021-04-10 19:13:28 +02:00
Tim van der Meij	03c8c89002	Merge pull request #13171 from brendandahl/struct-tree [api-minor] Add support for basic structure tree for accessibility.	2021-04-09 21:32:44 +02:00
Tim van der Meij	b0473eb353	Merge pull request #13207 from Snuffleupagus/api-AnnotationStorage-params [api-minor] Remove the manual passing of an `AnnotationStorage`-instance when calling various API-method	2021-04-09 21:09:16 +02:00
Brendan Dahl	fc9501a637	Add support for basic structure tree for accessibility. When a PDF is "marked" we now generate a separate DOM that represents the structure tree from the PDF. This DOM is inserted into the <canvas> element and allows screen readers to walk the tree and have more information about headings, images, links, etc. To link the structure tree DOM (which is empty) to the text layer aria-owns is used. This required modifying the text layer creation so that marked items are now tracked.	2021-04-09 09:56:28 -07:00
Jonas Jenwald	737a8e846d	Add `deprecated` handling of the now removed `AnnotationStorage` API-parameters These changes are done separately, to make it easier to remove them in the future.	2021-04-09 13:25:03 +02:00
Jonas Jenwald	72ef183085	[api-minor] Remove the manual passing of an `AnnotationStorage`-instance when calling various API-method Note how we purposely don't expose the `AnnotationStorage`-class directly in the official API (see `src/pdf.js`), since trying to use multiple ones simultaneously doesn't really make sense (e.g. in the viewer). Instead we lazily initialize, and cache, just one instance via `PDFDocumentProxy.annotationStorage` which should thus be available internally in the API itself without having to be manually passed to various methods. To support these changes, the `AnnotationStorage`-instance initialization is moved into the `WorkerTransport`-class to allow both `PDFDocumentProxy` and `PDFPageProxy` to access it. This patch implements the following simplifications: - Remove the `annotationStorage`-parameter from `PDFDocumentProxy.saveDocument`, since it's already available internally. Furthermore, while it's currently possible to call that method without an `AnnotationStorage`-instance, that really does not make any sense at all. In this case you're effectively reducing `PDFDocumentProxy.saveDocument` to a "regular" `PDFDocumentProxy.getData` call, but with a lot more overhead, which was obviously not the intention of the `PDFDocumentProxy.saveDocument`-method. - Try to discourage third-party users from calling `PDFDocumentProxy.saveDocument` unconditionally, as a replacement for `PDFDocumentProxy.getData` (note the previous point). - Replace the `annotationStorage`-parameter, in `PDFPageProxy.render`, with a boolean `includeAnnotationStorage`-parameter which simply indicates if the (internally available) `AnnotationStorage`-instance should be used during rendering (e.g. for printing). - By removing the need to manually provide `annotationStorage`-parameters to various API-methods, using the API should become simpler (e.g. for third-parties) since you no longer need to worry about manually fetching and passing around this data.	2021-04-09 13:24:25 +02:00
Ikko Ashimine	c4c4333d54	Fix typo in canvas.js Reseting -> Resetting	2021-04-08 23:45:24 +09:00
Tim van der Meij	6429ccc002	Merge pull request #13194 from Snuffleupagus/ttcf-fuzzy-match Fuzzy-match the fontName, for TrueType Collection fonts, where the "name"-table is wrong (issue 13193)	2021-04-07 20:50:19 +02:00
Tim van der Meij	5945f7c4a1	Merge pull request #13186 from Snuffleupagus/rm-deprecated-code Remove some `deprecated` code	2021-04-07 20:38:59 +02:00
Jonas Jenwald	f986ccdf0e	Fuzzy-match the fontName, for TrueType Collection fonts, where the "name"-table is wrong (issue 13193) The fontName, as defined in the PDF document, cannot be found in any of the "name"-tables in the TrueType Collection font. To work-around that, this patch adds a fallback code-path to allow using an approximately matching fontName rather than outright failing.	2021-04-07 15:25:32 +02:00
Jonas Jenwald	4e81e0e14f	Remove the deprecated `AnnotationStorage.getOrCreateValue`-method (PR 12759 follow-up) While this method has only been deprecated in one releases now, the `AnnotationStorage`-functionality is new enough that third-party implementations hopefully don't rely heavily on it just yet. (And removing this quickly should help reduce the likelihood that someone starts using it.)	2021-04-06 13:22:06 +02:00
Tim van der Meij	fc0cd4a443	Convert the `startXRefParsedCache` variable, in `src/core/obj.js`, from an object to a set We only want to track XRef starting points instead of actual data, so using a set conveys that intention more clearly and is slightly more efficient.	2021-04-05 19:32:58 +02:00
Tim van der Meij	228adbf673	Merge pull request #13172 from Snuffleupagus/cleanup-keepFonts [api-minor] Add an option, in `PDFDocumentProxy.cleanup`, to allow fonts to remain attached to the DOM	2021-04-05 14:21:34 +02:00
Jonas Jenwald	16fd838f52	Convert the `renderTasks`, used in `PDFPageProxy.render`/`PDFPageProxy.getOperatorList`, to a Set When removing tasks we're currently forced to indirectly iterate through the array, which can be avoided by using a Set instead. Furthermore, we can also (slightly) modernize the code responsible for initializing the `renderTasks`.	2021-04-05 10:51:28 +02:00
Jonas Jenwald	68d3a333ac	Change the `seenStyles` object, in `PartialEvaluator.getTextContent`, to a Set Given that what we actually want is only to keep track of the loadedFont-names, rather than storing any actual data, using an object isn't really necessary here. Furthermore, in the current code, we're also using `in` when checking if the data exists, which is generally less efficient than just checking for the value directly.	2021-04-05 10:34:02 +02:00
Jonas Jenwald	a2bc6481a0	[api-minor] Add an option, in `PDFDocumentProxy.cleanup`, to allow fonts to remain attached to the DOM As mentioned in the JSDoc comment, this should not be used unless you know what you're doing, since it will lead to increased memory usage. However, in some situations (e.g. SVG-rendering), we still want to be able to run general clean-up on both the main/worker-thread while keeping loaded fonts attached to the DOM.[1] As part of these changes, `WorkerTransport.startCleanup` is converted to an async method and we'll also skip clean-up when destruction has started (since it's redundant). --- [1] The SVG-rendering mode is obviously not officially supported, since it's both rather incomplete and inherently slower. However with recent changes, whereby we cache repeated images on the document rather than the page level, memory usage can be a lot worse than before if we never attempt to release e.g. cached image-data when the viewer is in SVG-rendering mode.	2021-04-02 12:32:31 +02:00
Jonas Jenwald	48ff20493f	Mark some internal `PDFDocumentProxy`-properties as "private" These two properties were never intended to be anything but "private", hence it really cannot hurt to actually indicate that they're not part of any official API.	2021-04-02 12:26:32 +02:00
Jonas Jenwald	0eb1433c78	[api-minor] Change the format of the `fontName`-property, in `defaultAppearanceData`, on Annotation-instances (PR 12831 follow-up) Currently the `fontName`-property contains an actual /Name-instance, which is a problem given that its fallback value is an empty string; see `ca7f546828/src/core/default_appearance.js (L35)` The reason that this is a problem can be seen in `ca7f546828/src/core/primitives.js (L30-L34)`, since an empty string short-circuits the cache. Essentially, in PDF documents, a /Name-instance cannot be empty and the way that the `DefaultAppearanceEvaluator` does things is unfortunately not entirely correct. Hence the `fontName`-property is changed to instead contain a string, rather than a /Name-instance, which simplifies the code overall. Please note: I'm tagging this patch with "[api-minor]", since PR 12831 is included in the current pre-release (although we're not using the `fontName`-property in the display-layer).	2021-04-01 16:47:30 +02:00
Tim van der Meij	ca7f546828	Merge pull request #12908 from calixteman/11918 Slightly rescale lineWidth to workaround chrome rendering issue	2021-03-31 21:56:31 +02:00
Calixte Denizet	a0cfb0841f	Slightly rescale lineWidth to workaround chrome rendering issue	2021-03-31 21:49:00 +02:00
Tim van der Meij	5a64157a2f	Merge pull request #13168 from janpe2/ttf-uni-glyphs Use post table when Encoding has only Differences	2021-03-31 21:35:13 +02:00
Tim van der Meij	1a4af17d07	Merge pull request #13165 from Snuffleupagus/Annotation-rm-defaultAppearance-export [api-minor] Stop exposing the raw `defaultAppearance`-string on Annotation-instances	2021-03-31 21:30:50 +02:00
Tim van der Meij	5be0fbe8f1	Merge pull request #13166 from Snuffleupagus/getDocument-URL [api-minor] Support proper `URL`-objects, in addition to URL-strings, in `getDocument`	2021-03-31 21:20:08 +02:00
Tim van der Meij	2fb4d02ea5	Merge pull request #13158 from Snuffleupagus/rm-URL-polyfill Remove the `URL` polyfill	2021-03-31 20:22:02 +02:00
Jani Pehkonen	0117ee5071	Use post table when Encoding has only Differences Fixes #13107 In the issue, some TrueType glyph names have the format `uniXXXX`. Font's `Encoding` dictionary has the entry `Differences` but no `BaseEncoding`. `uniXXXX` names are converted to glyph indices using font's `post` table but currently that is done only when `BaseEncoding` exists. We must enable the conversion also when only `Differences` exists.	2021-03-31 17:58:44 +03:00
Jonas Jenwald	db1e1612df	[api-minor] Support proper `URL`-objects, in addition to URL-strings, in `getDocument` Currently only URL-strings are officially supported by `getDocument`, however at this point in time I cannot really see any compelling reason to not support `URL`-objects as well. Most likely the reason that we've don't already support `URL`-objects, in `getDocument`, is that historically `URL` wasn't fully implemented across browsers and our old polyfill wasn't perfect; see https://developer.mozilla.org/en-US/docs/Web/API/URL/URL#browser_compatibility Please note: Because of how the `url` parameter is currently handled, there's actually some cases where passing a `URL`-object to `getDocument` already works. That, in my opinion, provides additional motivation for supporting `URL`-objects officially, since it makes the API more consistent. The following is an attempt to summarize the current situation, based on the actual code rather than the JSDocs: - `getDocument("url string")` works and is documented.[1] - `getDocument({ url: "url string", })` works and is documented.[1] - `getDocument(new URL(...))` throws immediately, since no supported parameters are found. - `getDocument({ url: new URL(...), })` actually works even though it's not documented.[1] Originally, when data was fetched on the worker-thread, this would likely have thrown since `URL` isn't clonable.[2] - `getDocument({ url: { abc: 123, }, })`, or some similarily meaningless input, will be "accepted" by `getDocument` and then throw a `MissingPDFException` when attempting to fetch the bogus data. With the changes in this patch, not only is `URL`-objects now officially supported and documented when calling `getDocument`, but we'll also do a much better job at actually validating any URL-data passed to `getDocument` (and instead fail early). --- [1] In browsers, we create a valid URL thus indirectly validating the input. In Node.js environments, on the other hand, no validation is done since obtaining a baseUrl is more difficult (and PDF.js is primarily written for browsers anyway). [2] https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm#supported_types	2021-03-31 16:21:41 +02:00
Jonas Jenwald	27add0f1f3	Re-factor the `source` parsing, in `getDocument`, to use `switch` rather than `if...else` Given the number of parameters that we now need to parse here, this code is no longer as readable as one would like. Hence this re-factoring, which will improve overall readability and also help with the next patch.	2021-03-31 16:21:37 +02:00
Jonas Jenwald	9c6770748c	Move the `PDFDocumentStats` typedef closer to its usage Currently this typedef appears slightly out-of-place, in the middle of the arguably much more important `getDocument` JSDocs.	2021-03-31 16:21:22 +02:00

1 2 3 4 5 ...

4442 Commits