pdf.js

Author	SHA1	Message	Date
Jonas Jenwald	7d94fdeb48	Support parsing encrypted documents in `XRef.indexObjects` (issue 15893) Please note: The reduced test-case is not a perfect reproduction of the original PDF document, since this one fails to open in e.g. Adobe Reader, but I do believe that it captures the most important points here. For corrupt and encrypted PDF documents, it's possible that only some trailer dictionaries actually contain an /Encrypt-entry. Previously we'd could easily miss that, since we generally pick the first not obviously corrupt trailer dictionary, and the solution implemented here is to simply pre-parse all trailer dictionaries to see if there's any /Encrypt-entries.	2023-01-06 13:09:37 +01:00
Calixte Denizet	661f425934	[GV] Add an option in the find controller to update matches count only when the last page is reached (bug 1803188). In GeckoView, on an event, a callback must be executed with the result of an action, but the callback can be used only one time. So for each FindInPage event, we must trigger only one matches count update.	2023-01-06 10:56:26 +01:00
Calixte Denizet	dea2471e96	[JS] UserActivation must be enabled before running document actions else auto-print is broken (it's a regression from patch #15822).	2023-01-04 21:26:36 +01:00
Jonas Jenwald	42aa08563b	Merge pull request #15880 from Snuffleupagus/rm-docStats [api-minor] Remove the `PDFDocumentProxy.stats` getter (PR 15758 follow-up)	2023-01-02 16:03:53 +01:00
Calixte Denizet	69c88477a9	Avoid an infinite loop when searching for a single diacritic	2023-01-02 12:27:07 +01:00
Jonas Jenwald	0c1fb4e740	[api-minor] Remove the `PDFDocumentProxy.stats` getter (PR 15758 follow-up) This was deprecated in PR 15758 and given that it's quite unlikely that any third-party users are relying on this functionality, since it was only ever added to support telemetry reporting in the Firefox PDF Viewer, it should hopefully be fine to remove this fairly quickly. These changes reduce the bundle size of the Firefox PDF Viewer by 4.5 kB in total.	2023-01-01 17:06:47 +01:00
Jonas Jenwald	2fcf8bb5be	Re-factor searching for incomplete objects in `XRef.indexObjects` (issue 15803) When trying to find incomplete objects, i.e. those missing the "endobj"-string at the end, there's unfortunately a number of possible operators that we need to check for. Otherwise we could miss e.g. the "trailer" at the end of a corrupt PDF document, which is why the referenced document didn't work. Currently we do all searching on the "raw" bytes of the PDF document, for efficiency, however this doesn't really work when we need to check for multiple potential command-strings. To keep the complexity manageable we'll instead use regular expressions here, but we can at least avoid creating lots of substrings thanks to the `RegExp.lastIndex` property; which is well supported across browsers according to https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/lastIndex#browser_compatibility Note that this repeated regular expression usage could perhaps be slightly less efficient than the old code, however this method is only invoked for corrupt PDF documents.	2022-12-19 23:01:09 +01:00
Calixte Denizet	f80880ccaa	Strip out a reserved operator (9) from CFF char strings (fixes issue #15784 )	2022-12-16 15:17:46 +01:00
Jonas Jenwald	26135b0313	Always parse the entire `startXRefQueue` in `XRef.readXRef` (issue 15833) Previously we'd abort all parsing if an Error was encountered, despite the fact that multiple `startXRefQueue`-entries may be available and that continued parsing could thus eventually be able to find usable data. Note that in the referenced PDF document the `startxref`-operator, at the end of the file, points to a position in the middle of an arbitrary `stream` which is why things break.	2022-12-15 13:46:28 +01:00
Jonas Jenwald	91524d1a60	[api-minor] Allow specifying an extra-delay, in `RenderTask.cancel`, for worker-thread aborting of operatorList parsing This is done to support upcoming viewer-changes, and in order to prevent third-party users from outright breaking things we'll simply ignore too large values.	2022-12-14 12:34:16 +01:00
Calixte Denizet	2ebf8745a2	[JS] Run the named actions before running the format when the file is open (issue #15818 ) It's a follow-up of #14950: some format actions are ran when the document is open but we must be sure we've everything ready for that, hence we have to run some named actions before runnig the global format. In playing with the form, I discovered that the blur event wasn't triggered when JS called `setFocus` (because in such a case the mouse was never down). So I removed the mouseState thing to just use the correct commitKey when blur is triggered by a TAB key.	2022-12-13 21:12:32 +01:00
Calixte Denizet	0c1ec946aa	[JS] Handle correctly choice widgets where the display and the export values are different (issue #15815 )	2022-12-13 19:08:26 +01:00
Calixte Denizet	1a397681fe	The annotation layer dimensions must be set before adding some elements (follow-up of #15770 ) In order to move the annotations in the DOM to have something which corresponds to the visual order, we need to have their dimensions/positions which means that the parent must have some dimensions.	2022-12-13 14:54:45 +01:00
Calixte Denizet	4f0bfabe7a	Take all the viewBox into account when computing the coordinates of an annotation in the page (fixes #15789 )	2022-12-08 15:02:20 +01:00
calixteman	fe3df4dcb4	Merge pull request #15782 from calixteman/15780 [api-minor][Editor] Don't use the editor parent which can be null.	2022-12-08 14:27:42 +01:00
Calixte Denizet	b93bf9f654	[Editor] Don't use the editor parent which can be null. An annotation editor layer can be destroyed when it's invisible, hence some annotations can have a null parent but when printing/saving or when changing font size, color, ... of all added annotations (when selected with ctrl+a) we still need to have some parent properties especially the page dimensions, global scale factor and global rotation angle. This patch aims to remove all the references to the parent in the editor instances except in some cases where an editor should obviously have one. It fixes #15780.	2022-12-08 14:06:06 +01:00
Jonas Jenwald	0ca92bf2a8	Merge pull request #15775 from Snuffleupagus/vars-all Tighten the `vars`-argument for the ESLint `no-unused-vars` rule	2022-12-07 10:30:47 +01:00
Calixte Denizet	9af89381cd	[Editor] Add a very basic and incomplete workaround for issue #15780 The main issue is due to the fact that an editor's parent can be null when we want to serialize it and that lead to an exception which break all the saving/printing process. So this incomplete patch fixes only the saving/printing issue but not the underlying problem (i.e. having a null parent) and doesn't bring that much complexity, so it should help to uplift it the next Firefox release.	2022-12-06 16:22:24 +01:00
Jonas Jenwald	fe8fded23b	[api-minor] Combine the `textContent`/`textContentStream` parameters Rather than handling these parameters separately, which is a left-over from back when streaming of textContent was originally added, we can simply pass either data directly to the `TextLayer` and let it handle things accordingly. Also, improves a few JSDoc comments and `typedef`-imports.	2022-12-04 21:22:14 +01:00
Jonas Jenwald	b659bacc43	Tighten the `vars`-argument for the ESLint `no-unused-vars` rule Please see https://eslint.org/docs/latest/rules/no-unused-vars#vars	2022-12-04 16:15:50 +01:00
Tim van der Meij	99cfef882f	Merge pull request #15752 from Snuffleupagus/no-typeof-undefined Enable the `no-typeof-undefined` ESLint plugin rule	2022-12-02 19:48:16 +01:00
Calixte Denizet	eed9bf71c5	Refactor the text layer code in order to avoid to recompute it on each draw The idea is just to resuse what we got on the first draw. Now, we only update the scaleX of the different spans and the other values are dependant of --scale-factor. Move some properties in the CSS in order to avoid any updates in JS.	2022-12-01 18:42:43 +01:00
Jonas Jenwald	47dbfc4ade	Enable the `no-typeof-undefined` ESLint plugin rule Please see https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/no-typeof-undefined.md	2022-12-01 18:20:39 +01:00
Calixte Denizet	20fd9099f8	[Annotation] Send correctly the updated values to the JS sandbox	2022-11-29 17:34:06 +01:00
Calixte Denizet	ea1995991b	Don't add an extra space after a Katakana or a Hiragana at the eol when searching	2022-11-29 10:46:48 +01:00
Calixte Denizet	ae7da6ae48	[JS] By default, a text field value must be treated as a number (bug 1802888)	2022-11-28 16:24:01 +01:00
calixteman	33f9d1aab2	Merge pull request #15755 from calixteman/rounding_printf [JS] Fix a rounding issue in printf (bug 1802888)	2022-11-28 15:39:36 +01:00
Calixte Denizet	4ee0c83548	[JS] Fix a rounding issue in printf (bug 1802888)	2022-11-28 14:37:15 +01:00
Jonas Jenwald	aa5b678f94	Add default icons for FileAttachment annotations (bug 1230933) Please note: This "borrows" the icons from Thunderbird. According to the PDF specification, see https://web.archive.org/web/20220309040754if_/https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G11.2096626, we should be providing default icons for FileAttachment annotations without appearances.	2022-11-26 11:24:59 +01:00
Jonas Jenwald	8fda3f04fe	Merge pull request #15732 from Snuffleupagus/issue-15719 Add a fallback for non-embedded composite Tahoma fonts (issue 15719)	2022-11-24 19:09:12 +01:00
Jonas Jenwald	d1c01b3164	Add a fallback for non-embedded composite Tahoma fonts (issue 15719)	2022-11-23 15:51:18 +01:00
Jonas Jenwald	47682985d3	Add support for Optional Content in TilingPatterns (issue 15716) This can't be a particularly common feature, since we've supported Optional Content for over two years and this is the very first TilingPattern-case we've seen.	2022-11-23 12:58:00 +01:00
Jonas Jenwald	0ba242ea4a	Support FileAttachments with hash-signs in the filename (issue 15729) The reason for the issue is that we use the generic `getFilenameFromUrl` helper function, which was originally intended for regular URLs. For the filenames we're dealing with in FileAttachments, we really only want to strip the path when one exists[1]. --- [1] See [bug 1230933](https://bugzilla.mozilla.org/show_bug.cgi?id=1230933) for an example of such a case.	2022-11-23 10:47:33 +01:00
Tim van der Meij	ae7c97aef8	Merge pull request #15710 from Snuffleupagus/issue-10791 Add localization support for the `annotationLayer` reference tests (issue 10791)	2022-11-19 11:25:15 +01:00
Jonas Jenwald	2ff904fb2b	Add localization support for the `annotationLayer` reference tests (issue 10791)	2022-11-18 23:08:11 +01:00
Jonas Jenwald	7d029f8bfe	Add a basic `stringToUTF16HexString` unit-test	2022-11-16 12:39:35 +01:00
Jonas Jenwald	9adc7859c8	Move the `escapeString` helper function into the worker-thread Given that this helper function is only used on the worker-thread, there's no reason to duplicate it in both of the `pdf.js` and `pdf.worker.js` files.	2022-11-16 12:35:48 +01:00
Jonas Jenwald	e5859e145d	Move the `isAscii` helper function into the worker-thread Given that this helper function is only used on the worker-thread, there's no reason to duplicate it in both of the `pdf.js` and `pdf.worker.js` files.	2022-11-16 12:35:48 +01:00
Jonas Jenwald	2eaa708e3a	Combine the `stringToUTF16String` and `stringToUTF16BEString` helper functions Given that these functions are virtually identical, with the latter only adding a BOM, we can combine the two. Furthermore, since both functions were only used on the worker-thread, there's no reason to duplicate this functionality in both of the `pdf.js` and `pdf.worker.js` files.	2022-11-16 12:35:44 +01:00
Calixte Denizet	2be64d63e1	Normalize fullwidth, halfwidth and circled chars when searching	2022-11-14 19:27:51 +01:00
Jonas Jenwald	a1d48e3651	Add a linked test-case for issue 2618 Given that this PDF document is an interesting test-case for performance reasons, w.r.t. inline image caching, it probably can't hurt to add it to the test-suite to make it more readily available. Considering the contents of that PDF document I'm not sure if we can include it directly in the repository, hence why a linked test-case was choosen here.	2022-11-12 16:31:01 +01:00
Jonas Jenwald	595711bd7c	Merge pull request #15679 from Snuffleupagus/bug-1799927-2 Use the full inline image as the cacheKey in `Parser.makeInlineImage` (bug 1799927)	2022-11-10 22:54:48 +01:00
Calixte Denizet	3ca03603c2	[Annotation] Fix printing/saving for annotations containing some non-ascii chars and with no fonts to handle them (bug 1666824) - For text fields * when printing, we generate a fake font which contains some widths computed thanks to an OffscreenCanvas and its method measureText. In order to avoid to have to layout the glyphs ourselves, we just render all of them in one call in the showText method in using the system sans-serif/monospace fonts. * when saving, we continue to create the appearance streams if the fonts contain the char but when a char is missing, we just set, in the AcroForm dict, the flag /NeedAppearances to true and remove the appearance stream. This way, we let the different readers handle the rendering of the strings. - For FreeText annotations * when printing, we use the same trick as for text fields. * there is no need to save an appearance since Acrobat is able to infer one from the Content entry.	2022-11-10 19:05:39 +01:00
Jonas Jenwald	b46e0d61cf	Use the full inline image as the cacheKey in `Parser.makeInlineImage` (bug 1799927) Please note: This only fixes the "wrong letter" part of bug 1799927. It appears that the simple `computeAdler32` function, used when caching inline images, generates hash collisions for some (very short) TypedArrays. In this case that leads to some of the "letters", which are actually inline images, being rendered incorrectly. Rather than switching to another hashing algorithm, e.g. the `MurmurHash3_64` class, we simply cache using a stringified version of the inline image data as the cacheKey to prevent any future collisions. While this will (naturally) lead to slightly higher peak memory usage, it'll however be limited to the current `Parser`-instance which means that it's not persistent. One small benefit of these changes is that we can avoid creating lots of `Stream`-instances for already cached inline images.	2022-11-10 18:27:26 +01:00
Samuel Yuan	36fb5c1e2b	Propagate the translated font name to TextContentItems. This allows font data for system fonts to be looked up in the PDFObjects.	2022-11-08 11:16:21 -08:00
Jonas Jenwald	23930a249e	[api-minor] Let `Catalog.getAllPageDicts` return an empty dictionary when loading the first /Page fails (issue 15590) In order to support opening certain corrupt PDF documents, particularly hand-edited ones, this patch adds support for letting the `Catalog.getAllPageDicts` method fallback to returning an empty dictionary to replace (only) the first /Page of the document. Given that the viewer cannot initialize/load without access to the first page, this will thus allow e.g. document-level scripting to run as expected. Note that by effectively replacing a corrupt or missing first /Page in this way[1], we'll now render nothing but a blank page for certain cases of broken/corrupt PDF documents which may look weird. Please note: This functionality is controlled via the existing `stopAtErrors` option, that can be passed to `getDocument`, since it's easy to imagine use-cases where this sort of fallback behaviour isn't desirable. --- [1] Currently we still require that a /Pages-dictionary is found though, however it may be possible to relax even that assumption if that becomes absolutely necessary in future corrupt documents.	2022-11-03 12:51:48 +01:00
Jonas Jenwald	2516ffa78e	Fallback to finding the first "obj" occurrence, when the trailer-dictionary is incomplete (issue 15590) Note that the "trailer"-case is already a fallback, since normally we're able to use the "xref"-operator even in corrupt documents. However, when a "trailer"-operator is found we still expect "startxref" to exist and be usable in order to advance the stream position. When that's not the case, as happens in the referenced issue, we use a simple fallback to find the first "obj" occurrence instead. This partially fixes issue 15590, since without this patch we fail to find any objects at all during `XRef.indexObjects`. However, note that the PDF document is still corrupt and won't render since there's no actual /Pages-dictionary and the /Root-entry simply points to the /OpenAction-dictionary instead.	2022-11-03 12:46:30 +01:00
calixteman	e42e1cde61	Merge pull request #15615 from calixteman/bug1796741 [Form] Don't use field appearances when /NeedAppearances is set to true (bug 1796741)	2022-10-31 09:58:27 +01:00
Jonas Jenwald	980acddbfa	Prevent textLayer errors in documents with unbalanced beginMarkedContent/endMarkedContent operators (issue 15629)	2022-10-26 18:35:48 +02:00
Calixte Denizet	9f95a14e91	[Form] Don't use field appearances when /NeedAppearances is set to true (bug 1796741) When a form isn't changed, we used the appearances we had in the file, but when /NeedAppearances is true, all the appearances have to be regenerated whatever they're.	2022-10-26 12:10:51 +02:00

1 2 3 4 5 ...

2966 Commits