pdf.js

Author	SHA1	Message	Date
Jonas Jenwald	0c1fb4e740	[api-minor] Remove the `PDFDocumentProxy.stats` getter (PR 15758 follow-up) This was deprecated in PR 15758 and given that it's quite unlikely that any third-party users are relying on this functionality, since it was only ever added to support telemetry reporting in the Firefox PDF Viewer, it should hopefully be fine to remove this fairly quickly. These changes reduce the bundle size of the Firefox PDF Viewer by 4.5 kB in total.	2023-01-01 17:06:47 +01:00
Jonas Jenwald	91524d1a60	[api-minor] Allow specifying an extra-delay, in `RenderTask.cancel`, for worker-thread aborting of operatorList parsing This is done to support upcoming viewer-changes, and in order to prevent third-party users from outright breaking things we'll simply ignore too large values.	2022-12-14 12:34:16 +01:00
Jonas Jenwald	fe8fded23b	[api-minor] Combine the `textContent`/`textContentStream` parameters Rather than handling these parameters separately, which is a left-over from back when streaming of textContent was originally added, we can simply pass either data directly to the `TextLayer` and let it handle things accordingly. Also, improves a few JSDoc comments and `typedef`-imports.	2022-12-04 21:22:14 +01:00
Tim van der Meij	99cfef882f	Merge pull request #15752 from Snuffleupagus/no-typeof-undefined Enable the `no-typeof-undefined` ESLint plugin rule	2022-12-02 19:48:16 +01:00
Calixte Denizet	eed9bf71c5	Refactor the text layer code in order to avoid to recompute it on each draw The idea is just to resuse what we got on the first draw. Now, we only update the scaleX of the different spans and the other values are dependant of --scale-factor. Move some properties in the CSS in order to avoid any updates in JS.	2022-12-01 18:42:43 +01:00
Jonas Jenwald	47dbfc4ade	Enable the `no-typeof-undefined` ESLint plugin rule Please see https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/no-typeof-undefined.md	2022-12-01 18:20:39 +01:00
Calixte Denizet	ea1995991b	Don't add an extra space after a Katakana or a Hiragana at the eol when searching	2022-11-29 10:46:48 +01:00
Calixte Denizet	ae7da6ae48	[JS] By default, a text field value must be treated as a number (bug 1802888)	2022-11-28 16:24:01 +01:00
Calixte Denizet	4ee0c83548	[JS] Fix a rounding issue in printf (bug 1802888)	2022-11-28 14:37:15 +01:00
Jonas Jenwald	0ba242ea4a	Support FileAttachments with hash-signs in the filename (issue 15729) The reason for the issue is that we use the generic `getFilenameFromUrl` helper function, which was originally intended for regular URLs. For the filenames we're dealing with in FileAttachments, we really only want to strip the path when one exists[1]. --- [1] See [bug 1230933](https://bugzilla.mozilla.org/show_bug.cgi?id=1230933) for an example of such a case.	2022-11-23 10:47:33 +01:00
Jonas Jenwald	7d029f8bfe	Add a basic `stringToUTF16HexString` unit-test	2022-11-16 12:39:35 +01:00
Jonas Jenwald	9adc7859c8	Move the `escapeString` helper function into the worker-thread Given that this helper function is only used on the worker-thread, there's no reason to duplicate it in both of the `pdf.js` and `pdf.worker.js` files.	2022-11-16 12:35:48 +01:00
Jonas Jenwald	e5859e145d	Move the `isAscii` helper function into the worker-thread Given that this helper function is only used on the worker-thread, there's no reason to duplicate it in both of the `pdf.js` and `pdf.worker.js` files.	2022-11-16 12:35:48 +01:00
Jonas Jenwald	2eaa708e3a	Combine the `stringToUTF16String` and `stringToUTF16BEString` helper functions Given that these functions are virtually identical, with the latter only adding a BOM, we can combine the two. Furthermore, since both functions were only used on the worker-thread, there's no reason to duplicate this functionality in both of the `pdf.js` and `pdf.worker.js` files.	2022-11-16 12:35:44 +01:00
Calixte Denizet	2be64d63e1	Normalize fullwidth, halfwidth and circled chars when searching	2022-11-14 19:27:51 +01:00
Calixte Denizet	3ca03603c2	[Annotation] Fix printing/saving for annotations containing some non-ascii chars and with no fonts to handle them (bug 1666824) - For text fields * when printing, we generate a fake font which contains some widths computed thanks to an OffscreenCanvas and its method measureText. In order to avoid to have to layout the glyphs ourselves, we just render all of them in one call in the showText method in using the system sans-serif/monospace fonts. * when saving, we continue to create the appearance streams if the fonts contain the char but when a char is missing, we just set, in the AcroForm dict, the flag /NeedAppearances to true and remove the appearance stream. This way, we let the different readers handle the rendering of the strings. - For FreeText annotations * when printing, we use the same trick as for text fields. * there is no need to save an appearance since Acrobat is able to infer one from the Content entry.	2022-11-10 19:05:39 +01:00
Samuel Yuan	36fb5c1e2b	Propagate the translated font name to TextContentItems. This allows font data for system fonts to be looked up in the PDFObjects.	2022-11-08 11:16:21 -08:00
Jonas Jenwald	23930a249e	[api-minor] Let `Catalog.getAllPageDicts` return an empty dictionary when loading the first /Page fails (issue 15590) In order to support opening certain corrupt PDF documents, particularly hand-edited ones, this patch adds support for letting the `Catalog.getAllPageDicts` method fallback to returning an empty dictionary to replace (only) the first /Page of the document. Given that the viewer cannot initialize/load without access to the first page, this will thus allow e.g. document-level scripting to run as expected. Note that by effectively replacing a corrupt or missing first /Page in this way[1], we'll now render nothing but a blank page for certain cases of broken/corrupt PDF documents which may look weird. Please note: This functionality is controlled via the existing `stopAtErrors` option, that can be passed to `getDocument`, since it's easy to imagine use-cases where this sort of fallback behaviour isn't desirable. --- [1] Currently we still require that a /Pages-dictionary is found though, however it may be possible to relax even that assumption if that becomes absolutely necessary in future corrupt documents.	2022-11-03 12:51:48 +01:00
Jonas Jenwald	2516ffa78e	Fallback to finding the first "obj" occurrence, when the trailer-dictionary is incomplete (issue 15590) Note that the "trailer"-case is already a fallback, since normally we're able to use the "xref"-operator even in corrupt documents. However, when a "trailer"-operator is found we still expect "startxref" to exist and be usable in order to advance the stream position. When that's not the case, as happens in the referenced issue, we use a simple fallback to find the first "obj" occurrence instead. This partially fixes issue 15590, since without this patch we fail to find any objects at all during `XRef.indexObjects`. However, note that the PDF document is still corrupt and won't render since there's no actual /Pages-dictionary and the /Root-entry simply points to the /OpenAction-dictionary instead.	2022-11-03 12:46:30 +01:00
Jonas Jenwald	71bd8b4de9	Let `Lexer.getNumber` treat more invalid "numbers" as zero (issue 15604) In the referenced PDF document there are "numbers" which consist only of `-.`, and while that's obviously not valid Adobe Reader seems to handle it just fine. Letting this method ignore more invalid "numbers" was suggested during the review of PR 14543, so let's simply relax our the validation here.	2022-10-20 22:36:15 +02:00
Calixte Denizet	e756bb69e4	[JS] Take into account all the required fields for some computations - Fix Field::getArray in order to collect only the fields which have a value; - Fix AFSimple_Calculate: * allow to have a string with a list of field names as argument; * since a field can be non-terminal, use Field::getArray to collect the field under it and then apply the calculation on all the descendants.	2022-10-13 18:33:12 +02:00
Jonas Jenwald	ce66fefbff	[api-minor] Add partial support for the "GoToE" action (issue 8844) Please note: The referenced issue is the only mention that I can find, in either GitHub or Bugzilla, of "GoToE" actions. Hence why I've purposely settled for a very simple, and partial, "GoToE" implementation to avoid complicating things initially.[1] In particular, this patch only supports "GoToE" actions that references the /EmbeddedFiles-dict in the PDF document. See https://web.archive.org/web/20220309040754if_/https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G11.2048909 --- [1] Usually I always prefer having real-world test-cases to work with, whenever I'm implementing new features.	2022-10-06 10:33:07 +02:00
Jonas Jenwald	fe5d9b4b6a	Remove duplicated `destroy`-calls in the "custom ownerDocument" unit-tests Given that `PDFDocumentProxy.destroy` is nothing but an alias for `PDFDocumentLoadingTask.destroy` calling both methods is obviously not useful.	2022-10-02 12:01:41 +02:00
Jonas Jenwald	7b24931f67	Merge pull request #15517 from Snuffleupagus/issue-15516 Add more non-standard ligatures in the `glyphlist.js` file (issue 15516)	2022-09-30 23:30:50 +02:00
Calixte Denizet	330048ad6b	[JS] Add the function AFExactMatch	2022-09-29 14:23:56 -10:00
Jonas Jenwald	c87f90102c	Add more non-standard ligatures in the `glyphlist.js` file (issue 15516) Note that this PR only adds the "underscore"-variant of actually existing ligatures, however the referenced PDF document also uses a couple of non-standard ones (e.g. `ft`, `Th`, and `fh`) that we cannot easily support without larger changes (since they don't have official Unicode-entries). Given that it's clearly the PDF document, and its fonts, that's the culprit here it's not entirely clear to me that we actually want to attempt a larger refactoring/rewriting of the `glyphlist.js` code, assuming it's even generally possible. Especially when this patch alone already improves our copy-paste behaviour when compared to both Adobe Reader and PDFium, and that this is only the second time this sort of bug has been reported.	2022-09-27 16:31:51 +02:00
calixteman	da1780f826	Merge pull request #15486 from nmtigor/fix_orders_of_prop Fix property chain orders of Operators in isDotExpression	2022-09-25 04:13:25 -10:00
Jonas Jenwald	6538409282	Replace some `Array.prototype`-usage with spread syntax We have a few, quite old, call-sites that use the `Array.prototype`-format and which can now be replaced with spread syntax instead.	2022-09-23 09:35:30 +02:00
Calixte Denizet	9e40938a29	[JS] Try to guess what the date is when it doesn't follow the given format (issue #15490 ) We use the format to guess in which order we can find month, day, ... we get the numbers in the date and consider them as month, day, ...	2022-09-22 16:30:39 +02:00
nmtigor	22cc9b7dc7	Fix property chain orders of Operators in isDotExpression and isSomPredicate	2022-09-21 17:20:23 +02:00
Calixte Denizet	f5b835157b	[XFA] Fix an hidden issue in the FormCalc lexer Since there are no script engine with XFA, the FormCalc parser is not used irl. The bug @nmtigor noticed was hidden by another one (the wrong check on `match`).	2022-09-20 13:53:55 +02:00
Jonas Jenwald	21fe5017bb	Remove the abstract `BaseViewer`-class After the changes in PR 14112 the `PDFViewer`-class is now "identical" to the `BaseViewer`-class and the `PDFSinglePageViewer`-class is just a very thin wrapper around the `BaseViewer`-class. Hence we can rename these files, and also remove the abstract `BaseViewer`-class, which helps reduce some unnecessary "closures" in the built viewer. Please note: These changes are made in two separate commits, to allow GitHub to preserve `blame` for the affected files.	2022-09-08 12:38:17 +02:00
Jonas Jenwald	6dc4c994b8	Remove the abstract `BaseViewer`-class After the changes in PR 14112 the `PDFViewer`-class is now "identical" to the `BaseViewer`-class and the `PDFSinglePageViewer`-class is just a very thin wrapper around the `BaseViewer`-class. Hence we can rename these files, and also remove the abstract `BaseViewer`-class, which helps reduce some unnecessary "closures" in the built viewer. Please note: These changes are made in two separate commits, to allow GitHub to preserve `blame` for the affected files.	2022-09-08 12:38:17 +02:00
Calixte Denizet	6c6f6fb2b8	Don't replace cr by a white space when the last char on the line is an ideographic char	2022-09-04 14:21:05 +02:00
Jonas Jenwald	cc4baa2fe9	[api-minor] Add basic support for the `SetOCGState` action (issue 15372) Note that this patch implements the `SetOCGState`-handling in `PDFLinkService`, rather than as a new method in `OptionalContentConfig`[1], since this action is nothing but a series of `setVisibility`-calls and that it seems quite uncommon in real-world PDF documents. The new functionality also required some tweaks in the `PDFLayerViewer`, to ensure that the `layersView` in the sidebar is updated correctly when the optional-content visibility changes from "outside" of `PDFLayerViewer`. --- [1] We can obviously move this code into `OptionalContentConfig` instead, if deemed necessary, but for an initial implementation I figured that doing it this way might be acceptable.	2022-09-01 17:34:24 +02:00
Jonas Jenwald	216b86a082	[api-minor] Support Named-actions in the outline (issue 15367) Apparently this is implemented in e.g. Adobe Reader, and the specification does support it, however it cannot be commonly used in real-world PDF documents since it took over ten years for this feature to be requested.	2022-08-30 18:47:45 +02:00
Calixte Denizet	c06c5f7cbd	[Annotations] charLimit === 0 means unlimited (bug 1782564) Changing the charLimit in JS had no impact, so this patch aims to fix that and add an integration test for it.	2022-08-19 11:28:28 +02:00
Jonas Jenwald	0024165f1f	Move `binarySearchFirstItem` back to the `web/`-folder (PR 15237 follow-up) This was moved into the `src/display/`-folder in PR 15110, for the initial editor-a11y patch. However, with the changes in PR 15237 we're again only using `binarySearchFirstItem` in the `web/`-folder and it thus seem reasonable to move it back there. The primary reason for moving it back is that `binarySearchFirstItem` is currently exposed in the public API, and we always want to avoid that unless it's either PDF-related functionality or code that simply must be shared between the `src/`- and `web/`-folders. In this case, `binarySearchFirstItem` is a general helper function that doesn't really satisfy either of those alternatives.	2022-08-14 11:38:17 +02:00
Jonas Jenwald	dd95e4f851	Add official support for passing `ArrayBuffer`-data to `getDocument` (issue 15269) While this has always worked, as a consequence of the implementation, it's never been officially supported. In addition to adding basic unit-tests, this patch also introduces a couple of new JSDoc `@typedef`s in the API to avoid overly long lines.	2022-08-10 14:13:01 +02:00
Jonas Jenwald	f6db7975c5	Enable the ESLint `prefer-spread` rule Note that in a couple of spots the argument could be `undefined` and there we simply disable the rule instead. Please refer to https://eslint.org/docs/latest/rules/prefer-spread	2022-08-06 10:17:00 +02:00
calixteman	b985eaa98c	Merge pull request #15267 from calixteman/freetext_a11y [Annotation] Add a div containing the text of a FreeText annotation (bug 1780375)	2022-08-04 11:49:29 +02:00
Calixte Denizet	31155740c3	[Annotation] Add a div containing the text of a FreeText annotation (bug 1780375) An annotation doesn't have to be in the text flow, hence it's likely a bad idea to insert its text in the text layer. But the text must be visible from a screen reader point of view so it must somewhere in the DOM. So with this patch, the text from a FreeText annotation is extracted and added in a div in its HTML counterpart, and with the patch #15237 the text should be visible and positioned relatively to the text flow.	2022-08-04 11:14:05 +02:00
Calixte Denizet	6916fabd51	Skip unknown fields when calculating a value in using AFSimple_Calculate	2022-08-03 23:40:09 +02:00
Jonas Jenwald	0c31320c12	[api-minor] Improve `thumbnail` handling in documents that contain interactive forms To improve performance of the sidebar we use the page-canvases to generate the thumbnails whenever possible, since that avoids unnecessary re-rendering when the sidebar is open. This works generally well, however there's an old problem in PDF documents that contain interactive forms (when those are enabled): Note how the thumbnails become partially (or fully) blank, since those Annotations are not included in the OperatorList.[1] We obviously want to keep using the `PDFThumbnailView.setImage`-method for most documents, however we need a way to skip it only for those pages that contain interactive forms. As it turns out it's unfortunately not all that simple to tell, after the fact, from looking only at the OperatorList that some Annotations were skipped. While it might have been possible to try and infer that in the viewer, it'd not have been pretty considering that at the time when rendering finishes the annotationLayer has not yet been built. The overall simplest solution that I could come up with, was instead to include a summary of the interactive form-state when doing the final "flushing" of the OperatorList and expose that information in the API. --- [1] Some examples from our test-suite: `annotation-tx2.pdf` where the thumbnail is completely blank, and `bug1737260.pdf` where the thumbnail is missing the "buttons" found on the page.	2022-07-30 16:53:32 +02:00
Jonas Jenwald	2fb083f3e2	Ensure that the `isUsingOwnCanvas`-parameter is consistently included in operatorLists (PR 14247 follow-up) Currently some `OPS.beginAnnotation` arguments will contain a `Number` value for the `isUsingOwnCanvas`-parameter, or in some cases an `undefined` value, which is inconsistent from an API perspective.	2022-07-28 13:37:37 +02:00
Calixte Denizet	7831a100b3	[Editor] Add the possibility to change line opacity in Ink editor	2022-07-27 18:46:25 +02:00
Calixte Denizet	af41a5cb49	[Editor] Simplify the command manager The previous version was maybe functional but definitely painful to maintain (maybe more efficient... I don't know) so this patch aims to simplify it and it adds some basic unit tests.	2022-07-21 18:44:41 +02:00
Jonas Jenwald	f46895d750	Merge pull request #15110 from calixteman/editing_a11y [Editor] Improve a11y for newly added element (#15109)	2022-07-19 20:02:53 +02:00
Calixte Denizet	624b26e1de	[Editor] Improve a11y for newly added element (#15109 ) - In the annotationEditorLayer, reorder the editors in the DOM according the position of the elements on the screen; - add an aria-owns attribute on the "nearest" element in the text layer which points to the added editor.	2022-07-19 18:52:17 +02:00
Jonas Jenwald	37ebc28756	Use more `for...of` loops in the code-base Note that these cases, which are all in older code, were found using the [`unicorn/no-for-loop`](https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/no-for-loop.md) ESLint plugin rule. However, note that I've opted not to enable this rule by default since there's still some cases where I do think that it makes sense to allow "regular" for-loops.	2022-07-17 16:18:54 +02:00

1 2 3 4 5 ...

1031 Commits