pdf.js

Author	SHA1	Message	Date
Jonas Jenwald	f2f0a1e871	[api-minor] Stop sending "UnsupportedFeature" from the worker-thread GetOperatorList-handling This code was added all the way back in PR 6698, almost seven years ago, for backwards compatibility reasons. At this point in time, it seems that we can remove that since: - We have more fine-grained "UnsupportedFeature" reporting elsewhere in the worker-thread code nowadays. - The GetOperatorList-handling is now using `ReadableStream`s, which means that errors are being forwarded to the main-thread anyway. - We're also no longer displaying a notification-bar, in the built-in Firefox PDF Viewer, for any of these "UnsupportedFeature" messages.	2022-10-13 11:46:17 +02:00
Jonas Jenwald	858d941ff8	Take the /CIDToGIDMap into account when getting the glyph mapping for CFF fonts (issue 15559) Please note: I don't really know what I'm doing here, however the patch appears to fix the referenced issue when comparing the rendering with Adobe Reader (with the caveat that I don't speak the language in question).	2022-10-13 10:02:25 +02:00
Jonas Jenwald	5bc6f964db	Slightly re-factor the version fetching in `PDFDocument.checkHeader` Note how after having found the "%PDF-" prefix we then read both the prefix and the version in the loop, only to then remove the prefix at the end. It seems better to instead advance the stream position past the "%PDF-" prefix, and then read only the version data. Finally the loop-condition can also be simplified slightly, to further clean-up some very old code.	2022-10-11 13:15:01 +02:00
Jonas Jenwald	081e897588	Ensure that `Page.getOperatorList` handles Annotation parsing errors correctly (issue 15557) Fixes a regression from PR 15246, sorry about that! The return value of all `Annotation.getOperatorList` methods was changed in PR 15246, however I missed updating the error code-path in `Page.getOperatorList` which thus breaks all operatorList-parsing for pages with corrupt Annotations.	2022-10-10 09:48:01 +02:00
Tim van der Meij	dff444d441	Merge pull request #15555 from Snuffleupagus/improve-GetDocRequest Clean-up the data that we're sending with "GetDocRequest"	2022-10-09 14:10:44 +02:00
Jonas Jenwald	8a4f6aca97	Stop using the `source`-object when sending "GetDocRequest" Looking at the code on the worker-thread, there doesn't appear to be any particular reason for placing some of the properties in a `source`-object when sending them with "GetDocRequest". As is often the case the explanation for this structure is rather "for historical reasons", since originally we simply sent the `source`-object as-is. Doing that was obviously a bad idea, for a couple of reasons: - It makes it less clear what is/isn't actually needed on the worker-thread. - Sending unused properties will unnecessarily increase memory usage. - The `source`-object may contain unclonable data, which would break the library.	2022-10-09 12:45:24 +02:00
Jonas Jenwald	c84b717773	Group the `evaluatorOptions` on the main-thread, when sending "GetDocRequest" Rather than sending all of these parameters individually and then grouping them together on the worker-thread, we can simply handle that in the API instead.	2022-10-09 12:31:03 +02:00
Jonas Jenwald	4cc98de6d7	Remove the unused `CMapCompressionType.STREAM` value This was added in PR 8064, over five years ago, for a possible future CMap file-format that was never implemented.	2022-10-08 17:10:05 +02:00
Calixte Denizet	c0e165bf97	Simplify the way to compute the remainder modulo 3 in PDF20Hash function I noticed the 256 % 3 (which is equal to 1) so I slighty simplify the code. The sum of the 16 Uint8 doesn't exceed 2^12, hence we can just take the sum modulo 3.	2022-10-07 14:43:31 +02:00
Jonas Jenwald	3cb119cb32	Merge pull request #15539 from Snuffleupagus/DecryptStream-set Replace loop with `TypedArray.prototype.set` in the `DecryptStream.readBlock` method	2022-10-07 11:14:28 +02:00
Jonas Jenwald	1ea4c4b519	[api-minor] Make `isOffscreenCanvasSupported` configurable via the API (issue 14952) This patch first of all makes `isOffscreenCanvasSupported` configurable, defaulting to `true` in browsers and `false` in Node.js environments, with a new `getDocument` parameter. While you normally want to use this, in order to improve performance, it should still be possible for users to control it (similar to e.g. `isEvalSupported`). The specific problem, as reported in issue 14952, is that the SVG back-end doesn't support the new ImageMask data-format that's introduced in PR 14754. In particular: - When the SVG back-end is used in Node.js environments, this patch will "just work" without the user needing to make any code changes. - If the SVG back-end is used in browsers, this patch will require that `isOffscreenCanvasSupported: false` is added to the `getDocument`-call.	2022-10-07 00:10:46 +02:00
Jonas Jenwald	6877d8b9e2	Replace loop with `TypedArray.prototype.set` in the `DecryptStream.readBlock` method There's no reason to use a manual loop, when a native method exists.	2022-10-06 14:43:24 +02:00
Jonas Jenwald	ce66fefbff	[api-minor] Add partial support for the "GoToE" action (issue 8844) Please note: The referenced issue is the only mention that I can find, in either GitHub or Bugzilla, of "GoToE" actions. Hence why I've purposely settled for a very simple, and partial, "GoToE" implementation to avoid complicating things initially.[1] In particular, this patch only supports "GoToE" actions that references the /EmbeddedFiles-dict in the PDF document. See https://web.archive.org/web/20220309040754if_/https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G11.2048909 --- [1] Usually I always prefer having real-world test-cases to work with, whenever I'm implementing new features.	2022-10-06 10:33:07 +02:00
Jonas Jenwald	60f6272ed9	Use more `for...of` loops in the code-base Most, if not all, of this code is old enough to predate the general availability of `for...of` iteration.	2022-10-03 13:08:38 +02:00
Jonas Jenwald	c87f90102c	Add more non-standard ligatures in the `glyphlist.js` file (issue 15516) Note that this PR only adds the "underscore"-variant of actually existing ligatures, however the referenced PDF document also uses a couple of non-standard ones (e.g. `ft`, `Th`, and `fh`) that we cannot easily support without larger changes (since they don't have official Unicode-entries). Given that it's clearly the PDF document, and its fonts, that's the culprit here it's not entirely clear to me that we actually want to attempt a larger refactoring/rewriting of the `glyphlist.js` code, assuming it's even generally possible. Especially when this patch alone already improves our copy-paste behaviour when compared to both Adobe Reader and PDFium, and that this is only the second time this sort of bug has been reported.	2022-09-27 16:31:51 +02:00
calixteman	da1780f826	Merge pull request #15486 from nmtigor/fix_orders_of_prop Fix property chain orders of Operators in isDotExpression	2022-09-25 04:13:25 -10:00
Jonas Jenwald	6538409282	Replace some `Array.prototype`-usage with spread syntax We have a few, quite old, call-sites that use the `Array.prototype`-format and which can now be replaced with spread syntax instead.	2022-09-23 09:35:30 +02:00
Jonas Jenwald	f1b0dc6f04	Tweak the heuristic that handles JPEG images with a wildly incorrect SOF (Start of Frame) `scanLines` parameter (issue 15492)	2022-09-22 14:09:04 +02:00
nmtigor	22cc9b7dc7	Fix property chain orders of Operators in isDotExpression and isSomPredicate	2022-09-21 17:20:23 +02:00
Calixte Denizet	198e9a3db1	Initialize values in the path bounding box before flushing the operator list (bug 1791583) OperatorList.addOp can trigger a flush if it's required, hence the values passed to it must be correctly initialized in order to avoid some wrong values in the renderer. Because of that a clip path was considered as empty, nothing was clipped, hence the wrong rendering in bug 1791583.	2022-09-20 20:01:54 +02:00
Calixte Denizet	f5b835157b	[XFA] Fix an hidden issue in the FormCalc lexer Since there are no script engine with XFA, the FormCalc parser is not used irl. The bug @nmtigor noticed was hidden by another one (the wrong check on `match`).	2022-09-20 13:53:55 +02:00
Jonas Jenwald	20b9887476	Enable the `unicorn/prefer-regexp-test` ESLint plugin rule Please see https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/prefer-regexp-test.md	2022-09-19 16:34:01 +02:00
Jonas Jenwald	bb75b36b77	Replace some unnecessary `String.prototype.search` usage Most of the `String.prototype.search` call-sites found throughout the code-base is actually not necessary, since we usually only want a boolean, and those can be replaced with `RegExp.prototype.test` instead.	2022-09-19 12:51:46 +02:00
Jonas Jenwald	7a19def34c	Extend `getSupplementalGlyphMapForCalibri` with more entries (issue 15443)	2022-09-15 22:19:16 +02:00
Jonas Jenwald	2f2ecad8fd	Extend `getGlyphMapForStandardFonts` with some quote-entries (issue 15441)	2022-09-15 11:37:20 +02:00
Jonas Jenwald	947d390421	Fallback to a standard font when a Type1 font program is empty (issue 15292) Please note: This is only a, hopefully generally helpful, work-around rather than a proper solution to issue 15292. There's something that's "special" about the Type1 fonts in the referenced PDF document, since we don't manage to find any actual font programs and thus cannot render anything. Given that it shouldn't make sense for a Type1 font program to ever be empty, since that means that there's no glyph-data to render, we simply fallback to a standard font to at least try and render something in these rare cases.	2022-09-05 12:07:19 +02:00
Jonas Jenwald	12d60e0acf	Don't allow `adjustToUnicode` to extend a built-in /ToUnicode map (issue 15352) Given that the change in PR 13393 was slightly speculative, given the lack of test-cases, let's just revert part of that to fix the referenced issue. Based on a quick look at old issues and existing test-cases, it seems that most (if not all) PDF documents that benefit from using the font-data in this way lack any /ToUnicode maps which should mean that they're unaffected by these changes.	2022-09-03 23:11:42 +02:00
Jonas Jenwald	cc4baa2fe9	[api-minor] Add basic support for the `SetOCGState` action (issue 15372) Note that this patch implements the `SetOCGState`-handling in `PDFLinkService`, rather than as a new method in `OptionalContentConfig`[1], since this action is nothing but a series of `setVisibility`-calls and that it seems quite uncommon in real-world PDF documents. The new functionality also required some tweaks in the `PDFLayerViewer`, to ensure that the `layersView` in the sidebar is updated correctly when the optional-content visibility changes from "outside" of `PDFLayerViewer`. --- [1] We can obviously move this code into `OptionalContentConfig` instead, if deemed necessary, but for an initial implementation I figured that doing it this way might be acceptable.	2022-09-01 17:34:24 +02:00
Jonas Jenwald	216b86a082	[api-minor] Support Named-actions in the outline (issue 15367) Apparently this is implemented in e.g. Adobe Reader, and the specification does support it, however it cannot be commonly used in real-world PDF documents since it took over ten years for this feature to be requested.	2022-08-30 18:47:45 +02:00
Calixte Denizet	c06c5f7cbd	[Annotations] charLimit === 0 means unlimited (bug 1782564) Changing the charLimit in JS had no impact, so this patch aims to fix that and add an integration test for it.	2022-08-19 11:28:28 +02:00
Jonas Jenwald	6a2c2a646f	Remove the remaining closures in the `src/core/type1_parser.js` file Given that the code is written with JavaScript module-syntax, none of this functionality will "leak" outside of this file with these change. By removing this closure the file-size is decreased, even for the built `pdf.worker.js` file, since there's now less overall indentation in the code.	2022-08-14 12:50:26 +02:00
Jonas Jenwald	e5e756c0b4	Remove the remaining closures in the `src/core/cff_parser.js` file Given that the code is written with JavaScript module-syntax, none of this functionality will "leak" outside of this file with these changes. For e.g. the `gulp mozcentral` command the built `pdf.worker.js` file-size decreases `~2 kB` with this patch, and most of the improvement comes from having less overall indentation in the code.	2022-08-13 19:48:17 +02:00
Jonas Jenwald	9dcfdb9578	Remove the remaining closure in the `src/core/function.js` file Given that the code is written with JavaScript module-syntax, none of this functionality will "leak" outside of this file with these changes. By removing this closure the file-size is decreased, even for the built `pdf.worker.js` file, since there's now less overall indentation in the code.	2022-08-13 12:52:36 +02:00
Calixte Denizet	04f78c935c	Fix OTS issue with empty index (#15289 )	2022-08-08 22:56:26 +02:00
Tim van der Meij	2a84a3078b	Merge pull request #15283 from Snuffleupagus/sort-PopupAnnotation [api-minor] Sort PopupAnnotations already on the worker-thread (PR 11535 follow-up)	2022-08-06 15:07:09 +02:00
Jonas Jenwald	876a02a504	[api-minor] Sort PopupAnnotations already on the worker-thread (PR 11535 follow-up) By doing this in the worker-thread this code will only need to run once, whereas currently re-rendering of a page forces this to be repeated (e.g. after it's been scrolled out-of-view and then back into view again).	2022-08-06 11:42:45 +02:00
Jonas Jenwald	f6db7975c5	Enable the ESLint `prefer-spread` rule Note that in a couple of spots the argument could be `undefined` and there we simply disable the rule instead. Please refer to https://eslint.org/docs/latest/rules/prefer-spread	2022-08-06 10:17:00 +02:00
Calixte Denizet	31155740c3	[Annotation] Add a div containing the text of a FreeText annotation (bug 1780375) An annotation doesn't have to be in the text flow, hence it's likely a bad idea to insert its text in the text layer. But the text must be visible from a screen reader point of view so it must somewhere in the DOM. So with this patch, the text from a FreeText annotation is extracted and added in a div in its HTML counterpart, and with the patch #15237 the text should be visible and positioned relatively to the text flow.	2022-08-04 11:14:05 +02:00
Jonas Jenwald	0c31320c12	[api-minor] Improve `thumbnail` handling in documents that contain interactive forms To improve performance of the sidebar we use the page-canvases to generate the thumbnails whenever possible, since that avoids unnecessary re-rendering when the sidebar is open. This works generally well, however there's an old problem in PDF documents that contain interactive forms (when those are enabled): Note how the thumbnails become partially (or fully) blank, since those Annotations are not included in the OperatorList.[1] We obviously want to keep using the `PDFThumbnailView.setImage`-method for most documents, however we need a way to skip it only for those pages that contain interactive forms. As it turns out it's unfortunately not all that simple to tell, after the fact, from looking only at the OperatorList that some Annotations were skipped. While it might have been possible to try and infer that in the viewer, it'd not have been pretty considering that at the time when rendering finishes the annotationLayer has not yet been built. The overall simplest solution that I could come up with, was instead to include a summary of the interactive form-state when doing the final "flushing" of the OperatorList and expose that information in the API. --- [1] Some examples from our test-suite: `annotation-tx2.pdf` where the thumbnail is completely blank, and `bug1737260.pdf` where the thumbnail is missing the "buttons" found on the page.	2022-07-30 16:53:32 +02:00
Calixte Denizet	d092a85b6c	Fix wrong order of arguments when calling the CipherTransform ctor (bug 1782186)	2022-07-29 12:46:45 +02:00
Jonas Jenwald	2fb083f3e2	Ensure that the `isUsingOwnCanvas`-parameter is consistently included in operatorLists (PR 14247 follow-up) Currently some `OPS.beginAnnotation` arguments will contain a `Number` value for the `isUsingOwnCanvas`-parameter, or in some cases an `undefined` value, which is inconsistent from an API perspective.	2022-07-28 13:37:37 +02:00
Calixte Denizet	7831a100b3	[Editor] Add the possibility to change line opacity in Ink editor	2022-07-27 18:46:25 +02:00
Jonas Jenwald	fc018ea9ea	Support images with /Filter-entries that contain Arrays (issue 15220) This patch "borrows" the code found in the `Parser.makeInlineImage`-method, to ensure that JBIG2 and JPX images can be rendered correctly.	2022-07-25 08:41:37 +02:00
Jonas Jenwald	60bd9580e2	Ignore invalid /CIDToGIDMap-entries when parsing fonts (issue 15139) In the referenced PDF document the fonts have /CIDToGIDMap-entries that cannot be loaded. Hence, only when `ignoreErrors` is set, we'll now ignore these corrupt /CIDToGIDMap-entries and fallback to simply assume that no such data is available. Given that this is clearly a case of a corrupt PDF document, there's no guarantee that this will "fix" things in the general case since a /CIDToGIDMap may be required in order for some composite fonts to render correctly. However, attempting to render something is surely better than skipping a font altogether.	2022-07-20 11:58:44 +02:00
Jonas Jenwald	37ebc28756	Use more `for...of` loops in the code-base Note that these cases, which are all in older code, were found using the [`unicorn/no-for-loop`](https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/no-for-loop.md) ESLint plugin rule. However, note that I've opted not to enable this rule by default since there's still some cases where I do think that it makes sense to allow "regular" for-loops.	2022-07-17 16:18:54 +02:00
Jonas Jenwald	de7d1d2167	Merge pull request #15170 from calixteman/js_rm_null [JS] Embedded JS scripts can have some null chars	2022-07-15 17:11:29 +02:00
Jonas Jenwald	acd61a138e	Handle errors in the "Loading by ref" code-path in `PartialEvaluator.loadFont` Note how we currently throw a "raw" Error, which is problematical since all of the `PartialEvaluator.loadFont` call-sites expect a Promise to be returned. Furthermore, this also means that we don't benefit from the fallback code-path that now exists below. Please note: Unfortunately I don't have a test-case that fails without this patch, since it's something I happened to notice when reading the code while working on another patch.	2022-07-15 16:33:36 +02:00
Calixte Denizet	5f0c95e70e	[JS] Embedded JS scripts can have some null chars	2022-07-15 16:05:25 +02:00
calixteman	41b2f52f70	Merge pull request #15157 from calixteman/1778484 Add unicode mapping in the font cmap to have correct chars when printing in pdf (bug 1778484)	2022-07-13 14:45:12 +02:00
Calixte Denizet	680c293c34	Add unicode mapping in the font cmap to have correct chars when printing in pdf (bug 1778484) It aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1778484.	2022-07-13 14:38:27 +02:00

1 2 3 4 5 ...

2687 Commits