pdf.js

Author	SHA1	Message	Date
Jonas Jenwald	8b1d01816b	Re-factor the `charsCache` on `Font`-instances Currently `charsCache` is initialized lazily, which considering that it just contains a simple `Object` doesn't seem entirely necessary. This first of all forces us to do repeated exists-checks in the `Font.charsToGlyphs` method, and secondly the similar/related `glyphCache` is already initialized eagerly. Furthermore, this patch also does a bit of clean-up in the `Font.charsToGlyphs` method since this code is quite old.	2021-05-26 13:13:44 +02:00
Tim van der Meij	3da9f077be	Merge pull request #13435 from Snuffleupagus/eslint-no-array-push-push Enable the `unicorn/no-array-push-push` ESLint plugin rule	2021-05-25 21:10:01 +02:00
Calixte Denizet	45c3f00a27	XFA - Move the fake HTML representation of XFA from the worker to the main thread - the only goal of this patch is to be able to get synchronously the fake html when printing from firefox: - in order to print we need to inject some html in beforeprint callback but we cannot block in waiting for all the pages. - from a memory point of view: it doesn't change anything since the fake HTML is deleted in the worker; - this way we don't break any assumptions.	2021-05-25 19:33:07 +02:00
Calixte Denizet	9478d2f064	XFA - Add a storage to save fields values - this is required to be able to print (or save) a document. Some pages can be unloaded (because pdf.js is lazy) and this storage will help to save their data in order to resuse them when printing or just when displaying a page again.	2021-05-25 19:25:09 +02:00
Calixte Denizet	7cebdbd58c	XFA - Fix lot of layout issues - I thought it was possible to rely on browser layout engine to handle layout stuff but it isn't possible - mainly because when a contentArea overflows, we must continue to layout in the next contentArea - when no more contentArea is available then we must go to the next page... - we must handle breakBefore and breakAfter which allows to "break" the layout to go to the next container - Sometimes some containers don't provide their dimensions so we must compute them in order to know where to put them in their parents but to compute those dimensions we need to layout the container itself... - See top of file layout.js for more explanations about layout. - fix few bugs in other places I met during my work on layout.	2021-05-25 17:51:36 +02:00
Jonas Jenwald	ec3bcadf56	Enable the `unicorn/no-array-push-push` ESLint plugin rule There's generally speaking no need to use multiple consecutive `Array.prototype.push()` calls, since that method accepts multiple arguments, and this ESLint rule helps enforce that pattern. Please see https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/no-array-push-push.md for additional information.	2021-05-25 13:54:46 +02:00
Calixte Denizet	209ac5ca57	XFA - Don't display images with a href	2021-05-22 15:09:43 +02:00
calixteman	0df1a56619	Merge pull request #13417 from Snuffleupagus/xfa-URL-clone [XFA] Send URLs as strings, rather than objects (issue 1773)	2021-05-22 14:31:59 +02:00
Tim van der Meij	de680d7777	Merge pull request #13381 from Snuffleupagus/buildFontPaths-ignoreErrors Handle errors gracefully, in PartialEvaluator.buildFontPaths, when glyph path building fails	2021-05-22 13:06:31 +02:00
Jonas Jenwald	53a70244d0	Use the `stringToBytes` helper function in more places Rather than manually reimplementing, more-or-less, this functionality in a few spots we can simply use the existing helper function instead.	2021-05-22 12:23:09 +02:00
Jonas Jenwald	ba13bd8c2d	[XFA] Send `URL`s as strings, rather than objects (issue 1773) Given that `URL`s aren't supported by the structured clone algorithm, see https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm, the document in issue 1773 will cause the browser to throw `DataCloneError: The object could not be cloned.`-errors and nothing will render. To fix this, we'll instead simply send the stringified version of the `URL` to prevent these errors from occuring.	2021-05-22 11:58:53 +02:00
Jonas Jenwald	c4429bc3f2	Do the `isType3Font`-check once, rather than repeating it, in `PartialEvaluator.translateFont` This is a small piece of clean-up that I happened to notice while browsing the code.	2021-05-22 11:46:37 +02:00
Jonas Jenwald	68350378c0	Handle errors gracefully, in `PartialEvaluator.buildFontPaths`, when glyph path building fails The building of glyph paths, in the `FontRendererFactory`, can fail in various ways for corrupt font data. However, we're currently not attempting to handle any such errors in the evaluator, which means that a single broken glyph can prevent an entire page from rendering. To address this we simply have to pass along, and check, the existing `ignoreErrors` option in `PartialEvaluator.buildFontPaths` similar to the rest of the `PartialEvaluator` code.	2021-05-22 11:46:31 +02:00
Tim van der Meij	b2ffebe978	Merge pull request #13416 from calixteman/xfa_config XFA - Fix wrong function name	2021-05-21 20:33:35 +02:00
Calixte Denizet	8a8879aed2	XFA - Fix wrong function name	2021-05-21 20:25:26 +02:00
Tim van der Meij	d1d9b9043d	Merge pull request #13415 from Snuffleupagus/getDestination-out-of-order Improve handling of named destinations in out-of-order NameTrees (PR 10274 follow-up)	2021-05-21 20:15:09 +02:00
Jonas Jenwald	8d5689387b	Improve handling of named destinations in out-of-order NameTrees (PR 10274 follow-up) According to the specification, see https://web.archive.org/web/20210404042322if_/https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G6.2384179, the keys of a NameTree/NumberTree should be ordered. For corrupt PDF files, which violate this assumption, it's thus possible that trying to lookup a single entry fails. Previously, in PR 10274, we implemented a fallback that only applies to the "bottom" node of a NameTree/NumberTree, which in general might not actually help for sufficiently corrupt NameTree/NumberTree data. Instead we remove the current limited fallback from `NameOrNumberTree.get`, and defer to the call-site to handle this case explicitly e.g. by using `NameOrNumberTree.getAll` for data where that makes sense. For well-formed documents, these changes should not lead to any additional data fetching/parsing. Finally, as part of these changes, the validation of named destination data is improved in the `Catalog` and a new unit-test is also added.	2021-05-21 15:48:37 +02:00
Jonas Jenwald	1a8d05fdcf	Remove some, with Prettier `2.3.0`, unnecessary `// prettier-ignore` comments To get the maximum benefit from something like Prettier, you obviously don't want to disable the automatic formatting unless absolutely necessary. When we added Prettier there were a number of cases, mostly involving larger Arrays, which required disabling of the automatic formatting for overall readability and/or to not break inline comments. With changes in Prettier version `2.3.0`, see [the release notes](https://prettier.io/blog/2021/05/09/2.3.0.html#concise-formatting-of-number-only-arrays-10106httpsgithubcomprettierprettierpull10106-10160httpsgithubcomprettierprettierpull10160-by-thorn0httpsgithubcomthorn0), there's now better formatting support for Arrays containing only numbers. Hence we can now remove a number of `// prettier-ignore` comments, and thus get the benefit of automatic formatting in (slightly) more of the code-base.	2021-05-19 11:36:03 +02:00
calixteman	faf6b10939	Merge pull request #13394 from calixteman/xml_parser Handle PI with no value in xml parser	2021-05-18 11:14:48 +02:00
Calixte Denizet	4544ebf38a	Handle PI with no value in xml parser - an XML PI contains a target and optionally some content (see https://en.wikipedia.org/wiki/Processing_Instruction) - the parser expected to always have some content and so it could lead to wrong parsing.	2021-05-18 10:22:18 +02:00
Brendan Dahl	239d0097fa	Merge pull request #13390 from calixteman/opentype_and_xfa XFA - Don't move glyphes in private area with non-truetype fonts	2021-05-17 12:39:10 -07:00
Brendan Dahl	46c2eeb19a	Merge pull request #13389 from calixteman/width_in_cff Get any width (if one is present) in CFF parser	2021-05-17 09:13:45 -07:00
Brendan Dahl	17e9cfcd2a	Merge pull request #13328 from calixteman/js_display1 JS - Add support for display property	2021-05-17 08:47:13 -07:00
Calixte Denizet	a74d19262a	XFA - Don't move glyphes in private area with non-truetype fonts - it has been done in PR #13146 but only for truetype fonts.	2021-05-17 16:52:39 +02:00
Calixte Denizet	d394188835	Get any width (if one is present) in CFF parser - in charstring specs at page 21 (section 4.2): "Also, it may appear in the charstring as the difference from nominalWidthX" so the number we've on the stack doesn't have to be positive. - currently this bug has probably no visible effect - but when the font is loaded to be used with XFA, then the rendering is incorrect.	2021-05-17 14:17:08 +02:00
Jonas Jenwald	718f7bf7e1	Fix a few safe ESLint `no-var` failures in `src/core/evaluator.js` (13371 follow-up) As can be seen in PR 13371, some of the `no-var` changes in the `PartialEvaluator.{getOperatorList, getTextContent}` methods caused errors in `gulp server`-mode. However, there's a handful of instances of `var` in other methods which should be completely safe to convert since there's no strange scope-issues present in that code.	2021-05-16 15:22:43 +02:00
Tim van der Meij	a5c74f53c1	Merge pull request #13386 from timvandermeij/src-core-bidi-no-var Enable the `no-var` linting rule in `src/core/bidi.js`	2021-05-16 15:02:18 +02:00
Tim van der Meij	b8a5e797c5	Enable the `no-var` linting rule in `src/core/bidi.js` This is done automatically with `gulp lint --fix` and the following manual changes: ```diff diff --git a/src/core/bidi.js b/src/core/bidi.js index e9e0a7217..32691c0c6 100644 --- a/src/core/bidi.js +++ b/src/core/bidi.js @@ -82,7 +82,8 @@ function isEven(i) { } function findUnequal(arr, start, value) { - for (var j = start, jj = arr.length; j < jj; ++j) { + let j, jj; + for (j = start, jj = arr.length; j < jj; ++j) { if (arr[j] !== value) { return j; } @@ -251,15 +252,14 @@ function bidi(str, startLevel, vertical) { for (i = 0; i < strLength; ++i) { if (types[i] === "EN") { // do before - var j; - for (j = i - 1; j >= 0; --j) { + for (let j = i - 1; j >= 0; --j) { if (types[j] !== "ET") { break; } types[j] = "EN"; } // do after - for (j = i + 1; j < strLength; ++j) { + for (let j = i + 1; j < strLength; ++j) { if (types[j] !== "ET") { break; } ```	2021-05-16 14:14:26 +02:00
Jonas Jenwald	3cfa316d40	Convert `src/core/operator_list.js` to use standard classes With modern JavaScript modules, where only explicitly exported properties are visible to the outside, the `QueueOptimizerClosure` should no longer be necessary. Furthermore, to reduce the possibility of `NullOptimizer` and `QueueOptimizer` getting out of sync (note e.g. the inconsistency fixed in PR 10784), we now let the latter extend the former one.	2021-05-16 13:39:54 +02:00
Jonas Jenwald	8943bcd3c3	Account for formatting changes in Prettier version `2.3.0` With the exception of one tweaked `eslint-disable` comment, in `web/generic_scripting.js`, this patch was generated automatically using `gulp lint --fix`. Please find additional information at: - https://github.com/prettier/prettier/releases/tag/2.3.0 - https://prettier.io/blog/2021/05/09/2.3.0.html	2021-05-16 11:44:05 +02:00
Tim van der Meij	d2e7161f2c	Merge pull request #13377 from Snuffleupagus/pattern-class Re-factor and convert the code in `src/core/pattern.js` to use standard classes	2021-05-14 22:23:44 +02:00
Jonas Jenwald	ebe3ee4f25	Modernize the `Shadings` structure, in `src/core/pattern.js`, to use standard classes This patch replaces the old structure with a abstract base-class, which the new RadialAxial/Mesh-shading classes then inherit from.[1] The old `MeshClosure` can now be removed, since it's not necessary any more, and most of the functions inside of it are now instead methods on the new `MeshShading` class. This is particularly nice, in my opinion, since we previously were manually passing around a reference to the current `Mesh`-instance. --- [1] If we want/need to, in the future, split e.g. the Mesh-handling into multiple classes that should now be easy to do.	2021-05-14 21:44:41 +02:00
Jonas Jenwald	6acb2db4be	Convert `src/core/pattern.js` to use standard classes Note that this patch only covers `Pattern` and `MeshStreamReader`, since the `Shadings`-implementation required additional re-factoring.	2021-05-14 21:42:21 +02:00
Calixte Denizet	f92e1fa160	Replace terminal null char by a endchar command in CFF charstrings to make OTS happy	2021-05-14 18:34:51 +02:00
Jonas Jenwald	612b43852b	Remove unused properties from the `Shadings`-implementations in `src/core/pattern.js` Neither the `type` or the `cs` properties are used outside of the "constructors", and we can thus remove them.[1] Note that a lot of this code is very old, and that it actually predates the main/worker-thread split before which the same file was used on both the main- and worker-threads. --- [1] On the main-thread, a similar `type` property was removed in PR 12591.	2021-05-14 16:11:48 +02:00
Calixte Denizet	1a2cea21a5	Replace command with not enough args by an endchar in CFF font - Right now, a glyph with an erroneous outline is replaced by an empty glyph if the error is far enough from the start there's likely something to render so the idea is to replace a command with args by an endchar when no args are on the stack: this way OTS is likely happy (no remaining args on stack) and we can draw something which is likely better than nothing.	2021-05-14 13:45:45 +02:00
Jonas Jenwald	4248f0745c	Improve the `Page.content` and `Page.getContentStream` methods First of all, by using `Dict.getArray` in the `Page.content` getter we remove the need to manually iterate through and fetch the sub-streams (when they exist) in the `Page.getContentStream` method. Secondly, we can simplify the code in `Page.{getOperatorList, extractTextContent}` by letting `Page.getContentStream` ensure that `content` is available and returning a Promise instead.	2021-05-14 11:47:34 +02:00
Jonas Jenwald	70113131de	Inline the data lookup in the `Dict.getArray` method Similar to the `get`/`getAsync` methods, this should be a tiny bit more efficient which cannot hurt considering that `getArray` is now used a lot more than when initially added.	2021-05-14 11:24:27 +02:00
Jonas Jenwald	75208d36c2	Revert "Fix the remaining `no-var` failures, which couldn't be handled automatically, in the `src/core/evaluator.js` file" (PR 13344 follow-up) This reverts commit 0ef9b5aafc88094f19fec793c174c622e7e15542, since it cases a lot of warnings (see below) locally with e.g. the document from issue 9627. Strangely enough, this only occurs with `gulp server`-mode and the actual builds are apparently fine. It seems that this may be some unfortunate interaction with the old Babel-plugin that's used together with SystemJS. ``` Warning: getTextContent - ignoring ExtGState: "FormatError: ExtGState should be a dictionary.". ``` Rather than taking the risk that this could actually cover a more serious bug, and since I cannot immediately figure out what's wrong, it thus seem safest to revert this for now and we can (carefully) revisit this once SystemJS has been removed (see PR 12563).	2021-05-13 11:19:46 +02:00
Tim van der Meij	ba99e54c66	Merge pull request #13361 from brendandahl/patterns-fixes Fix several issues with radial/axial shadings and tiling patterns.	2021-05-12 20:27:37 +02:00
Jonas Jenwald	757636d519	Convert the remaining functions in `src/core/primitives.js` to use standard classes This patch was tested using the PDF file from issue 2618, i.e. https://bug570667.bugzilla-attachments.gnome.org/attachment.cgi?id=226471, with the following manifest file: ``` [ { "id": "issue2618", "file": "../web/pdfs/issue2618.pdf", "md5": "", "rounds": 50, "type": "eq" } ] ``` which gave the following results when comparing this patch against the `master` branch: ``` -- Grouped By browser, stat -- browser \| stat \| Count \| Baseline(ms) \| Current(ms) \| +/- \| % \| Result(P<.05) ------- \| ------------ \| ----- \| ------------ \| ----------- \| --- \| ---- \| ------------- firefox \| Overall \| 50 \| 3417 \| 3426 \| 9 \| 0.27 \| firefox \| Page Request \| 50 \| 1 \| 1 \| 0 \| 5.41 \| firefox \| Rendering \| 50 \| 3416 \| 3426 \| 9 \| 0.27 \| ``` Based on these results, there's no significant performance regression from using standard classes and this patch should thus be OK.	2021-05-12 09:36:28 +02:00
Brendan Dahl	ac44afa70e	Fix several issues with radial/axial shadings and tiling patterns. Previously, we set the base transformation and pattern matrix directly to the main rendering ctx of the page, however doing this caused the current transform to be lost. This would cause issues with things like shear missing so the pattern was misaligned or when stroke was used the scale of the line width or dash would be wrong. Instead we should leave the current transform and use setTransfrom on the pattern so it is applied correctly. For axial and radial shadings I had to create a temporary canvas to draw the shading so I could in turn use setTransform. Fixes: #13325, #6769, #7847, #11018, #11597, #11473 The following already in the corpus are improved: issue8078-page1 issue1877-page1	2021-05-11 16:32:24 -07:00
Jonas Jenwald	6eef69de22	Export the "raw" `toUnicode`-data from `PartialEvaluator.preEvaluateFont` Compared to other data-structures, such as e.g. `Dict`s, we're purposely not caching Streams on the `XRef`-instance.[1] The, somewhat unfortunate, effect of Streams not being cached is that repeatedly getting the same Stream-data requires re-parsing/re-initializing of a bunch of data; see `XRef.fetch` and related methods. For the font-parsing in particular we're currently fetching the `toUnicode`-data, which is very often a Stream, in `PartialEvaluator.preEvaluateFont` and then again in `PartialEvaluator.extractDataStructures` soon afterwards. By instead letting `PartialEvaluator.preEvaluateFont` export the "raw" `toUnicode`-data, we can avoid some unnecessary re-parsing/re-initializing when handling fonts. Please note: In this particular case, given that `PartialEvaluator.preEvaluateFont` only accesses the "raw" `toUnicode` data, exporting a Stream should be safe. --- [1] The reasons for this include: - Streams, especially `DecodeStream`-instances, can become very large once read. Hence caching them really isn't a good idea simply because of the (potential) memory impact of doing so. - Attempting to read from the same Stream-instance more than once won't work, unless it's `reset` in between, since using any method such as e.g. `getBytes` always starts at the current data position. - Given that parsing, even in the worker-thread, is now fairly asynchronous it's generally impossible to assert that any one Stream-instance isn't being accessed "concurrently" by e.g. different `getOperatorList` calls. Hence `reset`-ing a cached Stream-instance isn't going to work in the general case.	2021-05-08 12:04:13 +02:00
Jonas Jenwald	13fb1654dc	Export the `firstChar`/`lastChar`-data from `PartialEvaluator.preEvaluateFont` Rather than re-fetching/re-parsing these properties immediately in `PartialEvaluator.translateFont`, we can simply export them instead. (Obviously the effect will be really tiny, but there is less parsing overall this way.)	2021-05-08 12:02:49 +02:00
Jonas Jenwald	8a1cb82aee	Ensure that the `Widths` array is parsed correctly in `PartialEvaluator.preEvaluateFont` Please note: While I don't have a document that this patches fixes, the current code is however not entirely correct as far as I can tell. Looking at how the `Widths` array is parsed in `PartialEvaluator.extractWidths`, it's clear that the implementation in `PartialEvaluator.preEvaluateFont` is a bit too simplistic. In particular, by only wrapping the data into a TypedArray, there's no attempt to handle indirect objects which could potentially lead to colliding `hash`es being computed.	2021-05-07 21:23:44 +02:00
Jonas Jenwald	30b2739adf	Ensure that composite/non-composite fonts won't get the same `hash` in `PartialEvaluator.preEvaluateFont` To hopefully help prevent any future bugs, make sure that composite/non-composite fonts cannot accidentally get matching `hash`es. Given the differences between those font types, that's very unlikely to be useful or even correct in general.	2021-05-07 21:22:37 +02:00
Jonas Jenwald	fc59a5f709	Take the `W` array into account when computing the hash, in `PartialEvaluator.preEvaluateFont`, for composite fonts (issue 13343) Without this some composite fonts may incorrectly end up with matching `hash`es, thus breaking rendering since we'll not actually try to load/parse some of the fonts. Please note: Given that the document, in the referenced issue, doesn't embed any of its fonts there's no guarantee that it renders correctly in all configurations even with this patch.	2021-05-07 21:22:36 +02:00
Tim van der Meij	a3632c0f38	Merge pull request #13344 from Snuffleupagus/evaluator-no-var Enable the `no-var` rule in the `src/core/evaluator.js` file	2021-05-07 21:02:46 +02:00
Tim van der Meij	5248d0a77d	Merge pull request #13338 from Snuffleupagus/images-class Convert the `src/core/{jbig2, jpg, jpx}.js` files to use standard classes	2021-05-07 20:59:58 +02:00
Calixte Denizet	af125cd299	JS - Add support for display property - in annotation_layer, move common properties treatment in a common method instead having duplicated code in each widget.	2021-05-06 11:15:38 +02:00

... 6 7 8 9 10 ...

2466 Commits