Commit Graph

4592 Commits

Author SHA1 Message Date
Tim van der Meij
b2ffebe978
Merge pull request #13416 from calixteman/xfa_config
XFA - Fix wrong function name
2021-05-21 20:33:35 +02:00
Calixte Denizet
8a8879aed2 XFA - Fix wrong function name 2021-05-21 20:25:26 +02:00
Tim van der Meij
d1d9b9043d
Merge pull request #13415 from Snuffleupagus/getDestination-out-of-order
Improve handling of named destinations in out-of-order NameTrees (PR 10274 follow-up)
2021-05-21 20:15:09 +02:00
Jonas Jenwald
8d5689387b Improve handling of named destinations in out-of-order NameTrees (PR 10274 follow-up)
According to the specification, see https://web.archive.org/web/20210404042322if_/https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G6.2384179, the keys of a NameTree/NumberTree should be ordered.
For corrupt PDF files, which violate this assumption, it's thus possible that trying to lookup a single entry fails.

Previously, in PR 10274, we implemented a fallback that only applies to the "bottom" node of a NameTree/NumberTree, which in general might not actually help for sufficiently corrupt NameTree/NumberTree data.
Instead we remove the current *limited* fallback from `NameOrNumberTree.get`, and defer to the call-site to handle this case explicitly e.g. by using `NameOrNumberTree.getAll` for data where that makes sense. For well-formed documents, these changes should *not* lead to any additional data fetching/parsing.

Finally, as part of these changes, the validation of named destination data is improved in the `Catalog` and a new unit-test is also added.
2021-05-21 15:48:37 +02:00
Jonas Jenwald
1a8d05fdcf Remove some, with Prettier 2.3.0, unnecessary // prettier-ignore comments
To get the maximum benefit from something like Prettier, you obviously don't want to disable the automatic formatting unless absolutely necessary. When we added Prettier there were a number of cases, mostly involving larger Arrays, which required disabling of the automatic formatting for overall readability and/or to not break inline comments.

With changes in Prettier version `2.3.0`, see [the release notes](https://prettier.io/blog/2021/05/09/2.3.0.html#concise-formatting-of-number-only-arrays-10106httpsgithubcomprettierprettierpull10106-10160httpsgithubcomprettierprettierpull10160-by-thorn0httpsgithubcomthorn0), there's now better formatting support for Arrays containing only numbers. Hence we can now remove a number of `// prettier-ignore` comments, and thus get the benefit of automatic formatting in (slightly) more of the code-base.
2021-05-19 11:36:03 +02:00
calixteman
faf6b10939
Merge pull request #13394 from calixteman/xml_parser
Handle PI with no value in xml parser
2021-05-18 11:14:48 +02:00
Calixte Denizet
4544ebf38a Handle PI with no value in xml parser
- an XML PI contains a target and optionally some content (see https://en.wikipedia.org/wiki/Processing_Instruction)
  - the parser expected to always have some content and so it could lead to wrong parsing.
2021-05-18 10:22:18 +02:00
Brendan Dahl
239d0097fa
Merge pull request #13390 from calixteman/opentype_and_xfa
XFA - Don't move glyphes in private area with non-truetype fonts
2021-05-17 12:39:10 -07:00
Brendan Dahl
46c2eeb19a
Merge pull request #13389 from calixteman/width_in_cff
Get any width (if one is present) in CFF parser
2021-05-17 09:13:45 -07:00
Brendan Dahl
17e9cfcd2a
Merge pull request #13328 from calixteman/js_display1
JS - Add support for display property
2021-05-17 08:47:13 -07:00
Calixte Denizet
a74d19262a XFA - Don't move glyphes in private area with non-truetype fonts
- it has been done in PR #13146 but only for truetype fonts.
2021-05-17 16:52:39 +02:00
Calixte Denizet
d394188835 Get any width (if one is present) in CFF parser
- in charstring specs at page 21 (section 4.2): "Also, it may appear in the charstring as the difference from nominalWidthX" so the number we've on the stack doesn't have to be positive.
  - currently this bug has probably no visible effect
  - but when the font is loaded to be used with XFA, then the rendering is incorrect.
2021-05-17 14:17:08 +02:00
Jonas Jenwald
718f7bf7e1 Fix a few *safe* ESLint no-var failures in src/core/evaluator.js (13371 follow-up)
As can be seen in PR 13371, some of the `no-var` changes in the `PartialEvaluator.{getOperatorList, getTextContent}` methods caused errors in `gulp server`-mode.
However, there's a handful of instances of `var` in other methods which should be completely *safe* to convert since there's no strange scope-issues present in that code.
2021-05-16 15:22:43 +02:00
Tim van der Meij
a5c74f53c1
Merge pull request #13386 from timvandermeij/src-core-bidi-no-var
Enable the `no-var` linting rule in `src/core/bidi.js`
2021-05-16 15:02:18 +02:00
Tim van der Meij
b8a5e797c5
Enable the no-var linting rule in src/core/bidi.js
This is done automatically with `gulp lint --fix` and the following
manual changes:

```diff
diff --git a/src/core/bidi.js b/src/core/bidi.js
index e9e0a7217..32691c0c6 100644
--- a/src/core/bidi.js
+++ b/src/core/bidi.js
@@ -82,7 +82,8 @@ function isEven(i) {
 }

 function findUnequal(arr, start, value) {
-  for (var j = start, jj = arr.length; j < jj; ++j) {
+  let j, jj;
+  for (j = start, jj = arr.length; j < jj; ++j) {
     if (arr[j] !== value) {
       return j;
     }
@@ -251,15 +252,14 @@ function bidi(str, startLevel, vertical) {
   for (i = 0; i < strLength; ++i) {
     if (types[i] === "EN") {
       // do before
-      var j;
-      for (j = i - 1; j >= 0; --j) {
+      for (let j = i - 1; j >= 0; --j) {
         if (types[j] !== "ET") {
           break;
         }
         types[j] = "EN";
       }
       // do after
-      for (j = i + 1; j < strLength; ++j) {
+      for (let j = i + 1; j < strLength; ++j) {
         if (types[j] !== "ET") {
           break;
         }
```
2021-05-16 14:14:26 +02:00
Jonas Jenwald
3cfa316d40 Convert src/core/operator_list.js to use standard classes
With modern JavaScript modules, where only *explicitly* exported properties are visible to the outside, the `QueueOptimizerClosure` should no longer be necessary.

Furthermore, to reduce the possibility of `NullOptimizer` and `QueueOptimizer` getting out of sync (note e.g. the inconsistency fixed in PR 10784), we now let the latter extend the former one.
2021-05-16 13:39:54 +02:00
Tim van der Meij
8a8a67de3b
Merge pull request #13380 from Snuffleupagus/pattern_helper-class
Re-factor and convert the code in `src/display/pattern_helper.js` to use standard classes
2021-05-16 13:11:04 +02:00
Jonas Jenwald
8943bcd3c3 Account for formatting changes in Prettier version 2.3.0
With the exception of one tweaked `eslint-disable` comment, in `web/generic_scripting.js`, this patch was generated automatically using `gulp lint --fix`.

Please find additional information at:
 - https://github.com/prettier/prettier/releases/tag/2.3.0
 - https://prettier.io/blog/2021/05/09/2.3.0.html
2021-05-16 11:44:05 +02:00
Jonas Jenwald
a984431046 Modernize the ShadingIRs structure, in src/display/pattern_helper.js, to use standard classes
This patch replaces the old structure with an abstract base-class, which the new ShadingPattern classes then inherit from.
The old `createMeshCanvasClosure` can now be removed, since it's not necessary any more with modern JavaScript, and the `createMeshCanvas` function is now instead a method on the new `MeshShadingPattern` class (avoids unnecessary parameter passing).
2021-05-15 16:00:00 +02:00
Jonas Jenwald
40939d5955 Convert src/display/pattern_helper.js to use standard classes
Note that this patch only covers `TilingPattern`, since the `ShadingIRs`-implementation required additional re-factoring.
2021-05-15 13:03:07 +02:00
Jonas Jenwald
bb8e15c971 [api-minor] Update minimum supported browser versions (PR 13361 follow-up)
With the changes in PR 13361, we're now using the `CanvasPattern.setTransform()` method when rendering certain Shadings/Patterns.
Note that while `CanvasPattern` itself has been supported since basically "forever", its `setTransform` method is a slightly newer addition to the specification; please refer to https://developer.mozilla.org/en-US/docs/Web/API/CanvasPattern#browser_compatibility

Rather than trying to re-write PR 13361 to not use, or possibly spending time/effort (if possible) polyfilling, `CanvasPattern.setTransform()` this patch thus suggests that we simply update the *minimum* supported browser versions instead.

According to the compatibility data linked above, the *minimum* supported browser versions in the PDF.js library are now as follows:
 - Chrome >= 68, which was released on 2018-07-24.[1]
 - Firefox ESR, see https://wiki.mozilla.org/Release_Management/Calendar.
 - Safari >= 11.1, which was release on 2018-03-29.[2]

(Given that the PDF.js contributors cannot realistically test a bunch of old browsers, it's not unimaginable that some older browser versions are already not working with the PDF.js library.)

Based on these changes, which we should ensure are reflected in the Wiki as well, we can also remove a number of now redundant polyfills. Furthermore we'll no longer "claim" to support Windows XP, note the `gulpfile.js` changes, which should definitely *not* be an issue given that it's no longer officially supported.[3]

---
[1] According to https://en.wikipedia.org/wiki/Google_Chrome_version_history

[2] According to https://en.wikipedia.org/wiki/Safari_version_history#Safari_11

[3] According to https://en.wikipedia.org/wiki/Windows_XP#End_of_support
2021-05-15 09:57:34 +02:00
Tim van der Meij
d2e7161f2c
Merge pull request #13377 from Snuffleupagus/pattern-class
Re-factor and convert the code in `src/core/pattern.js` to use standard classes
2021-05-14 22:23:44 +02:00
Jonas Jenwald
ebe3ee4f25 Modernize the Shadings structure, in src/core/pattern.js, to use standard classes
This patch replaces the old structure with a abstract base-class, which the new RadialAxial/Mesh-shading classes then inherit from.[1]
The old `MeshClosure` can now be removed, since it's not necessary any more, and most of the functions inside of it are now instead methods on the new `MeshShading` class. This is particularly nice, in my opinion, since we previously were *manually* passing around a reference to the current `Mesh`-instance.

---
[1] If we want/need to, in the future, split e.g. the Mesh-handling into multiple classes that should now be easy to do.
2021-05-14 21:44:41 +02:00
Jonas Jenwald
6acb2db4be Convert src/core/pattern.js to use standard classes
Note that this patch only covers `Pattern` and `MeshStreamReader`, since the `Shadings`-implementation required additional re-factoring.
2021-05-14 21:42:21 +02:00
Jonas Jenwald
612b43852b Remove unused properties from the Shadings-implementations in src/core/pattern.js
Neither the `type` or the `cs` properties are used outside of the "constructors", and we can thus remove them.[1]
Note that a lot of this code is very old, and that it actually predates the main/worker-thread split before which the *same* file was used on both the main- *and* worker-threads.

---
[1] On the main-thread, a similar `type` property was removed in PR 12591.
2021-05-14 16:11:48 +02:00
Jonas Jenwald
4248f0745c Improve the Page.content and Page.getContentStream methods
First of all, by using `Dict.getArray` in the `Page.content` getter we remove the need to manually iterate through and fetch the sub-streams (when they exist) in the `Page.getContentStream` method.
Secondly, we can simplify the code in `Page.{getOperatorList, extractTextContent}` by letting `Page.getContentStream` ensure that `content` is available and returning a Promise instead.
2021-05-14 11:47:34 +02:00
Jonas Jenwald
70113131de Inline the data lookup in the Dict.getArray method
Similar to the `get`/`getAsync` methods, this should be a *tiny* bit more efficient which cannot hurt considering that `getArray` is now used a lot more than when initially added.
2021-05-14 11:24:27 +02:00
Tim van der Meij
e394da5861
Merge pull request #13369 from brendandahl/smask-pattern
Fix tiling pattern with smask.
2021-05-13 13:26:38 +02:00
Jonas Jenwald
75208d36c2 Revert "Fix the remaining no-var failures, which couldn't be handled automatically, in the src/core/evaluator.js file" (PR 13344 follow-up)
This reverts commit 0ef9b5aafc, since it cases a lot of warnings (see below) *locally* with e.g. the document from issue 9627.
Strangely enough, this only occurs with `gulp server`-mode and the actual builds are apparently fine. It seems that this *may* be some unfortunate interaction with the old Babel-plugin that's used together with SystemJS.

```
Warning: getTextContent - ignoring ExtGState: "FormatError: ExtGState should be a dictionary.".
```

Rather than taking the risk that this could actually cover a more serious bug, and since I cannot immediately figure out what's wrong, it thus seem safest to revert this for now and we can (carefully) revisit this once SystemJS has been removed (see PR 12563).
2021-05-13 11:19:46 +02:00
Brendan Dahl
53991d0924 Fix tiling pattern with smask.
After drawing a tiling pattern we were not calling
endDrawing, which handles compositing any
active smasks.

Fixes #8565.
2021-05-12 11:42:08 -07:00
Tim van der Meij
ba99e54c66
Merge pull request #13361 from brendandahl/patterns-fixes
Fix several issues with radial/axial shadings and tiling patterns.
2021-05-12 20:27:37 +02:00
Tim van der Meij
1cf9f42ca2
Merge pull request #13366 from Snuffleupagus/primitives-class
Convert the remaining functions in `src/core/primitives.js` to use standard classes
2021-05-12 20:20:35 +02:00
Tim van der Meij
0a3e483c7f
Merge pull request #13360 from Snuffleupagus/renderer-conditional-pref
Only include the `renderer`-preference in builds where `SVGGraphics` is defined
2021-05-12 20:16:53 +02:00
Jonas Jenwald
64c55d381d Fix the Jbig2Image export for the gulp image_decoders build (PR 9729 follow-up, issue 13367) 2021-05-12 19:41:29 +02:00
Jonas Jenwald
757636d519 Convert the remaining functions in src/core/primitives.js to use standard classes
This patch was tested using the PDF file from issue 2618, i.e. https://bug570667.bugzilla-attachments.gnome.org/attachment.cgi?id=226471, with the following manifest file:
```
[
    {  "id": "issue2618",
       "file": "../web/pdfs/issue2618.pdf",
       "md5": "",
       "rounds": 50,
       "type": "eq"
    }
]
```

which gave the following results when comparing this patch against the `master` branch:
```
-- Grouped By browser, stat --
browser | stat         | Count | Baseline(ms) | Current(ms) | +/- |   %  | Result(P<.05)
------- | ------------ | ----- | ------------ | ----------- | --- | ---- | -------------
firefox | Overall      |    50 |         3417 |        3426 |   9 | 0.27 |
firefox | Page Request |    50 |            1 |           1 |   0 | 5.41 |
firefox | Rendering    |    50 |         3416 |        3426 |   9 | 0.27 |
```

Based on these results, there's no significant performance regression from using standard classes and this patch should thus be OK.
2021-05-12 09:36:28 +02:00
Brendan Dahl
ac44afa70e Fix several issues with radial/axial shadings and tiling patterns.
Previously, we set the base transformation and pattern matrix
directly to the main rendering ctx of the page, however doing this
caused the current transform to be lost. This would cause issues
with things like shear missing so the pattern was misaligned or when
stroke was used the scale of the line width or dash would be wrong.
Instead we should leave the current transform and use setTransfrom
on the pattern so it is applied correctly. For axial and radial shadings I had
to create a temporary canvas to draw the shading so I could in turn
use setTransform.

Fixes: #13325, #6769, #7847, #11018, #11597, #11473

The following already in the corpus are improved:
issue8078-page1
issue1877-page1
2021-05-11 16:32:24 -07:00
Jonas Jenwald
b068882bd0 Clean-up usage of the TESTING-define in src/pdf.sandbox.js
This patch moves the `PDFJSDev`-checks *inline*, similar to the rest of the code-base, such that the code in question is actually being removed from the *built* files in e.g. the official releases.
2021-05-11 12:39:33 +02:00
Jonas Jenwald
7548dc5ea2 Only include the renderer-preference in builds where SVGGraphics is defined
After PR 13117 it's now (finally) possible for *different* build targets to specify individual options/preferences, and we can utilize that to only expose the `renderer`-preference in builds where `SVGGraphics` is actually defined.
Note that for e.g. `MOZCENTRAL`-builds, trying to enable SVG-rendering will throw immediately and the preference thus doesn't make sense to include there.

Also, update the dummy `SVGGraphics` to use a class, tweak the `PDFJSDev`-check in `src/display/svg.js` to agree fully with the option/preference, and remove an unnecessary `eslint-disable`.
2021-05-10 12:03:53 +02:00
Jonas Jenwald
2ba4b65ca8 [api-minor] Remove the WebGL implementation
Reasons for the removal include:
 - This functionality was always somewhat experimental and has never been enabled by default, partly because of worries about rendering bugs caused by e.g. bad/outdated graphics drivers.

 - After the initial implementation, in PR 4286 (back in 2014), no additional functionality has been added to the WebGL implementation.

 - The vast majority of all documents do not benefit from WebGL rendering, since only a couple of *specific* features are supported (e.g. some Soft Masks and Patterns).

 - There is, and has always been, *zero* test-coverage for the WebGL implementation.

 - Overall performance, in the PDF.js library, has improved since the experimental WebGL implementation was added.

Rather than shipping unused *and* untested code, it seems reasonable to simply remove the WebGL implementation for now; thanks to version control it's always possible to bring back the code should the need ever arise.
2021-05-09 16:38:44 +02:00
Jonas Jenwald
6eef69de22 Export the "raw" toUnicode-data from PartialEvaluator.preEvaluateFont
Compared to other data-structures, such as e.g. `Dict`s, we're purposely *not* caching Streams on the `XRef`-instance.[1]
The, somewhat unfortunate, effect of Streams not being cached is that repeatedly getting the *same* Stream-data requires re-parsing/re-initializing of a bunch of data; see `XRef.fetch` and related methods.

For the font-parsing in particular we're currently fetching the `toUnicode`-data, which is very often a Stream, in `PartialEvaluator.preEvaluateFont` and then *again* in `PartialEvaluator.extractDataStructures` soon afterwards.
By instead letting `PartialEvaluator.preEvaluateFont` export the "raw" `toUnicode`-data, we can avoid *some* unnecessary re-parsing/re-initializing when handling fonts.
*Please note:* In this particular case, given that `PartialEvaluator.preEvaluateFont` only accesses the "raw" `toUnicode` data, exporting a Stream should be safe.

---
[1] The reasons for this include:
 - Streams, especially `DecodeStream`-instances, can become *very* large once read. Hence caching them really isn't a good idea simply because of the (potential) memory impact of doing so.

 - Attempting to read from the *same* Stream-instance more than once won't work, unless it's `reset` in between, since using any method such as e.g. `getBytes` always starts at the current data position.

 - Given that parsing, even in the worker-thread, is now fairly asynchronous it's generally impossible to assert that any one Stream-instance isn't being accessed "concurrently" by e.g. different `getOperatorList` calls. Hence `reset`-ing a cached Stream-instance isn't going to work in the general case.
2021-05-08 12:04:13 +02:00
Jonas Jenwald
13fb1654dc Export the firstChar/lastChar-data from PartialEvaluator.preEvaluateFont
Rather than re-fetching/re-parsing these properties immediately in `PartialEvaluator.translateFont`, we can simply export them instead. (Obviously the effect will be really tiny, but there is less parsing overall this way.)
2021-05-08 12:02:49 +02:00
Jonas Jenwald
8a1cb82aee Ensure that the Widths array is parsed correctly in PartialEvaluator.preEvaluateFont
*Please note:* While I don't have a document that this patches fixes, the current code is however not entirely correct as far as I can tell.

Looking at how the `Widths` array is parsed in `PartialEvaluator.extractWidths`, it's clear that the implementation in `PartialEvaluator.preEvaluateFont` is a bit too simplistic. In particular, by only wrapping the data into a TypedArray, there's no attempt to handle *indirect* objects which could potentially lead to colliding `hash`es being computed.
2021-05-07 21:23:44 +02:00
Jonas Jenwald
30b2739adf Ensure that composite/non-composite fonts won't get the same hash in PartialEvaluator.preEvaluateFont
To hopefully help prevent any future bugs, make sure that composite/non-composite fonts cannot accidentally get matching `hash`es. Given the differences between those font types, that's very unlikely to be useful or even correct in general.
2021-05-07 21:22:37 +02:00
Jonas Jenwald
fc59a5f709 Take the W array into account when computing the hash, in PartialEvaluator.preEvaluateFont, for composite fonts (issue 13343)
Without this some *composite* fonts may incorrectly end up with matching `hash`es, thus breaking rendering since we'll not actually try to load/parse some of the fonts.

*Please note:* Given that the document, in the referenced issue, doesn't embed *any* of its fonts there's no guarantee that it renders correctly in all configurations even with this patch.
2021-05-07 21:22:36 +02:00
Tim van der Meij
a3632c0f38
Merge pull request #13344 from Snuffleupagus/evaluator-no-var
Enable the `no-var` rule in the `src/core/evaluator.js` file
2021-05-07 21:02:46 +02:00
Tim van der Meij
5248d0a77d
Merge pull request #13338 from Snuffleupagus/images-class
Convert the `src/core/{jbig2, jpg, jpx}.js` files to use standard classes
2021-05-07 20:59:58 +02:00
Calixte Denizet
af125cd299 JS - Add support for display property
- in annotation_layer, move common properties treatment in a common method instead having duplicated code in each widget.
2021-05-06 11:15:38 +02:00
Jonas Jenwald
0ef9b5aafc Fix the remaining no-var failures, which couldn't be handled automatically, in the src/core/evaluator.js file
The only *slight* complication here were some of the `switch`-cases, in `getOperatorList`/`getTextContent`, where the parsing is done asynchronously.
However, those cases are easy to deal with by wrapping the code within its own block; please see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/switch#block-scope_variables_within_switch_statements
2021-05-06 10:21:05 +02:00
Jonas Jenwald
f93c3b9aa7 Enable the no-var rule in the src/core/evaluator.js file
These changes were made automatically, using `gulp lint --fix`.
2021-05-06 09:39:21 +02:00
Jonas Jenwald
0a32ad3e42 Remove unnecessary closure in the src/core/font_renderer.js file
With modern JavaScript modules, where you explicitly list the properties that should be exported, it's no longer necessary to wrap *all* of the code within one file into a top-level closure.[1]

This patch reduces the size, of even the *built* `pdf.worker.js` file, since there's now a lot less unnecessary whitespace.

---
[1] For files which contain *different* functionality, some closures may however still make sense in order to separate the code.
It might be possible to remove some of those cases later, e.g. once private class fields becomes generally available/usable in browsers.
2021-05-05 22:35:52 +02:00