Commit Graph

2240 Commits

Author SHA1 Message Date
Tim van der Meij
a5c74f53c1
Merge pull request #13386 from timvandermeij/src-core-bidi-no-var
Enable the `no-var` linting rule in `src/core/bidi.js`
2021-05-16 15:02:18 +02:00
Tim van der Meij
b8a5e797c5
Enable the no-var linting rule in src/core/bidi.js
This is done automatically with `gulp lint --fix` and the following
manual changes:

```diff
diff --git a/src/core/bidi.js b/src/core/bidi.js
index e9e0a7217..32691c0c6 100644
--- a/src/core/bidi.js
+++ b/src/core/bidi.js
@@ -82,7 +82,8 @@ function isEven(i) {
 }

 function findUnequal(arr, start, value) {
-  for (var j = start, jj = arr.length; j < jj; ++j) {
+  let j, jj;
+  for (j = start, jj = arr.length; j < jj; ++j) {
     if (arr[j] !== value) {
       return j;
     }
@@ -251,15 +252,14 @@ function bidi(str, startLevel, vertical) {
   for (i = 0; i < strLength; ++i) {
     if (types[i] === "EN") {
       // do before
-      var j;
-      for (j = i - 1; j >= 0; --j) {
+      for (let j = i - 1; j >= 0; --j) {
         if (types[j] !== "ET") {
           break;
         }
         types[j] = "EN";
       }
       // do after
-      for (j = i + 1; j < strLength; ++j) {
+      for (let j = i + 1; j < strLength; ++j) {
         if (types[j] !== "ET") {
           break;
         }
```
2021-05-16 14:14:26 +02:00
Jonas Jenwald
3cfa316d40 Convert src/core/operator_list.js to use standard classes
With modern JavaScript modules, where only *explicitly* exported properties are visible to the outside, the `QueueOptimizerClosure` should no longer be necessary.

Furthermore, to reduce the possibility of `NullOptimizer` and `QueueOptimizer` getting out of sync (note e.g. the inconsistency fixed in PR 10784), we now let the latter extend the former one.
2021-05-16 13:39:54 +02:00
Jonas Jenwald
8943bcd3c3 Account for formatting changes in Prettier version 2.3.0
With the exception of one tweaked `eslint-disable` comment, in `web/generic_scripting.js`, this patch was generated automatically using `gulp lint --fix`.

Please find additional information at:
 - https://github.com/prettier/prettier/releases/tag/2.3.0
 - https://prettier.io/blog/2021/05/09/2.3.0.html
2021-05-16 11:44:05 +02:00
Tim van der Meij
d2e7161f2c
Merge pull request #13377 from Snuffleupagus/pattern-class
Re-factor and convert the code in `src/core/pattern.js` to use standard classes
2021-05-14 22:23:44 +02:00
Jonas Jenwald
ebe3ee4f25 Modernize the Shadings structure, in src/core/pattern.js, to use standard classes
This patch replaces the old structure with a abstract base-class, which the new RadialAxial/Mesh-shading classes then inherit from.[1]
The old `MeshClosure` can now be removed, since it's not necessary any more, and most of the functions inside of it are now instead methods on the new `MeshShading` class. This is particularly nice, in my opinion, since we previously were *manually* passing around a reference to the current `Mesh`-instance.

---
[1] If we want/need to, in the future, split e.g. the Mesh-handling into multiple classes that should now be easy to do.
2021-05-14 21:44:41 +02:00
Jonas Jenwald
6acb2db4be Convert src/core/pattern.js to use standard classes
Note that this patch only covers `Pattern` and `MeshStreamReader`, since the `Shadings`-implementation required additional re-factoring.
2021-05-14 21:42:21 +02:00
Calixte Denizet
f92e1fa160 Replace terminal null char by a endchar command in CFF charstrings to make OTS happy 2021-05-14 18:34:51 +02:00
Jonas Jenwald
612b43852b Remove unused properties from the Shadings-implementations in src/core/pattern.js
Neither the `type` or the `cs` properties are used outside of the "constructors", and we can thus remove them.[1]
Note that a lot of this code is very old, and that it actually predates the main/worker-thread split before which the *same* file was used on both the main- *and* worker-threads.

---
[1] On the main-thread, a similar `type` property was removed in PR 12591.
2021-05-14 16:11:48 +02:00
Calixte Denizet
1a2cea21a5 Replace command with not enough args by an endchar in CFF font
- Right now, a glyph with an erroneous outline is replaced by an empty glyph
    if the error is far enough from the start there's likely something to render
    so the idea is to replace a command with args by an endchar when no args are
    on the stack: this way OTS is likely happy (no remaining args on stack) and we
    can draw something which is likely better than nothing.
2021-05-14 13:45:45 +02:00
Jonas Jenwald
4248f0745c Improve the Page.content and Page.getContentStream methods
First of all, by using `Dict.getArray` in the `Page.content` getter we remove the need to manually iterate through and fetch the sub-streams (when they exist) in the `Page.getContentStream` method.
Secondly, we can simplify the code in `Page.{getOperatorList, extractTextContent}` by letting `Page.getContentStream` ensure that `content` is available and returning a Promise instead.
2021-05-14 11:47:34 +02:00
Jonas Jenwald
70113131de Inline the data lookup in the Dict.getArray method
Similar to the `get`/`getAsync` methods, this should be a *tiny* bit more efficient which cannot hurt considering that `getArray` is now used a lot more than when initially added.
2021-05-14 11:24:27 +02:00
Jonas Jenwald
75208d36c2 Revert "Fix the remaining no-var failures, which couldn't be handled automatically, in the src/core/evaluator.js file" (PR 13344 follow-up)
This reverts commit 0ef9b5aafc, since it cases a lot of warnings (see below) *locally* with e.g. the document from issue 9627.
Strangely enough, this only occurs with `gulp server`-mode and the actual builds are apparently fine. It seems that this *may* be some unfortunate interaction with the old Babel-plugin that's used together with SystemJS.

```
Warning: getTextContent - ignoring ExtGState: "FormatError: ExtGState should be a dictionary.".
```

Rather than taking the risk that this could actually cover a more serious bug, and since I cannot immediately figure out what's wrong, it thus seem safest to revert this for now and we can (carefully) revisit this once SystemJS has been removed (see PR 12563).
2021-05-13 11:19:46 +02:00
Tim van der Meij
ba99e54c66
Merge pull request #13361 from brendandahl/patterns-fixes
Fix several issues with radial/axial shadings and tiling patterns.
2021-05-12 20:27:37 +02:00
Jonas Jenwald
757636d519 Convert the remaining functions in src/core/primitives.js to use standard classes
This patch was tested using the PDF file from issue 2618, i.e. https://bug570667.bugzilla-attachments.gnome.org/attachment.cgi?id=226471, with the following manifest file:
```
[
    {  "id": "issue2618",
       "file": "../web/pdfs/issue2618.pdf",
       "md5": "",
       "rounds": 50,
       "type": "eq"
    }
]
```

which gave the following results when comparing this patch against the `master` branch:
```
-- Grouped By browser, stat --
browser | stat         | Count | Baseline(ms) | Current(ms) | +/- |   %  | Result(P<.05)
------- | ------------ | ----- | ------------ | ----------- | --- | ---- | -------------
firefox | Overall      |    50 |         3417 |        3426 |   9 | 0.27 |
firefox | Page Request |    50 |            1 |           1 |   0 | 5.41 |
firefox | Rendering    |    50 |         3416 |        3426 |   9 | 0.27 |
```

Based on these results, there's no significant performance regression from using standard classes and this patch should thus be OK.
2021-05-12 09:36:28 +02:00
Brendan Dahl
ac44afa70e Fix several issues with radial/axial shadings and tiling patterns.
Previously, we set the base transformation and pattern matrix
directly to the main rendering ctx of the page, however doing this
caused the current transform to be lost. This would cause issues
with things like shear missing so the pattern was misaligned or when
stroke was used the scale of the line width or dash would be wrong.
Instead we should leave the current transform and use setTransfrom
on the pattern so it is applied correctly. For axial and radial shadings I had
to create a temporary canvas to draw the shading so I could in turn
use setTransform.

Fixes: #13325, #6769, #7847, #11018, #11597, #11473

The following already in the corpus are improved:
issue8078-page1
issue1877-page1
2021-05-11 16:32:24 -07:00
Jonas Jenwald
6eef69de22 Export the "raw" toUnicode-data from PartialEvaluator.preEvaluateFont
Compared to other data-structures, such as e.g. `Dict`s, we're purposely *not* caching Streams on the `XRef`-instance.[1]
The, somewhat unfortunate, effect of Streams not being cached is that repeatedly getting the *same* Stream-data requires re-parsing/re-initializing of a bunch of data; see `XRef.fetch` and related methods.

For the font-parsing in particular we're currently fetching the `toUnicode`-data, which is very often a Stream, in `PartialEvaluator.preEvaluateFont` and then *again* in `PartialEvaluator.extractDataStructures` soon afterwards.
By instead letting `PartialEvaluator.preEvaluateFont` export the "raw" `toUnicode`-data, we can avoid *some* unnecessary re-parsing/re-initializing when handling fonts.
*Please note:* In this particular case, given that `PartialEvaluator.preEvaluateFont` only accesses the "raw" `toUnicode` data, exporting a Stream should be safe.

---
[1] The reasons for this include:
 - Streams, especially `DecodeStream`-instances, can become *very* large once read. Hence caching them really isn't a good idea simply because of the (potential) memory impact of doing so.

 - Attempting to read from the *same* Stream-instance more than once won't work, unless it's `reset` in between, since using any method such as e.g. `getBytes` always starts at the current data position.

 - Given that parsing, even in the worker-thread, is now fairly asynchronous it's generally impossible to assert that any one Stream-instance isn't being accessed "concurrently" by e.g. different `getOperatorList` calls. Hence `reset`-ing a cached Stream-instance isn't going to work in the general case.
2021-05-08 12:04:13 +02:00
Jonas Jenwald
13fb1654dc Export the firstChar/lastChar-data from PartialEvaluator.preEvaluateFont
Rather than re-fetching/re-parsing these properties immediately in `PartialEvaluator.translateFont`, we can simply export them instead. (Obviously the effect will be really tiny, but there is less parsing overall this way.)
2021-05-08 12:02:49 +02:00
Jonas Jenwald
8a1cb82aee Ensure that the Widths array is parsed correctly in PartialEvaluator.preEvaluateFont
*Please note:* While I don't have a document that this patches fixes, the current code is however not entirely correct as far as I can tell.

Looking at how the `Widths` array is parsed in `PartialEvaluator.extractWidths`, it's clear that the implementation in `PartialEvaluator.preEvaluateFont` is a bit too simplistic. In particular, by only wrapping the data into a TypedArray, there's no attempt to handle *indirect* objects which could potentially lead to colliding `hash`es being computed.
2021-05-07 21:23:44 +02:00
Jonas Jenwald
30b2739adf Ensure that composite/non-composite fonts won't get the same hash in PartialEvaluator.preEvaluateFont
To hopefully help prevent any future bugs, make sure that composite/non-composite fonts cannot accidentally get matching `hash`es. Given the differences between those font types, that's very unlikely to be useful or even correct in general.
2021-05-07 21:22:37 +02:00
Jonas Jenwald
fc59a5f709 Take the W array into account when computing the hash, in PartialEvaluator.preEvaluateFont, for composite fonts (issue 13343)
Without this some *composite* fonts may incorrectly end up with matching `hash`es, thus breaking rendering since we'll not actually try to load/parse some of the fonts.

*Please note:* Given that the document, in the referenced issue, doesn't embed *any* of its fonts there's no guarantee that it renders correctly in all configurations even with this patch.
2021-05-07 21:22:36 +02:00
Tim van der Meij
a3632c0f38
Merge pull request #13344 from Snuffleupagus/evaluator-no-var
Enable the `no-var` rule in the `src/core/evaluator.js` file
2021-05-07 21:02:46 +02:00
Tim van der Meij
5248d0a77d
Merge pull request #13338 from Snuffleupagus/images-class
Convert the `src/core/{jbig2, jpg, jpx}.js` files to use standard classes
2021-05-07 20:59:58 +02:00
Calixte Denizet
af125cd299 JS - Add support for display property
- in annotation_layer, move common properties treatment in a common method instead having duplicated code in each widget.
2021-05-06 11:15:38 +02:00
Jonas Jenwald
0ef9b5aafc Fix the remaining no-var failures, which couldn't be handled automatically, in the src/core/evaluator.js file
The only *slight* complication here were some of the `switch`-cases, in `getOperatorList`/`getTextContent`, where the parsing is done asynchronously.
However, those cases are easy to deal with by wrapping the code within its own block; please see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/switch#block-scope_variables_within_switch_statements
2021-05-06 10:21:05 +02:00
Jonas Jenwald
f93c3b9aa7 Enable the no-var rule in the src/core/evaluator.js file
These changes were made automatically, using `gulp lint --fix`.
2021-05-06 09:39:21 +02:00
Jonas Jenwald
0a32ad3e42 Remove unnecessary closure in the src/core/font_renderer.js file
With modern JavaScript modules, where you explicitly list the properties that should be exported, it's no longer necessary to wrap *all* of the code within one file into a top-level closure.[1]

This patch reduces the size, of even the *built* `pdf.worker.js` file, since there's now a lot less unnecessary whitespace.

---
[1] For files which contain *different* functionality, some closures may however still make sense in order to separate the code.
It might be possible to remove some of those cases later, e.g. once private class fields becomes generally available/usable in browsers.
2021-05-05 22:35:52 +02:00
Tim van der Meij
afb8c4fd25
Merge pull request #13327 from Snuffleupagus/split-fonts
Split the functionality in `src/core/fonts.js` into multiple files, and use standard classes
2021-05-05 20:16:24 +02:00
Jonas Jenwald
ce14171cf0 Convert src/core/jpx.js to use standard classes
*Please note:* Ignoring whitespace-only changes is probably necessary in order to review this.
2021-05-05 14:02:21 +02:00
Jonas Jenwald
cb65b762eb Fix the remaining no-var failures, which couldn't be handled automatically, in the src/core/jpx.js file 2021-05-05 14:02:21 +02:00
Jonas Jenwald
a273599a12 Enable the no-var rule in the src/core/jpx.js file
These changes were made automatically, using `gulp lint --fix`.
2021-05-05 14:02:21 +02:00
Jonas Jenwald
69dea39a42 Convert src/core/jpg.js to use standard classes
*Please note:* Ignoring whitespace-only changes is probably necessary in order to review this.
2021-05-05 14:02:21 +02:00
Jonas Jenwald
d0a299713c Fix the remaining no-var failures, which couldn't be handled automatically, in the src/core/jpg.js file 2021-05-05 14:02:21 +02:00
Jonas Jenwald
1e5a179600 Enable the no-var rule in the src/core/jpg.js file
These changes were made automatically, using `gulp lint --fix`.
2021-05-05 14:02:21 +02:00
Jonas Jenwald
0addf3a0d4 Convert src/core/jbig2.js to use standard classes
*Please note:* Ignoring whitespace-only changes is probably necessary in order to review this.
2021-05-05 14:02:21 +02:00
Jonas Jenwald
d59c9ab3ab Fix the remaining no-var failures, which couldn't be handled automatically, in the src/core/jbig2.js file 2021-05-05 14:02:21 +02:00
Jonas Jenwald
7ca3a34e1f Enable the no-var rule in the src/core/jbig2.js file
These changes were made automatically, using `gulp lint --fix`.
2021-05-05 14:02:21 +02:00
Jonas Jenwald
99fae47c8e [Regression] Move the super-call in the PredictorStream-constructor to prevent errors (PR 13303)
*My apologies for breaking this; thankfully PR 13303 hasn't reach mozilla-central yet.*

It's (obviously) necessary to initialize a `PredictorStream`-instance fully, since otherwise breakage may occur if there's errors during the actual stream parsing.
To reproduce this issue, try opening the PDF document from issue 13051 locally and observe the following message in the console:
```
Warning: Invalid stream: "ReferenceError: this hasn't been initialised - super() hasn't been called"
```
2021-05-05 13:24:12 +02:00
Calixte Denizet
549aae6c3d JS -- add support for page property in field 2021-05-03 15:46:29 +02:00
Jonas Jenwald
5e5daca407 Remove unnecessary MissingDataException check from getHeaderBlock
It shouldn't be possible for the `getBytes`-call to throw a `MissingDataException`, since all resources are loaded *before* e.g. font-parsing ever starts; see f0817015bd/src/core/object_loader.js (L111-L126)

Furthermore, even if we'd *somehow* re-throw a `MissingDataException` here that still won't help considering where the `Type1Font`-instance is created. Note how in the `Font`-constructor we simply catch any errors and fallback to a standard font, which means that a `MissingDataException` would just lead to rendering errors anyway; see f0817015bd/src/core/fonts.js (L648-L691)

All-in-all, it's not possible for a `MissingDataException` to be thrown in `getHeaderBlock` and this code-path can thus be removed.
2021-05-03 13:57:30 +02:00
Jonas Jenwald
b487edd05d Convert src/core/fonts.js to use standard classes
Obviously the `Font`-class is still *very* large, given particularly how TrueType fonts are handled, however this patch-series at least improves things by moving a number of functions/classes into their own files.
As a follow-up it might make sense to try and re-factor/extract the TrueType parsing into its own file, since all of this code is quite old, however that's probably best left for another time.

For e.g. `gulp mozcentral`, the *built* `pdf.worker.js` files decreases from `1 620 332` to `1 617 466` bytes with this patch-series.
2021-05-03 13:57:25 +02:00
Jonas Jenwald
cadc20d8b9 Fix the remaining no-var failures, which couldn't be handled automatically, in the src/core/fonts.js file 2021-05-02 21:00:29 +02:00
Jonas Jenwald
b9cd080c01 Enable the no-var rule in the src/core/fonts.js file
These changes were made automatically, using `gulp lint --fix`.
Given the large size of this patch, the manual fixes are done separately in the next commit.
2021-05-02 21:00:29 +02:00
Jonas Jenwald
f64b7922b3 Convert src/core/type1_font.js to use standard classes 2021-05-02 21:00:29 +02:00
Jonas Jenwald
4bd69556ab Enable the no-var rule in the src/core/type1_font.js file
These changes were made *mostly* automatically, using `gulp lint --fix`, with the following manual changes:

```diff
diff --git a/src/core/type1_font.js b/src/core/type1_font.js
index 50a3e49e6..55a2005fb 100644
--- a/src/core/type1_font.js
+++ b/src/core/type1_font.js
@@ -38,10 +38,9 @@ const Type1Font = (function Type1FontClosure() {
     const scanLength = streamBytesLength - signatureLength;

     let i = startIndex,
-      j,
       found = false;
     while (i < scanLength) {
-      j = 0;
+      let j = 0;
       while (j < signatureLength && streamBytes[i + j] === signature[j]) {
         j++;
       }
@@ -248,14 +247,14 @@ const Type1Font = (function Type1FontClosure() {
         return charCodeToGlyphId;
       }

-      let glyphNames = [".notdef"],
-        glyphId;
+      const glyphNames = [".notdef"];
+      let builtInEncoding, glyphId;
       for (glyphId = 0; glyphId < charstrings.length; glyphId++) {
         glyphNames.push(charstrings[glyphId].glyphName);
       }
       const encoding = properties.builtInEncoding;
       if (encoding) {
-        var builtInEncoding = Object.create(null);
+        builtInEncoding = Object.create(null);
         for (const charCode in encoding) {
           glyphId = glyphNames.indexOf(encoding[charCode]);
           if (glyphId >= 0
```
2021-05-02 21:00:29 +02:00
Jonas Jenwald
ff85bcfc0e Move the Type1Font from src/core/fonts.js and into its own file 2021-05-02 21:00:29 +02:00
Jonas Jenwald
e803584fe7 Convert src/core/cff_font.js to use standard classes 2021-05-02 21:00:29 +02:00
Jonas Jenwald
542ee0d798 Enable the no-var rule in the src/core/cff_font.js file
These changes were made automatically, using `gulp lint --fix`.
2021-05-02 21:00:29 +02:00
Jonas Jenwald
d5d73e3168 Move the CFFFont from src/core/fonts.js and into its own file 2021-05-02 21:00:29 +02:00
Jonas Jenwald
d4606712f2 Enable the no-var rule in the src/core/fonts_utils.js file
These changes were made *mostly* automatically, using `gulp lint --fix`, with the following manual changes:

```diff
diff --git a/src/core/fonts_utils.js b/src/core/fonts_utils.js
index f88ce4a8c..c4b3f3808 100644
--- a/src/core/fonts_utils.js
+++ b/src/core/fonts_utils.js
@@ -167,8 +167,8 @@ function type1FontGlyphMapping(properties, builtInEncoding,
glyphNames) {
   }

   // Lastly, merge in the differences.
-  let differences = properties.differences,
-    glyphsUnicodeMap;
+  const differences = properties.differences;
+  let glyphsUnicodeMap;
   if (differences) {
     for (charCode in differences) {
       const glyphName = differences[charCode];
```
2021-05-02 21:00:29 +02:00
Jonas Jenwald
77b258440b Move some constants and helper functions from src/core/fonts.js and into their own file
- `FontFlags`, is used in both `src/core/fonts.js` and `src/core/evaluator.js`.
 - `getFontType`, same as the above.
 - `MacStandardGlyphOrdering`, is a fairly large data-structure and `src/core/fonts.js` is already a *very* large file.
 - `recoverGlyphName`, a dependency of `type1FontGlyphMapping`; please see below.
 - `SEAC_ANALYSIS_ENABLED`, is used by both `Type1Font`, `CFFFont`, and unit-tests; please see below.
 - `type1FontGlyphMapping`, is used by both `Type1Font` and `CFFFont` which a later patch will move to their own files.
2021-05-02 21:00:29 +02:00
Jonas Jenwald
22539b52fa Convert src/core/to_unicode_map.js to use standard classes 2021-05-02 21:00:29 +02:00
Jonas Jenwald
33ea6b1131 Enable the no-var rule in the src/core/to_unicode_map.js file
These changes were made automatically, using `gulp lint --fix`.
2021-05-02 21:00:29 +02:00
Jonas Jenwald
6912bb5e0a Move the IdentityToUnicodeMap/ToUnicodeMap from src/core/fonts.js and into its own file 2021-05-02 21:00:29 +02:00
Jonas Jenwald
8c1d1a58f7 Convert src/core/opentype_file_builder.js to use standard classes 2021-05-02 21:00:28 +02:00
Jonas Jenwald
1808b2dc96 Enable the no-var rule in the src/core/opentype_file_builder.js file
These changes were made automatically, using `gulp lint --fix`.
2021-05-02 21:00:28 +02:00
Jonas Jenwald
a783c7ca79 Move the OpenTypeFileBuilder from src/core/fonts.js and into its own file 2021-05-02 21:00:28 +02:00
Tim van der Meij
af9feb1307
Merge pull request #13321 from timvandermeij/src-core-no-var
Enable the `no-var` linting rule in `src/core/{crypto,function}.js`
2021-05-02 13:45:33 +02:00
Tim van der Meij
b661cf2b80
Fix no-var linting rule violations in src/core/crypto.js that couldn't be changed automatically by ESLint
This is done in a separate commit due to the required number of changes
so that reviewing is easier than in a plain-text diff in the commit
message.
2021-05-02 13:32:34 +02:00
Tim van der Meij
1f8b452354
Enable the no-var linting rule in src/core/crypto.js
This is done automatically with `gulp lint --fix`.
2021-05-01 20:34:35 +02:00
Tim van der Meij
58e568fe62
Enable the no-var linting rule in src/core/function.js
This is done automatically with `gulp lint --fix` and the following
manual changes:

```diff
diff --git a/src/core/function.js b/src/core/function.js
index 878001057..b7e3e6ccf 100644
--- a/src/core/function.js
+++ b/src/core/function.js
@@ -131,7 +131,7 @@ function toNumberArray(arr) {
   return arr;
 }

-var PDFFunction = (function PDFFunctionClosure() {
+const PDFFunction = (function PDFFunctionClosure() {
   const CONSTRUCT_SAMPLED = 0;
   const CONSTRUCT_INTERPOLATED = 2;
   const CONSTRUCT_STICHED = 3;
@@ -484,7 +484,9 @@ var PDFFunction = (function PDFFunctionClosure() {
         // clip to domain
         const v = clip(src[srcOffset], domain[0], domain[1]);
         // calculate which bound the value is in
-        for (var i = 0, ii = bounds.length; i < ii; ++i) {
+        const length = bounds.length;
+        let i;
+        for (i = 0; i < length; ++i) {
           if (v < bounds[i]) {
             break;
           }
@@ -673,23 +675,21 @@ const PostScriptStack = (function PostScriptStackClosure() {
     roll(n, p) {
       const stack = this.stack;
       const l = stack.length - n;
-      let r = stack.length - 1,
-        c = l + (p - Math.floor(p / n) * n),
-        i,
-        j,
-        t;
-      for (i = l, j = r; i < j; i++, j--) {
-        t = stack[i];
+      const r = stack.length - 1;
+      const c = l + (p - Math.floor(p / n) * n);
+
+      for (let i = l, j = r; i < j; i++, j--) {
+        const t = stack[i];
         stack[i] = stack[j];
         stack[j] = t;
       }
-      for (i = l, j = c - 1; i < j; i++, j--) {
-        t = stack[i];
+      for (let i = l, j = c - 1; i < j; i++, j--) {
+        const t = stack[i];
         stack[i] = stack[j];
         stack[j] = t;
       }
-      for (i = c, j = r; i < j; i++, j--) {
-        t = stack[i];
+      for (let i = c, j = r; i < j; i++, j--) {
+        const t = stack[i];
         stack[i] = stack[j];
         stack[j] = t;
       }
@@ -939,7 +939,7 @@ class PostScriptEvaluator {
 // We can compile most of such programs, and at the same moment, we can
 // optimize some expressions using basic math properties. Keeping track of
 // min/max values will allow us to avoid extra Math.min/Math.max calls.
-var PostScriptCompiler = (function PostScriptCompilerClosure() {
+const PostScriptCompiler = (function PostScriptCompilerClosure() {
   class AstNode {
     constructor(type) {
       this.type = type;
```
2021-05-01 20:04:58 +02:00
Jonas Jenwald
3624f9eac7 Add a new BaseStream.getString(...) method to replace manual bytesToString(BaseStream.getBytes(...)) calls
Given that the `bytesToString(BaseStream.getBytes(...))` pattern is somewhat common throughout the `src/core/` code, it cannot hurt to add a new `BaseStream`-method which handles that case internally.
2021-05-01 19:20:36 +02:00
Tim van der Meij
f6f335173d
Merge pull request #13303 from Snuffleupagus/BaseStream
Add an abstract base-class, which all the various Stream implementations inherit from
2021-05-01 19:13:36 +02:00
calixteman
af4dc55019
[api-minor] Fix the way to chunk the strings (#13257)
- Improve chunking in order to fix some bugs where the spaces aren't here:
    * track the last position where a glyph has been drawn;
    * when a new glyph (first glyph in a chunk) is added then compare its position with the last saved one and add a space or break:
      - there are multiple ways to move the glyphs and to avoid to have to deal with all the different possibilities it's a way easier to just compare positions;
      - and so there is now one function (i.e. "compareWithLastPosition") where all the job is done.
  - Add some breaks in order to get lines;
  - Remove the multiple whites spaces:
    * some spaces were filled with several whites spaces and so it makes harder to find some sequences of words using the search tool;
    * other pdf readers replace spaces by one white space.

Update src/core/evaluator.js

Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com>

Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com>
2021-04-30 14:41:13 +02:00
Jonas Jenwald
2ac4ad3111 Let ChunkedStream extend Stream, rather than BaseStream directly
Looking at the `ChunkedStream` implementation, it's basically a "regular" `Stream` but with added functionality in order to deal with fetching/loading of missing data.
Hence, by letting `ChunkedStream` extend `Stream`, we can remove some duplicate methods from the `ChunkedStream` class.
2021-04-28 14:05:25 +02:00
Jonas Jenwald
fb0775525e Stop special-casing the dict parameter in the Jbig2Stream/JpegStream/JpxStream constructors
For all of the other `DecodeStream`s we're not passing in a `Dict`-instance manually, but instead get it from the `stream`-parameter. Hence there's no particularly good reason, as far as I can tell, to not do the same thing in `Jbig2Stream`/`JpegStream`/`JpxStream` as well.
2021-04-28 13:44:47 +02:00
Jonas Jenwald
67a1cfc1b1 Improve the handling getBaseStreams, on the various Stream implementations
The way that `getBaseStreams` is currently handled has bothered me from time to time, especially how we're checking if the method exists before calling it.
By adding a dummy `BaseStream.getBaseStreams` method, and having the call-sites simply check the return value, we can improve some of the relevant code.

Note in particular how the `ObjectLoader._walk` method didn't actually check that the data in question is a Stream instance, and instead only checked the `currentNode` (which could be anything) for the existence of a `getBaseStreams` property.
2021-04-28 13:44:47 +02:00
Jonas Jenwald
67415bfabe Add an abstract base-class, which all the various Stream implementations inherit from
By having an abstract base-class, it becomes a lot clearer exactly which methods/getters are expected to exist on all Stream instances.
Furthermore, since a number of the methods are *identical* for all Stream implementations, this reduces unnecessary code duplication in the `Stream`, `DecodeStream`, and `ChunkedStream` classes.

For e.g. `gulp mozcentral`, the *built* `pdf.worker.js` files decreases from `1 619 329` to `1 616 115` bytes with this patch-series.
2021-04-28 13:44:45 +02:00
Jonas Jenwald
6151b4ecac Convert src/core/stream.js to use standard classes 2021-04-28 13:44:10 +02:00
Jonas Jenwald
29cf415a69 Enable the no-var rule in the src/core/stream.js file 2021-04-28 10:16:51 +02:00
Jonas Jenwald
b11f012e52 Convert src/core/decode_stream.js to use standard classes 2021-04-28 10:16:51 +02:00
Jonas Jenwald
8ce2cae4a7 Enable the no-var rule in the src/core/decode_stream.js file 2021-04-28 10:16:51 +02:00
Jonas Jenwald
30a22a168d Move the DecodeStream and StreamsSequenceStream from src/core/stream.js and into its own file 2021-04-28 10:16:51 +02:00
Jonas Jenwald
213e1c389c Convert src/core/flate_stream.js to use standard classes 2021-04-28 10:16:51 +02:00
Jonas Jenwald
aa1deaf93c Enable the no-var rule in the src/core/flate_stream.js file 2021-04-28 10:16:51 +02:00
Jonas Jenwald
1e5bf352a5 Move the FlateStream from src/core/stream.js and into its own file 2021-04-28 10:16:51 +02:00
Jonas Jenwald
40c342ec6c Convert src/core/predictor_stream.js to use standard classes 2021-04-28 10:16:51 +02:00
Jonas Jenwald
b08f9a8182 Enable the no-var rule in the src/core/predictor_stream.js file 2021-04-28 10:16:51 +02:00
Jonas Jenwald
66d9d83dcb Move the PredictorStream from src/core/stream.js and into its own file 2021-04-28 10:16:51 +02:00
Jonas Jenwald
e938c05edb Convert src/core/decrypt_stream.js to use standard classes 2021-04-28 10:16:51 +02:00
Jonas Jenwald
a9476e7dd0 Enable the no-var rule in the src/core/decrypt_stream.js file 2021-04-28 10:16:51 +02:00
Jonas Jenwald
28b0809e60 Move the DecryptStream from src/core/stream.js and into its own file 2021-04-28 10:16:51 +02:00
Jonas Jenwald
cdb583b764 Convert src/core/ascii_85_stream.js to use standard classes 2021-04-28 10:16:51 +02:00
Jonas Jenwald
f6c7a65202 Enable the no-var rule in the src/core/ascii_85_stream.js file 2021-04-28 10:16:51 +02:00
Jonas Jenwald
3294d4d5a3 Move the Ascii85Stream from src/core/stream.js and into its own file 2021-04-28 10:16:51 +02:00
Jonas Jenwald
d2227a7d10 Convert src/core/ascii_hex_stream.js to use standard classes 2021-04-28 10:16:50 +02:00
Jonas Jenwald
59591f8788 Enable the no-var rule in the src/core/ascii_hex_stream.js file 2021-04-28 10:16:50 +02:00
Jonas Jenwald
d63df04854 Move the AsciiHexStream from src/core/stream.js and into its own file 2021-04-28 10:16:50 +02:00
Jonas Jenwald
704514c7cd Convert src/core/run_length_stream.js to use standard classes 2021-04-28 10:16:50 +02:00
Jonas Jenwald
66b898eb58 Enable the no-var rule in the src/core/run_length_stream.js file 2021-04-28 10:16:50 +02:00
Jonas Jenwald
342b0c1bbc Move the RunLengthStream from src/core/stream.js and into its own file 2021-04-28 10:16:50 +02:00
Jonas Jenwald
1f0685cee6 Convert src/core/lzw_stream.js to use standard classes 2021-04-28 10:16:50 +02:00
Jonas Jenwald
1f9b134c6a Enable the no-var rule in the src/core/src/core/lzw_stream.js file 2021-04-28 10:16:50 +02:00
Jonas Jenwald
6c1a321500 Move the LZWStream from src/core/stream.js and into its own file 2021-04-28 10:16:50 +02:00
Tim van der Meij
0acd801b1e
Merge pull request #13305 from timvandermeij/annotation-polygon-polyline-no-appearance-stream
Implement rendering polyline/polygon annotations without appearance stream
2021-04-27 20:03:35 +02:00
Tim van der Meij
60ab15427f
Implement rendering polyline/polygon annotations without appearance stream 2021-04-27 19:02:20 +02:00
Jonas Jenwald
0ecb42f4d7 Convert src/core/jpx_stream.js to use standard classes 2021-04-27 13:29:09 +02:00
Jonas Jenwald
c51ef1f21f Convert src/core/jbig2_stream.js to use standard classes 2021-04-27 13:29:09 +02:00
Jonas Jenwald
d9c1bf96b6 Convert src/core/jpeg_stream.js to use standard classes 2021-04-27 13:29:09 +02:00
Jonas Jenwald
0ca63f94b4 Convert src/core/ccitt_stream.js to use standard classes 2021-04-27 13:29:09 +02:00
Jonas Jenwald
8ff213871b Convert src/core/ccitt.js to use standard classes
Given that we're using modules, meaning that only explicitly `export`ed things are visible to the outside, it's no longer necessary to wrap all of the code in a closure.
2021-04-27 13:29:09 +02:00
Jonas Jenwald
6f4394fcd8
Support InkAnnotations without appearance streams (issue 13298) (#13301)
For now, we keep things purposely simple by using straight lines (rather than curves); please see https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G11.2096579
2021-04-27 11:49:03 +02:00
Tim van der Meij
270e56dae8
Enable the no-var linting rule in src/core/image.js
This is done automatically with `gulp lint --fix` and the following
manual changes:

```diff
diff --git a/src/core/image.js b/src/core/image.js
index 35c06b8ab..e718b9937 100644
--- a/src/core/image.js
+++ b/src/core/image.js
@@ -97,7 +97,7 @@ class PDFImage {
     if (isName(filter)) {
       switch (filter.name) {
         case "JPXDecode":
-          var jpxImage = new JpxImage();
+          const jpxImage = new JpxImage();
           jpxImage.parseImageProperties(image.stream);
           image.stream.reset();
```
2021-04-25 17:40:00 +02:00
Tim van der Meij
16efd09c9f
Enable the no-var linting rule in src/core/worker.js
This is done automatically with `gulp lint --fix` and the following
manual changes:

```diff
diff --git a/src/core/worker.js b/src/core/worker.js
index aec9c1d39..f88691622 100644
--- a/src/core/worker.js
+++ b/src/core/worker.js
@@ -300,7 +300,7 @@ class WorkerMessageHandler {
         cachedChunks = [];
       };
       const readPromise = new Promise(function (resolve, reject) {
-        var readChunk = function ({ value, done }) {
+        const readChunk = function ({ value, done }) {
           try {
             ensureNotTerminated();
             if (done) {
```
2021-04-25 17:40:00 +02:00
Tim van der Meij
85659b4cf0
Enable the no-var linting rule in src/core/cmap.js
This is done automatically with `gulp lint --fix` and the following
manual changes:

```diff
diff --git a/src/core/cmap.js b/src/core/cmap.js
index 850275a19..8794726dd 100644
--- a/src/core/cmap.js
+++ b/src/core/cmap.js
@@ -519,8 +519,8 @@ const BinaryCMapReader = (function BinaryCMapReaderClosure() {

     readHexNumber(num, size) {
       let last;
-      let stack = this.tmpBuf,
-        sp = 0;
+      const stack = this.tmpBuf;
+      let sp = 0;
       do {
         const b = this.readByte();
         if (b < 0) {
@@ -603,7 +603,6 @@ const BinaryCMapReader = (function BinaryCMapReaderClosure() {

         const ucs2DataSize = 1;
         const subitemsCount = stream.readNumber();
-        var i;
         switch (type) {
           case 0: // codespacerange
             stream.readHex(start, dataSize);
@@ -614,7 +613,7 @@ const BinaryCMapReader = (function BinaryCMapReaderClosure() {
               hexToInt(start, dataSize),
               hexToInt(end, dataSize)
             );
-            for (i = 1; i < subitemsCount; i++) {
+            for (let i = 1; i < subitemsCount; i++) {
               incHex(end, dataSize);
               stream.readHexNumber(start, dataSize);
               addHex(start, end, dataSize);
@@ -633,7 +632,7 @@ const BinaryCMapReader = (function BinaryCMapReaderClosure() {
             addHex(end, start, dataSize);
             stream.readNumber(); // code
             // undefined range, skipping
-            for (i = 1; i < subitemsCount; i++) {
+            for (let i = 1; i < subitemsCount; i++) {
               incHex(end, dataSize);
               stream.readHexNumber(start, dataSize);
               addHex(start, end, dataSize);
@@ -647,7 +646,7 @@ const BinaryCMapReader = (function BinaryCMapReaderClosure() {
             stream.readHex(char, dataSize);
             code = stream.readNumber();
             cMap.mapOne(hexToInt(char, dataSize), code);
-            for (i = 1; i < subitemsCount; i++) {
+            for (let i = 1; i < subitemsCount; i++) {
               incHex(char, dataSize);
               if (!sequence) {
                 stream.readHexNumber(tmp, dataSize);
@@ -667,7 +666,7 @@ const BinaryCMapReader = (function BinaryCMapReaderClosure() {
               hexToInt(end, dataSize),
               code
             );
-            for (i = 1; i < subitemsCount; i++) {
+            for (let i = 1; i < subitemsCount; i++) {
               incHex(end, dataSize);
               if (!sequence) {
                 stream.readHexNumber(start, dataSize);
@@ -692,7 +691,7 @@ const BinaryCMapReader = (function BinaryCMapReaderClosure() {
               hexToInt(char, ucs2DataSize),
               hexToStr(charCode, dataSize)
             );
-            for (i = 1; i < subitemsCount; i++) {
+            for (let i = 1; i < subitemsCount; i++) {
               incHex(char, ucs2DataSize);
               if (!sequence) {
                 stream.readHexNumber(tmp, ucs2DataSize);
@@ -717,7 +716,7 @@ const BinaryCMapReader = (function BinaryCMapReaderClosure() {
               hexToInt(end, ucs2DataSize),
               hexToStr(charCode, dataSize)
             );
-            for (i = 1; i < subitemsCount; i++) {
+            for (let i = 1; i < subitemsCount; i++) {
               incHex(end, ucs2DataSize);
               if (!sequence) {
                 stream.readHexNumber(start, ucs2DataSize);
```
2021-04-25 17:40:00 +02:00
Jonas Jenwald
da22146b95 Replace a bunch of Array.prototype.forEach() cases with for...of loops instead
Using `for...of` is a modern and generally much nicer pattern, since it gets rid of unnecessary callback-functions. (In a couple of spots, a "regular" `for` loop had to be used.)
2021-04-24 13:00:19 +02:00
Jonas Jenwald
4ec0a4fb43 Re-factor the Catalog._collectJavaScript method slightly
This patch first of all moves all checking/validation into the `appendIfJavaScriptDict` function, to avoid duplicating it in multiple places. Secondly, also removes what's now an outdated/incorrect comment since we have implemented scripting support.
2021-04-23 09:42:32 +02:00
Jonas Jenwald
83f7009e4b Change NameOrNumberTree.getAll to return a Map rather than an Object
Given that we're (almost) always iterating through the result of the `getAll`-calls, using a `Map` seems nicer overall since it's more suited to iteration compared to a regular Object.

Also, add a couple of `Dict`-checks in existing code touched by this patch, since it really cannot hurt to prevent *potential* errors in a corrupt PDF document.
2021-04-22 13:15:50 +02:00
Jonas Jenwald
57a1ea840f
Ensure that saveDocument works if there's no /ID-entry in the PDF document (issue 13279) (#13280)
First of all, while it should be very unlikely that the /ID-entry is an *indirect* object, note how we're using `Dict.get` when parsing it e.g. in `PDFDocument.fingerprint`. Hence we definitely should be consistent here, since if the /ID-entry is an *indirect* object the existing code in `src/core/writer.js` would already fail.
Secondly, to fix the referenced issue, we also need to check that the /ID-entry actually is an Array before attempting to access its contents in `src/core/writer.js`.

*Drive-by change:* In the `xrefInfo` object passed to the `incrementalUpdate` function, re-name the `encrypt` property to `encryptRef` since its data is fetched using `Dict.getRaw` (given the names of the other properties fetched similarly).
2021-04-22 12:08:56 +02:00
Brendan Dahl
066cbcfb27
Merge pull request #13277 from Snuffleupagus/adjustToUnicode-cff
For CFF fonts without proper `ToUnicode`/`Encoding` data, utilize the "charset"/"Encoding"-data from the font file to improve text-selection (issue 13260)
2021-04-21 10:41:36 -07:00
Jonas Jenwald
7fab73ed23 For CFF fonts without proper ToUnicode/Encoding data, utilize the "charset"/"Encoding"-data from the font file to improve text-selection (issue 13260)
This patch extends the approach, implemented in PR 7550, to also apply to CFF fonts.
2021-04-20 20:48:44 +02:00
Jonas Jenwald
8f6543c218 Ensure that the /Properties, used with optional content, is actually loaded *before* parsing the operatorList/textContent (PR 12095 follow-up)
By not waiting for the /Properties to load, before parsing of the operatorList/textContent starts, there's a very real risk that a `MissingDataException` will be thrown when trying to access the data in the `PartialEvaluator.parseMarkedContentProps` method.
If this ever happens it will thus lead to incomplete and/or outright broken rendering, and with e.g. `disableAutoFetch=true` set the likelihood of this occuring would increase quite a bit.

*Please note:* While I've not yet seen this error in an actual PDF document, it can happen during loading if you're unlucky enough with e.g. the structure of the PDF document and/or the download speed offered by the server.
2021-04-20 20:22:44 +02:00
Jonas Jenwald
f560fe6875 A couple of small scripting/XFA-related tweaks in the worker-code
- Use `PDFManager.ensureDoc`, rather than `PDFManager.ensure`, in a couple of spots in the code. If there exists a short-hand format, we should obviously use it whenever possible.

 - Fix a unit-test helper, to account for the previous changes. (Also, converts a function to be `async` instead.)

 - Add one more exists-check in `PDFDocument.loadXfaFonts`, which I missed to suggest in PR 13146, to prevent any possible errors if the method is ever called in a situation where it shouldn't be.
   Also, print a warning if the actual font-loading fails since that could help future debugging. (Finally, reduce overall indentation in the loop.)

 - Slightly unrelated, but make a small tweak of a comment in `src/core/fonts.js` to reduce possible confusion.
2021-04-17 10:34:22 +02:00
Brendan Dahl
ac3fa1e3d7
Merge pull request #13146 from calixteman/xfa_fonts
XFA -- Load fonts permanently from the pdf
2021-04-16 12:55:12 -07:00
Calixte Denizet
7e9579045f XFA -- Load fonts permanently from the pdf
- Different fonts can be used in xfa and some of them are embedded in the pdf.
  - Load all the fonts in window.document.

Update src/core/document.js

Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com>

Update src/core/worker.js

Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com>
2021-04-15 17:57:42 +02:00
Jani Pehkonen
3a96977ea8 Implement visibility expressions for optional content 2021-04-14 17:39:41 +03:00
Jonas Jenwald
1d6d476cab Rename the src/core/obj.js file to src/core/catalog.js
Now that only the `Catalog` remains in this file, after the previous patches, it makes sense to rename the file to reduce confusion.
2021-04-13 21:00:30 +02:00
Jonas Jenwald
088a55f80d Enable the no-var rule in the src/core/xref.js file 2021-04-13 21:00:30 +02:00
Jonas Jenwald
bc828cd41f Convert the XRef to a "normal" class 2021-04-13 21:00:30 +02:00
Jonas Jenwald
e8750cfe95 Move the XRef from src/core/obj.js and into its own file
The size of the `src/core/obj.js` file has increased slowly over the years, and it also contains a fair amount of *distinct* functionality.
In order to improve readability and make it easier to navigate through the code, this patch moves the `XRef` into its own file.
2021-04-13 21:00:30 +02:00
Jonas Jenwald
24e5ecdf76 Move NameTree/NumberTree from src/core/obj.js and into its own file
The size of the `src/core/obj.js` file has increased slowly over the years, and it also contains a fair amount of *distinct* functionality.
In order to improve readability and make it easier to navigate through the code, this patch moves `NameTree`/`NumberTree` into its own file.
2021-04-13 21:00:30 +02:00
Jonas Jenwald
92141e0468 Enable the no-var rule in the src/core/file_spec.js file 2021-04-13 21:00:30 +02:00
Jonas Jenwald
22a066e657 Convert the FileSpec to a "normal" class 2021-04-13 21:00:30 +02:00
Jonas Jenwald
e02d17da93 Move the FileSpec from src/core/obj.js and into its own file
The size of the `src/core/obj.js` file has increased slowly over the years, and it also contains a fair amount of *distinct* functionality.
In order to improve readability and make it easier to navigate through the code, this patch moves the `FileSpec` into its own file.
2021-04-13 21:00:30 +02:00
Jonas Jenwald
6a935682fd Covert the ObjectLoader to a "normal" class 2021-04-13 21:00:30 +02:00
Jonas Jenwald
604cd6d600 Move the ObjectLoader from src/core/obj.js and into its own file
The size of the `src/core/obj.js` file has increased slowly over the years, and it also contains a fair amount of *distinct* functionality.
In order to improve readability and make it easier to navigate through the code, this patch moves the `ObjectLoader` into its own file.
2021-04-13 21:00:30 +02:00
Tim van der Meij
ebeb3f7999
Merge pull request #13234 from Snuffleupagus/hasJSActions-MissingDataException
[api-minor] Ensure that `PDFDocumentProxy.hasJSActions` won't fail if `MissingDataException`s are thrown during the associated worker-thread parsing
2021-04-13 20:44:58 +02:00
Tim van der Meij
3d2d8002b0
Merge pull request #13223 from Snuffleupagus/worker-xfa-structTree-tweaks
Remove the unused "GetIsPureXfa" message handler; and avoid unnecessary parsing when no structTree is available (PR 13069 follow-up, PR 13221 follow-up)
2021-04-13 20:39:52 +02:00
Jonas Jenwald
2b2234fd5a [api-minor] Ensure that PDFDocumentProxy.hasJSActions won't fail if MissingDataExceptions are thrown during the associated worker-thread parsing
With the current implementation of `PDFDocument.hasJSActions`, in the worker-thread, we're not actually handling not-yet-loaded data correctly. This can thus fail in *two* different ways:
 - The `PDFDocument.fieldObjects` getter (and its helper method), while it may *return* a Promise, still fetches all of its data synchronously and it can thus throw a `MissingDataException` during parsing.
 - The `Catalog.jsActions` getter, which is completely synchronous, can obviously throw a `MissingDataException` during parsing.

If either of these cases occur currently, the `PDFDocumentProxy.hasJSActions` method in the API can either return a *rejected* Promise (which it never should) or possibly "hang" and never resolve.

*Please note:* While I've not *yet* seen this error in an actual PDF document, it can happen during loading if you're unlucky enough with e.g. the structure of the PDF document and/or the download speed offered by the server.
This patch is thus based on code-inspection *and* on manually throwing a `MissingDataException` on the first access of `Catalog.jsActions` to simulate this situation.

Finally, this patch adds a couple of *API* unit-tests for this (since none existed).
2021-04-13 14:33:56 +02:00
Jonas Jenwald
4aa27cc645 Re-factor Catalog._collectJavaScript to use a Map rather than an Object
Given that this only an internal helper method, used by the `Catalog.{javaScript, jsActions}` getters, this change simplifies iteration of the returned data.
We can also (slightly) re-factor the code of the `jsActions` getter, and remove an obsolete[1] JSDoc-comment from the `openAction` getter.

---
[1] Not really relevant now that we've got proper scripting support.
2021-04-13 14:16:17 +02:00
Calixte Denizet
a4c986515f XFA -- Display text content
- display xhtml;
  - allow spaces in xhtml (xfa-spacerun:yes);
  - support column layout;
  - fix some border issues.
2021-04-12 14:13:49 +02:00
Jonas Jenwald
54ef4370a2 Ensure that the data is loaded, in the "GetPageJSActions" message handler
Similar to all other data accesses, note e.g. the "GetDocJSActions" handler just above, we need to ensure that a `MissingDataException` isn't propagated to the main-thread if this data is accessed while the PDF document is still loading.
2021-04-12 13:54:37 +02:00
Jonas Jenwald
9360c7cbdc Avoid unnecessary parsing, in Page.GetStructTree, when no structTree is available (PR 13221 follow-up)
It's obviously (a bit) more efficient to return early in `Page.getStructTree`, rather than trying to first "parse" an *empty* structTree-root.

*Somehow I didn't think of this yesterday, but this feels like a much better solution overall; sorry about the churn here!*
2021-04-12 08:54:21 +02:00
Jonas Jenwald
0d2dd6c2fe Remove the unused "GetIsPureXfa" message handler in the worker (PR 13069 follow-up)
Looking at the API, there's no code which actually sends this message. Most likely it's a left-over from a previous version of PR 13069, since the `isPureXfa` parameter is being included in the "GetDoc" message.
2021-04-12 08:52:27 +02:00
Jonas Jenwald
5adee0cdd1 [api-minor] Let PDFPageProxy.getStructTree return null, rather than an empty structTree, for documents without any accessibility data (PR 13171 follow-up)
This is first of all consistent with existing API-methods, where we return `null` when the data in question doesn't exist. Secondly, it should also be (slightly) more efficient since there's less dummy-data that we need to transfer between threads.
Finally, this prevents us from adding an empty/unnecessary span to *every* single page even in documents without any structure tree data.
2021-04-11 12:35:33 +02:00
Jonas Jenwald
ff4dae05b0 Ensure that getStructTree won't break with disableAutoFetch = true set (PR 13171 follow-up)
Open http://localhost:8888/web/viewer.html?file=/test/pdfs/pdf.pdf#disableStream=true&disableAutoFetch=true and observe the following message in the console (repeated for each page of the document):
```
Uncaught (in promise)
Object { message: "Missing data [19787293, 19787294)", name: "UnknownErrorException", details: "MissingDataException: Missing data [19787293, 19787294)", stack: "BaseExceptionClosure@http://localhost:8888/src/shared/util.js:458:29\n@http://localhost:8888/src/shared/util.js:462:3\n" }
```
2021-04-11 12:15:33 +02:00
Tim van der Meij
d9d626a5e1
Merge pull request #13214 from calixteman/signatures
Display widget signature
2021-04-10 19:35:16 +02:00
Calixte Denizet
5875ebb1ca Display widget signature
- but don't validate them for now;
  - Firefox will display a bar to warn that the signature validation is not supported (see https://bugzilla.mozilla.org/show_bug.cgi?id=854315)
  - almost all (all ?) pdf readers display signatures;
  - validation is done in edge but for now it's behind a pref.
2021-04-10 19:13:28 +02:00
Brendan Dahl
fc9501a637 Add support for basic structure tree for accessibility.
When a PDF is "marked" we now generate a separate DOM that represents
the structure tree from the PDF.  This DOM is inserted into the <canvas>
element and allows screen readers to walk the tree and have more
information about headings, images, links, etc. To link the structure
tree DOM (which is empty) to the text layer aria-owns is used. This
required modifying the text layer creation so that marked items are
now tracked.
2021-04-09 09:56:28 -07:00
Tim van der Meij
6429ccc002
Merge pull request #13194 from Snuffleupagus/ttcf-fuzzy-match
Fuzzy-match the fontName, for TrueType Collection fonts, where the "name"-table is wrong (issue 13193)
2021-04-07 20:50:19 +02:00
Jonas Jenwald
f986ccdf0e Fuzzy-match the fontName, for TrueType Collection fonts, where the "name"-table is wrong (issue 13193)
The fontName, as defined in the PDF document, cannot be found in *any* of the "name"-tables in the TrueType Collection font. To work-around that, this patch adds a *fallback* code-path to allow using an approximately matching fontName rather than outright failing.
2021-04-07 15:25:32 +02:00
Tim van der Meij
fc0cd4a443
Convert the startXRefParsedCache variable, in src/core/obj.js, from an object to a set
We only want to track XRef starting points instead of actual data, so
using a set conveys that intention more clearly and is slightly more
efficient.
2021-04-05 19:32:58 +02:00
Jonas Jenwald
68d3a333ac Change the seenStyles object, in PartialEvaluator.getTextContent, to a Set
Given that what we actually want is only to keep track of the loadedFont-names, rather than storing any actual data, using an object isn't really necessary here. Furthermore, in the current code, we're also using `in` when checking if the data exists, which is generally less efficient than just checking for the value directly.
2021-04-05 10:34:02 +02:00
Jonas Jenwald
0eb1433c78 [api-minor] Change the format of the fontName-property, in defaultAppearanceData, on Annotation-instances (PR 12831 follow-up)
Currently the `fontName`-property contains an actual /Name-instance, which is a problem given that its fallback value is an empty string; see ca7f546828/src/core/default_appearance.js (L35)
The reason that this is a problem can be seen in ca7f546828/src/core/primitives.js (L30-L34), since an empty string short-circuits the cache. Essentially, in PDF documents, a /Name-instance cannot be empty and the way that the `DefaultAppearanceEvaluator` does things is unfortunately not entirely correct.

Hence the `fontName`-property is changed to instead contain a string, rather than a /Name-instance, which simplifies the code overall.

*Please note:* I'm tagging this patch with "[api-minor]", since PR 12831 is included in the current pre-release (although we're not using the `fontName`-property in the display-layer).
2021-04-01 16:47:30 +02:00
Tim van der Meij
5a64157a2f
Merge pull request #13168 from janpe2/ttf-uni-glyphs
Use post table when Encoding has only Differences
2021-03-31 21:35:13 +02:00
Tim van der Meij
1a4af17d07
Merge pull request #13165 from Snuffleupagus/Annotation-rm-defaultAppearance-export
[api-minor] Stop exposing the *raw* `defaultAppearance`-string on Annotation-instances
2021-03-31 21:30:50 +02:00
Jani Pehkonen
0117ee5071 Use post table when Encoding has only Differences
Fixes #13107
In the issue, some TrueType glyph names have the format `uniXXXX`.
Font's `Encoding` dictionary has the entry `Differences` but no
`BaseEncoding`. `uniXXXX` names are converted to glyph indices
using font's `post` table but currently that is done only when
`BaseEncoding` exists. We must enable the conversion also when only
`Differences` exists.
2021-03-31 17:58:44 +03:00
calixteman
b3528868c1
XFA - Add support for few ui elements (#13115)
- input;
  - layout;
  - border;
  - margin;
  - color.
2021-03-31 15:42:21 +02:00
Jonas Jenwald
3df24254e3 [api-minor] Stop exposing the *raw* defaultAppearance-string on Annotation-instances
The reasons for making this change are:
 - This property is not, nor has it ever been, used anywhere in the PDF.js display-layer.
 - Related to the previous point, the format of the `defaultAppearance`-string is such that it'd be difficult to use it as-is in the display-layer anyway.
 - It (usually) contains the "raw" appearance-string, from the PDF document, which is neither parsed nor validated and could thus be bogus.
 - We now expose a `defaultAppearanceData`-property, which is first of all used in the display-layer and secondly contains actually parsed/validated data.
 - In the event that a third-party implementation needs the `defaultAppearance`-string, it could be easily constructed from the recently added `defaultAppearanceData`-property.

All-in-all, I'm thus suggesting that we stop exposing an unused and unnecessary property on all Annotation-instances.
2021-03-31 15:09:18 +02:00
Jonas Jenwald
38acde8375 Use template strings, to reduce unnecessary verbosity in a few warn(...) calls in src/core/annotation.js 2021-03-31 14:40:21 +02:00
calixteman
84d7cccb1d
JS - Handle correctly hierarchy of fields (#13133)
* JS - Handle correctly hierarchy of fields
  - it aims to fix #13132;
  - annotations can inherit their actions from the parent field;
  - there are some fields which act as a container for other fields:
    - they can be access through js so need to add them with an empty type (nothing in the spec about that but checked in Acrobat);
    - calculation order list (CO) can reference them so need make them through this.getField;
    - getArray method must return kids.
  - field values are number, string, ... depending of their type but nothing in the spec on how to know what's the type:
    - according to the comment for Canonical Format: https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#page=461
    - it seems that this "type" can be guessed from js action Format (when setting a type in Acrobat DC, the only affected thing is this action).
  - util.scand with an empty string returns the current date.
2021-03-30 08:50:35 -07:00
Calixte Denizet
9296ee6986 Skip extra objects in object stream in using offsets 2021-03-28 13:03:05 +02:00
calixteman
81c602c61c
Set CFF header to 4 when writing it because it contains 4 elements (#13149) 2021-03-26 18:23:18 +01:00
calixteman
63471bcbbe
XFA - Convert some template properties into CSS ones (#13082)
- implement few positioning properties: position, width, height, anchor;
  - implement font element;
  - implement fill element (used by font) and its children (linear, radial, ...);
  - font property is inherited from ancestor container (see https://www.pdfa.org/wp-content/uploads/2020/07/XFA-3_3.pdf#page=43) so let CSS handles that stuff;
  - in order to reduce the number of properties to set, only set non default properties and put the default in CSS;
  - set a background to some containers to be able to see them (will be removed in a future commit).
2021-03-25 13:02:39 +01:00
Tim van der Meij
8269ddbd16
Merge pull request #13105 from Snuffleupagus/BasePdfManager-parseDocBaseUrl
Improve memory usage around the `BasePdfManager.docBaseUrl` parameter (PR 7689 follow-up)
2021-03-19 23:03:20 +01:00
calixteman
24e598a895
XFA - Add a layer to display XFA forms (#13069)
- add an option to enable XFA rendering if any;
  - for now, let the canvas layer: it could be useful to implement XFAF forms (embedded pdf in xml stream for the background and xfa form for the foreground);
  - ui elements in template DOM are pretty close to their html counterpart so we generate a fake html DOM from template one:
    - it makes easier to translate template properties to html ones;
    - it makes faster the creation of the html element in the main thread.
2021-03-19 10:11:40 +01:00
Jonas Jenwald
c4c7216171 Improve memory usage around the BasePdfManager.docBaseUrl parameter (PR 7689 follow-up)
While there is nothing *outright* wrong with the existing implementation, it can however lead to increased memory usage in one particular case (that I completely overlooked when implementing this):
For "data:"-URLs, which by definition contains the entire PDF document and can thus be arbitrarily large, we obviously want to avoid sending, storing, and/or logging the "raw" docBaseUrl in that case.

To address this, this patch makes the following changes:
 - Ignore any non-string in the `docBaseUrl` option passed to `getDocument`, since those are unsupported anyway, already on the main-thread.

 - Ignore "data:"-URLs in the `docBaseUrl` option passed to `getDocument`, to avoid having to send what could potentially be a *very* long string to the worker-thread.

 - Parse the `docBaseUrl` option *directly* in the `BasePdfManager`-constructors, on the worker-thread, to avoid having to store the "raw" docBaseUrl in the first place.
2021-03-17 15:48:24 +01:00
Jonas Jenwald
5099f1977f Support LineAnnotations with empty /Rect-entries (issue 6564)
This extends PR 13033 slightly, with a heuristic to support corrupt PDF documents where the `LineAnnotation`s have an empty /Rect-entry. Please note that while I have no idea if this is "correct", this patch at least makes us output the same /BBox as re-saving in Adobe Reader does.
2021-03-15 16:33:43 +01:00
Tim van der Meij
a2f0573a64
Enable the no-var linting rule for src/core/operator_list.js
This is mostly done using `gulp lint --fix` with a few manual changes in
the following diff:

```diff
diff --git a/src/core/operator_list.js b/src/core/operator_list.js
index 66c26fe05..cbcd12d97 100644
--- a/src/core/operator_list.js
+++ b/src/core/operator_list.js
@@ -40,7 +40,8 @@ const QueueOptimizer = (function QueueOptimizerClosure() {
     // 'count' groups of (save, transform, paintImageMaskXObject, restore)+
     // have been found at iFirstSave.
     const iFirstPIMXO = iFirstSave + 2;
-    for (var i = 0; i < count; i++) {
+    let i;
+    for (i = 0; i < count; i++) {
       const arg = argsArray[iFirstPIMXO + 4 * i];
       const imageMask = arg.length === 1 && arg[0];
       if (
@@ -106,8 +107,8 @@ const QueueOptimizer = (function QueueOptimizerClosure() {
       // assuming that heights of those image is too small (~1 pixel)
       // packing as much as possible by lines
       let maxX = 0;
-      let map = [],
-        maxLineHeight = 0;
+      const map = [];
+      let maxLineHeight = 0;
       let currentX = IMAGE_PADDING,
         currentY = IMAGE_PADDING;
       let q;
@@ -326,9 +327,9 @@ const QueueOptimizer = (function QueueOptimizerClosure() {
           if (fnArray[i] !== OPS.transform) {
             return false;
           }
-          var iFirstTransform = context.iCurr - 2;
-          var firstTransformArg0 = argsArray[iFirstTransform][0];
-          var firstTransformArg3 = argsArray[iFirstTransform][3];
+          const iFirstTransform = context.iCurr - 2;
+          const firstTransformArg0 = argsArray[iFirstTransform][0];
+          const firstTransformArg3 = argsArray[iFirstTransform][3];
           if (
             argsArray[i][0] !== firstTransformArg0 ||
             argsArray[i][1] !== 0 ||
@@ -342,8 +343,8 @@ const QueueOptimizer = (function QueueOptimizerClosure() {
           if (fnArray[i] !== OPS.paintImageXObject) {
             return false;
           }
-          var iFirstPIXO = context.iCurr - 1;
-          var firstPIXOArg0 = argsArray[iFirstPIXO][0];
+          const iFirstPIXO = context.iCurr - 1;
+          const firstPIXOArg0 = argsArray[iFirstPIXO][0];
           if (argsArray[i][0] !== firstPIXOArg0) {
             return false; // images don't match
           }
@@ -423,9 +424,9 @@ const QueueOptimizer = (function QueueOptimizerClosure() {
           if (fnArray[i] !== OPS.showText) {
             return false;
           }
-          var iFirstSetFont = context.iCurr - 3;
-          var firstSetFontArg0 = argsArray[iFirstSetFont][0];
-          var firstSetFontArg1 = argsArray[iFirstSetFont][1];
+          const iFirstSetFont = context.iCurr - 3;
+          const firstSetFontArg0 = argsArray[iFirstSetFont][0];
+          const firstSetFontArg1 = argsArray[iFirstSetFont][1];
           if (
             argsArray[i][0] !== firstSetFontArg0 ||
             argsArray[i][1] !== firstSetFontArg1
```
2021-03-14 11:49:31 +01:00
Tim van der Meij
24ff738e7b
Enable the no-var linting rule for src/core/pattern.js
This is mostly done using `gulp lint --fix` with a few manual changes in
the following diff:

```diff
diff --git a/src/core/pattern.js b/src/core/pattern.js
index 365491ed3..eedd8b686 100644
--- a/src/core/pattern.js
+++ b/src/core/pattern.js
@@ -105,7 +105,7 @@ const Pattern = (function PatternClosure() {
   return Pattern;
 })();

-var Shadings = {};
+const Shadings = {};

 // A small number to offset the first/last color stops so we can insert ones to
 // support extend. Number.MIN_VALUE is too small and breaks the extend.
@@ -597,16 +597,15 @@ Shadings.Mesh = (function MeshClosure() {
       if (!(0 <= f && f <= 3)) {
         throw new FormatError("Unknown type6 flag");
       }
-      var i, ii;
       const pi = coords.length;
-      for (i = 0, ii = f !== 0 ? 8 : 12; i < ii; i++) {
+      for (let i = 0, ii = f !== 0 ? 8 : 12; i < ii; i++) {
         coords.push(reader.readCoordinate());
       }
       const ci = colors.length;
-      for (i = 0, ii = f !== 0 ? 2 : 4; i < ii; i++) {
+      for (let i = 0, ii = f !== 0 ? 2 : 4; i < ii; i++) {
         colors.push(reader.readComponents());
       }
-      var tmp1, tmp2, tmp3, tmp4;
+      let tmp1, tmp2, tmp3, tmp4;
       switch (f) {
         // prettier-ignore
         case 0:
@@ -729,16 +728,15 @@ Shadings.Mesh = (function MeshClosure() {
       if (!(0 <= f && f <= 3)) {
         throw new FormatError("Unknown type7 flag");
       }
-      var i, ii;
       const pi = coords.length;
-      for (i = 0, ii = f !== 0 ? 12 : 16; i < ii; i++) {
+      for (let i = 0, ii = f !== 0 ? 12 : 16; i < ii; i++) {
         coords.push(reader.readCoordinate());
       }
       const ci = colors.length;
-      for (i = 0, ii = f !== 0 ? 2 : 4; i < ii; i++) {
+      for (let i = 0, ii = f !== 0 ? 2 : 4; i < ii; i++) {
         colors.push(reader.readComponents());
       }
-      var tmp1, tmp2, tmp3, tmp4;
+      let tmp1, tmp2, tmp3, tmp4;
       switch (f) {
         // prettier-ignore
         case 0:
@@ -897,7 +895,7 @@ Shadings.Mesh = (function MeshClosure() {
         decodeType4Shading(this, reader);
         break;
       case ShadingType.LATTICE_FORM_MESH:
-        var verticesPerRow = dict.get("VerticesPerRow") | 0;
+        const verticesPerRow = dict.get("VerticesPerRow") | 0;
         if (verticesPerRow < 2) {
           throw new FormatError("Invalid VerticesPerRow");
         }
```
2021-03-14 11:43:05 +01:00
Jonas Jenwald
5b5061afa8 Enable the ESLint no-var rule globally
A significant portion of the code-base has now been converted to use `let`/`const`, rather than `var`, hence it should be possible to simply enable the ESLint `no-var` rule globally.
This way we can ensure that new code won't accidentally use `var`, and it also removes the need to manually enable the rule in various folders.

Obviously it makes sense to continue the efforts to replace `var`, but that should probably happen on a file and/or folder basis.

Please note that this patch excludes the following code:
 - The `extensions/` folder, since that seemed easiest for now (and I don't know exactly what the support situation is for the Chromium-extension).

 - The entire `external/` folder is ignored, since most of it's currently excluded from linting.
   For the code that isn't imported from elsewhere (and should be ignored), we should probably (at some point) bring the code up to the same linting/formatting standard as the rest of the code-base.

 - Various files in the `test/` folder are ignored, as necessary, since the way that a lot of this code is loaded will require some care (or perhaps larger re-factoring) when removing `var` usage.
2021-03-13 16:12:53 +01:00
Jonas Jenwald
ab91f42a5e Convert code in src/core/type1_parser.js to use "standard" classes
All of this code predates the existence of native JS classes, however we can now clean this up a little bit.
2021-03-12 12:16:50 +01:00
Jonas Jenwald
8fc8dc020e Enable the ESLint no-var rule in the src/core/type1_parser.js file
Note that the majority of these changes were done automatically, by using `gulp lint --fix`, and the manual changes were limited to the following diff:

```diff
diff --git a/src/core/type1_parser.js b/src/core/type1_parser.js
index 192781de1..05c5fe2e5 100644
--- a/src/core/type1_parser.js
+++ b/src/core/type1_parser.js
@@ -251,7 +251,7 @@ const Type1CharString = (function Type1CharStringClosure() {
               // vhea tables reconstruction -- ignoring it.
               this.stack.pop(); // wy
               wx = this.stack.pop();
-              var sby = this.stack.pop();
+              const sby = this.stack.pop();
               sbx = this.stack.pop();
               this.lsb = sbx;
               this.width = wx;
@@ -263,8 +263,8 @@ const Type1CharString = (function Type1CharStringClosure() {
                 error = true;
                 break;
               }
-              var num2 = this.stack.pop();
-              var num1 = this.stack.pop();
+              const num2 = this.stack.pop();
+              const num1 = this.stack.pop();
               this.stack.push(num1 / num2);
               break;
             case (12 << 8) + 16: // callothersubr
@@ -273,7 +273,7 @@ const Type1CharString = (function Type1CharStringClosure() {
                 break;
               }
               subrNumber = this.stack.pop();
-              var numArgs = this.stack.pop();
+              const numArgs = this.stack.pop();
               if (subrNumber === 0 && numArgs === 3) {
                 const flexArgs = this.stack.splice(this.stack.length - 17, 17);
                 this.stack.push(
@@ -397,9 +397,9 @@ const Type1Parser = (function Type1ParserClosure() {
     if (discardNumber >= data.length) {
       return new Uint8Array(0);
     }
+    const c1 = 52845,
+      c2 = 22719;
     let r = key | 0,
-      c1 = 52845,
-      c2 = 22719,
       i,
       j;
     for (i = 0; i < discardNumber; i++) {
@@ -416,9 +416,9 @@ const Type1Parser = (function Type1ParserClosure() {
   }

   function decryptAscii(data, key, discardNumber) {
-    let r = key | 0,
-      c1 = 52845,
+    const c1 = 52845,
       c2 = 22719;
+    let r = key | 0;
     const count = data.length,
       maybeLength = count >>> 1;
     const decrypted = new Uint8Array(maybeLength);
@@ -429,7 +429,7 @@ const Type1Parser = (function Type1ParserClosure() {
         continue;
       }
       i++;
-      var digit2;
+      let digit2;
       while (i < count && !isHexDigit((digit2 = data[i]))) {
         i++;
       }
@@ -599,7 +599,7 @@ const Type1Parser = (function Type1ParserClosure() {
               if (token !== "/") {
                 continue;
               }
-              var glyph = this.getToken();
+              const glyph = this.getToken();
               length = this.readInt();
               this.getToken(); // read in 'RD' or '-|'
               data = length > 0 ? stream.getBytes(length) : new Uint8Array(0);
@@ -638,7 +638,7 @@ const Type1Parser = (function Type1ParserClosure() {
           case "OtherBlues":
           case "FamilyBlues":
           case "FamilyOtherBlues":
-            var blueArray = this.readNumberArray();
+            const blueArray = this.readNumberArray();
             // *Blue* values may contain invalid data: disables reading of
             // those values when hinting is disabled.
             if (
@@ -672,7 +672,7 @@ const Type1Parser = (function Type1ParserClosure() {
       }

       for (let i = 0; i < charstrings.length; i++) {
-        glyph = charstrings[i].glyph;
+        const glyph = charstrings[i].glyph;
         encoded = charstrings[i].encoded;
         const charString = new Type1CharString();
         const error = charString.convert(
@@ -728,12 +728,12 @@ const Type1Parser = (function Type1ParserClosure() {
         token = this.getToken();
         switch (token) {
           case "FontMatrix":
-            var matrix = this.readNumberArray();
+            const matrix = this.readNumberArray();
             properties.fontMatrix = matrix;
             break;
           case "Encoding":
-            var encodingArg = this.getToken();
-            var encoding;
+            const encodingArg = this.getToken();
+            let encoding;
             if (!/^\d+$/.test(encodingArg)) {
               // encoding name is specified
               encoding = getEncoding(encodingArg);
@@ -764,7 +764,7 @@ const Type1Parser = (function Type1ParserClosure() {
             properties.builtInEncoding = encoding;
             break;
           case "FontBBox":
-            var fontBBox = this.readNumberArray();
+            const fontBBox = this.readNumberArray();
             // adjusting ascent/descent
             properties.ascent = Math.max(fontBBox[3], fontBBox[1]);
             properties.descent = Math.min(fontBBox[1], fontBBox[3]);
```
2021-03-12 12:05:48 +01:00
Jonas Jenwald
82062f7e0d Enable the ESLint no-var rule in the src/core/cff_parser.js file
Note that the majority of these changes were done automatically, by using `gulp lint --fix`, and the manual changes were limited to the following diff:

```diff
diff --git a/src/core/cff_parser.js b/src/core/cff_parser.js
index d684c200e..2e2b811e4 100644
--- a/src/core/cff_parser.js
+++ b/src/core/cff_parser.js
@@ -555,7 +555,7 @@ const CFFParser = (function CFFParserClosure() {
           stackSize %= 2;
           validationCommand = CharstringValidationData[value];
         } else if (value === 10 || value === 29) {
-          var subrsIndex;
+          let subrsIndex;
           if (value === 10) {
             subrsIndex = localSubrIndex;
           } else {
@@ -886,15 +886,15 @@ const CFFParser = (function CFFParserClosure() {
         format = bytes[pos++];
         switch (format & 0x7f) {
           case 0:
-            var glyphsCount = bytes[pos++];
+            const glyphsCount = bytes[pos++];
             for (i = 1; i <= glyphsCount; i++) {
               encoding[bytes[pos++]] = i;
             }
             break;

           case 1:
-            var rangesCount = bytes[pos++];
-            var gid = 1;
+            const rangesCount = bytes[pos++];
+            let gid = 1;
             for (i = 0; i < rangesCount; i++) {
               const start = bytes[pos++];
               const left = bytes[pos++];
@@ -938,7 +938,7 @@ const CFFParser = (function CFFParserClosure() {
           }
           break;
         case 3:
-          var rangesCount = (bytes[pos++] << 8) | bytes[pos++];
+          const rangesCount = (bytes[pos++] << 8) | bytes[pos++];
           for (i = 0; i < rangesCount; ++i) {
             let first = (bytes[pos++] << 8) | bytes[pos++];
             if (i === 0 && first !== 0) {
@@ -1173,7 +1173,7 @@ class CFFDict {
   }
 }

-var CFFTopDict = (function CFFTopDictClosure() {
+const CFFTopDict = (function CFFTopDictClosure() {
   const layout = [
     [[12, 30], "ROS", ["sid", "sid", "num"], null],
     [[12, 20], "SyntheticBase", "num", null],
@@ -1229,7 +1229,7 @@ var CFFTopDict = (function CFFTopDictClosure() {
   return CFFTopDict;
 })();

-var CFFPrivateDict = (function CFFPrivateDictClosure() {
+const CFFPrivateDict = (function CFFPrivateDictClosure() {
   const layout = [
     [6, "BlueValues", "delta", null],
     [7, "OtherBlues", "delta", null],
@@ -1265,11 +1265,12 @@ var CFFPrivateDict = (function CFFPrivateDictClosure() {
   return CFFPrivateDict;
 })();

-var CFFCharsetPredefinedTypes = {
+const CFFCharsetPredefinedTypes = {
   ISO_ADOBE: 0,
   EXPERT: 1,
   EXPERT_SUBSET: 2,
 };
+
 class CFFCharset {
   constructor(predefined, format, charset, raw) {
     this.predefined = predefined;
@@ -1695,7 +1696,7 @@ class CFFCompiler {
             // For offsets we just insert a 32bit integer so we don't have to
             // deal with figuring out the length of the offset when it gets
             // replaced later on by the compiler.
-            var name = dict.keyToNameMap[key];
+            const name = dict.keyToNameMap[key];
             // Some offsets have the offset and the length, so just record the
             // position of the first one.
             if (!offsetTracker.isTracking(name)) {
```
2021-03-12 12:00:38 +01:00
Jonas Jenwald
f3948aeb90 Enable the ESLint no-var rule in the src/core/font_renderer.js file
Note that the majority of these changes were done automatically, by using `gulp lint --fix`, and the manual changes were limited to the following diff:

```diff
diff --git a/src/core/font_renderer.js b/src/core/font_renderer.js
index e1538c481..00f5424cd 100644
--- a/src/core/font_renderer.js
+++ b/src/core/font_renderer.js
@@ -152,9 +152,9 @@ const FontRendererFactory = (function FontRendererFactoryClosure() {
   }

   function lookupCmap(ranges, unicode) {
-    let code = unicode.codePointAt(0),
-      gid = 0;
-    let l = 0,
+    const code = unicode.codePointAt(0);
+    let gid = 0,
+      l = 0,
       r = ranges.length - 1;
     while (l < r) {
       const c = (l + r + 1) >> 1;
@@ -199,7 +199,7 @@ const FontRendererFactory = (function FontRendererFactoryClosure() {
         flags = (code[i] << 8) | code[i + 1];
         const glyphIndex = (code[i + 2] << 8) | code[i + 3];
         i += 4;
-        var arg1, arg2;
+        let arg1, arg2;
         if (flags & 0x01) {
           arg1 = ((code[i] << 24) | (code[i + 1] << 16)) >> 16;
           arg2 = ((code[i + 2] << 24) | (code[i + 3] << 16)) >> 16;
@@ -366,7 +366,7 @@ const FontRendererFactory = (function FontRendererFactoryClosure() {
       while (i < code.length) {
         let stackClean = false;
         let v = code[i++];
-        var xa, xb, ya, yb, y1, y2, y3, n, subrCode;
+        let xa, xb, ya, yb, y1, y2, y3, n, subrCode;
         switch (v) {
           case 1: // hstem
             stems += stack.length >> 1;
@@ -494,7 +494,7 @@ const FontRendererFactory = (function FontRendererFactoryClosure() {
                 bezierCurveTo(xa, y2, xb, y3, x, y);
                 break;
               case 37: // flex1
-                var x0 = x,
+                const x0 = x,
                   y0 = y;
                 xa = x + stack.shift();
                 ya = y + stack.shift();
```
2021-03-12 11:57:27 +01:00
Calixte Denizet
3243672727 XFA - Create Form DOM in merging template and data trees
- Spec: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=171;
  - support for the 2 ways of merging: consumeData and matchTemplate;
  - create additional nodes in template DOM when occur node allows it;
  - support for global values in data DOM.
2021-03-08 14:10:30 +01:00
Tim van der Meij
5828ff6cb0
Implement rendering line annotations without appearance stream 2021-02-28 18:57:58 +01:00
Tim van der Meij
d6e0b2d92e
Merge pull request #13032 from Snuffleupagus/parseDestDictionary-actionName-warn
Don't warn about actions that require scripting support in `Catalog.parseDestDictionary`
2021-02-28 14:52:06 +01:00
Jonas Jenwald
39cf4a0008 Don't warn about actions that require scripting support in Catalog.parseDestDictionary
Now that we have scripting support, warning about e.g. JavaScript actions doesn't seem necessary anymore. Especially considering that scripting-related actions are/will not be parsed by the `Catalog.parseDestDictionary` method anyway, since it's intended for handling "simple" actions.
2021-02-28 13:13:17 +01:00
Tim van der Meij
fa6cebf045
Implement rendering square/circle annotations without appearance stream 2021-02-27 19:05:12 +01:00
Jonas Jenwald
05de20071a Modernize some of the code in src/core/cmap.js by using classes and async/await
This converts a couple of our old "classes" to proper ECMAScript classes, and replaces a lot of manual Promise-wrapping with async/await instead.
2021-02-27 14:20:43 +01:00
Tim van der Meij
4e96d59fca
Use a buffer instead of string concatenation in reverseIfRtl in src/core/unicode.js
This avoids creating intermediate strings and should be slightly more
efficient.
2021-02-27 13:20:09 +01:00
Tim van der Meij
24f80f1e38
Enable the no-var linting rule in src/core/primitives.js 2021-02-27 12:51:01 +01:00
Tim van der Meij
ed33727419
Enable the no-var linting rule in src/core/glyphlist.js 2021-02-27 12:46:57 +01:00
Tim van der Meij
e051d4d029
Enable the no-var linting rule in src/core/ccitt_stream.js 2021-02-27 12:44:55 +01:00
Tim van der Meij
0897dddbbe
Enable the no-var linting rule in src/core/unicode.js 2021-02-27 12:44:50 +01:00
Tim van der Meij
cb82dda755
Enable the no-var linting rule in src/core/metrics.js 2021-02-27 12:44:45 +01:00
Tim van der Meij
55786a4880
Merge pull request #13026 from Snuffleupagus/crypto-classes
Convert code in `src/core/crypto.js` to use "normal" classes
2021-02-26 22:39:30 +01:00
Jonas Jenwald
6b4c4f80e4 Convert code in src/core/crypto.js to use "normal" classes
All of this code predates the existence of native JS classes, however we can now clean this up a bit. This patch thus let us remove some variable "shadowing" from the code.
2021-02-26 15:51:45 +01:00
Jonas Jenwald
b884757873 Inline the concatArrays function in calculatePDF20Hash
This helper function is first of all only called *twice*, and secondly it also leads to unnecessary intermediate allocations given how the `TypedArray`s are handled.
Hence we can simply inline this small function, and thus directly allocate the combined `TypedArray` instead.
2021-02-26 15:51:39 +01:00
Jonas Jenwald
9a9a5b2365 Replace the compareByteArrays functions, in src/core/crypto.js, with the isArrayEqual helper function
The `compareByteArrays` is first of all duplicated in multiple closures in the `src/core/crypto.js` file. Secondly, despite its name, it's also functionally equivalent to the now existing `isArrayEqual` helper function.

The `isArrayEqual` helper function is changed to use a standard `for`-loop, rather than `Array.prototype.every`, since that ought to be slightly more efficient given that we're now using it with (potentially) larger data.
2021-02-26 15:51:32 +01:00
Jonas Jenwald
e69e8622a9 Convert code in src/core/function.js to use "normal" classes
All of this code predates the existence of native JS classes, however we can now clean this up a bit. This patch thus let us remove some variable "shadowing" from the code.
2021-02-26 13:20:59 +01:00
calixteman
45329af926
XFA -- Add support for SOM expressions (#12983)
- specifications: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=87;
 - add a parser for SOM expressions;
 - add search functions to resolve those expressions;
 - search functions will be used to bind data into template.
2021-02-24 10:13:02 +01:00
Jonas Jenwald
e9038cc3d1 Send the AnnotationStorage-data to the worker-thread as a Map
Rather than converting the `AnnotationStorage`-data to an Object, before sending it to the worker-thread, we should be able to simply send the internal `Map` directly.
The "structured clone algorithm" doesn't have a problem with `Map`s, however the `LoopbackPort` used when workers are *disabled* (e.g. in Node.js environments) didn't use to support them. With PR 12997 having lifted that restriction, we should now be able to simply send the `AnnotationStorage`-data as-is rather than having to iterate through it to first create an Object.

*Please note:* The changes in `src/core/annotation.js` could have been a lot more compact if we were able to use optional chaining in the `src/core` folder. Unfortunately that's still not possible, since SystemJS is being used in the development viewer (i.g. `gulp server`) and fixing that is *still* blocked by [bug 1247687](https://bugzilla.mozilla.org/show_bug.cgi?id=1247687).
2021-02-18 17:13:43 +01:00
calixteman
0fa9976268
XFA - Add support for prototypes (#12979)
- specifications: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=225&zoom=auto,-207,784
 - add a clone method on nodes in order to be able to clone a proto;
 - support ids in template namespace;
 - prevent from cycle when applying protos.
2021-02-18 10:32:25 +01:00
Tim van der Meij
4619b1b568
Merge pull request #12997 from Snuffleupagus/metadata-worker
Move the Metadata parsing to the worker-thread
2021-02-17 20:57:46 +01:00
calixteman
b5be515375
XFA - Add a lexer/parser for FormCalc language (#12936)
- the language specifications are: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=1049
 - it can be used to:
   * as a scripting language for calculation, validations, ...
   * in SOM expressions to select nodes: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=101
2021-02-17 20:28:06 +01:00
Jonas Jenwald
d366bbdf51 Move the encodeToXmlString helper function to src/core/core_utils.js
With the previous patch this function is now *only* accessed on the worker-thread, hence it's no longer necessary to include it in the *built* `pdf.js` file.
2021-02-17 13:12:01 +01:00
Jonas Jenwald
b66f294f64 Move the XML-parser to the src/core/-folder
With the previous patch this functionality is now *only* accessed on the worker-thread, hence it's no longer necessary to include it in the *built* `pdf.js` file.
2021-02-17 13:12:01 +01:00
Jonas Jenwald
cc3a6563ee Move the Metadata parsing to the worker-thread
The only reason, as far as I can tell, for parsing the Metadata on the main-thread is how it was originally implemented. When Metadata support was first implemented, it utilized the [`DOMParser`](https://developer.mozilla.org/en-US/docs/Web/API/DOMParser) which isn't available in workers.
Today, with the custom XML-parser being used, that's no longer an issue and it seems reasonable to move the Metadata parsing to the worker-thread[1], since that's where all parsing should happen (for performance reasons).

Based on these changes, we'll be able to reduce the now unnecessary duplication of the XML-parser (and related code) in both of the *built* `pdf.js`/`pdf.worker.js` files.

Finally, this patch changes the `_repair` method to use "Array + join" rather than string concatenation.

---
[1] This needed the previous patch, to enable sending of `Map`s between threads with workers disabled.
2021-02-17 13:12:01 +01:00
Calixte Denizet
0fc8267576 Avoid infinite loop when getting annotation field name
- aims to fix issue #12963;
 - use a Set to track already visited objects;
 - remove the loop limit in getInheritableProperty and use a RefSet too.
2021-02-14 19:58:19 +01:00
Jonas Jenwald
1ee747a620 Remove unneeded instanceof MissingDataException checks
The following checks are all unneeded, and could easily cause confusion when reading the code. (All of them are my fault as well, since I've sometimes added those checks without really thinking about the surrounding code.)

 - In `PartialEvaluator.hasBlendModes` there cannot be any `MissingDataException`s thrown, given that the `Page.getOperatorList` method waits for all the necessary /Resources to load first. Furthermore, note also that if an error is thrown from `PartialEvaluator.hasBlendModes` then it'd completely break rendering of that page, since any errors thrown from `Page.getOperatorList` are simply sent to the main-thread.

 - In `PartialEvaluator.handleColorN` there cannot be any `MissingDataException`s thrown, given that again the `Page.getOperatorList` method waits for all the necessary /Resources to load before operatorList parsing starts.

 - In `XRef.readXRef` there cannot be any `MissingDataException`s thrown, given that we're *explicitly* requesting (and waiting for) the entire document in `pdfManagerReady` (in `src/core/worker.js`) before re-parsing of a corrupt document starts.
2021-02-13 12:26:05 +01:00
Calixte Denizet
ea06bb0e36 [api-minor] Annotation -- Don't compute appearance when nothing has changed
* don't set a value in annotationStorage by default:
   - having an undefined when the annotation is rendered for saving/printing means nothing has changed so use normal appearance
   - aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1681687
 * change the way to compute font size when this one is null in DA:
   - make fontSize proportional to line height
   - in multiline case, take into account the number of lines for text entered to adapt the font size
2021-02-12 19:27:21 +01:00
calixteman
0479deef4e
XFA -- Add other objects (#12949)
- connectionSet: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=969
 - datasets: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=1038
  - signature: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=1040
  - stylesheet: the same
  - xhtml: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=1187
2021-02-11 12:30:37 +01:00
calixteman
3787bd41ef
XFA -- Add localset object (#12948)
- Specifications: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=943
2021-02-10 18:04:43 +01:00
Jonas Jenwald
0068dba009 [api-minor] Rename -es5 to -legacy, to reduce confusion over what's actually supported (issue 12976)
*Please note that this will also require some edits of the Wiki.*
2021-02-10 16:01:59 +01:00
Jonas Jenwald
31098c404d
Use Math.hypot, instead of Math.sqrt with manual squaring (#12973)
When the PDF.js project started `Math.hypot` didn't exist yet, and until recently we still supported browsers (IE 11) without a native `Math.hypot` implementation; please see this compatibility information: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Math/hypot#browser_compatibility

Furthermore, somewhat recently there were performance improvements of `Math.hypot` in Firefox; see https://bugzilla.mozilla.org/show_bug.cgi?id=1648820

Finally, this patch also replaces a couple of multiplications with the exponentiation operator.
2021-02-10 12:28:49 +01:00
Jonas Jenwald
e6fe8a7d53 Handle errors gracefully, in PartialEvaluator.translateFont, when fetching the font file (issue 9462)
The *third* page of the referenced PDF document currently fails to render completely, since one of its font files fail to load.
Since that error isn't handled, a large part of the text is thus missing which looks quite bad. By "replacing" the font data with an *empty* stream, we'll thus be able to fallback to rendering the text with a standard font (instead of using `ErrorFont`). While there's obviously no guarantee that things will look perfect, actually rendering the text at all should be an improvement in general.

Also, print a warning in `PartialEvaluator.loadFont` when the `PartialEvaluator.translateFont` method rejects, since that'd have helped debug/fix the issue faster.
2021-02-06 19:44:53 +01:00
Jonas Jenwald
d3e65f24e3 Request all data, rather than throwing, when encountering general errors in ObjectLoader._walk (issue 9462, PR 3289 follow-up)
*As far as I can tell, this has been broken ever since PR 3289 (back in 2013) without anyone noticing.*

For any non-`MissingDataException` errors encountered in `ObjectLoader._walk`, we're simply throwing immediately which thus has the potential to *completely* break rendering of an entire page.
In practice this is obviously only an issue for PDF documents which are in one way or another corrupt, since that's the only way that `XRef.fetch` will throw non-`MissingDataException` errors. To make matters worse these errors are *intermittent*, since they can only occur if the document is still loading when the `ObjectLoader`-code runs (note the early return in `ObjectLoader.load`).

Please note that we cannot simply catch the error and let "normal" parsing continue in `ObjectLoader._walk`, since that could lead to errors elsewhere given that resources "below" the current one (in the graph) might not be checked as intended then.
All-in-all, the only way to make absolutely sure that we won't cause *unexpected* `MissingDataException`s somewhere else in the code-base is to fallback to fetching the *entire* document in this edge-case.
2021-02-06 14:33:50 +01:00
Brendan Dahl
a392082e30
Merge pull request #12944 from calixteman/xfa_config
XFA -- Update config object
2021-02-05 15:06:09 -08:00