Ignore invalid /Encoding-entries when parsing fonts (issue 14821)

In the referenced PDF document the fonts have /Encoding-entries that are Streams (containing completely bogus data), which are thus obviously not valid here. Hence, only when `ignoreErrors` is set, we'll now ignore these corrupt /Encoding-entries and fallback to the existing code to try and infer a usable encoding. Given that this is *clearly* a case of corrupt PDF documents, there's no guarantee that this will "fix" all such cases, however it's the best that we do here and shouldn't really be worse than ignoring an entire font.
2022-04-22 11:40:13 +02:00 · 2022-04-22 11:40:13 +02:00 · e723da7261
commit e723da7261
parent 452a98b0e0
4 changed files with 13 additions and 1 deletions
--- a/src/core/evaluator.js
+++ b/src/core/evaluator.js
@ -3405,7 +3405,12 @@ class PartialEvaluator {
      } else if (encoding instanceof Name) {
        baseEncodingName = encoding.name;
      } else {
-        throw new FormatError("Encoding is not a Name nor a Dict");
+        const msg = "Encoding is not a Name nor a Dict";
+
+        if (!this.options.ignoreErrors) {
+          throw new FormatError(msg);
+        }
+        warn(msg);
      }
      // According to table 114 if the encoding is a named encoding it must be
      // one of these predefined encodings.
--- a/test/pdfs/.gitignore
+++ b/test/pdfs/.gitignore
@ -129,6 +129,7 @@
 !asciihexdecode.pdf
 !bug766086.pdf
 !bug793632.pdf
+!issue14821.pdf
 !bug1020858.pdf
 !prefilled_f1040.pdf
 !bug1050040.pdf
--- a/test/pdfs/issue14821.pdf
+++ b/test/pdfs/issue14821.pdf
--- a/test/test_manifest.json
+++ b/test/test_manifest.json
@ -3612,6 +3612,12 @@
       "rounds": 1,
       "type": "eq"
    },
+    {  "id": "issue14821",
+       "file": "pdfs/issue14821.pdf",
+       "md5": "ae77afb0f98c62e6b7fe7f912c84a75c",
+       "rounds": 1,
+       "type": "eq"
+    },
    {
      "id": "issue6165",
      "file": "pdfs/issue6165.pdf",