Commit Graph

35 Commits

Author SHA1 Message Date
Jonas Jenwald
89785a23f3 Convert Metadata to use private class fields
Please refer to https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Classes/Private_class_fields
2021-10-22 22:01:19 +02:00
Jonas Jenwald
a0e584eeb2 Replace the objectFromEntries helper function with an objectFromMap one instead
Given that it's only used with `Map`s, and that it's currently implemented in such a way that we (indirectly) must iterate through the data *twice*, some simplification cannot hurt here.
Note that the only reason that we're not using `Object.fromEntries(...)` directly, at each call-site, is that that one won't guarantee that a `null` prototype is being used.
2021-03-11 16:37:34 +01:00
Jonas Jenwald
cc3a6563ee Move the Metadata parsing to the worker-thread
The only reason, as far as I can tell, for parsing the Metadata on the main-thread is how it was originally implemented. When Metadata support was first implemented, it utilized the [`DOMParser`](https://developer.mozilla.org/en-US/docs/Web/API/DOMParser) which isn't available in workers.
Today, with the custom XML-parser being used, that's no longer an issue and it seems reasonable to move the Metadata parsing to the worker-thread[1], since that's where all parsing should happen (for performance reasons).

Based on these changes, we'll be able to reduce the now unnecessary duplication of the XML-parser (and related code) in both of the *built* `pdf.js`/`pdf.worker.js` files.

Finally, this patch changes the `_repair` method to use "Array + join" rather than string concatenation.

---
[1] This needed the previous patch, to enable sending of `Map`s between threads with workers disabled.
2021-02-17 13:12:01 +01:00
Jonas Jenwald
b26c7974fe [api-minor] Change the dc:subject Metadata field to an Array
This patch simply extends the existing handling of the `dc:creator` field, which should hopefully suffice here; please refer to https://wwwimages2.adobe.com/content/dam/acom/en/devnet/xmp/pdfs/XMP%20SDK%20Release%20cc-2016-08/XMPSpecificationPart1.pdf#page=34
2021-02-14 17:16:40 +01:00
Jonas Jenwald
298ee5cfbb Replace some ternary operators with optional chaining, and nullish coalescing, in the src/display/-folder
This way, we can further reduce unnecessary code-repetition in some cases.
2021-01-19 17:20:02 +01:00
Calixte Denizet
43d5512f5c [api-minor] Change the "dc:creator" Metadata field to an Array
- add scripting support for doc.info.authors
 - doc.info.metadata is the raw string with xml code
2021-01-11 21:34:07 +01:00
Jonas Jenwald
820fb7f969 Update all Object.fromEntries call-sites to ensure that a null prototype is used
Given that `Object.fromEntries` doesn't seem to *guarantee* that a `null` prototype is used, we thus hack around that by using `Object.assign` with `Object.create(null)`.
2020-10-28 14:43:44 +01:00
calixteman
68b99c59ee
Save form data in XFA datasets when pdf is a mix of acroforms and xfa (#12344)
* Move display/xml_parser.js in shared to use it in worker

* Save form data in XFA datasets when pdf is a mix of acroforms and xfa

Co-authored-by: Brendan Dahl <brendan.dahl@gmail.com>
2020-09-08 15:13:52 -07:00
Jonas Jenwald
16fa9dc4ea Add support for Object.fromEntries
This provides a simpler way of creating an `Object` from e.g. a `Map`, without having to manually iterate over it.
Please see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Object/fromEntries
2020-08-06 14:39:51 +02:00
Jonas Jenwald
426945b480 Update Prettier to version 2.0
Please note that these changes were done automatically, using `gulp lint --fix`.

Given that the major version number was increased, there's a fair number of (primarily whitespace) changes; please see https://prettier.io/blog/2020/03/21/2.0.0.html
In order to reduce the size of these changes somewhat, this patch maintains the old "arrowParens" style for now (once mozilla-central updates Prettier we can simply choose the same formatting, assuming it will differ here).
2020-04-14 12:28:14 +02:00
Jonas Jenwald
5cdfff4a47 Re-factor how Metadata class instances store its data internally
Please note that these changes do *not* affect the *public* interface of the `Metadata` class, but only touches internal structures.[1]

These changes were prompted by looking at the `getAll` method, which simply returns the "private" metadata object to the consumer. This seems wrong conceptually, since it allows way too easy/accidental changes to the internal parsed metadata.
As part of fixing this, the internal metadata was changed to use a `Map` rather than a plain Object.

---
[1] Basically, we shouldn't need to worry about someone depending on internal implementation details.
2020-02-13 18:23:15 +01:00
Jonas Jenwald
3f1568b51a A couple of small improvements of the Metadata._repair method
- Remove the "capturing group" in the regular expression that removes leading "junk" from the raw metadata, since it's not necessary here (it's simply a case of too much copy-pasting in a prior patch).
   According to [MDN](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions/Cheatsheet#Groups_and_ranges) you want to, for performance reasons, avoid "capturing groups" unless actually needed.

 - Add inline comments to document a bunch of magic values in the code.
2020-02-13 17:20:52 +01:00
Jonas Jenwald
9e262ae7fa Enable the ESLint prefer-const rule globally (PR 11450 follow-up)
Please find additional details about the ESLint rule at https://eslint.org/docs/rules/prefer-const

With the recent introduction of Prettier this sort of mass enabling of ESLint rules becomes a lot easier, since the code will be automatically reformatted as necessary to account for e.g. changed line lengths.

Note that this patch is generated automatically, by using the ESLint `--fix` argument, and will thus require some additional clean-up (which is done separately).
2020-01-25 00:20:22 +01:00
Jonas Jenwald
36881e3770 Ensure that all import and require statements, in the entire code-base, have a .js file extension
In order to eventually get rid of SystemJS and start using native `import`s instead, we'll need to provide "complete" file identifiers since otherwise there'll be MIME type errors when attempting to use `import`.
2020-01-04 13:01:43 +01:00
Jonas Jenwald
de36b2aaba Enable auto-formatting of the entire code-base using Prettier (issue 11444)
Note that Prettier, purposely, has only limited [configuration options](https://prettier.io/docs/en/options.html). The configuration file is based on [the one in `mozilla central`](https://searchfox.org/mozilla-central/source/.prettierrc) with just a few additions (to avoid future breakage if the defaults ever changes).

Prettier is being used for a couple of reasons:

 - To be consistent with `mozilla-central`, where Prettier is already in use across the tree.

 - To ensure a *consistent* coding style everywhere, which is automatically enforced during linting (since Prettier is used as an ESLint plugin). This thus ends "all" formatting disussions once and for all, removing the need for review comments on most stylistic matters.

Many ESLint options are now redundant, and I've tried my best to remove all the now unnecessary options (but I may have missed some).
Note also that since Prettier considers the `printWidth` option as a guide, rather than a hard rule, this patch resorts to a small hack in the ESLint config to ensure that *comments* won't become too long.

*Please note:* This patch is generated automatically, by appending the `--fix` argument to the ESLint call used in the `gulp lint` task. It will thus require some additional clean-up, which will be done in a *separate* commit.

(On a more personal note, I'll readily admit that some of the changes Prettier makes are *extremely* ugly. However, in the name of consistency we'll probably have to live with that.)
2019-12-26 12:34:24 +01:00
Jonas Jenwald
9f45f8dfda When parsing Metadata, attempt to remove "junk" before the first tag (PR 10398 follow-up)
This will allow the Metadata to be successfully extracted from the PDF file in issue 10395.
Furthermore, this patch also fixes a bug in `Metadata.get` which causes the method to return `null` rather than an empty string or zero (since either ought to be allowed).
2019-01-16 12:44:27 +01:00
Yury Delendik
655c8d34d0 New XML parser 2018-03-19 20:51:41 -05:00
Jonas Jenwald
ad5ed37059 Handle broken, Ghostscript generated, Metadata that contains HTML character names (bug 1424938)
Please note that while this could be considered a regression in user-facing behaviour, I'm not convinced that it's really a regression as such since prior to PR 8912 the Metadata would fail to parse (with an XML error) and thus be ignored when setting the viewer title.
With the refactored Metadata parsing we're now able to parse this, which uncovered issues with a subset of broken Ghostscript Metadata that uses HTML character names.

Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1424938
2017-12-13 14:32:47 +01:00
Jonas Jenwald
0f90d5130c [api-major] Remove all remaining deprecated functions/methods 2017-10-15 13:27:10 +02:00
Tim van der Meij
d4309614f9
Replace DOMParser with SimpleXMLParser
The `DOMParser` is most likely overkill and may be less secure.
Moreover, it is not supported in Node.js environments.

This patch replaces the `DOMParser` with a simple XML parser. This
should be faster and gives us Node.js support for free. The simple XML
parser is a port of the one that existed in the examples folder with a
small regex fix to make the parsing work correctly.

The unit tests are extended for increased test coverage of the metadata
code. The new method `getAll` is provided so the example does not have
to access internal properties of the object anymore.
2017-09-19 23:09:07 +02:00
Tim van der Meij
bc9afdf3c4
Convert src/display/metadata.js to ES6 syntax 2017-09-19 22:13:59 +02:00
Yury Delendik
d028c26210 Removes error() 2017-07-07 09:40:24 -05:00
Jonas Jenwald
f20d2cd2ae Fix inconsistent spacing and trailing commas in objects in remaining src/ files, so we can enable the comma-dangle and object-curly-spacing ESLint rules later on
http://eslint.org/docs/rules/comma-dangle
http://eslint.org/docs/rules/object-curly-spacing

Given that we currently have quite inconsistent object formatting, fixing this in *one* big patch probably wouldn't be feasible (since I cannot imagine anyone wanting to review that); hence I've opted to try and do this piecewise instead.

Please note: This patch was created automatically, using the ESLint `--fix` command line option. In a couple of places this caused lines to become too long, and I've fixed those manually; please refer to the interdiff below for the only hand-edits in this patch.

```diff
diff --git a/src/display/canvas.js b/src/display/canvas.js
index 5739f6f2..4216b2d2 100644
--- a/src/display/canvas.js
+++ b/src/display/canvas.js
@@ -2071,7 +2071,7 @@ var CanvasGraphics = (function CanvasGraphicsClosure() {
       var map = [];
       for (var i = 0, ii = positions.length; i < ii; i += 2) {
         map.push({ transform: [scaleX, 0, 0, scaleY, positions[i],
-                 positions[i + 1]], x: 0, y: 0, w: width, h: height, });
+                   positions[i + 1]], x: 0, y: 0, w: width, h: height, });
       }
       this.paintInlineImageXObjectGroup(imgData, map);
     },
diff --git a/src/display/svg.js b/src/display/svg.js
index 9eb05dfa..2aa21482 100644
--- a/src/display/svg.js
+++ b/src/display/svg.js
@@ -458,7 +458,11 @@ SVGGraphics = (function SVGGraphicsClosure() {

       for (var x = 0; x < fnArrayLen; x++) {
         var fnId = fnArray[x];
-        opList.push({ 'fnId': fnId, 'fn': REVOPS[fnId], 'args': argsArray[x], });
+        opList.push({
+          'fnId': fnId,
+          'fn': REVOPS[fnId],
+          'args': argsArray[x],
+        });
       }
       return opListToTree(opList);
     },
```
2017-06-02 12:32:18 +02:00
Jonas Jenwald
52e3de3c0a Convert the files in the /src/display folder to ES6 modules
Also disables ES2015 transpilation in development mode.
2017-04-16 12:19:10 +02:00
Jonas Jenwald
c850968fa7 Remove globals that are now unnecessary thanks to the use of various ESLint environments (e.g. Node, ShellJS, Jasmine) 2016-12-16 21:09:55 +01:00
Jonas Jenwald
77bcc9232e Remove a misplaced false from a condition in fixMetadata, in metadata.js, since it currently short circuits the entire condition
This looks to me like a simple oversight, which has existed ever since PR 1598 all the way back in 2012.
2016-12-07 22:51:46 +01:00
Yury Delendik
1d12aed5ca Move all PDFJS.xxx settings into display/global. 2016-04-07 13:46:07 -05:00
Yury Delendik
bda5e6235e Removes global PDFJS usage from the src/core/. 2016-03-23 19:24:37 -05:00
Yury Delendik
2edf2792dc Replaces literal {} created lookup tables with Object.create 2016-01-28 12:18:38 -06:00
Yury Delendik
6b60c8f4db Adds UMD headers to core, display and shared files. 2015-12-15 13:24:39 -06:00
Jonas Jenwald
373da010ac Move the globals comments in bidi.js and metadata.js to after the Copyright comments 2015-11-21 18:43:08 +01:00
Manas
a2ba1b8189 Uses editorconfig to maintain consistent coding styles
Removes the following as they unnecessary
/* -*- Mode: Java; tab-width: 2; indent-tabs-mode: nil; c-basic-offset: 2 -*- */
/* vim: set shiftwidth=2 tabstop=2 autoindent cindent expandtab: */
2015-11-14 07:32:18 +05:30
Jonas Jenwald
ec6ec13506 Add strict equalities in src/display/metadata.js 2014-08-01 12:40:16 +02:00
Tim van der Meij
7d86fa859f Making src/display/metadat.js adhere to the style guide 2014-03-09 12:41:01 +01:00
Brendan Dahl
5ecce4996b Split files into worker and main thread pieces. 2013-08-12 10:48:06 -07:00