Commit Graph

858 Commits

Author SHA1 Message Date
calixteman
a9385bbb52
Revert "XFA - Fix the way to select page on breaking" 2021-06-21 15:45:04 +02:00
Calixte Denizet
7aea8faa34 XFA - Fix the way to select page on breaking
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1716838.
  - some fonts in the pdf in the bug where bold when they shouldn't so write the font properties in the html to avoid to use some wrong inherited ones.
2021-06-21 12:45:23 +02:00
Calixte Denizet
7cdbc98716 XFA - Match font family correctly
- partial fix for https://bugzilla.mozilla.org/show_bug.cgi?id=1716980;
  - some pdf can contain an invalid font family (e.g. 'Windings 3') so in this case remove the space;
  - the font family in typeface attribute doesn't always match the one defined in the FontDescriptor dictionary.
2021-06-20 15:16:28 +02:00
Calixte Denizet
df08b1548b XFA - Fix layout issues
- PR #13554 is buggy, so this patch aims to fix bugs.
  - check if a component fits into its parent in taking into account the parent layout.
  - introduce method isSplittable for template nodes to know if a component can be splitted in case of overflow.
2021-06-17 16:09:22 +02:00
Calixte Denizet
8eeb7ab4a3 XFA - Add the possibily to layout and measure text
- some containers doesn't always have their 2 dimensions and those dimensions re based on contents;
  - so in order to measure text, we must get the glyph widths (for the xfa fonts) before starting the layout;
  - implement a word-wrap algorithm;
  - handle font change during text layout.
2021-06-17 14:17:02 +02:00
Calixte Denizet
793a0156ce XFA - By default a text ui has only one line when in a field element
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1716809.
2021-06-16 20:18:29 +02:00
Calixte Denizet
d89c429d78 XFA - Handle maxChars property for text fields
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1716294.
2021-06-14 13:07:06 +02:00
Brendan Dahl
d333af7848
Merge pull request #13527 from calixteman/bind_inf_loop
XFA - Avoid infinite loop when creating some nodes in data
2021-06-09 12:37:29 -07:00
Brendan Dahl
aa2712744d
Merge pull request #13502 from calixteman/contentarea
XFA - contentarea must be on top of the other containers in a pageArea
2021-06-09 12:36:21 -07:00
Calixte Denizet
cddc1d869d XFA - Avoid infinite loop when creating some nodes in data 2021-06-09 19:07:59 +02:00
Jonas Jenwald
a01c599247 Cache the "raw" standard font data in the worker-thread (PR 12726 follow-up)
*This implementation is basically a copy of the pre-existing `builtInCMapCache` implementation.*

For some, badly generated, PDF documents it's possible that we'll end up having to fetch the *same* standard font data over and over (which is obviously inefficient).
While not common, it's certainly possible that a PDF document uses *custom* font names where the actual font then references one of the standard fonts; see e.g. issue 11399 for one such example.

Note that I did suggest adding worker-thread caching of standard font data in PR 12726, however it wasn't deemed necessary at the time. Now that we have a real-world example that benefit from caching, I think that we should simply implement this now.
2021-06-09 18:27:51 +02:00
Calixte Denizet
34a2fa72c7 XFA - Add Liberation-Sans font as a substitution for some missing fonts
- Some js files contain scale factors for each glyph in order to rescale Liberation to have a final font with the correct width.
  - A lot of XFA have some containers where their dimensions are based on their text content, so using default font from browser can lead to an almost unreadable pdf.
2021-06-09 16:55:45 +02:00
Calixte Denizet
1486608f32 XFA - contentarea must be on top of the other containers in a pageArea 2021-06-09 15:29:29 +02:00
Calixte Denizet
cfa727474e XFA - Fix layout issues (again)
- some elements weren't displayed because their rotation angle was not taken into account;
  - fix box model (XFA concept):
    - remove use of outline;
    - position correctly border which isn't part of box dimensions;
    - fix margins issues (see issue #13474).
  - move border on button instead of having it on wrapping div;
2021-06-08 17:42:53 +02:00
Jonas Jenwald
e7dc822e74
Merge pull request #12726 from brendandahl/standard-fonts
[api-minor] Include and use the 14 standard font files.
2021-06-08 10:09:40 +02:00
Brendan Dahl
4c1dd47e65 Include and use the 14 standard fonts files. 2021-06-07 11:10:11 -07:00
Calixte Denizet
5dc7f4ade8 XFA - CDATA can be xml so parse it when required 2021-06-07 10:38:39 +02:00
Calixte Denizet
112645ea3d XFA - Don't bind a form node with an empty value when the data node doesn't exist 2021-06-06 17:59:01 +02:00
Calixte Denizet
11573ddd16 XFA - Implement usehref support
- attribute 'use' was already implemented but not usehref
  - in general, usehref should make reference to current document
  - add support for SOM expressions in use and usehref to search a node.
  - get prototype for all nodes if any.
2021-06-04 14:57:05 +02:00
Jonas Jenwald
af78ba64bd Don't change options of the globally used PartialEvaluator in the "should render checkbox with fallback font for printing" unit-test
Given that the same `PartialEvaluator`-instance is used for a lot of these unit-tests, manually changing the options in any one test-case could lead to intermittently failing unit-tests since they're run in a random order.
To fix this, we simply have to use the existing method to clone the `PartialEvaluator`-instance but with the custom options.
2021-05-31 12:14:58 +02:00
Calixte Denizet
45c3f00a27 XFA - Move the fake HTML representation of XFA from the worker to the main thread
- the only goal of this patch is to be able to get synchronously the fake html when printing from firefox:
    - in order to print we need to inject some html in beforeprint callback but we cannot block in waiting for all the pages.
  - from a memory point of view: it doesn't change anything since the fake HTML is deleted in the worker;
  - this way we don't break any assumptions.
2021-05-25 19:33:07 +02:00
Calixte Denizet
7cebdbd58c XFA - Fix lot of layout issues
- I thought it was possible to rely on browser layout engine to handle layout stuff but it isn't possible
    - mainly because when a contentArea overflows, we must continue to layout in the next contentArea
    - when no more contentArea is available then we must go to the next page...
    - we must handle breakBefore and breakAfter which allows to "break" the layout to go to the next container
  - Sometimes some containers don't provide their dimensions so we must compute them in order to know where to put
    them in their parents but to compute those dimensions we need to layout the container itself...
  - See top of file layout.js for more explanations about layout.
  - fix few bugs in other places I met during my work on layout.
2021-05-25 17:51:36 +02:00
Tim van der Meij
d1d9b9043d
Merge pull request #13415 from Snuffleupagus/getDestination-out-of-order
Improve handling of named destinations in out-of-order NameTrees (PR 10274 follow-up)
2021-05-21 20:15:09 +02:00
Jonas Jenwald
8d5689387b Improve handling of named destinations in out-of-order NameTrees (PR 10274 follow-up)
According to the specification, see https://web.archive.org/web/20210404042322if_/https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G6.2384179, the keys of a NameTree/NumberTree should be ordered.
For corrupt PDF files, which violate this assumption, it's thus possible that trying to lookup a single entry fails.

Previously, in PR 10274, we implemented a fallback that only applies to the "bottom" node of a NameTree/NumberTree, which in general might not actually help for sufficiently corrupt NameTree/NumberTree data.
Instead we remove the current *limited* fallback from `NameOrNumberTree.get`, and defer to the call-site to handle this case explicitly e.g. by using `NameOrNumberTree.getAll` for data where that makes sense. For well-formed documents, these changes should *not* lead to any additional data fetching/parsing.

Finally, as part of these changes, the validation of named destination data is improved in the `Catalog` and a new unit-test is also added.
2021-05-21 15:48:37 +02:00
Jonas Jenwald
1a8d05fdcf Remove some, with Prettier 2.3.0, unnecessary // prettier-ignore comments
To get the maximum benefit from something like Prettier, you obviously don't want to disable the automatic formatting unless absolutely necessary. When we added Prettier there were a number of cases, mostly involving larger Arrays, which required disabling of the automatic formatting for overall readability and/or to not break inline comments.

With changes in Prettier version `2.3.0`, see [the release notes](https://prettier.io/blog/2021/05/09/2.3.0.html#concise-formatting-of-number-only-arrays-10106httpsgithubcomprettierprettierpull10106-10160httpsgithubcomprettierprettierpull10160-by-thorn0httpsgithubcomthorn0), there's now better formatting support for Arrays containing only numbers. Hence we can now remove a number of `// prettier-ignore` comments, and thus get the benefit of automatic formatting in (slightly) more of the code-base.
2021-05-19 11:36:03 +02:00
Calixte Denizet
4544ebf38a Handle PI with no value in xml parser
- an XML PI contains a target and optionally some content (see https://en.wikipedia.org/wiki/Processing_Instruction)
  - the parser expected to always have some content and so it could lead to wrong parsing.
2021-05-18 10:22:18 +02:00
Jonas Jenwald
8943bcd3c3 Account for formatting changes in Prettier version 2.3.0
With the exception of one tweaked `eslint-disable` comment, in `web/generic_scripting.js`, this patch was generated automatically using `gulp lint --fix`.

Please find additional information at:
 - https://github.com/prettier/prettier/releases/tag/2.3.0
 - https://prettier.io/blog/2021/05/09/2.3.0.html
2021-05-16 11:44:05 +02:00
Jonas Jenwald
757636d519 Convert the remaining functions in src/core/primitives.js to use standard classes
This patch was tested using the PDF file from issue 2618, i.e. https://bug570667.bugzilla-attachments.gnome.org/attachment.cgi?id=226471, with the following manifest file:
```
[
    {  "id": "issue2618",
       "file": "../web/pdfs/issue2618.pdf",
       "md5": "",
       "rounds": 50,
       "type": "eq"
    }
]
```

which gave the following results when comparing this patch against the `master` branch:
```
-- Grouped By browser, stat --
browser | stat         | Count | Baseline(ms) | Current(ms) | +/- |   %  | Result(P<.05)
------- | ------------ | ----- | ------------ | ----------- | --- | ---- | -------------
firefox | Overall      |    50 |         3417 |        3426 |   9 | 0.27 |
firefox | Page Request |    50 |            1 |           1 |   0 | 5.41 |
firefox | Rendering    |    50 |         3416 |        3426 |   9 | 0.27 |
```

Based on these results, there's no significant performance regression from using standard classes and this patch should thus be OK.
2021-05-12 09:36:28 +02:00
Jonas Jenwald
77b258440b Move some constants and helper functions from src/core/fonts.js and into their own file
- `FontFlags`, is used in both `src/core/fonts.js` and `src/core/evaluator.js`.
 - `getFontType`, same as the above.
 - `MacStandardGlyphOrdering`, is a fairly large data-structure and `src/core/fonts.js` is already a *very* large file.
 - `recoverGlyphName`, a dependency of `type1FontGlyphMapping`; please see below.
 - `SEAC_ANALYSIS_ENABLED`, is used by both `Type1Font`, `CFFFont`, and unit-tests; please see below.
 - `type1FontGlyphMapping`, is used by both `Type1Font` and `CFFFont` which a later patch will move to their own files.
2021-05-02 21:00:29 +02:00
Tim van der Meij
f6f335173d
Merge pull request #13303 from Snuffleupagus/BaseStream
Add an abstract base-class, which all the various Stream implementations inherit from
2021-05-01 19:13:36 +02:00
calixteman
af4dc55019
[api-minor] Fix the way to chunk the strings (#13257)
- Improve chunking in order to fix some bugs where the spaces aren't here:
    * track the last position where a glyph has been drawn;
    * when a new glyph (first glyph in a chunk) is added then compare its position with the last saved one and add a space or break:
      - there are multiple ways to move the glyphs and to avoid to have to deal with all the different possibilities it's a way easier to just compare positions;
      - and so there is now one function (i.e. "compareWithLastPosition") where all the job is done.
  - Add some breaks in order to get lines;
  - Remove the multiple whites spaces:
    * some spaces were filled with several whites spaces and so it makes harder to find some sequences of words using the search tool;
    * other pdf readers replace spaces by one white space.

Update src/core/evaluator.js

Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com>

Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com>
2021-04-30 14:41:13 +02:00
Jonas Jenwald
66d9d83dcb Move the PredictorStream from src/core/stream.js and into its own file 2021-04-28 10:16:51 +02:00
Jonas Jenwald
57a1ea840f
Ensure that saveDocument works if there's no /ID-entry in the PDF document (issue 13279) (#13280)
First of all, while it should be very unlikely that the /ID-entry is an *indirect* object, note how we're using `Dict.get` when parsing it e.g. in `PDFDocument.fingerprint`. Hence we definitely should be consistent here, since if the /ID-entry is an *indirect* object the existing code in `src/core/writer.js` would already fail.
Secondly, to fix the referenced issue, we also need to check that the /ID-entry actually is an Array before attempting to access its contents in `src/core/writer.js`.

*Drive-by change:* In the `xrefInfo` object passed to the `incrementalUpdate` function, re-name the `encrypt` property to `encryptRef` since its data is fetched using `Dict.getRaw` (given the names of the other properties fetched similarly).
2021-04-22 12:08:56 +02:00
Tim van der Meij
d42f3d0bfe
Convert done callbacks to async/await in test/unit/evaluator_spec.js 2021-04-18 14:20:54 +02:00
Tim van der Meij
f4237d3a09
Convert done callbacks to async/await in test/unit/annotation_spec.js 2021-04-17 19:59:18 +02:00
Tim van der Meij
c2f3a71eca
Convert done callbacks to async/await in test/unit/api_spec.js 2021-04-17 17:52:23 +02:00
Jonas Jenwald
f560fe6875 A couple of small scripting/XFA-related tweaks in the worker-code
- Use `PDFManager.ensureDoc`, rather than `PDFManager.ensure`, in a couple of spots in the code. If there exists a short-hand format, we should obviously use it whenever possible.

 - Fix a unit-test helper, to account for the previous changes. (Also, converts a function to be `async` instead.)

 - Add one more exists-check in `PDFDocument.loadXfaFonts`, which I missed to suggest in PR 13146, to prevent any possible errors if the method is ever called in a situation where it shouldn't be.
   Also, print a warning if the actual font-loading fails since that could help future debugging. (Finally, reduce overall indentation in the loop.)

 - Slightly unrelated, but make a small tweak of a comment in `src/core/fonts.js` to reduce possible confusion.
2021-04-17 10:34:22 +02:00
Brendan Dahl
ac3fa1e3d7
Merge pull request #13146 from calixteman/xfa_fonts
XFA -- Load fonts permanently from the pdf
2021-04-16 12:55:12 -07:00
Calixte Denizet
7e9579045f XFA -- Load fonts permanently from the pdf
- Different fonts can be used in xfa and some of them are embedded in the pdf.
  - Load all the fonts in window.document.

Update src/core/document.js

Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com>

Update src/core/worker.js

Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com>
2021-04-15 17:57:42 +02:00
Tim van der Meij
38ed655562
Convert done callbacks to async/await in test/unit/cmap_spec.js 2021-04-14 22:24:28 +02:00
Tim van der Meij
046467ff47
Drop obsolete done callbacks in test/unit/annotation_storage_spec.js
There is no asynchronous code involved here, so we can get rid of all
done callbacks here and simply use the fact that if the function call
ends without failed assertion that the test passed.
2021-04-14 22:11:45 +02:00
Tim van der Meij
82bdba78fb
Drop obsolete done callbacks in test/unit/crypto_spec.js
There is no asynchronous code involved here, so we can get rid of all
done callbacks here and simply use the fact that if the function call
ends without failed assertion that the test passed.
2021-04-14 22:09:17 +02:00
Tim van der Meij
43eb4302ff
Convert done callbacks to async/await in test/unit/message_handler_spec.js 2021-04-14 21:59:13 +02:00
Tim van der Meij
bc8c0bbbfd
Convert done callbacks to async/await in test/unit/display_svg_spec.js 2021-04-14 21:59:13 +02:00
Tim van der Meij
cd2c4e277c
Merge pull request #13222 from timvandermeij/unit-test-async
Convert done callbacks to async/await in the smaller unit test files
2021-04-14 20:37:17 +02:00
Tim van der Meij
c1e9f6025f
Convert done callbacks to async/await in test/unit/custom_spec.js 2021-04-13 21:51:27 +02:00
Tim van der Meij
a1c1e1b9f8
Convert done callbacks to async/await in test/unit/fetch_stream_spec.js 2021-04-13 21:51:27 +02:00
Tim van der Meij
5607484402
Convert done callbacks to async/await in test/unit/network_spec.js 2021-04-13 21:51:26 +02:00
Tim van der Meij
fcf4d02fca
Convert done callbacks to async/await in test/unit/node_stream_spec.js 2021-04-13 21:51:26 +02:00
Tim van der Meij
99dc0d6b65
Convert done callbacks to async/await in test/unit/primitives_spec.js 2021-04-13 21:50:13 +02:00
Tim van der Meij
a56ffb92be
Convert done callbacks to async/await in test/unit/ui_utils_spec.js 2021-04-13 21:50:13 +02:00
Tim van der Meij
a2811e925d
Convert done callbacks to async/await in test/unit/util_spec.js 2021-04-13 21:47:53 +02:00
Jonas Jenwald
2b2234fd5a [api-minor] Ensure that PDFDocumentProxy.hasJSActions won't fail if MissingDataExceptions are thrown during the associated worker-thread parsing
With the current implementation of `PDFDocument.hasJSActions`, in the worker-thread, we're not actually handling not-yet-loaded data correctly. This can thus fail in *two* different ways:
 - The `PDFDocument.fieldObjects` getter (and its helper method), while it may *return* a Promise, still fetches all of its data synchronously and it can thus throw a `MissingDataException` during parsing.
 - The `Catalog.jsActions` getter, which is completely synchronous, can obviously throw a `MissingDataException` during parsing.

If either of these cases occur currently, the `PDFDocumentProxy.hasJSActions` method in the API can either return a *rejected* Promise (which it never should) or possibly "hang" and never resolve.

*Please note:* While I've not *yet* seen this error in an actual PDF document, it can happen during loading if you're unlucky enough with e.g. the structure of the PDF document and/or the download speed offered by the server.
This patch is thus based on code-inspection *and* on manually throwing a `MissingDataException` on the first access of `Catalog.jsActions` to simulate this situation.

Finally, this patch adds a couple of *API* unit-tests for this (since none existed).
2021-04-13 14:33:56 +02:00
Calixte Denizet
a4c986515f XFA -- Display text content
- display xhtml;
  - allow spaces in xhtml (xfa-spacerun:yes);
  - support column layout;
  - fix some border issues.
2021-04-12 14:13:49 +02:00
Jonas Jenwald
5adee0cdd1 [api-minor] Let PDFPageProxy.getStructTree return null, rather than an empty structTree, for documents without any accessibility data (PR 13171 follow-up)
This is first of all consistent with existing API-methods, where we return `null` when the data in question doesn't exist. Secondly, it should also be (slightly) more efficient since there's less dummy-data that we need to transfer between threads.
Finally, this prevents us from adding an empty/unnecessary span to *every* single page even in documents without any structure tree data.
2021-04-11 12:35:33 +02:00
Tim van der Meij
10574a0f8a
Remove obsolete done callbacks from the unit tests
The done callbacks are an outdated mechanism to signal Jasmine that a
unit test is done, mostly in cases where a unit test needed to wait for
an asynchronous operation to complete before doing its assertions.
Nowadays a much better mechanism is in place for that, namely simply
passing an asynchronous function to Jasmine, so we don't need callbacks
anymore (which require more code and may be more difficult to reason
about).

In these particular cases though the done callbacks never had any real
use since nothing asynchronous happens in these places. Synchronous
functions don't need to use done callbacks since Jasmine simply knows
it's done when the function reaches its normal end, so we can safely get
rid of these callbacks. The telltale sign is if the done callback is
used unconditionally at the end of the function.

This is all done in an effort to over time get rid of all callbacks in
the unit test code.
2021-04-10 20:29:39 +02:00
Tim van der Meij
d9d626a5e1
Merge pull request #13214 from calixteman/signatures
Display widget signature
2021-04-10 19:35:16 +02:00
Calixte Denizet
5875ebb1ca Display widget signature
- but don't validate them for now;
  - Firefox will display a bar to warn that the signature validation is not supported (see https://bugzilla.mozilla.org/show_bug.cgi?id=854315)
  - almost all (all ?) pdf readers display signatures;
  - validation is done in edge but for now it's behind a pref.
2021-04-10 19:13:28 +02:00
Brendan Dahl
fc9501a637 Add support for basic structure tree for accessibility.
When a PDF is "marked" we now generate a separate DOM that represents
the structure tree from the PDF.  This DOM is inserted into the <canvas>
element and allows screen readers to walk the tree and have more
information about headings, images, links, etc. To link the structure
tree DOM (which is empty) to the text layer aria-owns is used. This
required modifying the text layer creation so that marked items are
now tracked.
2021-04-09 09:56:28 -07:00
Tim van der Meij
228adbf673
Merge pull request #13172 from Snuffleupagus/cleanup-keepFonts
[api-minor] Add an option, in `PDFDocumentProxy.cleanup`, to allow fonts to remain attached to the DOM
2021-04-05 14:21:34 +02:00
Jonas Jenwald
232fbd28e1 Re-factor the PDFDocumentProxy.cleanup unit-tests to use async/await 2021-04-02 12:32:35 +02:00
Jonas Jenwald
0eb1433c78 [api-minor] Change the format of the fontName-property, in defaultAppearanceData, on Annotation-instances (PR 12831 follow-up)
Currently the `fontName`-property contains an actual /Name-instance, which is a problem given that its fallback value is an empty string; see ca7f546828/src/core/default_appearance.js (L35)
The reason that this is a problem can be seen in ca7f546828/src/core/primitives.js (L30-L34), since an empty string short-circuits the cache. Essentially, in PDF documents, a /Name-instance cannot be empty and the way that the `DefaultAppearanceEvaluator` does things is unfortunately not entirely correct.

Hence the `fontName`-property is changed to instead contain a string, rather than a /Name-instance, which simplifies the code overall.

*Please note:* I'm tagging this patch with "[api-minor]", since PR 12831 is included in the current pre-release (although we're not using the `fontName`-property in the display-layer).
2021-04-01 16:47:30 +02:00
Jonas Jenwald
db1e1612df [api-minor] Support proper URL-objects, in addition to URL-strings, in getDocument
Currently only URL-strings are officially supported by `getDocument`, however at this point in time I cannot really see any compelling reason to not support `URL`-objects as well.

Most likely the reason that we've don't already support `URL`-objects, in `getDocument`, is that historically `URL` wasn't fully implemented across browsers and our old polyfill wasn't perfect; see https://developer.mozilla.org/en-US/docs/Web/API/URL/URL#browser_compatibility

*Please note:* Because of how the `url` parameter is currently handled, there's actually *some* cases where passing a `URL`-object to `getDocument` already works. That, in my opinion, provides additional motivation for supporting `URL`-objects officially, since it makes the API more consistent.

The following is an attempt to summarize the *current* situation, based on the actual code rather than the JSDocs:
 - `getDocument("url string")` works and is documented.[1]
 - `getDocument({ url: "url string", })` works and is documented.[1]
 - `getDocument(new URL(...))` throws immediately, since no supported parameters are found.
 - `getDocument({ url: new URL(...), })` actually works even though it's not documented.[1] Originally, when data was fetched on the worker-thread, this would likely have thrown since `URL` isn't clonable.[2]
 - `getDocument({ url: { abc: 123, }, })`, or some similarily meaningless input, will be "accepted" by `getDocument` and then throw a `MissingPDFException` when attempting to fetch the bogus data.

With the changes in this patch, not only is `URL`-objects now officially supported and documented when calling `getDocument`, but we'll also do a much better job at actually validating any URL-data passed to `getDocument` (and instead fail early).

---
[1] In *browsers*, we create a valid URL thus indirectly validating the input. In Node.js environments, on the other hand, no validation is done since obtaining a baseUrl is more difficult (and PDF.js is primarily written for browsers anyway).

[2] https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm#supported_types
2021-03-31 16:21:41 +02:00
calixteman
b3528868c1
XFA - Add support for few ui elements (#13115)
- input;
  - layout;
  - border;
  - margin;
  - color.
2021-03-31 15:42:21 +02:00
calixteman
63471bcbbe
XFA - Convert some template properties into CSS ones (#13082)
- implement few positioning properties: position, width, height, anchor;
  - implement font element;
  - implement fill element (used by font) and its children (linear, radial, ...);
  - font property is inherited from ancestor container (see https://www.pdfa.org/wp-content/uploads/2020/07/XFA-3_3.pdf#page=43) so let CSS handles that stuff;
  - in order to reduce the number of properties to set, only set non default properties and put the default in CSS;
  - set a background to some containers to be able to see them (will be removed in a future commit).
2021-03-25 13:02:39 +01:00
Jonas Jenwald
eeda2215d7 Remove redundant done-callback functions from unit-tests which are async
For unit-tests which are asynchronous, using a `done`-callback is redundant and future Jasmine versions will stop supporting that pattern.
2021-03-21 11:33:39 +01:00
Tim van der Meij
8269ddbd16
Merge pull request #13105 from Snuffleupagus/BasePdfManager-parseDocBaseUrl
Improve memory usage around the `BasePdfManager.docBaseUrl` parameter (PR 7689 follow-up)
2021-03-19 23:03:20 +01:00
calixteman
24e598a895
XFA - Add a layer to display XFA forms (#13069)
- add an option to enable XFA rendering if any;
  - for now, let the canvas layer: it could be useful to implement XFAF forms (embedded pdf in xml stream for the background and xfa form for the foreground);
  - ui elements in template DOM are pretty close to their html counterpart so we generate a fake html DOM from template one:
    - it makes easier to translate template properties to html ones;
    - it makes faster the creation of the html element in the main thread.
2021-03-19 10:11:40 +01:00
Jonas Jenwald
bd9dee1544 Move the getPdfFilenameFromUrl helper function from web/ui_utils.js and into src/display/display_utils.js
It seems reasonable to place this alongside the *similar* `getFilenameFromUrl` helper function. This way, with the changes in the next patch, we also avoid having to expose the `isDataScheme` function in the API itself and we instead expose `getPdfFilenameFromUrl` in the API (which feels overall more appropriate).
2021-03-17 15:48:24 +01:00
Jonas Jenwald
5b5061afa8 Enable the ESLint no-var rule globally
A significant portion of the code-base has now been converted to use `let`/`const`, rather than `var`, hence it should be possible to simply enable the ESLint `no-var` rule globally.
This way we can ensure that new code won't accidentally use `var`, and it also removes the need to manually enable the rule in various folders.

Obviously it makes sense to continue the efforts to replace `var`, but that should probably happen on a file and/or folder basis.

Please note that this patch excludes the following code:
 - The `extensions/` folder, since that seemed easiest for now (and I don't know exactly what the support situation is for the Chromium-extension).

 - The entire `external/` folder is ignored, since most of it's currently excluded from linting.
   For the code that isn't imported from elsewhere (and should be ignored), we should probably (at some point) bring the code up to the same linting/formatting standard as the rest of the code-base.

 - Various files in the `test/` folder are ignored, as necessary, since the way that a lot of this code is loaded will require some care (or perhaps larger re-factoring) when removing `var` usage.
2021-03-13 16:12:53 +01:00
Jonas Jenwald
22e0ed51c6 Remove unnecessary /* eslint no-var: error */ lines in the test/unit/ folder (PR 12528 follow-up)
These lines are no longer needed, since the ESLint `no-var` rule has been enabled in the entire folder.
2021-03-13 11:50:11 +01:00
Brendan Dahl
47a5550f10 Disable intermittent unit test "creates pdf doc from non-existent URL"
Disable this test so we don't have to manually review unit test
failure log all the time.
2021-03-11 11:40:40 -08:00
Calixte Denizet
3243672727 XFA - Create Form DOM in merging template and data trees
- Spec: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=171;
  - support for the 2 ways of merging: consumeData and matchTemplate;
  - create additional nodes in template DOM when occur node allows it;
  - support for global values in data DOM.
2021-03-08 14:10:30 +01:00
calixteman
45329af926
XFA -- Add support for SOM expressions (#12983)
- specifications: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=87;
 - add a parser for SOM expressions;
 - add search functions to resolve those expressions;
 - search functions will be used to bind data into template.
2021-02-24 10:13:02 +01:00
Tim van der Meij
f3aa4408a5
Merge pull request #13005 from calixteman/colors
JS - Fix setting a color on an annotation
2021-02-21 14:50:03 +01:00
Calixte Denizet
4a5f1d1b7a JS - Fix setting a color on an annotation
- strokeColor corresponds to borderColor;
 - support fillColor and textColor;
 - support colors on the different annotations;
 - fix typo in aforms (+test).
2021-02-20 15:24:37 +01:00
Jonas Jenwald
e9038cc3d1 Send the AnnotationStorage-data to the worker-thread as a Map
Rather than converting the `AnnotationStorage`-data to an Object, before sending it to the worker-thread, we should be able to simply send the internal `Map` directly.
The "structured clone algorithm" doesn't have a problem with `Map`s, however the `LoopbackPort` used when workers are *disabled* (e.g. in Node.js environments) didn't use to support them. With PR 12997 having lifted that restriction, we should now be able to simply send the `AnnotationStorage`-data as-is rather than having to iterate through it to first create an Object.

*Please note:* The changes in `src/core/annotation.js` could have been a lot more compact if we were able to use optional chaining in the `src/core` folder. Unfortunately that's still not possible, since SystemJS is being used in the development viewer (i.g. `gulp server`) and fixing that is *still* blocked by [bug 1247687](https://bugzilla.mozilla.org/show_bug.cgi?id=1247687).
2021-02-18 17:13:43 +01:00
calixteman
0fa9976268
XFA - Add support for prototypes (#12979)
- specifications: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=225&zoom=auto,-207,784
 - add a clone method on nodes in order to be able to clone a proto;
 - support ids in template namespace;
 - prevent from cycle when applying protos.
2021-02-18 10:32:25 +01:00
Tim van der Meij
4619b1b568
Merge pull request #12997 from Snuffleupagus/metadata-worker
Move the Metadata parsing to the worker-thread
2021-02-17 20:57:46 +01:00
calixteman
b5be515375
XFA - Add a lexer/parser for FormCalc language (#12936)
- the language specifications are: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=1049
 - it can be used to:
   * as a scripting language for calculation, validations, ...
   * in SOM expressions to select nodes: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=101
2021-02-17 20:28:06 +01:00
Jonas Jenwald
d366bbdf51 Move the encodeToXmlString helper function to src/core/core_utils.js
With the previous patch this function is now *only* accessed on the worker-thread, hence it's no longer necessary to include it in the *built* `pdf.js` file.
2021-02-17 13:12:01 +01:00
Jonas Jenwald
b66f294f64 Move the XML-parser to the src/core/-folder
With the previous patch this functionality is now *only* accessed on the worker-thread, hence it's no longer necessary to include it in the *built* `pdf.js` file.
2021-02-17 13:12:01 +01:00
Jonas Jenwald
cc3a6563ee Move the Metadata parsing to the worker-thread
The only reason, as far as I can tell, for parsing the Metadata on the main-thread is how it was originally implemented. When Metadata support was first implemented, it utilized the [`DOMParser`](https://developer.mozilla.org/en-US/docs/Web/API/DOMParser) which isn't available in workers.
Today, with the custom XML-parser being used, that's no longer an issue and it seems reasonable to move the Metadata parsing to the worker-thread[1], since that's where all parsing should happen (for performance reasons).

Based on these changes, we'll be able to reduce the now unnecessary duplication of the XML-parser (and related code) in both of the *built* `pdf.js`/`pdf.worker.js` files.

Finally, this patch changes the `_repair` method to use "Array + join" rather than string concatenation.

---
[1] This needed the previous patch, to enable sending of `Map`s between threads with workers disabled.
2021-02-17 13:12:01 +01:00
Calixte Denizet
ccef734ebb Remove Promise.all and async+done from unit/scripting_spec 2021-02-17 11:19:39 +01:00
Calixte Denizet
82f75a8ac2 JS -- Fix doc.getField and add missing field methods
- getField("foo") was wrongly returning a field named "foobar";
 - field object had few missing unimplemented methods
2021-02-17 10:42:52 +01:00
Tim van der Meij
bab059d8fd
Merge pull request #12964 from calixteman/12963
Avoid infinite loop when getting annotation field name
2021-02-16 22:36:24 +01:00
Calixte Denizet
0fc8267576 Avoid infinite loop when getting annotation field name
- aims to fix issue #12963;
 - use a Set to track already visited objects;
 - remove the loop limit in getInheritableProperty and use a RefSet too.
2021-02-14 19:58:19 +01:00
Jonas Jenwald
b26c7974fe [api-minor] Change the dc:subject Metadata field to an Array
This patch simply extends the existing handling of the `dc:creator` field, which should hopefully suffice here; please refer to https://wwwimages2.adobe.com/content/dam/acom/en/devnet/xmp/pdfs/XMP%20SDK%20Release%20cc-2016-08/XMPSpecificationPart1.pdf#page=34
2021-02-14 17:16:40 +01:00
Calixte Denizet
ea06bb0e36 [api-minor] Annotation -- Don't compute appearance when nothing has changed
* don't set a value in annotationStorage by default:
   - having an undefined when the annotation is rendered for saving/printing means nothing has changed so use normal appearance
   - aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1681687
 * change the way to compute font size when this one is null in DA:
   - make fontSize proportional to line height
   - in multiline case, take into account the number of lines for text entered to adapt the font size
2021-02-12 19:27:21 +01:00
calixteman
a8021208ea
Restore window.alert after use in scripting test (#12987) 2021-02-12 14:19:58 +01:00
Jonas Jenwald
4733f163e8 Replace a few new Date().getTime() instances with Date.now()
The former format is not only more verbose, but it's also *slightly* less efficient since it creates a new `Date` object.
2021-02-11 23:00:42 +01:00
calixteman
0479deef4e
XFA -- Add other objects (#12949)
- connectionSet: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=969
 - datasets: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=1038
  - signature: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=1040
  - stylesheet: the same
  - xhtml: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=1187
2021-02-11 12:30:37 +01:00
Jonas Jenwald
0068dba009 [api-minor] Rename -es5 to -legacy, to reduce confusion over what's actually supported (issue 12976)
*Please note that this will also require some edits of the Wiki.*
2021-02-10 16:01:59 +01:00
Calixte Denizet
652ff57897 XFA -- Add template object
- Specifications: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=596
2021-02-03 21:05:10 +01:00
Calixte Denizet
0ff5cd7eb5 XFA - Add a parser for XFA files
- the parser is base on a class extending XMLParserBase
 - it handle xml namespaces:
   * each namespace is assocated with a builder
   * builder builds nodes belonging to the namespace
   * when a node is inserted in the parent namespace compatibility is checked (if required)
 - to avoid name collision between xml names and object properties, use Symbol.
2021-02-01 13:45:31 +01:00
Tim van der Meij
f2c7338b02
Merge pull request #12897 from calixteman/12895
JS - Fix mouse event names
2021-01-24 12:28:24 +01:00
Calixte Denizet
34d2e72df2 JS - Fix mouse event names
- fix issue #12895
2021-01-23 20:26:22 +01:00
Tim van der Meij
d4c4f5d4e5
Merge pull request #12870 from Snuffleupagus/page-advance
Add previous/next-page functionality that takes scroll/spread-modes into account (issue 11946)
2021-01-23 19:35:08 +01:00
Jonas Jenwald
ef1d33a29e Use slightly less verbose font-names in the "Default appearance" unit-tests
The new names are not only less verbose, but also uses a *very* common PDF font-naming convention.
2021-01-23 15:34:22 +01:00
Jonas Jenwald
6bcb4e3ad9 Ensure that parseDefaultAppearance won't attempt to access a not yet defined variable (PR 12831 follow-up)
Note how, in the `if (this.stateManager.stateStack.length !== 0) {` branch, we're attempting to access the not yet defined variable[1] `args`. If this code-path is ever hit, an Error will be thrown and parsing will thus be aborted immediately (likely leading to e.g. rendering bugs).

Note that I found this purely by accident, since I happened to glance at the LGTM report. However, I've since found that the error is also present during the unit-test[2] and with this patch we're actually testing the *intended* thing here.

As part of fixing this, and to avoid re-introducing a similar bug in the future, we'll now instead always reset `args.length` *before* attempting to read the next operator.
Also, we can use the existing `EvaluatorPreprocessor.savedStatesDepth` getter to simplify the save/restore detection a tiny bit.

---
[1] The ESLint rule `no-use-before-define` would have helped catch this problem, but unfortunately we cannot enable that without quite a bit of refactoring all over the code-base.

[2] The unit-test was updated such that it would fail in the `master`-branch.
2021-01-23 15:33:28 +01:00