Commit Graph

4304 Commits

Author SHA1 Message Date
Calixte Denizet
9d47e69771 XFA -- Update config object 2021-02-05 19:22:51 +01:00
Calixte Denizet
7e0554afe2 XFA -- Add attributes and children in XFAObject
- in order to evaluate SOM expressions nodes and their attributes must be checked in the same order as in the xml;
 - add an object XFAObjectArray with a parameter max to handle multiple children with the same name.
2021-02-03 18:56:00 +01:00
Calixte Denizet
0ff5cd7eb5 XFA - Add a parser for XFA files
- the parser is base on a class extending XMLParserBase
 - it handle xml namespaces:
   * each namespace is assocated with a builder
   * builder builds nodes belonging to the namespace
   * when a node is inserted in the parent namespace compatibility is checked (if required)
 - to avoid name collision between xml names and object properties, use Symbol.
2021-02-01 13:45:31 +01:00
Jonas Jenwald
fec8c4c43f Access this._onUnsupportedFeature directly in FontFaceObject.getPathGenerator
Given that `FontFaceObject` is not exposed in the public API, but only accessed internally, there's no need to assume that a `FontFaceObject`-instance is ever initialized without `onUnsupportedFeature` being provided. This is also consistent with the `BaseFontLoader` implementation.
2021-01-29 16:48:55 +01:00
Tim van der Meij
e4e92d10e8
Merge pull request #12922 from Snuffleupagus/getTextContent-globalImageCache
Ignore globally cached images in `PartialEvaluator.getTextContent` (PR 11930 follow-up)
2021-01-28 23:44:10 +01:00
Tim van der Meij
8805614a03
Merge pull request #12924 from brendandahl/fix-clone
Fix font data clone error when pdfBug is enabled.
2021-01-28 23:42:12 +01:00
Jonas Jenwald
72da2aa166 Ignore globally cached images in PartialEvaluator.getTextContent (PR 11930 follow-up)
Given that we'll only cache `/XObject`s of the `Image`-type globally, we can utilize that in `PartialEvaluator.getTextContent` as well. This way, in cases such as e.g. issue 12098, we can avoid having to fetch/parse `/XObject`s that we already know to be `Image`s. This is helpful, since `Stream`s are not cached on the `XRef` instance (given their potential size) and the lookup can thus be somewhat expensive in general.

Also, skip a redundant `RefSetCache.has` check in the `GlobalImageCache.getData` method.
2021-01-28 10:19:26 +01:00
Brendan Dahl
52fb5abb0b Fix font data clone error when pdfBug is enabled.
The widths property should be an object to match what metrics returns.

In ZapfDingbats.pdf I was getting a data clone error with pdfBug enabled.
In buildCharCodeToWidth() there was an encoding with the name "at" which
is also the name of a method on an array. buildCharCodeToWidth assumes an
object is passed in, so when it checked for the "at" property, it found the
method and copied it over.

This only seemed to affect Firefox.
2021-01-27 14:38:43 -08:00
Tim van der Meij
d52e5b0505
Merge pull request #12903 from Snuffleupagus/GlobalImageCache-byteSize
Improve global image caching for small images (PR 11912 follow-up, issue 12098)
2021-01-27 22:20:58 +01:00
Tim van der Meij
286271152f
Merge pull request #12910 from calixteman/bidi
Add back dir property in spans in text layer
2021-01-27 22:09:00 +01:00
calixteman
465697eb10
JS - QuickJS sandbox initialization must be the last evaluated string (#12914)
- aims to fix issue #12912
2021-01-26 14:56:01 +01:00
Jonas Jenwald
1ab6d2c604 Improve global image caching for small images (PR 11912 follow-up, issue 12098)
When implementing the `GlobalImageCache` functionality I was mostly worried about the effect of *very large* images, hence the maximum number of cached images were purposely kept quite low[1].
However, there's one fairly obvious problem with that approach: In documents with hundreds, or even thousands, of *small* images the `GlobalImageCache` as implemented becomes essentially pointless.

Hence this patch, where the `GlobalImageCache`-implementation is changed in the following ways:
 - We're still guaranteed to be able to cache a *minimum* number of images, set to `10` (similar as before).
 - If the *total* size of all the cached image data is below a threshold[2], we're allowed to cache additional images.

This patch thus *improve*, but doesn't completely fix, issue 12098. Note that that document is created by a *very poor* PDF generator, since every single page contains the *entire* document (with all of its /Resources) and to create the individual pages clipping is used.[3]

---
[1] Currently set to `10` images; imagine what would happen to overall memory usage if we encountered e.g. 50 images each 10 MB in size.

[2] This value was chosen, somewhat randomly, to be `40` megabytes; basically five times the [maximum individual image size per page](6249ef517d/src/display/api.js (L2483-L2484)).

[3] This surely has to be some kind of record w.r.t. how badly PDF generators can mess things up...
2021-01-26 12:00:12 +01:00
Calixte Denizet
539256c351 Add back dir property in spans in text layer
- aims to fix #12909
2021-01-26 12:00:05 +01:00
calixteman
a3f6882b06
JS -- add support for choice widget (#12826) 2021-01-25 23:40:57 +01:00
Tim van der Meij
f2c7338b02
Merge pull request #12897 from calixteman/12895
JS - Fix mouse event names
2021-01-24 12:28:24 +01:00
Jonas Jenwald
eb015ca023 Update the ESLint env to use "es2021"
Currently this is inconsistent, since we're using ECMAScript 2021 in the parser options.
2021-01-24 11:25:43 +01:00
Calixte Denizet
34d2e72df2 JS - Fix mouse event names
- fix issue #12895
2021-01-23 20:26:22 +01:00
Tim van der Meij
25b84ce84c
Merge pull request #12828 from dhufnagel/feature/annotation_layer_display_fontsize
[api-minor] Set font size and color for text widget annotations
2021-01-23 16:08:07 +01:00
Jonas Jenwald
6bcb4e3ad9 Ensure that parseDefaultAppearance won't attempt to access a not yet defined variable (PR 12831 follow-up)
Note how, in the `if (this.stateManager.stateStack.length !== 0) {` branch, we're attempting to access the not yet defined variable[1] `args`. If this code-path is ever hit, an Error will be thrown and parsing will thus be aborted immediately (likely leading to e.g. rendering bugs).

Note that I found this purely by accident, since I happened to glance at the LGTM report. However, I've since found that the error is also present during the unit-test[2] and with this patch we're actually testing the *intended* thing here.

As part of fixing this, and to avoid re-introducing a similar bug in the future, we'll now instead always reset `args.length` *before* attempting to read the next operator.
Also, we can use the existing `EvaluatorPreprocessor.savedStatesDepth` getter to simplify the save/restore detection a tiny bit.

---
[1] The ESLint rule `no-use-before-define` would have helped catch this problem, but unfortunately we cannot enable that without quite a bit of refactoring all over the code-base.

[2] The unit-test was updated such that it would fail in the `master`-branch.
2021-01-23 15:33:28 +01:00
Dominik Hufnagel
c5083cda02 set font size and color on annotation layer
use the default appearance to set the font size and color of a text
annotation widget
2021-01-22 23:12:14 +01:00
Tim van der Meij
6ffb6b1c0c
Merge pull request #12885 from Snuffleupagus/worker-tweak-caching
Simplify the `PDFFunctionFactory._localFunctionCache` initialization (PR 12034 follow-up); Fix the `gStateObj` lookup in `TranslatedFont._removeType3ColorOperators` (PR 12718 follow-up)
2021-01-22 20:24:33 +01:00
Jonas Jenwald
ca1f58ea42 Use _defaultAppearanceData directly in WidgetAnnotation._getSaveFieldResources (PR 12831 follow-up)
With the changes in PR 12831, it's no longer necessary to keep track of the `fontName`-string separately since it's available through the `_defaultAppearanceData`-property as well.
2021-01-22 13:23:04 +01:00
Jonas Jenwald
8137c0547d Fix the gStateObj lookup in TranslatedFont._removeType3ColorOperators (PR 12718 follow-up)
As can be seen in 2cba290361/src/core/evaluator.js (L986) the `gStateObj` (which is actually an Array despite its name), is wrapped in Array when it's inserted into the OperatorList. Hence we obviously need to take this into account when accessing it in `TranslatedFont._removeType3ColorOperators`; this mistake happened because we don't have any test-cases for this particular code-path as far as I know.
2021-01-22 12:27:38 +01:00
Jonas Jenwald
cfaf23dee2 Simplify the PDFFunctionFactory._localFunctionCache initialization (PR 12034 follow-up)
By changing this a `shadow`ed getter, we can simply access it directly and not worry about it being initialized. I have no idea why I didn't just implement it this way in the first place.
2021-01-22 12:25:05 +01:00
Brendan Dahl
2cba290361
Merge pull request #12836 from calixteman/update_buttons
JS -- update radio/checkbox values even if there are no actions
2021-01-21 14:00:26 -08:00
calixteman
1039698697
Add a parser to get font data from the default appearance (#12831)
* Add a parser to get font data from the default appearance
 - pdfium & poppler use a special parser too to get these info.

* Update src/core/default_appearance.js

Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com>

Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com>
2021-01-21 20:15:31 +01:00
Brendan Dahl
4142001fc2
Merge pull request #12869 from calixteman/lw
Fix zoom issue with too thin lines
2021-01-21 08:31:59 -08:00
Tim van der Meij
9d4bad91e2
Merge pull request #12878 from Snuffleupagus/worker-compat-checks
Remove redundant compatibility checks, for modern `generic` builds, in `src/core/worker.js`
2021-01-20 21:03:04 +01:00
Brendan Dahl
f45ba02fd3
Merge pull request #12850 from calixteman/missing_cstes
JS -- Add few missing constants in global scope
2021-01-20 11:33:02 -08:00
Jonas Jenwald
b4eb55250e Remove redundant compatibility checks, for modern generic builds, in src/core/worker.js
With the recent additions of optional chaining and nullish coalescing to the PDF.js code-base, a couple of the checks in `src/core/worker.js` are now redundant; please see this compatibility information:
 - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Optional_chaining#browser_compatibility
 - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Nullish_coalescing_operator#browser_compatibility

In practice, for the non-translated/non-polyfilled PDF.js builds, browsers without support for optional chaining and nullish coalescing will simply throw immediately upon loading of the code.
Hence both the `globalThis` and `Promise.allSettled` checks are now unnecessary, given this compatibility information:
 - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/globalThis#browser_compatibility
 - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise/allSettled#browser_compatibility

*Please note:* The `ReadableStream` check is however still necessary, since Node.js doesn't support that.
2021-01-20 13:09:56 +01:00
Jonas Jenwald
298ee5cfbb Replace some ternary operators with optional chaining, and nullish coalescing, in the src/display/-folder
This way, we can further reduce unnecessary code-repetition in some cases.
2021-01-19 17:20:02 +01:00
Calixte Denizet
9754216c60 Fix zoom issue with too thin lines
- aims to fix issue #12868: apply zoom factor to linewidth after setting it to 1.
 - only apply 1px-width when required
 - the sign of getSinglePixelWidth is used to know if 1px-width is required
2021-01-16 15:52:27 +01:00
Calixte Denizet
0d1b19632d Enforce linewidth to 1px when at least one of scale factor is lower than 1 2021-01-15 13:18:24 +01:00
Jonas Jenwald
cf7eb87934 Remove a duplicated reference test (PR 12812 follow-up)
- Remove a *duplicated* reference test, see "issue12810", from the manifest.

 - Improve the spelling in a couple of comments in `src/core/canvas.js`, most notable of the word "parallelogram".

 - Update a comment, also in `src/core/canvas.js`, to actually agree with the value used to reduce confusion when reading the code.
2021-01-15 10:57:15 +01:00
Brendan Dahl
6619f1f3f2
Merge pull request #12812 from calixteman/too_thin
Enforce line width to be at least 1px after applied transform
2021-01-14 15:21:44 -08:00
Jonas Jenwald
2600e59acb Always re-measure non-embedded ArialNarrow fonts (bug 1671312, PR 12725 follow-up)
While PR 12725 fixed bug 1671312 as reported, i.e. the "In the upper right corner "Purposes' has bad kerning."-part, it however broke other parts of the text rendering.
Note in particular the tables, e.g. on page 2 and beyond, where the glyphs are now rendered too close together. The reason for this is that the fonts in question are non-embedded ArialNarrow, which we just replace with Helvetica which obviously is not narrow. Given that the font replacement isn't a perfect fit for non-embedded ArialNarrow, we still need to re-measure the glyph widths in this case.
2021-01-14 15:51:48 +01:00
Jonas Jenwald
13742eb82d Inlude the JS actions for the page when dispatching the "pageopen"-event in the BaseViewer
Note first of all how the `PDFDocumentProxy.getJSActions` method in the API caches the result, which makes repeated lookups cheap enough to not really be an issue.
Secondly, with the previous patch, we're now only dispatching "pageopen"/"pageclose"-events when there's actually a sandbox that listens for them.

All-in-all, with these changes we can thus simplify the default-viewer "pageopen"-event handler a fair bit.
2021-01-12 20:28:50 +01:00
calixteman
1de1ae0be6
Merge pull request #12838 from calixteman/authors
[api-minor] Change the "dc:creator" Metadata field to an Array
2021-01-12 02:44:58 -08:00
Calixte Denizet
43d5512f5c [api-minor] Change the "dc:creator" Metadata field to an Array
- add scripting support for doc.info.authors
 - doc.info.metadata is the raw string with xml code
2021-01-11 21:34:07 +01:00
Calixte Denizet
8e6bec6e2e JS -- Add few missing constants in global scope
- these constants are available in pdfium implementation too
 - fix error code in aform.js
2021-01-11 17:19:28 +01:00
Calixte Denizet
b3dccd66ab Enforce line width to be at least 1px after applied transform
* add a comment to explain how minimal linewidth is computed.
 * when context.linewidth < 1 after transform, firefox and chrome
   don't render in the same way (issue #12810).
 * set lineWidth to 1 after transform and before stroking
   - aims fix issue #12295
   - a pixel can be transformed into a rectangle with both heights < 1.
     A single rescale leads to a rectangle with dim equals to 1 and
     the other to something greater than 1.
 * change the way to render rectangle with null dimensions:
   - right now we rely on the lineWidth set before "re" but
     it can be set after "re" and before "S" and in this case the rendering
     will be wrong.
   - render such rectangles as a single line.
2021-01-10 18:02:12 +01:00
Tim van der Meij
f85b8721d1
Merge pull request #12842 from Snuffleupagus/issue-12841
Improve handling of JPEG images without an EOI marker (issue 12841)
2021-01-10 13:21:28 +01:00
Jonas Jenwald
81525fd446 Use ESLint to ensure that exports are sorted alphabetically
There's built-in ESLint rule, see `sort-imports`, to ensure that all `import`-statements are sorted alphabetically, since that often helps with readability.
Unfortunately there's no corresponding rule to sort `export`-statements alphabetically, however there's an ESLint plugin which does this; please see https://www.npmjs.com/package/eslint-plugin-sort-exports

The only downside here is that it's not automatically fixable, but the re-ordering is a one-time "cost" and the plugin will help maintain a *consistent* ordering of `export`-statements in the future.
*Note:* To reduce the possibility of introducing any errors here, the re-ordering was done by simply selecting the relevant lines and then using the built-in sort-functionality of my editor.
2021-01-09 20:37:51 +01:00
Jonas Jenwald
cd9422a075 Improve handling of JPEG images without an EOI marker (issue 12841)
Given that the PDF document in the issue contains the same very large JPEG image *three* times, this patch includes a test-case where only the first page has been extracted from it.
2021-01-09 20:19:39 +01:00
Tim van der Meij
c0a6d6cd21
Merge pull request #12394 from calixteman/appearance
In a text widget, Font resources can be in the appearance
2021-01-08 21:03:41 +01:00
Jonas Jenwald
941b65f683 Remove unncessary CanvasFactory/CMapReaderFactory/FileReaderFactory duplication in unit-tests
Given that the API will now, after PR 12039, automatically pick the correct factories to use depending on the environment (browser vs. Node.js), we can utilize that in the unit-tests as well. This way we don't have to manually repeat the same initialization code in *multiple* unit-tests.
*Note:* The *official* PDF.js API is defined in `src/pdf.js`, hence the new exports in `src/display/api.js` will not affect that.

Also, updates the unit-test `FileReaderFactory` helpers similarily.

*Drive-by change:* Fix the `CMapReaderFactory` usage in the annotation unit-tests, since the cache should only contain raw data and not a Promise. While this obviously works as-is, having unit-tests that "abuse" the intended data format can easily lead to unnecessary failures if changes are made to the relevant `src/core/` code.
2021-01-08 17:33:59 +01:00
Calixte Denizet
7172f0a928 JS -- update radio/checkbox values even if there are no actions 2021-01-08 16:43:16 +01:00
Calixte Denizet
83119b9000 In a text widget, Font resources can be in the appearance 2021-01-08 10:13:47 +01:00
Tim van der Meij
048081fb69
Merge pull request #12824 from Snuffleupagus/preEvaluateFont-errors
Improve the handling of errors, in `PartialEvaluator.loadFont`, occuring in `PartialEvaluator.preEvaluateFont` (issue 12823)
2021-01-07 23:15:41 +01:00
Tim van der Meij
5bde4b71f8
Merge pull request #12292 from calixteman/encoding
Fix encoding issues when printing/saving a form with non-ascii characters
2021-01-07 22:56:42 +01:00