Commit Graph

5943 Commits

Author SHA1 Message Date
Jonas Jenwald
6a2c2a646f Remove the remaining closures in the src/core/type1_parser.js file
Given that the code is written with JavaScript module-syntax, none of this functionality will "leak" outside of this file with these change.
By removing this closure the file-size is decreased, even for the *built* `pdf.worker.js` file, since there's now less overall indentation in the code.
2022-08-14 12:50:26 +02:00
Tim van der Meij
e6fe127433
Merge pull request #15313 from Snuffleupagus/move-binarySearchFirstItem
Move `binarySearchFirstItem` back to the `web/`-folder (PR 15237 follow-up)
2022-08-14 12:14:32 +02:00
Jonas Jenwald
0024165f1f Move binarySearchFirstItem back to the web/-folder (PR 15237 follow-up)
This was moved into the `src/display/`-folder in PR 15110, for the initial editor-a11y patch. However, with the changes in PR 15237 we're again only using `binarySearchFirstItem` in the `web/`-folder and it thus seem reasonable to move it back there.
The primary reason for moving it back is that `binarySearchFirstItem` is currently exposed in the public API, and we always want to avoid that unless it's either PDF-related functionality or code that simply must be shared between the `src/`- and `web/`-folders. In this case, `binarySearchFirstItem` is a general helper function that doesn't really satisfy either of those alternatives.
2022-08-14 11:38:17 +02:00
Jonas Jenwald
e5e756c0b4 Remove the remaining closures in the src/core/cff_parser.js file
Given that the code is written with JavaScript module-syntax, none of this functionality will "leak" outside of this file with these changes.
For e.g. the `gulp mozcentral` command the *built* `pdf.worker.js` file-size decreases `~2 kB` with this patch, and most of the improvement comes from having less overall indentation in the code.
2022-08-13 19:48:17 +02:00
Tim van der Meij
f212341d01
Merge pull request #15306 from Snuffleupagus/Type3-only-Path2D
Only compile Type3 glyphs when `Path2D` is supported
2022-08-13 15:23:11 +02:00
Jonas Jenwald
9dcfdb9578 Remove the remaining closure in the src/core/function.js file
Given that the code is written with JavaScript module-syntax, none of this functionality will "leak" outside of this file with these changes.
By removing this closure the file-size is decreased, even for the *built* `pdf.worker.js` file, since there's now less overall indentation in the code.
2022-08-13 12:52:36 +02:00
Calixte Denizet
2916910ea1 [Annotation] Add an aria role comment for FreeText annotations 2022-08-12 15:59:21 +02:00
calixteman
6b4c2464ad
Merge pull request #15237 from calixteman/annotation_a11y
[Annotations] Add some aria-owns in the text layer to link to annotations (bug 1780375)
2022-08-12 15:04:56 +02:00
Calixte Denizet
f316300113 [Annotations] Add some aria-owns in the text layer to link to annotations (bug 1780375)
This patch doesn't structurally change the text layer: it just adds some aria-owns
attributes to some spans.
The aria-owns attribute expect to have an element id, hence it's why it adds back an
id on the element rendering an annotation, but this id is built in using crypto.randomUUID
to avoid any potential issues with the hash in the url.
The elements in the annotation layer are moved into the DOM in order to have them in the
same "order" as they visually are.
The overall goal is to help screen readers to present to the user the annotations as
they visually are and as they come in the text flow.
It is clearly not perfect, but it should improve readability for some people with visual
disabilities.
2022-08-12 14:35:26 +02:00
Jonas Jenwald
e9e9fee833 Only compile Type3 glyphs when Path2D is supported
According to MDN `Path2D` is available in all browsers that we currently support, see https://developer.mozilla.org/en-US/docs/Web/API/Path2D#browser_compatibility
Hence only Node.js is currently lagging behind here, and requires that we keep the old code as a fallback in the `compileType3Glyph` function. However, there's an open PR in the `node-canvas` repository for adding `Path2D` support.

As far as I'm concerned, there's two possible solutions here:
 - We land this patch now, since it removes unnecessary code in e.g. the Firefox PDF Viewer, which means that compilation of Type3 glyphs will be disabled in Node.js until that PR is landed.[1]
   If users report bugs about Type3 glyphs looking "inconsistent" in Node.js and/or being slow to render, we could perhaps encourage them to upvote and otherwise help out getting that PR landed?

 - We wait for the mentioned PR to land *first*, before moving forward with this patch. Given that there's been no updates on that PR for almost two months, this alternative may possibly take a while.

---
[1] Note that Type3 fonts are first of all not very common in PDF documents, and secondly that compilation only applies specifically to Type3 glyphs that contain /ImageMask-data (i.e. not all Type3 fonts are affected).
2022-08-12 13:06:42 +02:00
Jonas Jenwald
dd95e4f851 Add *official* support for passing ArrayBuffer-data to getDocument (issue 15269)
While this has always worked, as a consequence of the implementation, it's never been officially supported.
In addition to adding basic unit-tests, this patch also introduces a couple of new JSDoc `@typedef`s in the API to avoid overly long lines.
2022-08-10 14:13:01 +02:00
calixteman
cef2ac99e5
Merge pull request #15298 from calixteman/ink_min_size
[Editor] Ensure an ink editor has the minimal required size after having been pasted
2022-08-10 10:27:37 +02:00
Calixte Denizet
63361dcfc7 [Editor] Ensure an ink editor has the minimal required size after having been pasted 2022-08-10 10:15:23 +02:00
Calixte Denizet
71ca249d2b [Editor] Avoid creation of an editor on "wrong" clicks 2022-08-10 10:05:04 +02:00
Calixte Denizet
04f78c935c Fix OTS issue with empty index (#15289) 2022-08-08 22:56:26 +02:00
Calixte Denizet
5e0ddfb0e6 [Editor] Remove use of innerHtml 2022-08-07 13:39:41 +02:00
Tim van der Meij
2a84a3078b
Merge pull request #15283 from Snuffleupagus/sort-PopupAnnotation
[api-minor] Sort PopupAnnotations already on the worker-thread (PR 11535 follow-up)
2022-08-06 15:07:09 +02:00
Jonas Jenwald
358a0607fe Remove mozCurrentTransform/mozCurrentTransformInverse usage
These canvas-context properties are Mozilla-specific, and has obviously never been implemented anywhere else. Currently they are in the process of being removed, see [bug 1782651](https://bugzilla.mozilla.org/show_bug.cgi?id=1782651) and [bug 1294360](https://bugzilla.mozilla.org/show_bug.cgi?id=1294360), which in practice means that in e.g. Firefox Nightly the `addContextCurrentTransform`-function is now being used in the *built-in* PDF Viewer (which was obviously never intended).

We should thus be able to replace these Mozilla-specific properties with `CanvasRenderingContext2D.getTransform()`, which is available in all browsers that we currently support: https://developer.mozilla.org/en-US/docs/Web/API/CanvasRenderingContext2D/getTransform#browser_compatibility
2022-08-06 14:24:57 +02:00
Jonas Jenwald
876a02a504 [api-minor] Sort PopupAnnotations already on the worker-thread (PR 11535 follow-up)
By doing this in the worker-thread this code will only need to run *once*, whereas currently re-rendering of a page forces this to be repeated (e.g. after it's been scrolled out-of-view and then back into view again).
2022-08-06 11:42:45 +02:00
Jonas Jenwald
f6db7975c5 Enable the ESLint prefer-spread rule
Note that in a couple of spots the argument could be `undefined` and there we simply disable the rule instead.

Please refer to https://eslint.org/docs/latest/rules/prefer-spread
2022-08-06 10:17:00 +02:00
Calixte Denizet
fce83f8656 [Editor] Remove some a11y properties only useful when a FreeText editor is edited 2022-08-04 15:28:25 +02:00
Calixte Denizet
3c8d8f0d02 [Editor] A pasted FreeText editor was missing when printing/saving
When a FreeText editor is pasted then it hasn't an editorDiv yet when added
to the layer, hence it's empty.
So this patch just move the call to addToAnnotationStorage to ensure we've
what we need.
2022-08-04 13:00:45 +02:00
calixteman
b985eaa98c
Merge pull request #15267 from calixteman/freetext_a11y
[Annotation] Add a div containing the text of a FreeText annotation (bug 1780375)
2022-08-04 11:49:29 +02:00
Calixte Denizet
31155740c3 [Annotation] Add a div containing the text of a FreeText annotation (bug 1780375)
An annotation doesn't have to be in the text flow, hence it's likely a bad idea
to insert its text in the text layer. But the text must be visible from a screen
reader point of view so it must somewhere in the DOM.
So with this patch, the text from a FreeText annotation is extracted and added in
a div in its HTML counterpart, and with the patch #15237 the text should be visible
and positioned relatively to the text flow.
2022-08-04 11:14:05 +02:00
Calixte Denizet
6916fabd51 Skip unknown fields when calculating a value in using AFSimple_Calculate 2022-08-03 23:40:09 +02:00
Jonas Jenwald
4f6cd05a53
Merge pull request #15264 from calixteman/editing_telemetry
[Editor] Add some telemetry to know how often the editing features are used (bug 1782254)
2022-08-03 11:28:08 +02:00
Calixte Denizet
94f57e5dd7 [Editor] Add some telemetry to know how often the editing features are used (bug 1782254) 2022-08-03 09:54:27 +02:00
Jonas Jenwald
899fc29eef Always set a border-radius for RadioButton annotations (issue 15262) 2022-08-02 13:58:20 +02:00
Jonas Jenwald
0c31320c12 [api-minor] Improve thumbnail handling in documents that contain interactive forms
To improve performance of the sidebar we use the page-canvases to generate the thumbnails whenever possible, since that avoids unnecessary re-rendering when the sidebar is open. This works generally well, however there's an old problem in PDF documents that contain interactive forms (when those are enabled): Note how the thumbnails become partially (or fully) blank, since those Annotations are not included in the OperatorList.[1]

We obviously want to keep using the `PDFThumbnailView.setImage`-method for most documents, however we need a way to skip it only for those pages that contain interactive forms.
As it turns out it's unfortunately not all that simple to tell, after the fact, from looking only at the OperatorList that some Annotations were skipped. While it might have been possible to try and infer that in the viewer, it'd not have been pretty considering that at the time when rendering finishes the annotationLayer has not yet been built.
The overall simplest solution that I could come up with, was instead to include a *summary* of the interactive form-state when doing the final "flushing" of the OperatorList and expose that information in the API.

---
[1] Some examples from our test-suite: `annotation-tx2.pdf` where the thumbnail is completely blank, and `bug1737260.pdf` where the thumbnail is missing the "buttons" found on the page.
2022-07-30 16:53:32 +02:00
Tim van der Meij
c7b71a3376
Merge pull request #15215 from Snuffleupagus/optional-content-initial
[api-minor] Improve how we disable `PDFThumbnailView.setImage` for documents with Optional Content
2022-07-30 12:04:23 +02:00
Jonas Jenwald
ee8fab929c
Merge pull request #15244 from calixteman/15241
[Editor] Add an editor in the annotation storage only when it's non-empty (#15241)
2022-07-29 20:58:53 +02:00
Calixte Denizet
e819834505 [Editor] Add an editor in the annotation storage only when it's non-empty (#15241) 2022-07-29 18:00:52 +02:00
Calixte Denizet
9a464b70c1 [Editor] Avoid to slightly move ink editor when undoing/redoing 2022-07-29 16:53:03 +02:00
Calixte Denizet
d092a85b6c Fix wrong order of arguments when calling the CipherTransform ctor (bug 1782186) 2022-07-29 12:46:45 +02:00
Calixte Denizet
51c8e2f3ab Fix text selection with hdpi screens (#15229) 2022-07-28 19:44:13 +02:00
Jonas Jenwald
2fb083f3e2 Ensure that the isUsingOwnCanvas-parameter is consistently included in operatorLists (PR 14247 follow-up)
Currently some `OPS.beginAnnotation` arguments will contain a `Number` value for the `isUsingOwnCanvas`-parameter, or in some cases an `undefined` value, which is inconsistent from an API perspective.
2022-07-28 13:37:37 +02:00
Calixte Denizet
3c10c71a91 [Editor] Reset the queue when a command is added after having undone all the commands 2022-07-27 23:23:28 +02:00
Calixte Denizet
759116f4c5 [Editor] Avoid to add unexpected commands in the undo/redo queue when undoing/redoing (bug 1781790)
We can undo/redo a command which will at some point add a command in the queue: typically
it can happening when redoing an addition.
So the idea is to lock the queue when undoing/redoing.
2022-07-27 19:12:06 +02:00
Calixte Denizet
7831a100b3 [Editor] Add the possibility to change line opacity in Ink editor 2022-07-27 18:46:25 +02:00
calixteman
45b9e8417d
Merge pull request #15231 from calixteman/bug1781763
[Editor] Don't set as active an editor which is not (bug 1781763)
2022-07-27 18:45:07 +02:00
Calixte Denizet
ce4144eee4 [Editor] Avoid editor creation/selection on right click (bug 1781762) 2022-07-27 17:53:22 +02:00
Calixte Denizet
59580d8986 [Editor] Don't set as active an editor which is not (bug 1781763) 2022-07-27 14:46:36 +02:00
Jonas Jenwald
8ebd2d3dfd
Merge pull request #15221 from Snuffleupagus/issue-15220
Support images with /Filter-entries that contain Arrays (issue 15220)
2022-07-25 11:14:51 +02:00
Jonas Jenwald
fc018ea9ea Support images with /Filter-entries that contain Arrays (issue 15220)
This patch "borrows" the code found in the `Parser.makeInlineImage`-method, to ensure that JBIG2 and JPX images can be rendered correctly.
2022-07-25 08:41:37 +02:00
Calixte Denizet
85f3e23e7f [Editor] Fix few keyboard shortcuts on mac 2022-07-24 22:22:27 +02:00
Jonas Jenwald
ceb4f8a6ab [api-minor] Add a new method, in OptionalContentConfig, to detect the initial Optional Content visibility state
This will allow us to improve the `PDFThumbnailView.setImage` handling in the viewer, and thanks to the added caching this should be reasonbly efficient.
2022-07-24 17:29:37 +02:00
Jonas Jenwald
f3d76b42b3 Ensure that OptionalContentGroup.visible cannot be modified from the "outside"
Given that Optional Content visibility is only intended/supported to be updated via the `OptionalContentConfig.setVisibility`-method, this patch actually enforces that now.
Note that this will be used by the next patch in the series, and will help prevent inconsistent state in the `OptionalContentConfig`-class.

*Please note:* This patch also uncovered a pre-existing bug, related to iterating through the visibility groups in the constructor, for the `baseState === "OFF"` case.
2022-07-24 17:28:08 +02:00
Jonas Jenwald
2d6ebc5801 Convert the OptionalContentConfig to use *properly* private fields/methods
To ensure that this data cannot be directly changed from the outside, use private fields/methods now that those are available.
2022-07-24 13:40:59 +02:00
Calixte Denizet
a118e268af [Editor] Fix multi-selection on touch screens 2022-07-22 16:11:58 +02:00
Calixte Denizet
7b25b39a17 [Editor] Replace mouse events by pointer ones (bug 1779015)
The goal is to be able to edit a pdf on a touchscreen.
2022-07-22 13:46:39 +02:00
Calixte Denizet
5bbe0d0782 [Editor] Unselect correctly removed editors
- After undoing a deletion of several editors, they appeared to be selected (they had a red border)
when in fact they were not, consequently, this patch aims to remove the selectedEditor class when
an editor is removed;
- Add a test with some ink editors.
2022-07-22 13:21:08 +02:00
Calixte Denizet
d6b9ca48a5 [Editor] Add the ability to make multiple selections (bug 1779582)
- several editors can be selected/unselected using ctrl+click;
- and then they can be copied, pasted, their properties can be changed.
2022-07-21 22:53:52 +02:00
Calixte Denizet
af41a5cb49 [Editor] Simplify the command manager
The previous version was maybe functional but definitely painful to maintain
(maybe more efficient... I don't know) so this patch aims to simplify it and
it adds some basic unit tests.
2022-07-21 18:44:41 +02:00
Jonas Jenwald
5e7eab4dd8
Merge pull request #15196 from calixteman/zindex
[Editor] Add a z-index in order to draw them in the right order
2022-07-21 09:38:12 +02:00
Calixte Denizet
a7a5e98b7e [Editor] Add a z-index in order to draw them in the right order
The elements in the annotationEditor layer are rearranged to make them
more accessible, but we must draw them in the order they have been created,
hence this patch adds a z-index to the editors.
2022-07-20 15:47:43 +02:00
calixteman
408c10b5bb
Merge pull request #15195 from calixteman/empty_editor
[Editor] No need to click twice to create an editor when the last one is empty
2022-07-20 14:29:16 +02:00
Calixte Denizet
6d0676fd86 [Editor] No need to click twice to create an editor when the last one is empty 2022-07-20 14:15:00 +02:00
Jonas Jenwald
a9fc8792c8
Merge pull request #15192 from Snuffleupagus/issue-15139
Ignore invalid /CIDToGIDMap-entries when parsing fonts (issue 15139)
2022-07-20 12:45:04 +02:00
Calixte Denizet
e1f28d3504 [Editor] Move the keyboard manager at the container level
- This way, the keyboard callbacks are called even if the page has not
the focus, hence the user doesn't have to guess that they have to click
on the page which is a bit painful especially in Ink mode.
- Add two keyboard shortcuts to commit a Freetext editor (ctrl+enter and
escape).
2022-07-20 12:24:30 +02:00
Jonas Jenwald
60bd9580e2 Ignore invalid /CIDToGIDMap-entries when parsing fonts (issue 15139)
In the referenced PDF document the fonts have /CIDToGIDMap-entries that cannot be loaded. Hence, only when `ignoreErrors` is set, we'll now ignore these corrupt /CIDToGIDMap-entries and fallback to simply assume that no such data is available.

Given that this is *clearly* a case of a corrupt PDF document, there's no guarantee that this will "fix" things in the general case since a /CIDToGIDMap may be *required* in order for some composite fonts to render correctly. However, attempting to render *something* is surely better than skipping a font altogether.
2022-07-20 11:58:44 +02:00
calixteman
7a4b72ed11
Merge pull request #15185 from calixteman/ink_translation
[Editor] Ink editor was too much translated after commit
2022-07-20 10:34:37 +02:00
Calixte Denizet
964fb77fa5 [Editor] Remove useless and potentially deleted editors
After a deletion, a reference on a deleted editor can still be used
(for example in changing the font size just after having deleted all
editors).
2022-07-19 23:04:33 +02:00
Jonas Jenwald
f46895d750
Merge pull request #15110 from calixteman/editing_a11y
[Editor] Improve a11y for newly added element (#15109)
2022-07-19 20:02:53 +02:00
Jonas Jenwald
98f70d87f6
Merge pull request #15174 from Snuffleupagus/more-for-of
Use more `for...of` loops in the code-base
2022-07-19 19:04:47 +02:00
Calixte Denizet
624b26e1de [Editor] Improve a11y for newly added element (#15109)
- In the annotationEditorLayer, reorder the editors in the DOM according
  the position of the elements on the screen;
- add an aria-owns attribute on the "nearest" element in the text layer
  which points to the added editor.
2022-07-19 18:52:17 +02:00
calixteman
ad15532235
Merge pull request #15179 from calixteman/editor_cp
[Editor] Use serialized data when copying/pasting
2022-07-19 18:31:37 +02:00
Calixte Denizet
3c17dbb43e [Editor] Use serialized data when copying/pasting
- in using the global clipboard, it'll be possible to copy from a
  pdf and paste in an other one;
- it'll allow to edit a previously created annotation;
- copy the editors in the current page.
2022-07-19 17:54:06 +02:00
Calixte Denizet
7024a53e79 [Editor] Simplify the way to create an editor on click
Previously, we had to set the #allowClick property by hand which was
a bit painful because it's easy to overlook one case or an other.
So with this patch a new editor (for now FreeText one only because the
Ink one is a bit different) is created on the first click if none is selected
on mousedown, else the first click will just commit the data and then the
second will creater a new editor.
2022-07-19 17:41:35 +02:00
Calixte Denizet
35671127d9 [Editor] Ink editor was too much translated after commit
The problem is clearly visible when the thickness is at max.
It's mainly because the thickness was not taken into account when
translating the div but it was when the line is drawn on the canvas.
2022-07-19 17:33:34 +02:00
Jonas Jenwald
37ebc28756 Use more for...of loops in the code-base
Note that these cases, which are all in older code, were found using the [`unicorn/no-for-loop`](https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/no-for-loop.md) ESLint plugin rule.
However, note that I've opted not to enable this rule by default since there's still *some* cases where I do think that it makes sense to allow "regular" for-loops.
2022-07-17 16:18:54 +02:00
Jonas Jenwald
90bf659b29 [api-minor] Deprecate the SVG back-end 2022-07-16 10:24:24 +02:00
Jonas Jenwald
de7d1d2167
Merge pull request #15170 from calixteman/js_rm_null
[JS] Embedded JS scripts can have some null chars
2022-07-15 17:11:29 +02:00
Jonas Jenwald
acd61a138e Handle errors in the "Loading by ref" code-path in PartialEvaluator.loadFont
Note how we currently throw a "raw" Error, which is problematical since all of the `PartialEvaluator.loadFont` call-sites expect a Promise to be returned. Furthermore, this also means that we don't benefit from the fallback code-path that now exists below.

*Please note:* Unfortunately I don't have a test-case that fails without this patch, since it's something I happened to notice when reading the code while working on another patch.
2022-07-15 16:33:36 +02:00
Calixte Denizet
5f0c95e70e [JS] Embedded JS scripts can have some null chars 2022-07-15 16:05:25 +02:00
calixteman
41b2f52f70
Merge pull request #15157 from calixteman/1778484
Add unicode mapping in the font cmap to have correct chars when printing in pdf (bug 1778484)
2022-07-13 14:45:12 +02:00
Calixte Denizet
680c293c34 Add unicode mapping in the font cmap to have correct chars when printing in pdf (bug 1778484)
It aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1778484.
2022-07-13 14:38:27 +02:00
Jonas Jenwald
b4a3fd31c8
Merge pull request #15162 from Snuffleupagus/prefer-logical-operator-over-ternary
Enable the `unicorn/prefer-logical-operator-over-ternary` ESLint plugin rule
2022-07-13 09:50:04 +02:00
calixteman
1301b71b7c
Merge pull request #15163 from calixteman/prepare_touch
[Editor] Always have an ink editor (when in ink mode)
2022-07-12 19:35:32 +02:00
Calixte Denizet
2df2defa02 [Editor] Always have an ink editor (when in ink mode)
Previously it was created only on mouseover event but on a touch screen
there are no fingerover event...
The idea behind creating the ink editor on mouseover was to avoid to have
a canvas on each visible page.
So now, when the editor is created, the canvas has dimensions 1x1 and
only when the user starts drawing the dimensions are set to the page ones.
2022-07-12 19:18:37 +02:00
Jonas Jenwald
dcc73423e5 Enable the unicorn/prefer-logical-operator-over-ternary ESLint plugin rule
This leads to ever so slightly more compact code, and can in some cases remove the need for a temporary variable.

Please find additional information here:
https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/prefer-logical-operator-over-ternary.md
2022-07-12 10:52:37 +02:00
calixteman
aa6512e70f
Merge pull request #15159 from calixteman/1778982
[Editor] Avoid to have the ink editor smaller than the resizer (bug 1778982)
2022-07-11 18:53:16 +02:00
Calixte Denizet
1b3c0f1799 [Editor] Avoid to have the ink editor smaller than the resizer (bug 1778982) 2022-07-11 17:00:34 +02:00
Jonas Jenwald
4b7bf74da2 Replace element ids with custom attributes in the xfaLayer
We want to avoid adding regular `id`s to xfaLayer-elements, since that means that they become "linkable" through the URL hash in a way that's not supported/intended. This could end up clashing with "named destinations", and that could easily lead to bugs; see issue 11499 and PR 11503 for some context.

Rather than using `id`s, we'll instead use a *custom* `data-element-id` attribute such that it's still possible to access the DOM-elements directly if needed. *Please note:* This is basically the xfaLayer-equivalent of PR 15057.
2022-07-10 15:44:54 +02:00
Tim van der Meij
220f980e12
Merge pull request #15145 from Snuffleupagus/deprecated-enhanceTextSelection
[api-minor] Deprecate the `enhanceTextSelection` functionality
2022-07-09 12:38:41 +02:00
calixteman
2b6a67c5d0
Merge pull request #15153 from calixteman/1778692
[Annotation] A push button can have no action (bug 1778692)
2022-07-08 21:06:53 +02:00
Calixte Denizet
8f26ba5487 [Annotation] A push button can have no action (bug 1778692) 2022-07-08 15:39:56 +02:00
Jonas Jenwald
4b493c2c33
Merge pull request #15152 from Snuffleupagus/validate-Resources
Ensure that the /Resources-entry is actually a dictionary (issue 15150)
2022-07-08 13:24:47 +02:00
Jonas Jenwald
c2f7942aea Ensure that the /Resources-entry is actually a dictionary (issue 15150)
Prevent issues in *corrupt* PDF documents, if the /Resources-entry is not of the correct and expected type.
2022-07-08 12:43:43 +02:00
Calixte Denizet
cec2172225 [Editor] Remove useless and faulty code when destroying the global manager 2022-07-08 11:49:19 +02:00
calixteman
657edb3892
Merge pull request #15147 from calixteman/editor_size
[Editor] Avoid to resize and redraw the ink canvas when it's useless
2022-07-07 19:49:59 +02:00
Jonas Jenwald
d6a75262d5
Merge pull request #15143 from bernatgy/typescript-compilation-fix
[jsdoc] failing typescript builds - wrong type
2022-07-07 19:27:53 +02:00
Calixte Denizet
9c4077ebc4 [Editor] Avoid to resize and redraw the ink canvas when it's useless
- and because of rounding errors it led to slightly resize again and again
the ink container;
- when zooming the size is changing but not the ratio, so in this case we
don't need to change the dimension of the container.
2022-07-07 18:39:20 +02:00
Calixte Denizet
edc9ad13bf [Editor] Change the cursor to a pen for the Ink editor 2022-07-07 18:23:59 +02:00
Bernát Gyovai
3d62f09fbd [jsdoc] failing typescript builds - wrong type
`HTMLSectionElement` is not part of the DOM, so the generated typescript definitions contain a non-existing type.

HTML Section elements have to be handled as simple `HTMLElements`.

fixing punctuation and lint problems

[jsdoc] failing typescript builds - wrong type
2022-07-07 17:03:46 +02:00
Jonas Jenwald
815c28da0e [api-minor] Deprecate the enhanceTextSelection functionality 2022-07-07 16:15:31 +02:00
Calixte Denizet
a4329d326c [Editor] Allow editors deletion on Backspace or Delete keys 2022-07-07 15:16:01 +02:00
Jonas Jenwald
345bb18575 [editor] Use the fit-curve package (issue 15004)
Rather than including all of this external code in the PDF.js repository, we should be using the npm package instead.
Unfortunately this is slightly more complicated than you'd hope, since the `fit-curve` package (which is older) isn't directly compatible with modern JavaScript modules.
In particular, the following cases needed to be considered:
 - For the development viewer (i.e. `gulp server`) and the unit-tests, we thus need to build a fitCurve-bundle that can be directly `import`ed.
 - For the actual PDF.js build-targets, we can slightly reduce the sizes by depending on the "raw" `fit-curve` source-code.
 - For the Node.js unit-tests, the `fit-curve` package can be used as-is.
2022-07-07 10:43:43 +02:00
Jonas Jenwald
bde46632d4
Merge pull request #15130 from calixteman/context_menu
[Editor] Dispatch an event when some global states are changing (bug 1777695)
2022-07-05 22:40:12 +02:00
Calixte Denizet
ec0f9f6dcf [Editor] Dispatch an event when some global states are changing
- this way the context menu in Firefox can take into account what we
  have in the clipboard, if an editor is selected, ...
- when the user will click on a context menu item, an action will be
  triggered, hence this patch adds what is required to handle it;
- some tests will be added in the Firefox' patch.
2022-07-05 22:12:56 +02:00
Jonas Jenwald
79cfc548fc Improve text-selection for Type3 fonts with bogus /FontBBox-entries (issue 14999)
This extends PR 13461, by also building a fallback bounding box for Type3 fonts that contain a much too small /FontBBox-entry.

*Please note:* While this patch improves things overall, copy-and-pasting still doesn't work perfectly for this document. In particular the lowercase letter "c" cannot be selected/copied, however this can be reproduced in both Adobe Reader and PDFium (in Google Chrome) too, which is caused by a lack of proper /ToUnicode-data in the PDF document.
2022-07-05 14:27:14 +02:00
Jonas Jenwald
552ee9decd Call AnnotationLayer.setDimensions as part of the render/update-methods (PR 15036 follow-up)
Rather than forcing the user to *manually* call `setDimensions`, which is also breaking any existing third-party code, it seems that we can simply let the `AnnotationLayer.{render, update}`-methods handle that internally.

As far as I can tell, based on testing manually in the viewer *and* running the browser-tests, everything still appears to work correctly with this patch.
2022-07-04 12:27:20 +02:00
Jonas Jenwald
ca8b112e8c
Merge pull request #15125 from Snuffleupagus/FileAttachmentAnnotationElement-trigger
Fix the Popup-trigger for `FileAttachmentAnnotationElement` (PR 15036 follow-up)
2022-07-04 11:00:24 +02:00
Calixte Denizet
ae2cf7e1e7 [Editor] Update the id for a l10n string 2022-07-04 10:18:42 +02:00
Jonas Jenwald
315f450b01 Fix the Popup-trigger for FileAttachmentAnnotationElement (PR 15036 follow-up)
After the changes in PR 15036, the trigger-element created in `FileAttachmentAnnotationElement.render` is now too small. This can be fixed by using the same approach as in PR 15065, and the patch can be tested using the `annotation-fileattachment.pdf` document in the test-suite.
2022-07-04 09:33:27 +02:00
Calixte Denizet
9723c5d377 [Editor] Handle correctly colors when saving a document in HCM
- for example in Dusk theme (Windows 11), black appears to be white, so
  the user will draw something in white. But if they want to print or
  save the used color must be black.
- fix a bug with the color input which only accepts hex string colors;
- adjust outline color of the selected/hovered editors in HCM.
2022-06-30 09:56:34 +02:00
Calixte Denizet
a694e360a4 [Editor] Allow to select a freetext editor when in ink mode
- and when in ink mode, change the toolbar active button when
  a freetext edited.
2022-06-29 19:35:40 +02:00
Calixte Denizet
bc5b6cd08c [Editor] Set the freetext editor dimensions when the changing the font size 2022-06-29 16:11:11 +02:00
Jonas Jenwald
4a4c6b9851 [editor] Introduce a proper annotationEditorMode option/preference (PR 15075 follow-up)
This replaces the boolean `annotationEditorEnabled` option/preference with a "proper" `annotationEditorMode` one. This way it's not only possible for the user to control if Editing is enabled/disabled, but also which *specific* Editing-mode should become enabled upon PDF document load.

Given that Editing is not enabled/released yet, I cannot imagine that changing the name and type of the option/preference should be an issue.
2022-06-29 11:35:58 +02:00
Calixte Denizet
1a3ef2a0aa [editor] Add some UI elements in order to set font size & color, and ink thickness & color 2022-06-28 12:05:04 +02:00
Calixte Denizet
3789dab307 Always flush the current item with MarkedContent stuff when getting text (#15094) 2022-06-25 17:19:57 +02:00
calixteman
23fcdabb37
Merge pull request #15088 from calixteman/editor_rotation
Support rotating editor layer
2022-06-25 16:18:07 +02:00
Calixte Denizet
0c420f5135 Support rotating editor layer
- As in the annotation layer, use percent instead of pixels as unit;
- handle the rotation of the editor layer in allowing editing when rotation
  angle is not zero;
- the different editors are rotated counterclockwise in order to be usable
  when the main page is itself rotated;
- add support for saving/printing rotated editors.
2022-06-24 20:02:32 +02:00
Jonas Jenwald
cd35b9bfac
Merge pull request #15095 from Snuffleupagus/Annotation-OC
Add (basic) support for Optional Content in Annotations
2022-06-24 19:10:11 +02:00
calixteman
b5fea8ff14
Merge pull request #15093 from calixteman/issue15092
[JS] Update siblings when a field is updated after a calculation (#15092)
2022-06-24 16:17:59 +02:00
Jonas Jenwald
c48dc251e0 Add (basic) support for Optional Content in Annotations
Given that Annotations can also have an `OC`-entry, we need to take that into account when generating their operatorLists.

Note that in order to simplify the patch the `getOperatorList`-methods, for the Annotation-classes, were converted to be `async`.
2022-06-24 15:19:56 +02:00
Calixte Denizet
a334a21a1d [JS] Update siblings when a field is updated after a calculation (#15092) 2022-06-24 14:23:06 +02:00
Jonas Jenwald
3fab4af949
Merge pull request #15043 from Snuffleupagus/PrintAnnotationStorage
[api-minor] Introduce a `PrintAnnotationStorage` with *frozen* serializable data
2022-06-24 09:30:07 +02:00
Calixte Denizet
e49d039853 Correctly order added annotations when saving or printing
- the annotations must be rendered in the same order as the chronological one.
- fix a bug in document.js which avoids to read a saved pdf correctly in Acrobat:
  there is no need to reset the xref state: it's done in worker.js once everything
  has been saved.
2022-06-23 17:39:12 +02:00
Jonas Jenwald
1cc7cecc7b [api-minor] Introduce a PrintAnnotationStorage with *frozen* serializable data
Given that printing is triggered *synchronously* in browsers, it's thus possible for scripting (in PDF documents) to modify the Annotation-data while printing is currently ongoing.
To work-around that we add a new printing-specific `AnnotationStorage`, where the serializable data is *frozen* upon initialization, which the viewer can thus create/utilize during printing.
2022-06-23 17:06:46 +02:00
Calixte Denizet
30c63eb0ec [Editor] Add support for printing newly added FreeText annotations 2022-06-22 13:26:09 +02:00
Jonas Jenwald
eca939d904
Merge pull request #15076 from Snuffleupagus/prefer-array-index-of
Enable the `prefer-array-index-of` ESLint plugin rule
2022-06-21 18:57:51 +02:00
Calixte Denizet
f27c8c4471 [Editor] Add support for printing newly added Ink annotations 2022-06-21 18:21:49 +02:00
calixteman
8d466f5dac
Merge pull request #15060 from calixteman/annotation_rotation
Rotate annotations based on the MK::R value (bug 1675139)
2022-06-21 18:03:09 +02:00
Calixte Denizet
cdc58b7a52 Rotate annotations based on the MK::R value (bug 1675139)
- it aims to fix: https://bugzilla.mozilla.org/show_bug.cgi?id=1675139;
- An annotation can be rotated (counterclockwise);
- the rotation can be set in using JS.
2022-06-21 17:57:26 +02:00
Jonas Jenwald
1c9a702f73 Enable the prefer-array-index-of ESLint plugin rule
https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/prefer-array-index-of.md
2022-06-21 16:54:32 +02:00
Jonas Jenwald
db6f675baa
Merge pull request #15069 from Snuffleupagus/annotationLayer-dimensions
Ensure that the annotationLayer has the correct dimensions (PR 15036 follow-up)
2022-06-21 15:57:21 +02:00
calixteman
6ee538e0ba
Merge pull request #15074 from calixteman/ink3
Only activate ink editor if none is selected
2022-06-20 22:45:36 +02:00
Calixte Denizet
c44ab94d28 Only activate ink editor if none is selected 2022-06-20 22:24:04 +02:00
Jonas Jenwald
aa3fc5844a
Merge pull request #15062 from Snuffleupagus/save-newRefs
Simplify the `newRefs` computation in the "SaveDocument"-handler in the worker-thread
2022-06-20 22:18:43 +02:00
Jonas Jenwald
7cce3fb6ff Ensure that the annotationLayer has the correct dimensions (PR 15036 follow-up)
Note how the "page"-div, "canvasWrapper"-div, and `textLayer`-div all have *integer* dimensions (rounded down) rather than using the "raw" viewport-dimensions.
Hence it seems reasonable that the same should apply to the "annotationLayer"-div, now that it's explicit dimensions set.
2022-06-20 09:38:46 +02:00
Jonas Jenwald
8d154d7f6a
Merge pull request #15064 from calixteman/rescale_followup
Avoid having overflowing sections (#15036 follow-up)
2022-06-20 09:33:13 +02:00
Calixte Denizet
2ff65dd514 Popup trigger area must filled its parent (fix #15063) 2022-06-19 22:44:58 +02:00
Calixte Denizet
af47a0b7e0 Avoid having overflowing sections (#15036 follow-up) 2022-06-19 22:09:02 +02:00
Jonas Jenwald
57c10ac213 Simplify the newRefs computation in the "SaveDocument"-handler in the worker-thread
- Let the `Page.save`-method filter out "empty" entries, similar to the `Page._parsedAnnotations`-getter, since that on its own already simplifies the "SaveDocument"-handler a tiny bit.

 - The existing `reduce` and `concat` construction isn't exactly a wonder of readability :-)
   Thanks to modern JavaScript features it should be possible to replace all of this with `Array.prototype.flat()` instead, which at least to me feels a lot easier to understand.
2022-06-19 18:21:51 +02:00
Jonas Jenwald
c21f4faaf8 Reduce unnecessary usage of Array.prototype.concat()
There are obviously cases where using `concat` makes perfect sense, since that method doesn't change any of the existing Arrays; see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/concat

However, in a few cases throughout the code-base that's not an issue and using `concat` only leads to unnecessary intermediate allocations. With modern JavaScript we can thus replace those with a combination of `push` and spread-syntax, which wasn't originally possible when the code was written.
2022-06-19 13:40:52 +02:00
Calixte Denizet
e2db9bacef Get rid of CSS transform on each annotation in the annotation layer
- each annotation has its coordinates/dimensions expressed in percentage,
  hence it's correctly positioned whatever the scale factor is;
- the font sizes are expressed in percentage too and the main font size
  is scaled thanks a css var (--scale-factor);
- the rotation is now applied on the div annotationLayer;
- this patch improve the rendering of some strings where the glyph spacing
  was not correct (it's a Firefox bug);
- it helps to simplify the code and it should slightly improve the update of
  page (on zoom or rotation).
2022-06-18 17:54:59 +02:00
Jonas Jenwald
03757d82b7 Replace element ids with custom attributes for Widget-annotations (issue 15056)
We want to avoid adding regular `id`s to Annotation-elements, since that means that they become "linkable" through the URL hash in a way that's not supported/intended. This could end up clashing with "named destinations", and that could easily lead to bugs; see issue 11499 and PR 11503 for some context.

Rather than using `id`s, we'll instead use a *custom* `data-element-id` attribute such that it's still possible to access the Annotation-elements directly.
Unfortunately these changes required updating most of the integration-tests, and to reduce the amount of repeated code a couple of helper functions were added.
2022-06-18 16:43:05 +02:00
Jonas Jenwald
be2dfe45f9
Merge pull request #15035 from Snuffleupagus/prefer-modern-dom-apis-2
Use modern DOM methods a bit more (PR 15031 follow-up)
2022-06-17 19:37:43 +02:00
Calixte Denizet
7e3941da9d [JS] Hide field borders and buttons (#15053)
- Since the border belongs to the section containing the HTML
  counterpart of an annotation, this section must be hidden when
  a JS action requires it;
- it wasn't possible to hide a button in using JS.
2022-06-17 17:36:38 +02:00
calixteman
b8688128e3
Merge pull request #15050 from calixteman/make_ink_better
[Editor] - Add the ability to directly draw after selecting ink tool
2022-06-16 20:34:56 +02:00
Calixte Denizet
e7dc1ef4f3 [Editor] - Add the ability to directly draw after selecting ink tool
- Right now, we must select the tool, then click to select a page and
  click to start drawing and it's a bit painful;
- so just create a new ink editor when we're hovering a page without one.
2022-06-16 19:53:07 +02:00
Jonas Jenwald
64cce1269e Add basic support for non-embedded ArialUnicodeMS fonts (issue 15044)
This appears to be a Microsoft-specific version of the regular Arial font, hence we simply map this to Helvetica in the same way that we treat many other Arial-named fonts.
2022-06-15 10:37:20 +02:00
Jonas Jenwald
4902ad8923 Use modern DOM methods a bit more (PR 15031 follow-up)
Apparently the ESLint rule added in PR 15031 wasn't able to catch all cases that can be converted, which is probably not all that surprising given how some of these call-sites look.

 - Use `Element.prepend()` to insert nodes before all other ones in the element, rather than using `firstChild` with `insertBefore`-calls; see https://developer.mozilla.org/en-US/docs/Web/API/Element/prepend

 - Fix one *incorrect* `insertBefore` call, in the AnnotationLayer-code.
   Initially the patch simply changed that to an `Element.before()`-call, however that broke one of the integration-tests. It turns out that the `index` may try to access a non-existent select-child, which triggers undefined behaviour; note the warning in https://developer.mozilla.org/en-US/docs/Web/API/Node/insertBefore#parameters
2022-06-13 10:47:37 +02:00
Jonas Jenwald
2dca14028d Extend getGlyphMapForStandardFonts with some Hebrew entries (issue 15033)
This only adds the minimum entries required in order to render the referenced document correctly, rather than trying to support "all" Hebrew glyphs, to ensure that all lines in `getGlyphMapForStandardFonts` are covered by tests.
2022-06-13 10:08:39 +02:00
Tim van der Meij
1a6ae5f034
Merge pull request #15031 from Snuffleupagus/prefer-modern-dom-apis
Enable the `unicorn/prefer-modern-dom-apis` ESLint plugin rule
2022-06-12 20:36:40 +02:00
Tim van der Meij
720f77c7cd
Merge pull request #15028 from Snuffleupagus/update-compat
[api-minor] Update the minimum supported browsers/environments
2022-06-12 20:12:33 +02:00
Tim van der Meij
26ae50e449
Merge pull request #15023 from Snuffleupagus/prefer-array-flat
Enable the `unicorn/prefer-array-flat` and `unicorn/prefer-array-flat-map` ESLint plugin rules
2022-06-12 20:10:52 +02:00
Jonas Jenwald
4d39898823 Enable the unicorn/prefer-modern-dom-apis ESLint plugin rule
This rule will help enforce slightly shorter code, and according to MDN both `Element.replaceWith()` and `Element.before()` are available in all browsers that we currently support.

Please find additional information here:
 - https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/prefer-modern-dom-apis.md
 - https://developer.mozilla.org/en-US/docs/Web/API/Element/replaceWith
 - https://developer.mozilla.org/en-US/docs/Web/API/Element/before
2022-06-12 20:05:05 +02:00
Jonas Jenwald
8129815538 Enable the unicorn/prefer-dom-node-append ESLint plugin rule
This rule will help enforce slightly shorter code, especially since you can insert multiple elements at once, and according to MDN `Element.append()` is available in all browsers that we currently support.

Please find additional information here:
 - https://developer.mozilla.org/en-US/docs/Web/API/Element/append
 - https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/prefer-dom-node-append.md
2022-06-12 13:07:03 +02:00
Jonas Jenwald
2f0ed3a9ca [api-minor] Update the minimum supported browsers/environments
*Please note:* The dates below are still a little ways off, however that obviously won't affect the existing PDF.js releases. Hence I think that we can make these changes now, since by the time of the *next* official PDF.js release they'll likely match up pretty well.[1]

While we "support" some (by now) fairly old browsers, that essentially means that the library (and viewer) will load and that the basic functionality will work as intended.[2]
However, in older browsers, some functionality may not be available and generally we'll ask users to update to a modern browser when bugs (specific to old browsers) are reported.[3]

Since we've previously settled on only supporting browsers/environments that are approximately *three years old*, this patch updates the minimum supported browsers/environments as follows:
 - Chrome 76, which was released on 2019-07-30; see https://en.wikipedia.org/wiki/Google_Chrome_version_history
 - Firefox ESR (as before); see https://wiki.mozilla.org/Release_Management/Calendar
 - Safari 13, which was released on 2019-09-19; see https://en.wikipedia.org/wiki/Safari_version_history#Safari_13
 - Node.js 14, which was release on 2020-04-21 (all older versions have reached EOL); see https://en.wikipedia.org/wiki/Node.js#Releases

---
[1] Given that the releases usually happen every two to three months.

[2] Assuming that a `legacy/`-build is being used, of course.

[3] In general it's never a good idea to use old/outdated browsers, since those may contain *known* security vulnerabilities.
2022-06-11 16:50:01 +02:00
Jonas Jenwald
4b2526ebf2 Remove superfluous trailing arguments from parseFloat-calls (PR 14978 follow-up)
Fixes two recent "Code scanning alerts" on GitHub, which likely happened because these calls originally used `parseInt` instead (during initial development).
2022-06-11 15:11:34 +02:00
Jonas Jenwald
bbf857d635 [api-minor] Stop using the beginAnnotations/endAnnotations operators (PR 14998 follow-up)
After the changes in PR 14998, these operators are now no-ops in the `src/display/canvas.js` code and should no longer be necessary.
Given that `beginAnnotations`/`endAnnotations` are not in the PDF specification, but are rather *custom* PDF.js operators, it seems reasonable to stop using them now that they've become no-ops.
2022-06-11 14:21:26 +02:00
Jonas Jenwald
010d996b74 Enable the unicorn/prefer-array-flat and unicorn/prefer-array-flat-map ESLint plugin rules
These rules will help enforce shorter and more readable code, and according to MDN these Array-methods are available in all browsers/environments that we currently support:
 - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/flat#browser_compatibility
 - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/flatMap#browser_compatibility

Please find additional information about these ESLint rules here:
 - https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/prefer-array-flat.md
 - https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/prefer-array-flat-map.md
2022-06-11 11:33:43 +02:00
Tim van der Meij
a57a4bc6c2
Merge pull request #15018 from Snuffleupagus/issue-15016
Expose `TextLayerRenderTask` in the TypeScript definitions (issue 15016, PR 14013 follow-up)
2022-06-10 22:18:35 +02:00
Tim van der Meij
f0b5aee6b8
Merge pull request #15014 from Snuffleupagus/prefer-at
Enable the `unicorn/prefer-at` ESLint plugin rule (PR 15008 follow-up)
2022-06-10 22:12:35 +02:00
Jonas Jenwald
e046b811b7 Expose TextLayerRenderTask in the TypeScript definitions (issue 15016, PR 14013 follow-up)
While `TextLayerRenderTask` apparently makes sense in TypeScript environments, given that it's being returned by the `renderTextLayer`-function in the API, we really don't want to extend the *public* API by simply exporting the class directly in `src/pdf.js` since it should never be called/initialized manually.
Hence we follow the same pattern as in PR 14013, and add some very basic unit-tests to ensure that `renderTextLayer` always returns a `TextLayerRenderTask`-instance as expected.
2022-06-10 22:12:32 +02:00
calixteman
6e6d94ab8d
Merge pull request #15020 from calixteman/1773680
Add an empty entry in combo list when nothing is selected (bug 1773680)
2022-06-10 19:18:31 +02:00
Calixte Denizet
bfe816d0d2 Add an empty entry in combo list when nothing is selected (bug 1773680)
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1773680
- the empty is removed once something is selected.
2022-06-10 18:45:02 +02:00
jerry1100
b716e82d18 Extend TextLayerRenderParameters.container type to include HTMLElement.
In PR #14717, the type was changed from a HTMLElement to a DocumentFragment.
This broke TypeScript projects that use a HTMLElement container.

To remedy this, we extend the type of container to also include HTMLElement.
2022-06-10 06:50:47 -07:00
Jonas Jenwald
9ac4536693 Enable the unicorn/prefer-at ESLint plugin rule (PR 15008 follow-up)
Please find additional information here:
 - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/at
 - https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/prefer-at.md
2022-06-09 21:21:19 +02:00
calixteman
5d88233fbb
Merge pull request #15006 from calixteman/ink2
[editor] Add support for saving newly added Ink
2022-06-09 21:13:53 +02:00
calixteman
61a65344a3
Merge pull request #14989 from calixteman/ink1
[editor] Add an Ink editor
2022-06-09 21:13:06 +02:00
Calixte Denizet
c161a86ba1 [editor] Add an Ink editor
- Approximate the drawn curve by a set of Bezier curves in using
  js code from https://github.com/soswow/fit-curves.
  The code has been slightly modified in order to make the linter
  happy.
2022-06-09 19:35:59 +02:00
Jonas Jenwald
3d244cb6a8 Render PopupAnnotations even if they have missing or empty /Rect-entries (issue 15012, PR 14439 follow-up)
This only applies to *corrupt* PDF documents, where Annotations are missing the required /Rect-entry. Rendering PopupAnnotations unconditionally shouldn't be a problem, since we're not using a `BaseSVGFactory`-instance in that case.
2022-06-09 15:10:54 +02:00
Jonas Jenwald
66bbc0e7ee Call WidgetAnnotation._getTextWidth correctly from the ChoiceWidgetAnnotation-class (PR 14720 follow-up)
In the "no fontSize available" code-path, in the `ChoiceWidgetAnnotation._getAppearance` method, we don't provide the necessary second argument when calling the `_getTextWidth`-method which will cause errors to be thrown.
2022-06-09 10:11:01 +02:00
Jonas Jenwald
b5cad9be03 Fix a bug in the ColorConverters.CMYK_HTML method (PR 12631 follow-up)
Because of a small oversight, this method accidentally handled the intermediate array incorrectly.
2022-06-09 10:03:36 +02:00
Calixte Denizet
36aae436bf [editor] Add support for saving newly added Ink 2022-06-08 22:16:01 +02:00
Jonas Jenwald
9e24a1660e Polyfill Array.prototype.at with core-js (PR 14976 follow-up)
This Array-method is a fairly new addition to the ECMAScript specification, hence we need a polyfill to avoid the library/viewer breaking in older browsers.

Please find additional information at:
 - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/at
 - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/at#browser_compatibility
2022-06-08 22:10:59 +02:00
calixteman
2fbf14ace8
Merge pull request #14978 from calixteman/editor2
[editor] Add support for saving a newly added FreeText
2022-06-08 15:51:03 +02:00
Calixte Denizet
7773b3f5be [edition] Add support for saving a newly added FreeText 2022-06-08 14:34:09 +02:00
Calixte Denizet
2dd0c861bf Outline fields which are required (bug 1724918)
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1724918;

- it applies for both Acroform and XFA.
2022-06-07 17:02:11 +02:00
Calixte Denizet
96d0d22d66 Reset all the canvas states after rendering each annotations (#14105)
- each annotation must be rendered independently of the others. So
  after having rendered each annotation, the canvas states are reset
  in order to have something clean to render the next one.
2022-06-07 14:59:02 +02:00
Tim van der Meij
135b9fbcfb
Merge pull request #14994 from Snuffleupagus/parseJbig2-conditional
Conditionally bundle `gulp image_decoders`-specific code in `src/core/jbig2.js` (PR 9729 follow-up)
2022-06-06 12:28:53 +02:00
Jonas Jenwald
e82ad79eb9 Conditionally bundle gulp image_decoders-specific code in src/core/jbig2.js (PR 9729 follow-up)
This method/function was added only for the `gulp image_decoders`-builds, and is completely unused elsewhere (e.g. in the Firefox PDF Viewer).
While this only reduces the size of the *built* `pdf.worker.js` file by a little over 1 kB, it can't hurt to remove completely unused code from the "normal" builds.
2022-06-05 15:38:28 +02:00
Jonas Jenwald
51c47acb41 [editor] Update the AnnotationStorage.hash-getter to support editing
While calling `JSON.stringify(...)` on a class-instance obviously "works" (as in it doesn't throw), since it's really just an Object, it doesn't really make much sense in the context of the `AnnotationStorage.hash`-getter.

Also, access the *inverse* Viewport-transform correctly in `FreeTextEditor.serialize` to prevent errors being thrown when that method is invoked.

Finally, slightly updates the `AnnotationStorage.serializable`-getter to improve consistency within the class.
2022-06-05 14:05:44 +02:00
Jonas Jenwald
59dd4ea2b0 Lookup image-data correctly in paintImageMaskXObjectGroup (issue 14990)
*This fixes a regression from PR 14754.*

We didn't lookup the image-data correctly, with the result that we tried to render some ImageMasks using a string rather than the intended TypedArray. To make matters worse, this code-path was apparently not *properly* covered by existing test-cases.
2022-06-05 12:39:23 +02:00
Jonas Jenwald
51bf928061 [editor] A couple of small FreeText-related fixes (PR 14976 follow-up)
- Ensure that the modified-warning won't be displayed, when navigating away from the viewer, if the user has added custom Annotations and then *removed all* of them.
 - Ensure that the *initial* editor-buttons state, i.e. the `toggled`-class, is correctly displayed in the toolbar when then viewer loads.
 - Tweak the CSS-classes for the editor-buttons, such that they use the correct focus/hover-rules (similar to the sidebar-buttons).
 - Remove a no longer accurate comment from the `BaseViewer.annotationEditorMode`-setter.
 - Address a couple of *smaller* outstanding review comments, including some re-formatting changes, from PR 14976.
2022-06-04 21:48:11 +02:00
Calixte Denizet
be1aa11986 [edition] Add a FreeText editor (#14970)
- add a basic UI to edit some text in a pdf;
- an editor can be moved, suppressed, cut, copied, pasted, selected;
- add an undo/redo manager.
2022-06-04 18:20:11 +02:00
Jonas Jenwald
7e852851fd A small memory-usage improvement for PDF documents opened from TypedArray-data
This patch contains a small optimization specifically for the case when `getDocument` is called with TypedArray-data. In that case we'll still hold onto that data, which could obviously be large, even after the "GetDocRequest"-message has been sent to the worker-thread.

In practice this will most likely not affect memory usage in any noticeable way, since the application calling `getDocument` will probably also be keeping a reference to the TypedArray-data. However, it seems like a good idea to ensure that the PDF.js API *itself* won't unnecessarily keep this data alive.
2022-05-29 16:37:18 +02:00
Calixte Denizet
66b513fc00 [Annotations] Show buttons even if they've no actions
- it's a regression from PR #14247:
 - before the PR, the button was rendered on the canvas whatever its status was;
 - after the PR, the button image has been moved in an other canvas so when the button is not renderable
   (because it has no actions) then the image is not added the HTML element.
- the buttons in the pdf in bug 1737260 or in the pdf in #14308 were not visible
- make the button always renderable but don't add the link element if it's useless.
2022-05-28 23:50:50 +02:00
Calixte Denizet
9d82106d20 Set the text fields font size based on their height
- right now we're using the font size from the pdf itself but we use an other font
  in the annotation layer. So this size doesn't really make sense and leads to bad
  rendering (see pdf in #14928);
- use a sans-serif font for the fields containing text (fix issue #14736);
- remove useless padding in text-based fields (fix issue #14301);
- text fields allow/disallow scrolling bars (see bit 24 in Ff entry), so use this
  value to hide/show scrollbars in annotation layer.
2022-05-28 18:00:39 +02:00
Jonas Jenwald
5a2899c57e Skip bogus d1 operators in Type3-glyphs (issue 14953)
In the `src/display/canvas.js` code the `d1` operator will be used to set the clipping region, and it obviously cannot be empty since that prevents the Type3-glyph from rendering.

Also, the patch removes an outdated comment; refer to PR 12718.
2022-05-24 12:20:31 +02:00
Calixte Denizet
9407adc416 [JS] Format all the fields if any when the document is open (bug 1766987)
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1766987.
2022-05-22 15:50:42 +02:00
Calixte Denizet
60498c67e4 Display background when printing or saving a text widget (issue #14928) 2022-05-19 16:41:54 +02:00
Jonas Jenwald
5a774b7ed3 Adjust the heuristics for handling of incomplete path operators (issue 14917)
This limits the heuristics for handling of incomplete path operators, see PR 9838, to only apply to *sequences* of such operators. In practice a couple of invalid path operators are (hopefully) unlikely to completely break rendering, whereas a sequence of them will easily lead to fairly chaotic rendering artifacts.
2022-05-15 11:24:39 +02:00
Jonas Jenwald
d540df0582 Use TypedArray.prototype.fill() a bit more in the code-base
Please see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/TypedArray/fill, which is implemented in all browsers that we currently support.
2022-05-13 12:42:51 +02:00
Tim van der Meij
fbc7981c98
Merge pull request #14894 from Snuffleupagus/rm-mozOpaque
Try to remove the `mozOpaque` canvas-property (PR 6551 follow-up)
2022-05-12 22:03:05 +02:00
Jonas Jenwald
6bcc5b615d [api-minor] Include line endings in Line/Polyline Annotation-data (issue 14896)
Please refer to:
 - https://web.archive.org/web/20220309040754if_/https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G11.2109792
 - https://web.archive.org/web/20220309040754if_/https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G11.2096489
 - https://web.archive.org/web/20220309040754if_/https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G11.2096447

Note that we still won't attempt to use the /LE-data when creating fallback appearance streams, as mentioned in PR 13448, since custom line endings aren't common enough to warrant the added complexity.
Finally, note that according to the PDF specification we should *potentially* also take the line endings into account for FreeText Annotations. However, in that case their use is conditional on other parameters that we currently don't support.
2022-05-12 11:08:30 +02:00
Jonas Jenwald
af5789125f Try to remove the mozOpaque canvas-property (PR 6551 follow-up)
According to MDN, see https://developer.mozilla.org/en-US/docs/Web/API/HTMLCanvasElement/mozOpaque, the `mozOpaque` canvas-property is not only non-standard (obviously) but it's also been deprecated.
Instead it's recommended to use `alpha = false` when getting the canvas-context, see https://developer.mozilla.org/en-US/docs/Web/API/HTMLCanvasElement/getContext#contextattributes, which all of our affected code is already doing.
2022-05-09 13:03:08 +02:00
Jonas Jenwald
38c82357b2
Merge pull request #14890 from calixteman/14889
[JS] Formatted value has to be a string when neither null nor undefined
2022-05-08 17:25:29 +02:00
Calixte Denizet
ab3958d6e8 [JS] Formatted value has to be a string when neither null nor undefined 2022-05-08 16:43:57 +02:00
Jonas Jenwald
472a1f9c91 Ignore pageColors when the background/foreground is identical (PR 14874 follow-up)
If the computed background/foreground colors are identical, the `canvas` would be rendered mostly blank with only images visible. Hence it seems reasonable to also ignore the `pageColors`-option in this case.

Also, the patch tries to *briefly* outline the various cases in which we ignore the `pageColors`-option in a comment.
2022-05-08 11:40:50 +02:00
Tim van der Meij
f8838eb794
Merge pull request #14882 from Snuffleupagus/issue-14881
Add support for TrueType format 12 `cmap`s (issue 14881)
2022-05-07 11:45:54 +02:00
Jonas Jenwald
7f40ef41a5 Simplify the "fileattachmentannotation"-event handling a little bit
*This patch can be tested, in the viewer, using the `annotation-fileattachment.pdf` document from the test-suite.*

Note how the `FileSpec`-implementation already uses `stringToPDFString` during the filename lookup, see cfac6fa511/src/core/file_spec.js (L70)
Hence there's no reason to repeat that again in the `FileAttachmentAnnotationElement`-constructor, and we can thus simplify the "fileattachmentannotation"-event handling a little bit.
2022-05-06 20:55:18 +02:00
Jonas Jenwald
6e7e9d83d8 Add support for TrueType format 12 cmaps (issue 14881)
This is, as far as I can tell, the first case we've seen of a format 12 `cmap`.
Please see https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6cmap.html
2022-05-06 11:11:38 +02:00
calixteman
cfac6fa511
Merge pull request #14874 from calixteman/colors
[api-minor] Improve pdf reading in high contrast mode
2022-05-05 21:48:19 +02:00
Calixte Denizet
c8afd6ce8c [api-minor] Improve pdf reading in high contrast mode
- Use Canvas & CanvasText color when they don't have their default value
  as background and foreground colors.
- The colors used to draw (stroke/fill) in a pdf are replaced by the bg/fg
  ones according to their luminance.
2022-05-05 16:34:51 +02:00
Tim van der Meij
899e4d58d6
Merge pull request #14870 from Snuffleupagus/isNodeJS-cleanup
Only bundle the `src/display/node_utils.js` file in GENERIC-builds
2022-05-04 22:38:21 +02:00
Jonas Jenwald
8267fd8a52 Replace the AnnotationStorage.lastModified-getter with a proper hash-method
The current `lastModified`-getter, which only contains a time-stamp, is a fairly crude way of detecting if the stored data has actually been changed. In particular, when the `getRawValue`-method is used, the `lastModified`-getter doesn't cope with data being modified from the "outside".

To fix these issues[1], and to prevent any future bugs in this code, this patch introduces a new `AnnotationStorage.hash`-getter which computes a hash of the currently stored data. To simplify things this re-uses the existing `MurmurHash3_64`-implementation, which required moving that file into the `src/shared/`-folder, since its performance should be good enough here.

---
[1] Given how the `AnnotationStorage.lastModified`-getter was used, this would have been limited to *printing* of forms.
2022-05-04 15:21:30 +02:00
Jonas Jenwald
8135d7ccf6
Merge pull request #14869 from calixteman/14862
[JS] Fix few bugs present in the pdf for issue #14862
2022-05-03 18:31:31 +02:00
Calixte Denizet
094ff38da0 [JS] Fix few bugs present in the pdf for issue #14862
- since resetForm function reset a field value a calculateNow is consequently triggered.
  But the calculate callback can itself call resetForm, hence an infinite recursive loop.
  So basically, prevent calculeNow to be triggered by itself.
- in Firefox, the letters entered in some fields were duplicated: "AaBb" instead of "AB".
  It was mainly because beforeInput was triggering a Keystroke which was itself triggering
  an input value update and then the input event was triggered.
  So in order to avoid that, beforeInput calls preventDefault and then it's up to the JS to
  handle the event.
- fields have a property valueAsString which returns the value as a string. In the
  implementation it was wrongly used to store the formatted value of a field (2€ when the user
  entered 2). So this patch implements correctly valueAsString.
- non-rendered fields can be updated in using JS but when they're, they must take some properties
  in the annotationStorage. It was implemented for field values, but it wasn't for
  display, colors, ...
- it fixes #14862 and #14705.
2022-05-03 15:48:44 +02:00
Jonas Jenwald
d4fe4fd97b Simplify a couple of isNodeJS-dependent getDocument default values
Given that the `isNodeJS`-constant will, after PR 14858, *always* be `false` in non-GENERIC builds we can simplify a couple of `getDocument`-parameter default values slightly.
The old format, with inline `PDFJSDev`-checks, wasn't exactly a wonder of readability; which was my fault.
2022-05-03 11:36:10 +02:00
Jonas Jenwald
7df47c289f Only bundle the src/display/node_utils.js file in GENERIC-builds
This first of all simplifies the file, since we no longer need dummy-classes and can instead *directly* define the actual classes.
Furthermore, and more importantly, this means that we no longer need to bundle this code in e.g. MOZCENTRAL-builds which reduces the size of *built* `pdf.js` file slightly.
2022-05-03 11:34:35 +02:00
Jonas Jenwald
67719af9df Immediately release the temporary Uint8Arrays used during Type3-compilation
Given that the `compileType3Glyph` function *returns* a function, see `drawOutline`, we'll thus keep the surrounding scope alive. Hence it shouldn't hurt to *explicitly* mark the temporary `Uint8Array`s, used during parsing, as no longer needed. Given the current `MAX_SIZE_TO_COMPILE`-value these `Uint8Array`s may be approximately two mega-bytes large *for every* Type3-glyph.
2022-05-02 13:25:48 +02:00
Jonas Jenwald
df5a4fd0a7 Support encoded dest-strings in /GoTo destination dictionaries (issue 14864)
Interestingly enough this appears to be the very first case of *encoded* dest-strings, in /GoTo destination dictionaries, that we've actually come across. What's really fascinating is that it's less than a week after issue 14847, given that these issues are *somewhat* similar.
2022-05-02 10:14:32 +02:00
Jonas Jenwald
e658acffbc Slightly re-factor the compileType3Glyph function
This moves the `COMPILE_TYPE3_GLYPHS`/`MAX_SIZE_TO_COMPILE`-checks into the `compileType3Glyph` function itself, which allows for some simplification at the call-site.
These changes also mean that the `COMPILE_TYPE3_GLYPHS`-check is now done *once* per Type3-glyph, rather than everytime that the glyph is being rendered.
2022-05-01 13:56:35 +02:00
Jonas Jenwald
c2488c7864 Use Path2D, if available, when rendering Type3-fonts (bug 810214)
Note that in order to avoid unnecessary allocations we build the `Path2D`-object *inline* during parsing, rather than iterating through the complete `outlines`-Array at the end.

This patch was tested using the PDF file from bug 810214, i.e. https://bug810214.bmoattachments.org/attachment.cgi?id=9254990, with the following manifest file:
```
[
    {  "id": "bug810214",
       "file": "../web/pdfs/bug810214.pdf",
       "md5": "2b7243178f5dd5fd3edc7b6649e4bdf3",
       "rounds": 100,
       "lastPage": 25,
       "type": "eq"
    }
]

```

which gave the following results when comparing this patch against the `master` branch:
 - Overall
```
-- Grouped By browser, stat --
browser | stat         | Count | Baseline(ms) | Current(ms) | +/- |     %  | Result(P<.05)
------- | ------------ | ----- | ------------ | ----------- | --- | ------ | -------------
firefox | Overall      |  2500 |          123 |          78 | -44 | -36.25 |        faster
firefox | Page Request |  2500 |            2 |           2 |   0 |   9.11 |        slower
firefox | Rendering    |  2500 |          121 |          76 | -45 | -36.93 |        faster
```

 - Page-specific
```
-- Grouped By browser, page, stat --
browser | page | stat         | Count | Baseline(ms) | Current(ms) | +/- |     %  | Result(P<.05)
------- | ---- | ------------ | ----- | ------------ | ----------- | --- | ------ | -------------
firefox | 0    | Overall      |   100 |           36 |          35 |  -1 |  -2.89 |
firefox | 0    | Page Request |   100 |            2 |           2 |   0 |   7.33 |
firefox | 0    | Rendering    |   100 |           34 |          33 |  -1 |  -3.47 |
firefox | 1    | Overall      |   100 |          123 |          81 | -42 | -33.92 |        faster
firefox | 1    | Page Request |   100 |            2 |           2 |   0 |  -3.31 |
firefox | 1    | Rendering    |   100 |          121 |          79 | -42 | -34.44 |        faster
firefox | 2    | Overall      |   100 |          129 |          82 | -47 | -36.61 |        faster
firefox | 2    | Page Request |   100 |            2 |           2 |   0 |  24.84 |        slower
firefox | 2    | Rendering    |   100 |          127 |          80 | -47 | -37.33 |        faster
firefox | 3    | Overall      |   100 |          114 |          68 | -46 | -40.18 |        faster
firefox | 3    | Page Request |   100 |            2 |           2 |   0 |  15.63 |        slower
firefox | 3    | Rendering    |   100 |          112 |          66 | -46 | -41.07 |        faster
firefox | 4    | Overall      |   100 |          102 |          75 | -27 | -26.09 |        faster
firefox | 4    | Page Request |   100 |            2 |           2 |   0 |   9.62 |
firefox | 4    | Rendering    |   100 |          100 |          73 | -27 | -26.71 |        faster
firefox | 5    | Overall      |   100 |          103 |          77 | -26 | -25.15 |        faster
firefox | 5    | Page Request |   100 |            2 |           2 |   0 |  -6.86 |
firefox | 5    | Rendering    |   100 |          100 |          75 | -26 | -25.53 |        faster
firefox | 6    | Overall      |   100 |           48 |          37 | -11 | -22.56 |        faster
firefox | 6    | Page Request |   100 |            2 |           2 |   0 | -10.14 |
firefox | 6    | Rendering    |   100 |           46 |          35 | -11 | -23.16 |        faster
firefox | 7    | Overall      |   100 |          109 |          70 | -39 | -35.59 |        faster
firefox | 7    | Page Request |   100 |            2 |           2 |   0 |   5.29 |
firefox | 7    | Rendering    |   100 |          107 |          68 | -39 | -36.23 |        faster
firefox | 8    | Overall      |   100 |           39 |          31 |  -9 | -22.14 |        faster
firefox | 8    | Page Request |   100 |            2 |           2 |   0 |   1.72 |
firefox | 8    | Rendering    |   100 |           38 |          29 |  -9 | -23.38 |        faster
firefox | 9    | Overall      |   100 |          156 |          96 | -60 | -38.49 |        faster
firefox | 9    | Page Request |   100 |            1 |           2 |   0 |  13.61 |
firefox | 9    | Rendering    |   100 |          155 |          94 | -60 | -38.98 |        faster
firefox | 10   | Overall      |   100 |          173 |         105 | -68 | -39.20 |        faster
firefox | 10   | Page Request |   100 |            2 |           2 |   0 |  -8.81 |
firefox | 10   | Rendering    |   100 |          171 |         103 | -68 | -39.60 |        faster
firefox | 11   | Overall      |   100 |          152 |          89 | -64 | -41.88 |        faster
firefox | 11   | Page Request |   100 |            2 |           2 |   0 |   6.04 |
firefox | 11   | Rendering    |   100 |          150 |          87 | -64 | -42.47 |        faster
firefox | 12   | Overall      |   100 |          141 |          90 | -51 | -35.91 |        faster
firefox | 12   | Page Request |   100 |            2 |           2 |   0 |  17.37 |
firefox | 12   | Rendering    |   100 |          139 |          88 | -51 | -36.60 |        faster
firefox | 13   | Overall      |   100 |           97 |          61 | -36 | -36.79 |        faster
firefox | 13   | Page Request |   100 |            2 |           2 |   0 |  25.44 |        slower
firefox | 13   | Rendering    |   100 |           95 |          59 | -36 | -37.87 |        faster
firefox | 14   | Overall      |   100 |          118 |          82 | -36 | -30.33 |        faster
firefox | 14   | Page Request |   100 |            2 |           2 |   0 |   9.20 |
firefox | 14   | Rendering    |   100 |          117 |          80 | -36 | -30.95 |        faster
firefox | 15   | Overall      |   100 |          111 |          73 | -37 | -33.85 |        faster
firefox | 15   | Page Request |   100 |            2 |           2 |   0 |  13.25 |
firefox | 15   | Rendering    |   100 |          109 |          71 | -38 | -34.61 |        faster
firefox | 16   | Overall      |   100 |          145 |          88 | -57 | -39.19 |        faster
firefox | 16   | Page Request |   100 |            2 |           2 |   1 |  33.75 |        slower
firefox | 16   | Rendering    |   100 |          143 |          86 | -57 | -40.03 |        faster
firefox | 17   | Overall      |   100 |          171 |         126 | -45 | -26.27 |        faster
firefox | 17   | Page Request |   100 |            2 |           2 |   0 |  17.92 |        slower
firefox | 17   | Rendering    |   100 |          169 |         124 | -45 | -26.69 |        faster
firefox | 18   | Overall      |   100 |          126 |          78 | -47 | -37.71 |        faster
firefox | 18   | Page Request |   100 |            2 |           2 |   0 |   2.43 |
firefox | 18   | Rendering    |   100 |          124 |          76 | -48 | -38.43 |        faster
firefox | 19   | Overall      |   100 |           92 |          58 | -34 | -37.19 |        faster
firefox | 19   | Page Request |   100 |            2 |           2 |   0 |  12.74 |
firefox | 19   | Rendering    |   100 |           90 |          56 | -35 | -38.13 |        faster
firefox | 20   | Overall      |   100 |          178 |          96 | -82 | -46.18 |        faster
firefox | 20   | Page Request |   100 |            2 |           2 |   0 |  -2.23 |
firefox | 20   | Rendering    |   100 |          176 |          94 | -82 | -46.67 |        faster
firefox | 21   | Overall      |   100 |          181 |         102 | -79 | -43.77 |        faster
firefox | 21   | Page Request |   100 |            2 |           2 |   0 |  12.36 |        slower
firefox | 21   | Rendering    |   100 |          179 |          99 | -79 | -44.34 |        faster
firefox | 22   | Overall      |   100 |          140 |          84 | -55 | -39.59 |        faster
firefox | 22   | Page Request |   100 |            2 |           2 |   0 |  12.50 |
firefox | 22   | Rendering    |   100 |          138 |          82 | -55 | -40.25 |        faster
firefox | 23   | Overall      |   100 |          119 |          73 | -46 | -38.48 |        faster
firefox | 23   | Page Request |   100 |            2 |           2 |   1 |  35.71 |        slower
firefox | 23   | Rendering    |   100 |          117 |          71 | -46 | -39.48 |        faster
firefox | 24   | Overall      |   100 |          165 |          96 | -68 | -41.51 |        faster
firefox | 24   | Page Request |   100 |            2 |           2 |   0 |   2.81 |
firefox | 24   | Rendering    |   100 |          163 |          94 | -68 | -42.00 |        faster
```
2022-05-01 13:56:35 +02:00
calixteman
b10b8dad7d
Merge pull request #14853 from calixteman/white_lines
Use integer coordinates when drawing images (bug 1264608, issue #3351)
2022-04-29 18:15:03 +02:00
Calixte Denizet
624d8a8e3e Use integer coordinates when drawing images (bug 1264608, issue #3351)
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1264608;
- it's only a partial fix for #3351;
- some tiled images have some spurious white lines between the tiles.
  When the current transform is applyed the corners of an image can have
  some non-integer coordinates leading to some extra transparency added
  to handle that. So with this patch the current transform is applied on the
  point and on the dimensions in order to have at the end only integer values.
2022-04-29 16:01:34 +02:00
Jonas Jenwald
fbf6dee8ee [api-minor] Remove the forceClamped-functionality in the Streams (issue 14849)
As it turns out, most of the code-paths in the `PDFImage`-class won't actually pass the TypedArray (containing the image-data) to the `ColorSpace`-code. Hence we *generally* don't need to force the image-data to be a `Uint8ClampedArray`, and can just as well directly use a `Uint8Array` instead.

In the following cases we're returning the data without any `ColorSpace`-parsing, and the exact TypedArray used shouldn't matter:
 - b72a448327/src/core/image.js (L714)
 - b72a448327/src/core/image.js (L751)

In the following cases the image-data is only used *internally*, and again the exact TypedArray used shouldn't matter:
 - b72a448327/src/core/image.js (L762) with the actual image-data being defined (as `Uint8ClampedArray`) further below
 - b72a448327/src/core/image.js (L837)

*Please note:* This is tagged `api-minor` because it's API-observable, given that *some* image/mask-data will now be returned as `Uint8Array` rather than using `Uint8ClampedArray` unconditionally. However, that seems like a small price to pay to (slightly) reduce memory usage during image-conversion.
2022-04-29 14:46:30 +02:00
Jonas Jenwald
71370d012b Support destinations in NameTrees with encoded keys (issue 14847)
Initially I considered updating the `NameOrNumberTree`-implementation to handle encoded keys, however that quickly became somewhat messy (especially in the `NameOrNumberTree.get`-method) since only NameTrees using string-keys.
Hence the easiest solution, as far as I'm concerned, was thus to just update the `Catalog.destinations`-getter instead. Please note that in the referenced PDF document the `Catalog.destination`-method will thus fallback to fetch all destinations, which should be fine since this is the very first case of encoded keys that we've seen.

Also changes the `NameOrNumberTree.getAll`-method to prevent a possible run-time error, although we've so far not seen such a case, for any non-Array Kids-entries found in a NameTree/NumberTree.

Finally, to improve overall consistency and to hopefully prevent future bugs, the patch also updates a couple of other `NameTree` call-sites to correctly handle encoded keys. (Note that the `Catalog.attachments`-getter was already doing this.)
2022-04-27 11:19:55 +02:00
Jonas Jenwald
e18edf38db Add a helper function for incrementing the count of cached ImageMasks
While working on PR 14825, I couldn't help noticing that the code to increment the `count` for cached ImageMasks was repeated multiple times. Hence it makes sense, as far as I'm concerned, to move this into a helper function instead.
2022-04-24 11:10:02 +02:00
Tim van der Meij
752dee5caa
Merge pull request #14825 from Snuffleupagus/issue-14824
Ensure that worker-thread image caching doesn't break optional content (issue 14824)
2022-04-23 13:19:56 +02:00
Tim van der Meij
f9e54d9226
Merge pull request #14823 from Snuffleupagus/issue-14821
Ignore invalid /Encoding-entries when parsing fonts (issue 14821)
2022-04-23 13:19:26 +02:00
Jonas Jenwald
6c229dffb1 Ensure that worker-thread image caching doesn't break optional content (issue 14824)
Currently we only insert optionalContent-data into the operatorList the first time that an image is parsed, which will (in hindsight) obviously cause problems for cached images.
Hence we also need to insert the optionalContent-data in the various worker-thread image caches, such that it can be accessed in the fast-paths that are used to skip re-parsing of images.

In order to reduce the amount of repeated code, this patch also adds a new `OperatorList`-method that takes care of inserting the necessary data in the operatorList.
2022-04-22 14:49:16 +02:00
Jonas Jenwald
e723da7261 Ignore invalid /Encoding-entries when parsing fonts (issue 14821)
In the referenced PDF document the fonts have /Encoding-entries that are Streams (containing completely bogus data), which are thus obviously not valid here.
Hence, only when `ignoreErrors` is set, we'll now ignore these corrupt /Encoding-entries and fallback to the existing code to try and infer a usable encoding.

Given that this is *clearly* a case of corrupt PDF documents, there's no guarantee that this will "fix" all such cases, however it's the best that we do here and shouldn't really be worse than ignoring an entire font.
2022-04-22 11:49:03 +02:00
Tim van der Meij
f39219cd45
Merge pull request #14815 from Snuffleupagus/issue-14814
Ignore non-Stream /SMask-entries when parsing images (issue 14814)
2022-04-22 11:39:13 +02:00
Sean Wei
6bf978404e Use correct case for JavaScript 2022-04-21 23:56:28 +08:00
Jonas Jenwald
39d1bdde09 Ignore non-Stream /SMask-entries when parsing images (issue 14814)
This is similar to the pre-existing check used in the /Mask-case below, to handle *corrupt* PDF documents that include non-Stream /SMask-entries in images; please refer to the PDF specification:
https://web.archive.org/web/20220309040754if_/https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#page=216

*Please note:* Adobe Reader also fails to render the image on the second page, and displays an error message.
2022-04-21 12:14:08 +02:00
Jonas Jenwald
5bc7339c1b Add support for the /Catalog Base-URI when resolving URLs (issue 14802)
As far as I can tell, this is actually the very first time that we've seen a PDF document with a Base-URI specified in the /Catalog; please refer to the specification:
https://web.archive.org/web/20220309040754if_/https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G11.2097122

To simplify the overall implementation, this new parameter is accessed via the existing `BasePdfManager.docBaseUrl`-getter and will thus override any user-specified `docBaseUrl` API-parameter.
2022-04-19 17:14:52 +02:00
Calixte Denizet
c2aa03e194 Fix clipping issue with pattern (follow-up of #14797) 2022-04-18 12:41:14 +02:00
Jonas Jenwald
5bbed400f2
Merge pull request #14797 from calixteman/12306
Don't clip when the clip path is empty (issue #12306)
2022-04-18 11:18:32 +02:00
Calixte Denizet
3d74d2c6cb Don't clip when the clip path is empty (issue #12306) 2022-04-18 10:33:44 +02:00
Calixte Denizet
4b7691baf6 Simplify min/max computations in constructPath (bug 1135277)
- most of the time the current transform is a scaling one (modulo translation),
  hence it's possible to avoid to apply the transform on each bbox and then apply
  it a posteriori;
- compute the bbox when it's possible in the worker.
2022-04-17 17:25:54 +02:00
Calixte Denizet
f62d961dfe Improve performances with image masks (bug 857031)
- it's the second part of the fix for https://bugzilla.mozilla.org/show_bug.cgi?id=857031;
- some image masks can be used several times but at different positions;
- an image need to be pre-process before to be rendered:
  * rescale it;
  * use the fill color/pattern.
- the two operations above are time consuming so we can cache the generated canvas;
- the cache key is based on the current transform matrix (without the translation part)
  and the current fill color when it isn't a pattern.
- the rendering of the pdf in the above bug is really faster than without this patch.
2022-04-16 20:48:39 +02:00
Tim van der Meij
b73a6cc213
Merge pull request #14785 from Snuffleupagus/core-js-structuredClone-transfers
Update `core-js` to allow removing a `structuredClone` work-around
2022-04-16 12:36:44 +02:00
calixteman
681a9b8927
Merge pull request #14784 from calixteman/intersect
Improve performance of shared/utils.js::intersect (bug 1135277)
2022-04-15 22:38:19 +02:00
Calixte Denizet
7501fe6f30 Improve performance of shared/utils.js::intersect
- avoid to call normalizeRect which clones the rectangles: it's useless
  and time consuming;
- in profiling the pdf in bug 1135277, the time spent in intersect drops
  from ~1s to ~30ms.
2022-04-15 22:24:26 +02:00
Jonas Jenwald
b996e107c3 Update core-js to allow removing a structuredClone work-around
Because of a bug in previous `core-js` versions, which caused an Error to be thrown if its `structuredClone` polyfill was called with an *explicit* `null`/`undefined` transfer-parameter, the `LoopbackPort`-class contained a work-around.
In the latest `core-js` version this has been fixed, and we can thus simplify our code ever so slightly; please see https://github.com/zloirock/core-js/releases/tag/v3.22.0
2022-04-15 22:12:02 +02:00
Jonas Jenwald
e67cd7fae0 Replace the --viewport-scale-factor CSS variable
This CSS variable is only used together with the `annotationCanvasMap`-functionality in the canvas-code, however its value can be *trivially* computed by using the older `--zoom-factor` CSS variable together with the `PixelsPerInch`-structure.
Rather than having *two different* CSS variables that are this closely linked, it seems better to simplify things by using just one CSS variable instead.
2022-04-14 12:43:57 +02:00
Tim van der Meij
cdb3481d6c
Merge pull request #14764 from apeltop/correct-typos
Correct typos
2022-04-10 14:55:08 +02:00
Calixte Denizet
687c9a8710 Improve performance of applyMaskImageData
- write some uint32 instead of uint8 to avoid the check before clamping;
- unroll the loop to write data in the buffer
- but keep a loop for the last element of a line: it likely doesn't hurt
  that much since it's executed only for one time for each line;
- I tested on a macbook with an Apple chip, and on Firefox nightly the new
  code is almost 3.5x faster than before (~1.8x with Chrome).
2022-04-09 22:19:02 +02:00
Calixte Denizet
040fcae5ab Improve performance with image masks (bug 857031)
- it aims to partially fix performance issue reported: https://bugzilla.mozilla.org/show_bug.cgi?id=857031;
- the idea is too avoid to use byte arrays but use ImageBitmap which are a way faster to draw:
  * an ImageBitmap is Transferable which means that it can be built in the worker instead of in the main thread:
    - this is achieved in using an OffscreenCanvas when it's available, there is a bug to enable them
      for pdf.js: https://bugzilla.mozilla.org/show_bug.cgi?id=1763330;
    - or in using createImageBitmap: in Firefox a task is sent to the main thread to build the bitmap so
      it's slightly slower than using an OffscreenCanvas.
  * it's transfered from the worker to the main thread by "reference";
  * the byte buffers used to create the image data have a very short lifetime and ergo the memory used is globally
    less than before.
- Use the localImageCache for the mask;
- Fix the pdf issue4436r.pdf: it was expected to have a binary stream for the image;
- Move the singlePixel trick from operator_list to image: this way we can use this trick even if it isn't in a set
  as defined in operator_list.
2022-04-09 18:26:26 +02:00
apeltop
a97dd26389 Correct typos 2022-04-09 09:43:18 +09:00
Jonas Jenwald
a919959d83 Slightly simplify the Catalog._readMarkInfo method
We don't need to first check if the Dictionary contains the key, since trying to get a non-existent key simply returns `undefined` and we're already ensuring that the value is a boolean.
Furthermore, we shouldn't need to worry about the `Object.prototype` containing enumerable properties since the checks (in `src/core/worker.js`) done for `Array.prototype` *indirectly* also cover `Object`s. (Keep in mind that an `Array` is just a special kind of `Object` in JavaScript.)
2022-04-05 16:37:51 +02:00
Jonas Jenwald
1dc4713a0b Re-factor the isLittleEndian/isEvalSupported caching
This functionality is very old, hence we should be able to improve the caching a little bit with modern JavaScript features.
2022-04-05 16:01:01 +02:00
Calixte Denizet
f4fcb59a5e Refactor some xfa*** getters in document.js
- it's a follow-up of PR #14735.
2022-04-03 20:38:12 +02:00
Jonas Jenwald
f33ce5fc2d Decode non-ASCII values found in the xfa:datasets (PR 14735 follow-up)
*Please note:* This is possibly bad/wrong in general, but I figured that submitting it for review wouldn't hurt.

It seems that even Adobe Reader doesn't handle the non-ASCII characters that appear in some of the fields correctly, however it should be pretty easy to improve things on the PDF.js side.
2022-04-01 11:54:34 +02:00
Jonas Jenwald
36a289d747
Merge pull request #14735 from calixteman/14685
[Annotations] Some annotations can have their values stored in the xfa:datasets
2022-04-01 11:30:16 +02:00
Calixte Denizet
0b597304c1 [Annotations] Some annotations can have their values stored in the xfa:datasets
- it aims to fix #14685;
- add a basic object to get values from the parsed datasets;
- these annotations don't have an appearance so we must create one when printing or saving.
2022-04-01 10:28:04 +02:00
Jonas Jenwald
addb4cb12b Use String.prototype.repeat() in a couple of spots
Rather than using a temporary Array to manually create repeated strings, we can use `String.prototype.repeat()` instead.
The reason that we didn't use this from the start is most likely because some browsers, notably IE, didn't support this; note https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/repeat#browser_compatibility
2022-03-30 15:42:40 +02:00
Calixte Denizet
ad3fb71a02 [Annotations] Add support for printing/saving choice list with multiple selections
- it aims to fix issue #12189.
2022-03-29 18:59:44 +02:00
Jonas Jenwald
0dd6bc9a85
Merge pull request #14703 from calixteman/14627
[text selection] Add the whitespaces present in the pdf in the text chunk
2022-03-27 15:20:19 +02:00
Calixte Denizet
18e79e3c0b [text selection] Add the whitespaces present in the pdf in the text chunk
- it aims to fix issue #14627;
- the basic idea of the recent text refactoring was to only consider the rendered visible whitespaces.
  But sometimes, the heuristics aren't correct and although some whitespaces are in the text stream
  they weren't in the text chunks because they were too small. Hence we added some exceptions, for example,
  we always add a whitespace when it is between two non-whitespace chars but only when in the same Tj.
  So basically, this patch removes the constraint to have the chars in the same Tj
  (in using a circular buffer to save the two last chars) but don't add a space when the visible space is really
  too small (hence `NOT_A_SPACE_FACTOR`).
2022-03-27 14:34:56 +02:00
Jonas Jenwald
7f0589c74a Change the type of the container property, in the TextLayerRenderParameters typedef (issue 14716)
Given that the textLayer-code has been using a `DocumentFragment` ever since PR 3356 (back in 2013), simply updating the type of the `container` property should be fine.
This patch also tries to, ever so slightly, improve the grammar of a couple of other properties in the typedef.
2022-03-24 22:42:37 +01:00
Jonas Jenwald
849de5a508 Slightly improve validation of (some) parameters in getDocument
There's a couple of `getDocument` parameters that should be numbers, but which are currently not *fully* validated to prevent issues elsewhere in the code-base.
Also, improves validation of the `ownerDocument` parameter since we currently accept more-or-less anything here.
2022-03-21 13:32:17 +01:00
Jonas Jenwald
73d2ddac0d Update npm packages
Note that the Prettier update made it possible to move a couple of comments after `default:`-cases back to their original/intended positions, please see https://prettier.io/blog/2022/03/16/2.6.0.html
2022-03-20 10:59:13 +01:00
Calixte Denizet
f0b549c2a2 [JS] - Parse a date in using the given format first and then try the default date parser
- it aims to fix #14672.
2022-03-19 16:07:43 +01:00
Tim van der Meij
5de6af4e64
Merge pull request #14683 from Snuffleupagus/sendTest-cleanup
[src/display/api.js] Simplify the `sendTest` function, used with Worker initialization (PR 14291 follow-up)
2022-03-19 13:38:05 +01:00
Jonas Jenwald
c0736647f9 Add general iteration support in the RefSet and RefSetCache classes
This patch removes the existing `forEach` methods, in favor of making the classes properly iterable instead. Given that the classes are using a `Set` respectively a `Map` internally, implementing this is very easy/efficient and allows us to simplify some existing code.
2022-03-18 14:27:34 +01:00
Jonas Jenwald
be2b1d5d2a [src/display/api.js] Simplify the sendTest function, used with Worker initialization (PR 14291 follow-up)
Given that we now only use Workers when `postMessage` transfers are supported, there's really no point in trying to send a "test" message *without* transfers present.
Hence, if `postMessage` transfers are not supported by the browser, we'll now fallback to "fake" Workers immediately instead. The comment about Opera is also removed, since it was originally added back in PR 983 and mentions Opera `11.60` [which was released in 2011](https://en.wikipedia.org/wiki/History_of_the_Opera_web_browser#Version_11).
2022-03-16 13:25:41 +01:00
Jonas Jenwald
d5c9be341d [src/display/api.js] Use private static class fields, rather than shadowed getter work-arounds (PR 13813, 13882 follow-up)
At the time private static class fields were to new, however that's no longer an issue and we can thus (ever so slightly) simplify the code.
2022-03-16 13:02:34 +01:00
Jonas Jenwald
0c349c701f Remove the addLinkAttributes warnings in the Annotation/XFA-layers (PR 14092 follow-up)
These warnings have now been present in three releases, see PR 14092, hence it should (hopefully) be fine to remove them now.
2022-03-13 11:38:56 +01:00
Tim van der Meij
790735eaf1
Merge pull request #14658 from Snuffleupagus/api-validate-cMapUrl-standardFontDataUrl
Validate the `cMapUrl`/`standardFontDataUrl` parameters in `getDocument`
2022-03-11 21:09:58 +01:00
Jonas Jenwald
a60b98412f Validate the cMapUrl/standardFontDataUrl parameters in getDocument
These changes make sense for two reasons:
 - Given that the parameters are potentially passed to the worker-thread, depending on the `useWorkerFetch` parameter, we need to prevent errors if the user provides values that aren't clonable.
 - By ensuring that the default values are indeed `null`, we'll trigger main-thread fetching (of CMaps and Standard fonts) as intended in the `PartialEvaluator` and thus potentially provide better Error messages.
2022-03-10 16:33:10 +01:00
Jonas Jenwald
537ed37835 Move the isSameOrigin helper function
This function is currently placed in the `src/shared/util.js` file, which means that the code is duplicated in both of the *built* `pdf.js` and `pdf.worker.js` files. Furthermore, it only has a single call-site which is also specific to the `GENERIC`-build of the PDF.js library.

Hence this helper function is instead moved into the `src/display/api.js` file, in such a way that it's conditionally defined but still can be unit-tested.
2022-03-10 13:51:09 +01:00
Tim van der Meij
e85bb0b599
Merge pull request #14645 from Snuffleupagus/Node-DOMMatrix-polyfill
[api-minor] Remove the, in `legacy` builds, bundled `DOMMatrix` polyfill
2022-03-09 20:38:26 +01:00
Tim van der Meij
55a931e454
Merge pull request #14648 from Snuffleupagus/PDFDocument-stream
Simplify the `PDFDocument` constructor
2022-03-09 20:36:49 +01:00
Jonas Jenwald
6a78f20b17 Simplify the PDFDocument constructor
Originally the code in the `src/`-folder was shared between the main/worker-threads, and back then it probably made sense that the `PDFDocument` constructor accepted different arguments.
However, for many years we've not been passing anything *except* Streams to `PDFDocument` and we should thus be able to slightly simplify that code. Note that for e.g. unit-tests of this code, using either a `NullStream` or a `StringStream` works just fine.
2022-03-08 17:13:47 +01:00
Jonas Jenwald
157a71d404 [api-minor] Remove the, in legacy builds, bundled DOMMatrix polyfill
According to the MDN compatibility data, see https://developer.mozilla.org/en-US/docs/Web/API/DOMMatrix/DOMMatrix#browser_compatibility, all browsers that we support have native `DOMMatrix` implementations (since quite some time too).

Hence Node.js is the only environment that lack `DOMMatrix` support, which probably isn't that surprising given that it's browser functionality.
While the `DOMMatrix` polyfill isn't that large, it nonetheless seems completely unnecessary to bundle it in the `legacy` builds when it's not needed in browsers. However, we can avoid that by simply listing `dommatrix` as a dependency for the `pdfjs-dist` library.
2022-03-08 10:29:11 +01:00
Jonas Jenwald
6f600befdd Update TypeScript to version 4.6.2 and work-around stricter type checks
I'm guessing that we're now running into the class-related improvements mentioned in https://devblogs.microsoft.com/typescript/announcing-typescript-4-6/#target-es2022
To unblock this update, and any future ones, this patch simply tweaks the JSDocs to get `gulp typestest` to run without errors.
2022-03-07 11:55:17 +01:00
Tim van der Meij
5242c38af5
Merge pull request #14628 from Snuffleupagus/issue-14626
When `stopAtErrors` is set, throw rather than warn when exceeding `maxImageSize` (issue 14626)
2022-03-05 13:09:36 +01:00
Tim van der Meij
5d12ac576b
Merge pull request #14631 from Snuffleupagus/typedef-fixes
Fix a couple of small typos in JSDoc `typedef` comments
2022-03-05 13:06:53 +01:00
Jonas Jenwald
939e6f0c4c Fix a couple of small typos in JSDoc typedef comments
While this doesn't affect the official API documentation, these cases should nonetheless be fixed.
2022-03-04 12:11:52 +01:00
Jonas Jenwald
1a7921dbf0 Compute the loca table endOffset, of the "first" glyph, correctly (issue 14618)
When there are *multiple* empty glyphs at the start of the data, ensure that the "first" glyph gets a correct `endOffset` to avoid skipping it during parsing in the `sanitizeGlyph` function.
2022-03-03 14:22:45 +01:00
Jonas Jenwald
d0d5c596fb When stopAtErrors is set, throw rather than warn when exceeding maxImageSize (issue 14626)
The situation described in issue 14626 seems like a fairly special case, and it thus seem reasonable that we simply follow the same pattern as elsewhere in the `PartialEvaluator` when the `stopAtErrors` API-option is being used.
2022-03-03 13:11:29 +01:00
Brendan Dahl
85ff7b117e
Merge pull request #14536 from calixteman/thin_line
Fix some issues with lineWidth < 1 after transform (bug 1753075, bug 1743245, bug 1710019)
2022-03-02 09:46:15 -08:00
Jonas Jenwald
ab55071568 Remove the JSDocs "External: Promise"-page, since Promises are now a standard feature
The "External: Promise"-page in the JSDocs pre-dates the introduction of `Promise`s, as a generally available standard JS feature, by a number of years. Hence it now longer seems necessary, as far as I can tell, to include this "special" page in the documentation.

Also, while unrelated to the rest of the patch, updates the `test/`-folder description in the documentation.
2022-02-26 23:53:11 +01:00
calixteman
046ff07ee3
Merge pull request #14610 from Snuffleupagus/jpx-resetContextProbabilities
[JPEG 2000] Add support for resetContextProbabilities (bug 1731483)
2022-02-26 18:26:39 +01:00
Jonas Jenwald
99cd24ce3e Remove the isString helper function
The call-sites are replaced by direct `typeof`-checks instead, which removes unnecessary function calls. Note that in the `src/`-folder we already had more `typeof`-cases than `isString`-calls.
2022-02-26 16:33:41 +01:00
Jonas Jenwald
6bd4e0f5af Re-factor the PDFDocument.documentInfo method
This removes the `DocumentInfoValidators` structure, and thus (slightly) simplifies the code overall. With these changes we only have to iterate through, and validate, the actually available Dictionary entries.
2022-02-26 16:33:21 +01:00
Tim van der Meij
f782f5e5bb
Merge pull request #14607 from Snuffleupagus/wrapReason-unreachable
Simplify the `wrapReason` helper function
2022-02-26 15:37:29 +01:00
Tim van der Meij
cf7ce0aa7e
Merge pull request #14600 from Snuffleupagus/getPageIndex-more-validation
[api-minor] Add validation for the  `PDFDocumentProxy.getPageIndex` method
2022-02-26 15:30:00 +01:00
Jeff Muizelaar
9b9609a6d8 [JPEG 2000] Add support for resetContextProbabilities (bug 1731483) 2022-02-26 13:05:23 +01:00
Calixte Denizet
46369e4aa5 Fix some issues with lineWidth < 1 after transform (bug 1753075, bug 1743245, bug 1710019)
- it aims to fix:
   - https://bugzilla.mozilla.org/show_bug.cgi?id=1753075;
   - https://bugzilla.mozilla.org/show_bug.cgi?id=1743245;
   - https://bugzilla.mozilla.org/show_bug.cgi?id=1710019;
   - issue #13211;
   - issue #14521.
 - previously we were trying to adjust lineWidth to have something correct after the current transform is applied but this approach was not correct because finally the pixel is rescaled with the same factors in both directions.
  And sometimes those factors must be different (see bug 1753075).
 - So the idea of this patch is to apply a scale matrix to the current transform just before setting lineWidth and stroking. This scale matrix is computed in order to ensure that after transform, a pixel will have its two thickness greater than 1.
2022-02-25 18:37:34 +01:00
Jonas Jenwald
28fc8248f0 Simplify the wrapReason helper function
All call-sites that use `wrapReason` should be passing a (possibly cloned) `Error` to the helper function, hence we shouldn't need to have a fallback code-path for any other data.
Note that for the `cancel`/`error` methods on Streams, since PR 11115 we've been asserting that the argument is in fact an `Error` as intended.
When calling `wrapReason` from *rejected* Promises, we should also be guaranteed that an `Error` is provided thanks to the ESLint rules `no-throw-literal` and `prefer-promise-reject-errors`.
2022-02-25 18:31:12 +01:00
Jonas Jenwald
172d007598 [api-minor] Add validation for the PDFDocumentProxy.getPageIndex method
Currently we'll happily attempt to send any argument passed to this method over to the worker-thread, without doing any sort of validation.
That could obviously be quite bad, since there's first of all no protection against sending unclonable data. Secondly, it's also possible to pass data that will cause the `Ref.get` call in the worker-thread to fail immediately.

In order to address all of these issues, we'll now properly validate the argument passed to `PDFDocumentProxy.getPageIndex` and when necessary reject already on the main-thread instead.
2022-02-24 12:01:51 +01:00
Jonas Jenwald
2be8036eb7 [api-minor] Reduce duplication in the "gets non-existent page" unit-test 2022-02-24 11:25:21 +01:00
Jonas Jenwald
ec87995050 Ensure that Cmd/Name is only initialized with string arguments
Trying to use a non-string argument in either a `Cmd` or a `Name` is not intended, and would basically be an implementation error. Hence we can add a non-PRODUCTION check to enforce this, similar to the existing one used e.g. in the `Dict.set` method.
2022-02-23 22:39:12 +01:00
Tim van der Meij
2bb96a708c
Merge pull request #14598 from Snuffleupagus/rm-isBool
Re-factor the `Catalog.viewerPreferences` method and remove the `isBool` helper function
2022-02-23 20:36:56 +01:00
Tim van der Meij
409cbfc817
Merge pull request #14597 from Snuffleupagus/Dict-set-validate-key
Ensure that `Dict.set` only accepts string `key`s
2022-02-23 20:31:36 +01:00
Tim van der Meij
1b51e10c9c
Merge pull request #14595 from Snuffleupagus/structuredClone-comment-support
Update the support information for `structuredClone` (PR 14392 follow-up)
2022-02-23 20:27:35 +01:00
Jonas Jenwald
3704283f5b Remove the isBool helper function
The call-sites are replaced by direct `typeof`-checks instead, which removes unnecessary function calls.
2022-02-23 13:31:03 +01:00
Jonas Jenwald
82f1ee1755 Re-factor the Catalog.viewerPreferences method
This removes the `ViewerPreferencesValidators` structure, and thus (slightly) simplifies the code overall. With these changes we only have to iterate through, and validate, the actually available Dictionary entries.
2022-02-23 13:25:56 +01:00
Jonas Jenwald
a2f9031e9a Ensure that Dict.set only accepts string keys
Trying to use a non-string `key` in a `Dict` is not intended, and would basically be an implementation error. Hence we can add a non-PRODUCTION check to enforce this, complementing the existing `value` check added in PR 11672.
2022-02-22 16:35:20 +01:00
Jonas Jenwald
48985bd221 Update the support information for structuredClone (PR 14392 follow-up)
When the `structuredClone` polyfill was added, the support information in Safari was unclear. Given that an actual version *number* is now available, see below, it seems like a good idea to update the comment accordingly.

https://developer.mozilla.org/en-US/docs/Web/API/structuredClone#browser_compatibility
2022-02-22 12:30:54 +01:00
Jonas Jenwald
05edd91bdb Remove the isNum helper function
The call-sites are replaced by direct `typeof`-checks instead, which removes unnecessary function calls. Note that in the `src/`-folder we already had more `typeof`-cases than `isNum`-calls.

These changes were *mostly* done using regular expression search-and-replace, with two exceptions:
 - In `Font._charToGlyph` we no longer unconditionally update the `width`, since that seems completely unnecessary.
 - In `PDFDocument.documentInfo`, when parsing custom entries, we now do the `typeof`-check once.
2022-02-22 11:55:34 +01:00
Jonas Jenwald
b282814e38 Prefer instanceof Name rather than calling isName() with one argument
Unless you actually need to check that something is both a `Name` and also of the *correct* type, using `instanceof Name` directly should be a tiny bit more efficient since it avoids one function call and an unnecessary `undefined` check.

This patch uses ESLint to enforce this, since we obviously still want to keep the `isName` helper function for where it makes sense.
2022-02-21 12:45:00 +01:00
Jonas Jenwald
4df82ad31e Prefer instanceof Dict rather than calling isDict() with one argument
Unless you actually need to check that something is both a `Dict` and also of the *correct* type, using `instanceof Dict` directly should be a tiny bit more efficient since it avoids one function call and an unnecessary `undefined` check.

This patch uses ESLint to enforce this, since we obviously still want to keep the `isDict` helper function for where it makes sense.
2022-02-21 12:44:56 +01:00
Jonas Jenwald
67b658e8d5 Prefer instanceof Cmd rather than calling isCmd() with *one* argument
Unless you actually need to check that something is both a `Cmd` and also of the *correct* type, using `instanceof Cmd` directly should be a tiny bit more efficient since it avoids one function call and an unnecessary `undefined` check.

This patch uses ESLint to enforce this, since we obviously still want to keep the `isCmd` helper function for where it makes sense.
2022-02-21 12:44:51 +01:00
Jonas Jenwald
bad15894fc Improve the JSDocs for the PDFObjects class
Given that we expose `PDFObjects`-instances, via the `commonObjs` and `objs` properties, on the `PDFPageProxy`-instances this ought to help provide slightly better TypeScript definitions.
2022-02-20 13:02:14 +01:00
Jonas Jenwald
f4712bc0ad Simplify the data stored on PDFObjects-instances
The manually tracked `resolved`-property is no longer necessary, since the same information is now directly available on all `PromiseCapability`-instances.
Furthermore, since the `PDFObjects.resolve` method is not documented as accepting e.g. only Object-data, we probably shouldn't resolve the `PromiseCapability` with the `data` and instead only store it on the `PDFObjects`-instance.[1]

---
[1] While Objects are passed by reference in JavaScript, other primitives such as e.g. strings are passed by value and the current implementation *could* thus lead to increased memory usage. Given how we're using `PDFObjects` in the PDF.js code-base none of this should be an issue, but it still cannot hurt to change this.
2022-02-20 12:33:33 +01:00
Jonas Jenwald
beecde3229 Introduce (some) private properties/methods in the PDFObjects class
This ensures that the underlying data cannot be accessed directly, from the outside, since that's definately not intended here.
Note that we expose `PDFObjects`-instances, via the `commonObjs` and `objs` properties, on the `PDFPageProxy`-instances hence these changes really cannot hurt.
2022-02-20 12:23:30 +01:00
Jonas Jenwald
2cb2f633ac Remove the isRef helper function
This helper function is not really needed, since it's just a wrapper around a simple `instanceof` check, and it only adds unnecessary indirection in the code.
2022-02-19 15:33:42 +01:00
Tim van der Meij
df0aa1a9c4
Merge pull request #14575 from Snuffleupagus/rm-isStream
Remove the `isStream` helper function
2022-02-19 14:59:19 +01:00
Jonas Jenwald
05efe3017b Change PixelsPerInch to a class with static properties (issue 14579)
*Please note:* I'm completely fine with this patch being rejected, and the issue instead closed as WONTFIX, since this is unfortunately a case where the TypeScript definitions dictate how we can/cannot write JavaScript code.

Apparently the TypeScript definitions generation converts the existing `PixelsPerInch` code into a `namespace` and simply ignores the getter; please see a7fc0d33a1/types/src/display/display_utils.d.ts (L223-L226)

Initially I tried tagging `PixelsPerInch` as en `@enum`, see https://jsdoc.app/tags-enum.html, however that unfortunately didn't help.
Hence the only good/simple solution, as far as I'm concerned, is to convert `PixelsPerInch` into a class with `static` properties. This patch results in the following diff, for the `gulp types` build target:
```diff
@@ -195,9 +195,10 @@
      */
     static toDateObject(input: string): Date | null;
 }
-export namespace PixelsPerInch {
-    const CSS: number;
-    const PDF: number;
+export class PixelsPerInch {
+    static CSS: number;
+    static PDF: number;
+    static PDF_TO_CSS_UNITS: number;
 }
 declare const RenderingCancelledException_base: any;
 export class RenderingCancelledException extends RenderingCancelledException_base {
```
2022-02-19 09:05:40 +01:00
Jonas Jenwald
530af48b8e
Merge pull request #14569 from brendandahl/smask-state
Fix canvas state getting out of sync from smasks. (bug 1755507)
2022-02-18 19:35:58 +01:00
Brendan Dahl
7def6d12c8 Fix canvas state getting out of sync from smasks. (bug 1755507)
Soft masks can be enabled/disabled at anytime and at different
points in the save/restore stack. This can lead to
the amount of save/restores becoming unbalanced across the
two canvases. Instead of save/restoring on the temporary canvas
change it so we only track state on the main (suspended canvas).

I was also getting an out balance stack from patterns, so I've also
fixed that and added a warning that will at least show up on chrome.
It would be nice to add this so Firefox at some point too.

Fixes #11328, #14297 and bug 1755507
2022-02-17 17:38:32 -08:00
Jonas Jenwald
1a31855977 Remove the isStream helper function
At this point all the various Stream-classes extends an abstract base-class, hence this helper function is no longer necessary and only adds unnecessary indirection in the code.
2022-02-17 13:51:36 +01:00
Jonas Jenwald
fd319e94b3 Add a missing string-check in the _collectJS helper function
Unfortunately I don't have a test-case that breaks without this change, however the `stringToPDFString` helper function will fail if anything other than a string is passed to it.
The changes in this patch thus make this code more-or-less identical to that found in the `Catalog.{_collectJavaScript, parseDestDictionary}` methods.
2022-02-16 13:43:42 +01:00
Calixte Denizet
18e3a98c2b [api-minor] Don't add in the text content the chars which are out-of-page (bug 1755201)
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1755201;
- if the glyph position is not within the view then skip it.
2022-02-13 21:07:11 +01:00
Tim van der Meij
c37d785b2a
Merge pull request #14560 from Snuffleupagus/Node-ReadableStream-polyfill
[api-minor] Remove the, in `legacy` builds, bundled `ReadableStream` polyfill
2022-02-13 14:08:22 +01:00
Jonas Jenwald
b89595fd20 [api-minor] Remove the, in legacy builds, bundled ReadableStream polyfill
According to the MDN compatibility data, see https://developer.mozilla.org/en-US/docs/Web/API/ReadableStream#browser_compatibility, all browsers that we support have native `ReadableStream` implementations (since quite some time too).

Hence only Node.js is now lagging behind w.r.t. `ReadableStream` support, and its experimental implementation doesn't really help us given the life-span of the LTS releases (see https://en.wikipedia.org/wiki/Node.js#Releases).
It seems quite unfortunate to bundle a `ReadableStream` polyfill in the `legacy` builds when it's unnecessary in browsers, given its overall size, but fortunately we can avoid that by simply listing `web-streams-polyfill` as a dependency for the `pdfjs-dist` library.
2022-02-13 10:15:58 +01:00
Jonas Jenwald
d642d34500 Remove the UTF-8 fallback, when TextDecoder is missing, from the Content-Disposition parser
Given that `TextDecoder` is now supported by all modern browsers/environments, please see https://developer.mozilla.org/en-US/docs/Web/API/TextDecoder#browser_compatibility, there's no longer any good reason to keep a UTF-8 fallback in the Content-Disposition parser.
2022-02-12 10:30:25 +01:00
Jonas Jenwald
b87a243222 [api-minor] Stop exposing the createObjectURL helper function in the API
With recent changes, specifically PR 14515 *and* the previous patch, the `createObjectURL` helper function is now only used with the SVG back-end.
All other call-sites, throughout the code-base, are now using `URL.createObjectURL(...)` directly and it no longer seems necessary to keep exposing the helper function in the API.
Finally, the `createObjectURL` helper function is moved into the `src/display/svg.js` file to avoid unnecessarily duplicating this code on both the main- and worker-threads.
2022-02-10 12:01:35 +01:00
Brendan Dahl
f8b2a99ddc
Merge pull request #14543 from Snuffleupagus/bug-1753983
Let `Lexer.getNumber` treat a single minus sign as zero (bug 1753983)
2022-02-09 14:06:35 -08:00
Jonas Jenwald
1f0fb270b1 [api-minor] Ensure that the PDFDocumentLoadingTask-promise is rejected when cancelling the PasswordPrompt (bug 1754421)
This is essentially a *continuation* of PR 7926, where we added support for rejecting the current `PDFDocumentLoadingTask`-promise by throwing inside of the `onPassword`-callback.
Hence the naive way to address [bug 1754421](https://bugzilla.mozilla.org/show_bug.cgi?id=1754421) would be to simply throw in the `onPassword`-callback used in the default viewer. However it unfortunately turns out to not work, since the password input/validation is asynchronous, and we thus need another approach.

The simplest solution that I can come up with here, is thus to *extend* the `onPassword`-callback to also reject the current `PDFDocumentLoadingTask`-instance if an `Error` is explicitly passed as the input to the callback function. (This doesn't feel great, but I cannot see a better solution that isn't really complicated.)
2022-02-09 15:09:20 +01:00
Jonas Jenwald
64f3dbeb48 Let Lexer.getNumber treat a single minus sign as zero (bug 1753983)
This appears to be consistent with the behaviour in both Adobe Reader and PDFium (in Google Chrome); this is essentially the same approach as used for a single decimal point in PR 9827.
2022-02-07 17:09:47 +01:00
Jonas Jenwald
03f5f6a421 [api-minor] Update the minimum supported browser versions
Please note that while we "support" some (by now) fairly old browsers, that essentially means that the library (and viewer) will load and that the basic functionality will work as intended.[1]
However, in older browsers, some functionality may not be available and generally we'll ask users to update to a modern browser when bugs (specific to old browsers) are reported.[2]

There's always a question of just how old browsers the PDF.js contributors can realistically support, and here I'm suggesting that we place the cut-off point at approximately *three* years.
With that in mind, this patch updates the *minimum* supported browsers (and environments) as follows:
 - Chrome 73, which was released on 2019-03-12; see https://en.wikipedia.org/wiki/Google_Chrome_version_history
 - Firefox ESR (as before); see https://wiki.mozilla.org/Release_Management/Calendar
 - Safari 12.1, which was released on 2019-03-25; see https://en.wikipedia.org/wiki/Safari_version_history#Safari_12
 - Node.js 12, which was release on 2019-04-23 (and will soon reach EOL); see https://en.wikipedia.org/wiki/Node.js#Releases

---
[1] Assuming a `legacy`-build is being used, of course.

[2] In general it's never a good idea to use an old/outdated browser, since those may contain *known* security vulnerabilities.
2022-02-06 13:06:43 +01:00
Jonas Jenwald
403baa7bba [api-minor] Remove the normalizeWhitespace option in the PDFPageProxy.{getTextContent, streamTextContent} methods (issue 14519, PR 14428 follow-up)
With these changes, we'll now *always* replace all whitespaces with standard spaces (0x20). This behaviour is already, since many years, the default in both the viewer and the browser-tests.
2022-02-03 09:17:22 +01:00
calixteman
7a034706ba
Merge pull request #14510 from calixteman/14502
[api-minor] Annotations - Adjust the font size in text field in considering the total width (bug 1721335)
2022-01-30 15:58:51 +01:00
Calixte Denizet
ae842e1c3a [api-minor] Annotations - Adjust the font size in text field in considering the total width (bug 1721335)
- it aims to fix #14502 and bug 1721335;
 - Acrobat and Pdfium do the same;
 - it'll avoid to have truncated data when printed;
 - change the factor to compute font size in using field height: lineHeight = 1.35*fontSize
  - this is the value used by Acrobat.
 - in order to not have truncated strings on the bottom, add few basic metrics for standard fonts.
2022-01-30 15:53:31 +01:00
Jonas Jenwald
7cc761a8c0 Polyfill structuredClone with core-js (PR 13948 follow-up)
This allows us to remove the manually implemented `structuredClone` polyfill, thus reducing the maintenance burden for the `LoopbackPort` class; refer to https://github.com/zloirock/core-js#structuredclone

*Please note:* While `structuredClone` support landed already in Firefox 94, Google Chrome only added it in version 98 (currently in Beta). However, given that the `LoopbackPort` will only be used together with *fake workers* in browsers this shouldn't be too much of a problem.[1]
For Node.js environments, where *fake workers* are unfortunately necessary, using a `legacy/`-build is already required which thus guarantees that the `structuredClone` polyfill is available.

Also, the patch updates core-js to the latest version since that one includes `structuredClone` improvements; please see https://github.com/zloirock/core-js/releases/tag/v3.20.3

---
[1] Given that we only support browsers with proper worker support, if *fake workers* are being used that essentially indicates a configuration problem/error.
2022-01-27 21:11:42 +01:00
Jonas Jenwald
8f6965b197
Merge pull request #14506 from Snuffleupagus/license_header_2022
Update the year in the `license_header` files
2022-01-27 19:34:56 +01:00
Jonas Jenwald
00bd549e82 Update the year in the license_header files
This also includes a couple of files that are included as-is in the `pdfjs-dist` library.
2022-01-27 19:24:31 +01:00
calixteman
838909f8c1
Merge pull request #14491 from quaoaris/lines-rendered-too-thick
fix for lines (stroke) are rendered too thick  (Bug 1743245)
2022-01-27 18:46:26 +01:00
Calixte Denizet
3a7004ca25 Take into account all rotations before comparing glyph positions
- it aims to fix #14497;
 - previously, only rotations with an angle 0, 90, 180 or 270 were taken into account;
 - so generalize to any angle but keep the fast path for 0, 90, ... because they're likely more common than anything else.
2022-01-26 17:19:00 +01:00
quaoaris
3f77d80f31 fix for lines (stroke) are rendered too thick (Bug 1743245)
This commit fixes Bug 1743245 (Grided PDF file lines rendered too thick) which was created by a fix for  #12868 .
The lineWidth was set to round(1 * this._combinedScaleFactor) when the pixel is drawn as a parallelorgam with a height <1. This fix changes this to floor(1*this._combinedScaleFactor) .

This change shows a visual result comparable to Chrome and Acrobat.
Regarding the last PR 3 statements in canvas.js are affected and will change with this commit (stroke and paintChar).

renaming the reference files to naming comvention
2022-01-25 10:27:30 +01:00
Jonas Jenwald
8836593b9e Add a (global) cache to the getCharUnicodeCategory function
Given that the regular expression has already become more complex (after the initial patch adding it), it seems to me that it probably cannot hurt to add a global cache to reduce unnecessary re-parsing.
Obviously the `Glyph`-instances are being cached *per* font, however in most documents multiple fonts are being used and in practice there's very often a fair amount of overlap between the /ToUnicode-data in different fonts[1].

Consider for example loading and rendering the entire `tracemonkey.pdf` document (from the test-suite), which isn't a particularily large document. In that case the `getCharUnicodeCategory` function is being called a total of `601` times, however there's only `106` *unique* unicode-chars being checked.

*Please note:* In practice I suppose that this won't have a *huge* effect on overall performance, however given the relative simplicity of this patch I figured that it'd not hurt to submit it for review.

---
[1] Consider e.g. how there's usually different fonts used for regular, bold, respectively italic text.
2022-01-25 09:59:34 +01:00
Calixte Denizet
e1d3a3b414 Remove the invisible format marks from the text chunks
- it aims to fix issue #9186.
2022-01-24 13:47:24 +01:00
calixteman
88236e1163
Merge pull request #14430 from calixteman/beforeinput
[JS] Use beforeinput event to trigger a keystroke event in the sandbox
2022-01-23 20:42:33 +01:00
Calixte Denizet
6ac296e48e [JS] Use beforeinput event to trigger a keystroke event in the sandbox
- it aims to fix issue #14307;
 - this event has been added recently in Firefox and we can now use it;
 - fix few bugs in aform.js or in annotation_layer.js;
 - add some integration tests to test keystroke events (see `AFSpecial_Keystroke`);
 - make dispatchEvent in the quickjs sandbox async.
2022-01-23 19:53:01 +01:00
Tim van der Meij
23b6fde9fc
Merge pull request #14464 from Snuffleupagus/issue-14462
Support Type1 font files with incomplete /CharStrings definitions (issue 14462)
2022-01-19 20:38:46 +01:00
calixteman
b0231cc887
Merge pull request #14456 from calixteman/1749563
Font renderer - get int8 instead of uint8 in composite glyphes (bug 1749563)
2022-01-19 01:20:49 -08:00
Calixte Denizet
74f25d2755 Font renderer - get int8 instead of uint8 in composite glyphes (bug 1749563)
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1749563;
 - use some helper functions to get (u|i)int** values in buffer: it helps to have a clearer code;
 - in composite glyphes the translations values with a transformations are signed so consequently get some int8 instead of uint8;
 - add few TODOs.
2022-01-18 22:06:23 +01:00
Jonas Jenwald
a13ae5d97d Support Type1 font files with incomplete /CharStrings definitions (issue 14462)
Please refer to https://www.pdfa.org/norm-refs/Type1Fonts.pdf#page=15 for the expected format for the /CharStrings entries.
In the referenced PDF document the /CharStrings are missing the expected end-token, which causes us to swallow the start of the next glyph name.
2022-01-17 18:55:22 +01:00
Jonas Jenwald
ba37d600d7 Make the normalizeWhitespace handling, in the PartialEvaluator, more efficient (PR 14428 follow-up)
After the changes in PR 14428 we can *directly*, and more efficiently, handle whitespace conversion in `PartialEvaluator.getTextContent` when the `normalizeWhitespace` option is being used.
This way we no longer need a separate helper function for this, and can avoid having to (again) iterate through the text and checking each character. Finally, this also removes the need for using a regular expression on e.g. all non-ASCII text.
2022-01-16 08:29:21 +01:00
calixteman
da953f4b64
Merge pull request #14428 from calixteman/typo
Use the correct dimension to know if we have to add an EOL in vertical mode
2022-01-15 12:47:10 -08:00
Calixte Denizet
9dae421a0d Handle all the whitespaces the same way when creating text chunks 2022-01-15 21:44:00 +01:00
Tim van der Meij
922dac035c
Merge pull request #14448 from Snuffleupagus/Type3-circular-refs
Prevent circular references in Type3 fonts
2022-01-15 14:11:47 +01:00
Tim van der Meij
a72d188599
Merge pull request #14439 from Snuffleupagus/issue-14438
Ignore Annotations with empty /Rect-entries in the display-layer (issue 14438)
2022-01-15 14:11:25 +01:00
Tim van der Meij
c0d2932faf
Merge pull request #14454 from Snuffleupagus/util-more-unreachable
Replace some `assert` usage with `unreachable` in the `src/shared/util.js` file
2022-01-15 13:52:10 +01:00
Tim van der Meij
625f829842
Merge pull request #14446 from Snuffleupagus/issue-14435
Expose even more API-functionality in the TypeScript definitions (issue 14435, PR 14013 follow-up)
2022-01-15 13:46:11 +01:00
Jonas Jenwald
0e1b93bf20 Replace some assert usage with unreachable in the src/shared/util.js file
Inlining the checks should be a *tiny bit* more efficient, since it avoids have to make *unconditional* function calls in these fairly commonly used helper functions.
2022-01-15 13:01:25 +01:00
Jonas Jenwald
12d8f0b64d Re-factor the stringToPDFString helper function for UTF-16 strings
This patch changes the function to instead utilize the `TextDecoder` for both kinds of UTF-16 BOM strings.
2022-01-14 20:38:40 +01:00
Jonas Jenwald
76444888fb Add (basic) UTF-8 support in the stringToPDFString helper function (issue 14449)
This patch implements this by looking for the UTF-8 BOM, i.e. `\xEF\xBB\xBF`, in order to determine the encoding.[1]
The actual conversion is done using the `TextDecoder` interface, which should be available in all environments/browsers that we support; please see https://developer.mozilla.org/en-US/docs/Web/API/TextDecoder#browser_compatibility

---
[1] Assuming that everything lacking a UTF-16 BOM would have to be UTF-8 encoded really doesn't seem correct.
2022-01-14 18:57:07 +01:00
Jonas Jenwald
53d4ee7990 Prevent circular references in Type3 fonts
In corrupt PDF documents Type3 fonts may introduce circular dependencies, thus resulting in the affected font(s) never loading and parsing/rendering never completing.
Note that I've not seen any real-world examples of this kind of font corruption, but the attached PDF document was rather found in https://github.com/pdf-association/safedocs/tree/main/Miscellaneous%20Targeted%20Test%20PDFs

*Please note:* That repository contains a number of reduced test-cases that are specifically intended to test interoperability (between PDF viewer) and parsing/rendering for various kinds of strange/corrupt PDF documents.
Some of the test-cases found there may thus not make sense to try and "fix" upfront, in my opinion, unless the problems are also found in real-world PDF documents.
2022-01-13 17:58:37 +01:00
Jonas Jenwald
b9849e38b8 Expose even more API-functionality in the TypeScript definitions (issue 14435, PR 14013 follow-up)
While `PageViewport` apparently makes sense in TypeScript environments, given that it's being returned by the `PDFPageProxy.getViewport`-method in the API, we really don't want to extend the *public* API by simply exporting the class directly in `src/pdf.js` since it should never be called/initialized manually.
Hence we follow the same pattern as in PR 14013, and also extend the API unit-tests to ensure that `PDFPageProxy.getViewport` always returns a `PageViewport`-instance as expected.
2022-01-13 12:05:40 +01:00
Jonas Jenwald
08d88a0235 Ignore Annotations with empty /Rect-entries in the display-layer (issue 14438)
This prevents the `BaseSVGFactory.create`-method from throwing, and thus preventing any remaining Annotations (on the page) from rendering in corrupt documents.
2022-01-11 13:54:35 +01:00
Tim van der Meij
8ac0ccc227
Merge pull request #14424 from Snuffleupagus/mv-addLinkAttributes
[api-minor] Move `addLinkAttributes`, `LinkTarget`, and `removeNullCharacters` into the viewer (PR 14092 follow-up)
2022-01-08 13:19:11 +01:00
Calixte Denizet
6369617e6f [JS] Fix few errors around AFSpecial_Keystroke
- @cincodenada found some errors which are fixed in this patch;
 - it partially fixes issue #14306;
 - add some tests.
2022-01-08 12:34:56 +01:00
Calixte Denizet
9bb636402a Use the correct dimension to know if we have to add an EOL in vertical mode 2022-01-07 15:19:03 +01:00
Jonas Jenwald
7b8794b37e [api-minor] Move removeNullCharacters into the viewer
This helper function has never been used in e.g. the worker-thread, hence its placement in `src/shared/util.js` led to a *small* amount of unnecessary duplication.
After the previous patches this helper function is now *only* used in the viewer, hence it no longer seems necessary to expose it through the official API.

*Please note:* It seems somewhat unlikely that third-party users were relying *directly* on this helper function, which is why it's not being exported as part of the viewer components. (If necessary, we can always change this later on.)
2022-01-06 12:25:33 +01:00
Jonas Jenwald
2d2b6463b8 [api-minor] Move addLinkAttributes and LinkTarget into the viewer
As part of the changes/improvement in PR 14092, we're no longer using the `addLinkAttributes` directly in e.g. the AnnotationLayer-code.
Given that the helper function is now *only* used in the viewer, hence it no longer seems necessary to expose it through the official API.

*Please note:* It seems somewhat unlikely that third-party users were relying *directly* on the helper function, which is why it's not being exported as part of the viewer components. (If necessary, we can always change this later on.)
2022-01-06 12:25:33 +01:00
Calixte Denizet
6cdae5ac4d Use positive dimensions for text chunks in the text layer (issue #14415). 2022-01-05 10:49:56 +01:00
Jonas Jenwald
b0e774d9c5 Convert Catalog.getAllPageDicts to an async method
The patch in PR 14335 *essentially* re-introduced the old code from before PR 3848, however looking at this code a bit closer it should be possible to simplify it by making the method asynchronous.

While this method is currently only used as a *fallback* in corrupt documents, the way that `MissingDataException`s are handled is less than ideal. Note that if a `MissingDataException` is thrown, we're forced to re-parse the *entire* /Pages tree[1].
With this method now being asynchronous, we're able to handle fetching of References in a *much* easier/nicer way than before without having to throw `MissingDataException`s and re-parse anything.
These changes also let us simplify the call-site slightly, by calling the method *directly* instead of using the `PDFManager`-instance (since again it will no longer throw `MissingDataException`s).

Furthermore, this patch contains the following other changes:
 - Reduce unnecessary duplication in the various `catch` handlers throughout the method, by simply moving the `XRefEntryException` handling into the `addPageError` helper function instead.
 - Move the "circular references"-check to occur slightly earlier, since there's obviously no point in asynchronously fetching data just to then throw an Error *immediately* afterwards.

---
[1] Imagine e.g. a thousand page document, where there's a `MissingDataException` thrown when fetching/parsing page 900.
2021-12-31 22:03:10 +01:00
Jonas Jenwald
1491459dea Improve caching for the Catalog.getPageIndex method (PR 13319 follow-up)
This method is now being used a lot more, compared to when it's added, since it's now used together with scripting as part of the `PDFDocument.fieldObjects` parsing (called during viewer initialization).
For /Page Dictionaries that we've already parsed, the `pageIndex` corresponding to a particular Reference is already known and we're thus able to skip *all* parsing in the `Catalog.getPageIndex` method for those cases.
2021-12-29 20:29:14 +01:00
Jonas Jenwald
a20393e6e4 Update PDFDocument._getLinearizationPage to do the /Type-check correctly (PR 14400 follow-up)
I forgot about this in PR 14400, since we should obviously be consistent *and* given that the existing check is actually wrong; sorry about this!
2021-12-29 13:26:58 +01:00
Tim van der Meij
e42d54e1b5
Merge pull request #14400 from Snuffleupagus/getPageDict-async
[api-minor] Convert `Catalog.getPageDict` to an asynchronous method
2021-12-28 19:40:34 +01:00
Jonas Jenwald
b513c64d9d [api-minor] Convert Catalog.getPageDict to an asynchronous method
Besides converting `Catalog.getPageDict` to an `async` method, thus simplifying the code, this patch also allows us to pro-actively fix a existing issue.
Note how we're looking up References in such a way that `MissingDataException`s won't cause trouble, however it's *technically possible* that the entries (i.e. /Count, /Kids, and /Type) in a /Pages Dictionary could actually be indirect objects as well. In the existing code this could lead to *some*, or even all, pages failing to load/render as intended.
In practice that doesn't *appear* to happen in real-world PDF documents, but given all the weird things that PDF software do I'd prefer to fix this pro-actively (rather than waiting for a bug report).
With `Catalog.getPageDict` being `async` this is now really simple to address, however I didn't want to introduce a bunch more *unconditional* asynchronicity in this method if it could be avoided (since that could slow things down). Hence we'll *synchronously* lookup the *raw* data in a /Pages Dictionary, and only fallback to asynchronous data lookup when a Reference was encountered.

In addition to the above, this patch also makes the following notable changes:
 - Let `Catalog.getPageDict` *consistently* reject with the actual error, regardless of what data we're fetching. Previously we'd "swallow" the actual errors except when looking up Dictionary entries, which is inconsistent and thus seem unfortunate. As can be seen from the updated unit-tests this change is API-observable, hence why the patch is tagged `[api-minor]`.

 - Improve the consistency of the Dictionary /Type-checks in both the `Catalog.getPageDict` and `Catalog.getAllPageDicts` methods.
   In `Catalog.getPageDict` there's a fallback code-path where we're *incorrectly* checking the /Page Dictionary for a /Contents-entry, which is wrong since a /Page Dictionary doesn't need to have a /Contents-entry in order to be valid.
   For consistency the `Catalog.getAllPageDicts` method is also updated to handle errors in the /Type-lookup correctly.

 - Reduce the `PagesCountLimit.PAUSE_EAGER_PAGE_INIT` viewer constant, to further improve loading/rendering performance of the *second* page during initialization of very long documents; PR 14359 follow-up.
2021-12-25 15:22:48 +01:00
KouWakai
98158b67a3 Handle non-integer Annotation border widths correctly (issue 14203)
The existing code appears to be wrong, since according to the PDF specification the border width of an Annotation only has to be a number and not specifically an integer. Please see:
 - https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#page=392
 - https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G11.2096210
 - https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G6.1965562
2021-12-24 22:10:19 +09:00
Jonas Jenwald
e0dba504d2 Fix broken/missing JSDocs and typedefs, to allow updating TypeScript to the latest version (issue 14342)
This patch circumvents the issues seen when trying to update TypeScript to version `4.5`, by "simply" fixing the broken/missing JSDocs and `typedef`s such that `gulp typestest` now passes.
As always, given that I don't really know anything about TypeScript, I cannot tell if this is a "correct" and/or proper way of doing things; we'll need TypeScript users to help out with testing!

*Please note:* I'm sorry about the size of this patch, but given how intertwined all of this unfortunately is it just didn't seem easy to split this into smaller parts.
However, one good thing about this TypeScript update is that it helped uncover a number of pre-existing bugs in our JSDocs comments.
2021-12-15 23:14:25 +01:00
Tim van der Meij
d3e1d7090a
Merge pull request #14370 from Snuffleupagus/getPageDict-sync-Pages
Slightly reduce asynchronicity in the `Catalog.getPageDict` method (PR 14338 follow-up)
2021-12-15 19:40:39 +01:00
Jonas Jenwald
760f765e56 Move the /Lang handling into the BaseViewer (PR 14114 follow-up)
In PR 14114 this was only added to the default viewer, which means that in the viewer components the user would need to *manually* implement /Lang handling. This was (obviously) a bad choice, since the viewer components already support e.g. structTrees by default; sorry about overlooking this!

To avoid having to make *two* `getMetadata` API-calls[1] very early during initialization, in the default viewer, the API will now cache its result. This will also come in handy elsewhere in the default viewer, e.g. by reducing parsing when opening the "document properties" dialog.

---
[1] This not only includes a round-trip to the worker-thread, but also having to re-parse the /Metadata-entry when it exists.
2021-12-14 13:19:05 +01:00
Jonas Jenwald
fa51fd9428 Slightly reduce asynchronicity in the Catalog.getPageDict method (PR 14338 follow-up)
After the changes in PR 14338, specifically in the `XRef.parse`-method, the /Pages-entry will now always have been fetched/validated when the `Catalog`-instance is created.
Hence we can directly access the /Pages-entry in `Catalog.getPageDict` and thus avoid *one* asynchronous data-lookup per page in the document. (In practice this is unlikely to show up in e.g. benchmarks, but it really cannot hurt.)

Finally, make sure that the `getPageDict`/`getAllPageDicts`-methods track the /Pages-tree reference correctly to prevent circular references in corrupt documents.
2021-12-13 21:18:06 +01:00
Tim van der Meij
a6dd39b645
Merge pull request #14358 from Snuffleupagus/checkLastPage-improvements
Improve `PDFDocument.checkLastPage`/`Catalog.getAllPageDicts` for documents with corrupt XRef tables (PR 14311, 14335 follow-up)
2021-12-11 13:07:54 +01:00
Tim van der Meij
70809a80ce
Merge pull request #14355 from Snuffleupagus/api-page-caches-Map
Change `WorkerTransport.{pageCache, pagePromises}` from an Array to a Map
2021-12-11 13:00:11 +01:00
Jonas Jenwald
70ac6b1694 Update Catalog.getAllPageDicts to always propagate the actual Errors (PR 14335 follow-up)
Rather than "swallowing" the actual Errors, when data fetching fails, ensure that they're always being propagated as intended to the call-site instead.
Note that we purposely handle `XRefEntryException` specially, to make it possible to fallback to indexing all XRef objects.
2021-12-10 15:22:36 +01:00
Jonas Jenwald
47f9eef584 Improve PDFDocument.checkLastPage for documents with corrupt XRef tables (PR 14311, 14335 follow-up)
Rather than trying, and failing, to fetch the entire /Pages-tree for documents with corrupt XRef tables, let's fallback to indexing all objects *before* trying to invoke the `Catalog.getAllPageDicts` method.
2021-12-10 11:45:09 +01:00
Jonas Jenwald
f39536a30b Change WorkerTransport.pagePromises from an Array to a Map
Given that not all pages necessarily are being accessed, or that the pages may be accessed out of order, using a `Map` seems like a more appropriate data-structure here.

Finally, also changes the `pagePromises` to a *private* property since it's not supposed to be accessed from the "outside".
2021-12-09 15:30:10 +01:00
Jonas Jenwald
c5525dcb69 Change WorkerTransport.pageCache from an Array to a Map
Given that not all pages necessarily are being accessed, or that the pages may be accessed out of order, using a `Map` seems like a more appropriate data-structure here.
For one thing, this simplifies iteration since we no longer have to worry about/check if `pageCache`-entries are undefined (which will happen for *sparse* `Array`s).

Of particular note is that we're no longer attempting to "null" the `pageCache`-entry from within the `PDFPageProxy._destroy`-method. Given that *synchronous* JavaScript will always run to completion[1] and that we're looping through all pages in `WorkerTransport.destroy` and immediately clear the cache afterwards, that code did/does not really make a lot of sense (as far as I can tell).

Finally, also changes the `pageCache` to a *private* property since it's not supposed to be accessed from the "outside".

---
[1] Unless there are errors, of course.
2021-12-09 15:29:47 +01:00
Jonas Jenwald
8a05db230e Further improve caching in Catalog.getPageDict, for disableAutoFetch mode (PR 8207 follow-up)
PR 8207 added caching to improve the performance of `Catalog.getPageDict`, by not having to repeatedly fetch the same data and also reducing the asynchronicity of that method.
However, because of *another* oversight on my part, we're only caching /Page references once we've found the correct page. As long as all pages are loaded *in order* this doesn't really matter (happens by default in the viewer), but when `disableAutoFetch` is used the pages may be fetched in a more random order (this patch reduces the asynchronicity of `Catalog.getPageDict` slightly in that case).
2021-12-09 12:54:49 +01:00
Tim van der Meij
97dc048e56
Merge pull request #14350 from Snuffleupagus/ccitt-infinite-loop
Prevent an infinite loop when parsing corrupt /CCITTFaxDecode data (issue 14305)
2021-12-08 20:01:21 +01:00
Jonas Jenwald
e8562173b8 Prevent an infinite loop when parsing corrupt /CCITTFaxDecode data (issue 14305)
Fixes one of the documents in issue 14305.
2021-12-07 13:57:25 +01:00
Jonas Jenwald
5f295ba280 Improve caching in Catalog.getPageDict (PR 8207 follow-up)
PR 8207 added caching to improve the performance of `Catalog.getPageDict`, by not having to repeatedly fetch the same data and also reducing the asynchronicity of that method.
However, because of annoying off-by-one errors[1] the caching became less efficient than it could/should be.[2] Note here that the /Pages-tree is zero-indexed, and that e.g. `pageIndex = 5` thus correspond to the *sixth* page of the document.

---
[1] In particular the `currentPageIndex + count < pageIndex` part.

[2] For example, even when loading a relatively small/simple document such as `tracemonkey.pdf` in the viewer, the number of `xref.fetchAsync(currentNode)` calls are reduced from `56` to `44` with this patch.
2021-12-06 11:49:31 +01:00
Tim van der Meij
335c4c8a43
Merge pull request #14338 from Snuffleupagus/XRef-more-Pages-validation
[api-minor] Clear all caches in `XRef.indexObjects`, and improve /Root dictionary validation in `XRef.parse` (issue 14303)
2021-12-04 13:23:40 +01:00
Tim van der Meij
3117985c55
Merge pull request #14340 from Snuffleupagus/Metadata-fetch-error
Handle errors when fetching the raw /Metadata (issue 14305)
2021-12-04 13:19:37 +01:00
Jonas Jenwald
d9fac34596 Ensure that the shadow helper function is passed a valid property (PR 14152 follow-up)
Trying to shadow a non-existent property is always an implementation mistake, since it leads to the `shadow`-call not having any effect.

In PR 14152 I overlooked the fact that it's fairly easy to enforce this during development/testing, since that can help catch e.g. simple spelling bugs.
2021-12-04 10:07:21 +01:00
Jonas Jenwald
40291d1943 Handle errors when fetching the raw /Metadata (issue 14305)
Currently the `Catalog.metadata` getter only handles errors during parsing, however in a *corrupt* PDF document fetching of the raw /Metadata can obviously fail as well.
Without this patch the `PDFDocumentProxy.getMetadata` method, in the API, can thus fail which it *never* should and this will cause the viewer to not initialize all state as expected.

Fixes one of the documents in issue 14305.
2021-12-04 09:41:42 +01:00
Jonas Jenwald
ad3a271fc4 [api-minor] Clear all caches in XRef.indexObjects, and improve /Root dictionary validation in XRef.parse (issue 14303)
*This patch improves handling of a couple of PDF documents from issue 14303.*

 - Update `XRef.indexObjects` to actually clear *all* XRef-caches. Invalid XRef tables *usually* cause issues early enough during parsing that we've not populated the XRef-cache, however to prevent any issues we obviously need to clear that one as well.

 - Improve the /Root dictionary validation in `XRef.parse` (PR 9827 follow-up). In addition to checking that a /Pages entry exists, we'll now also check that it can be successfully fetched *and* that it's of the correct type. There's really no point trying to use a /Root dictionary that e.g. `Catalog.toplevelPagesDict` will reject, and this way we'll be able to fallback to indexing the objects in corrupt documents.

 - Throw an `InvalidPDFException`, rather than a general `FormatError`, in `XRef.parse` when no usable /Root dictionary could be found. That really seems more appropriate overall, since all attempts at parsing/recovery have failed. (This part of the patch is API-observable, hence the tag.)

With these changes, two existing test-cases are improved and the unit-tests are updated/re-factored to highlight that. In particular `GHOSTSCRIPT-698804-1-fuzzed.pdf` will now both load and "render" correctly, whereas `poppler-395-0-fuzzed.pdf` will now fail immediately upon loading (rather than *appearing* to work).
2021-12-03 11:57:38 +01:00
Jonas Jenwald
1fac6371d3 [Regression] Eagerly fetch/parse the entire /Pages-tree in corrupt documents (issue 14303, PR 14311 follow-up)
*Please note:* This is similar to the method that existed prior to PR 3848, but the new method will *only* be used as a fallback when parsing of corrupt PDF documents.

The implementation in PR 14311 unfortunately turned out to be *way* too simplistic, as evident by the recently added test-files in issue 14303, since it may *cause* infinite loops in `PDFDocument.checkLastPage` for some corrupt PDF documents.[1]
To avoid this, the easiest solution that I could come up with was to fallback to eagerly parsing the *entire* /Pages-tree when the /Count-entry validation fails during document initialization.

Fixes *at least* two of the issues listed in issue 14303, namely the `poppler-395-0.pdf...` and `GHOSTSCRIPT-698804-1.pdf...` documents.

---
[1] The whole point of PR 14311 was obviously to *get rid of* infinte loops during document initialization, not to introduce any more of those.
2021-12-02 14:31:04 +01:00
Jonas Jenwald
e045cd4520 Remove the unused skipCount parameter from Catalog.getPageDict (PR 14311 follow-up)
This was added in PR 14311, but given that I completely missed to update the `PDFDocument.getPage` signature accordingly it's completely unused.
Given that things work just as fine as-is, let's simply remove that optional parameter for now; sorry about the churn here!
2021-12-02 11:51:38 +01:00
Jonas Jenwald
63be23f05b Handle errors correctly when data lookup fails during /Pages-tree parsing (issue 14303)
This only applies to severely corrupt documents, where it's possible that the `Parser` throws when we try to access e.g. a /Kids-entry in the /Pages-tree.

Fixes two of the issues listed in issue 14303, namely the `poppler-742-0.pdf...` and `poppler-937-0.pdf...` documents.
2021-12-02 10:54:40 +01:00
Jonas Jenwald
a807ffe907 Prevent circular references in XRef tables from hanging the worker-thread (issue 14303)
*Please note:* While this patch on its own is sufficient to prevent the worker-thread from hanging, however in combination with PR 14311 these PDF documents will both load *and* render correctly.

Rather than focusing on the particular structure of these PDF documents, it seemed (at least to me) to make sense to try and prevent all circular references when fetching/looking-up data using the XRef table.
To avoid a solution that required tracking the references manually everywhere, the implementation settled on here instead handles that internally in the `XRef.fetch`-method. This should work, since that method *and* the `Parser`/`Lexer`-implementations are completely synchronous.

Note also that the existing `XRef`-caching, used for all data-types *except* Streams, should hopefully help to lessen the performance impact of these changes.
One *potential* problem with these changes could be certain *browser* exceptions, since those are generally not catchable in JavaScript code, however those would most likely "stop" worker-thread parsing anyway (at least I hope so).

Finally, note that I settled on returning dummy-data rather than throwing an exception. This was done to allow parsing, for the rest of the document, to continue such that *one* bad reference doesn't prevent an entire document from loading.

Fixes two of the issues listed in issue 14303, namely the `poppler-91414-0.zip-2.gz-53.pdf` and `poppler-91414-0.zip-2.gz-54.pdf` documents.
2021-11-27 23:50:26 +01:00
Jonas Jenwald
a669fce762 Inline the isDict, isRef, and isStream checks in the src/core/xref.js file 2021-11-27 23:49:17 +01:00
Jonas Jenwald
680e0efb9d Use Array-destructuring in the XRef.readXRefStream-method 2021-11-27 23:49:17 +01:00
Jonas Jenwald
d0c4bbd828 [api-minor] Validate the /Pages-tree /Count entry during document initialization (issue 14303)
*This patch basically extends the approach from PR 10392, by also checking the last page.*

Currently, in e.g. the `Catalog.numPages`-getter, we're simply assuming that if the /Pages-tree has an *integer* /Count entry it must also be correct/valid.
As can be seen in the referenced PDF documents, that entry may be completely bogus which causes general parsing to breaking down elsewhere in the worker-thread (and hanging the browser).

Rather than hoping that the /Count entry is correct, similar to all other data found in PDF documents, we obviously need to validate it. This turns out to be a little less straightforward than one would like, since the only way to do this (as far as I know) is to parse the *entire* /Pages-tree and essentially counting the pages.
To avoid doing that for all documents, this patch tries to take a short-cut by checking if the last page (based on the /Count entry) can be successfully fetched. If so, we assume that the /Count entry is correct and use it as-is, otherwise we'll iterate through (potentially) the *entire* /Pages-tree to determine the number of pages.

Unfortunately these changes will have a number of *somewhat* negative side-effects, please see a possibly incomplete list below, however I cannot see a better way to address this bug.
 - This will slow down initial loading/rendering of all documents, at least by some amount, since we now need to fetch/parse more of the /Pages-tree in order to be able to access the *last* page of the PDF documents.
 - For poorly generated PDF documents, where the entire /Pages-tree only has *one* level, we'll unfortunately need to fetch/parse the *entire* /Pages-tree to get to the last page. While there's a cache to help reduce repeated data lookups, this will affect initial loading/rendering of *some* long PDF documents,
 - This will affect the `disableAutoFetch = true` mode negatively, since we now need to fetch/parse more data during document initialization. While the `disableAutoFetch = true` mode should still be helpful in larger/longer PDF documents, for smaller ones the effect/usefulness may unfortunately be lost.

As one *small* additional bonus, we should now also be able to support opening PDF documents where the /Pages-tree /Count entry is completely invalid (e.g. contains a non-integer value).

Fixes two of the issues listed in issue 14303, namely the `poppler-67295-0.pdf` and `poppler-85140-0.pdf` documents.
2021-11-27 21:57:35 +01:00
Tim van der Meij
9a1e27efc5
Merge pull request #14313 from Snuffleupagus/PDFDocument_pagePromises-map
Change the `_pagePromises` cache, in the worker, from an Array to a Map
2021-11-27 20:58:23 +01:00
calixteman
bbd8b5ce9f
Merge pull request #14319 from calixteman/xfa_arc
XFA - Draw arcs correctly
2021-11-27 11:32:32 -08:00
Calixte Denizet
31e13515f5 XFA - Draw arcs correctly
- it aims to fix #14315;
- take into account the startAngle to compute the coordinates of the final point.
2021-11-27 19:30:12 +01:00
Calixte Denizet
cfdaa57353 Handle sub/super-scripts in rich text
- it aims to fix #14317;
 - change the fontSize and the verticalAlign properties according to the position of the text.
2021-11-27 16:06:09 +01:00
Jonas Jenwald
4c56214ab4 Convert PDFDocument._getLinearizationPage to an async method
This, ever so slightly, simplifies the code and reduces overall indentation.
2021-11-26 19:57:47 +01:00
Jonas Jenwald
080996ac68 Change the _pagePromises cache, in the worker, from an Array to a Map
Given that not all pages necessarily are being accessed, or that the pages may be accessed out of order, using a `Map` seems like a more appropriate data-structure here.
Furthermore, this patch also adds (currently missing) caching for XFA-documents. Loading a couple of such documents in the viewer, with logging added, shows that we're currently re-creating `Page`-instances unnecessarily for XFA-documents.
2021-11-26 19:53:57 +01:00
Jonas Jenwald
ca8d2bdce4 Abort parsing when the XRef /W-array contain bogus entries (issue 14303)
For this particular PDF document, we have `/W [1 2 166666666666666666666666666]` which obviously makes no sense.

While this patch makes no attempt at actually validating the entries in the /W-array, we'll now simply abort all processing when the end of the PDF document has been reached (thus preventing hanging the browser).
Please note that this patch doesn't enable the PDF document to be loaded/rendered, but at least it fails "correctly" now.

Fixes one of the issues listed in issue 14303, namely the `REDHAT-1531897-0.pdf`document.
2021-11-25 18:35:08 +01:00
Jonas Jenwald
ae4f1ae3e7 Ensure that ChunkedStream won't attempt to request data *beyond* the document size (issue 14303)
This bug was surprisingly difficult to track down, since it didn't just depend on range-requests being used but also on how quickly the document was loaded. To even be able to reproduce this locally, I had to use a very small `rangeChunkSize`-value (note the unit-test).

The cause of this bug is a bogus entry in the XRef-table, causing us to attempt to request data from *beyond* the actual document size and thus getting into an infinite loop.

Fixes *one* of the issues listed in issue 14303, namely the `PDFBOX-4352-0.pdf` document.
2021-11-24 19:19:43 +01:00
Jonas Jenwald
6da0944fc7 [api-minor] Replace PDFDocumentProxy.getStats with a synchronous PDFDocumentProxy.stats getter
*Please note:* These changes will primarily benefit longer documents, somewhat at the expense of e.g. one-page documents.

The existing `PDFDocumentProxy.getStats` function, which in the default viewer is called for each rendered page, requires a round-trip to the worker-thread in order to obtain the current document stats. In the default viewer, we currently make one such API-call for *every rendered* page.
This patch proposes replacing that method with a *synchronous* `PDFDocumentProxy.stats` getter instead, combined with re-factoring the worker-thread code by adding a `DocStats`-class to track Stream/Font-types and *only send* them to the main-thread *the first time* that a type is encountered.

Note that in practice most PDF documents only use a fairly limited number of Stream/Font-types, which means that in longer documents most of the `PDFDocumentProxy.getStats`-calls will return the same data.[1]
This re-factoring will obviously benefit longer document the most[2], and could actually be seen as a regression for one-page documents, since in practice there'll usually be a couple of "DocStats" messages sent during the parsing of the first page. However, if the user zooms/rotates the document (which causes re-rendering), note that even a one-page document would start to benefit from these changes.

Another benefit of having the data available/cached in the API is that unless the document stats change during parsing, repeated `PDFDocumentProxy.stats`-calls will return *the same identical* object.
This is something that we can easily take advantage of in the default viewer, by now *only* reporting "documentStats" telemetry[3] when the data actually have changed rather than once per rendered page (again beneficial in longer documents).

---
[1] Furthermore, the maximium number of `StreamType`/`FontType` are `10` respectively `12`, which means that regardless of the complexity and page count in a PDF document there'll never be more than twenty-two "DocStats" messages sent; see 41ac3f0c07/src/shared/util.js (L206-L232)

[2] One example is the `pdf.pdf` document in the test-suite, where rendering all of its 1310 pages only result in a total of seven "DocStats" messages being sent from the worker-thread.

[3] Reporting telemetry, in Firefox, includes using `JSON.stringify` on the data and then sending an event to the `PdfStreamConverter.jsm`-code.
In that code the event is handled and `JSON.parse` is used to retrieve the data, and in the "documentStats"-case we'll then iterate through the data to avoid double-reporting telemetry; see https://searchfox.org/mozilla-central/rev/8f4c180b87e52f3345ef8a3432d6e54bd1eb18dc/toolkit/components/pdfjs/content/PdfStreamConverter.jsm#515-549
2021-11-20 12:20:55 +01:00
Tim van der Meij
41ac3f0c07
Merge pull request #14291 from Snuffleupagus/force-postMessageTransfers
[api-minor] Only use Workers when `postMessage` transfers are supported (PR 11123 follow-up)
2021-11-19 20:02:51 +01:00
Brendan Dahl
c6cb39ef30
Merge pull request #14262 from Snuffleupagus/issue-14261
Include the /Lang-property, when it exists, in the StructTree-data (issue 14261)
2021-11-19 07:51:21 -08:00
Jonas Jenwald
6f22327e61 [api-minor] Only use Workers when postMessage transfers are supported (PR 11123 follow-up)
Given that all modern browsers now support `postMessage` transfers, and have for years, it no longer seems necessary for the PDF.js library to support using Workers unless the `postMessage` transfers functionality is available.
This patch is a follow-up to PR 11123, which made it impossible to *manually* disable `postMessage` transfers for performance reasons (since it increases memory usage), which hasn't caused any bug reports as far as I know.[1]

Hence we'll now only support *proper* Worker implementations, with fully working `postMessage` transfers, and fallback to using "fake" Workers otherwise.

---
[1] At the time of that PR we still "supported" IE, which is why this code was left intact.
2021-11-19 16:47:58 +01:00
Tim van der Meij
3dccaccbb4
Merge pull request #14278 from Snuffleupagus/rm-removeChild
Replace the remaining `Node.removeChild()` instances with `Element.remove()`
2021-11-17 20:17:55 +01:00
Jonas Jenwald
4ef1a129fa Replace the remaining Node.removeChild() instances with Element.remove()
Using `Element.remove()` is a slightly more compact way of removing an element, since you no longer need to explicitly find/use its parent element.
Furthermore, the patch also replaces a couple of loops that're used to delete all elements under a node with simply overwriting the contents directly (a pattern already used throughout the viewer).

See also:
 - https://developer.mozilla.org/en-US/docs/Web/API/Node/removeChild
 - https://developer.mozilla.org/en-US/docs/Web/API/Element/remove
2021-11-16 17:52:50 +01:00
Brendan Dahl
3209c013c4
Merge pull request #14247 from calixteman/button
[api-minor] Render pushbuttons on their own canvas (bug 1737260)
2021-11-16 08:10:40 -08:00
Jonas Jenwald
971ac8e993 Include the /Lang-property, when it exists, in the StructTree-data (issue 14261)
*Please note:* This is a tentative patch, since I don't have the necessary a11y-software to actually test it.
2021-11-14 12:37:41 +01:00
Jonas Jenwald
a54bed4963 Enable the ESLint no-loss-of-precision rule
Please refer to https://eslint.org/docs/rules/no-loss-of-precision
2021-11-14 10:48:50 +01:00
calixteman
85c6dd59ce
Merge pull request #14268 from calixteman/outline
Remove non-displayable chars from outline title (#14267)
2021-11-13 08:12:56 -08:00
Calixte Denizet
7041c62ccf Remove non-displayable chars from outline title (#14267)
- it aims to fix #14267;
 - there is nothing about chars in range [0-1F] in the specs but acrobat doesn't display them in any way.
2021-11-13 16:56:08 +01:00
Jonas Jenwald
afcc99a86d When parsing corrupt documents without any trailer-dictionary, fallback to the "top"-dictionary (issue 14269)
There's obviously no guarantee that this will work in general, if the document is sufficiently corrupt, but it should hopefully be better than just throwing `InvalidPDFException` as currently happens.

Please note that, as is often the case with corrupt documents, it's somewhat difficult to know if we're rendering the document "correctly" with this patch[1]. In this case even Adobe Reader cannot open the document, which is always a good sign that it's *really* corrupt, however we're at least able to render *something* with this patch.

---
[1] Whatever "correct" even means when dealing with corrupt PDF documents, where often times different PDF viewers won't agree completely.
2021-11-13 13:21:38 +01:00
Jonas Jenwald
28fb3975eb
Merge pull request #14266 from calixteman/bug931481
Don't consider space as real space when there is an extra spacing (bug 931481)
2021-11-12 21:42:32 +01:00
Calixte Denizet
a88ff34eb7 Don't consider space as real space when there is an extra spacing (bug 931481)
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=931481;
 - real space chars are pushed in the chunk but when there is an extra spacing, the next char position must be compared with the previous one;
 - for example, an extra spacing can cancel a space so visually there are no space.
2021-11-12 18:53:48 +01:00
Calixte Denizet
5b7e1f5232 XFA - Avoid an exception when looking for a font in a parent node
- it aims to fix issue https://github.com/mozilla/pdf.js/issues/14150;
  - a parent can be null in case the root has been reached, so just add a check.
2021-11-12 16:27:08 +01:00
Calixte Denizet
33ea817b20 [api-minor] Render pushbuttons on their own canvas (bug 1737260)
- First step to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1737260;
 - several interactive pdfs use the possibility to hide/show buttons to show different icons;
 - render pushbuttons on their own canvas and then insert it the annotation_layer;
 - update test/driver.js in order to convert canvases for pushbuttons into images.
2021-11-12 15:37:33 +01:00
Jonas Jenwald
ea1c348c67 Always prefer abbreviated keys, over full ones, when doing any dictionary lookups (issue 14256)
Note that issue 14256 was specifically about *inline* images, please refer to:
 - https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G7.1852045
 - https://www.pdfa.org/safedocs-unearths-pdf-inline-image-issue/
 - https://pdf-issues.pdfa.org/32000-2-2020/clause08.html#H8.9.7

However, during review of the initial PR in https://github.com/mozilla/pdf.js/pull/14257#issuecomment-964469710, it was suggested that we instead do this *unconditionally for all* dictionary lookups.
In addition to re-ordering the existing call-sites in the `src/core`-code, and adding non-PRODUCTION/TESTING asserts to catch future errors, for consistency a number of existing `if`/`switch`-blocks were re-factored to also check the abbreviated keys first.
2021-11-10 11:56:18 +01:00
calixteman
4bb9de4b00
Merge pull request #14239 from calixteman/1739502
XFA - Fix a breakBefore issue when target is a contentArea and startNew is 1 (bug 1739502)
2021-11-08 03:14:42 -08:00
Calixte Denizet
13ae6d493a XFA - Encode tag names in UTF-8 when saving (fix #14249) 2021-11-07 21:41:37 +01:00
calixteman
efb4455749
Merge pull request #14240 from calixteman/14014
XFA - Get each page asynchronously in order to avoid blocking the event loop (#14014)
2021-11-06 13:21:43 -07:00
Calixte Denizet
1681e25008 XFA - Get each page asynchronously in order to avoid blocking the event loop (#14014) 2021-11-06 13:25:03 +01:00
Brendan Dahl
b56cca0324 Create shading patterns the size of the current path. (bug 1722807)
Previously, when we created a shading pattern canvas we created it
as the same size as the page. This was good for caching if the same
pattern was used over and over again, but when lots of different
shadings are created that caused us to create many full page
canvases.

Instead of creating the full page canvses, create the canvas
as the same size as the current path bounding box. This reduces memory
consumption by a lot since most paths are pretty small. Also, in real world
PDFs it's rare for a shading (non shading fill) to be reused over and over again.
Bug 1721949 is an example where the same pattern is reused and it will be slightly
slower than before.
2021-11-05 20:44:18 -07:00
Brendan Dahl
8161d3f29d Don't double apply a group xobject's bbox.
In `beginGroup` we create a new canvas that is the size of the
bounding box and we translate it to the offset. This means we don't need to
also apply the bounding box during `paintFormXObjectBegin`.

This improves #6961 quite a bit, but it still is missing the indention
in the ruler.
2021-11-05 15:40:58 -07:00
Calixte Denizet
a08763f4aa XFA - Fix a breakBefore issue when target is a contentArea and startNew is 1 (bug 1739502)
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1739502;
 - when the target area was the current content area, everything was pushed in it instead of creating a new one (and consequently a new pageArea is created).
 - the pdf shows an alignment issue on page 4:
   - the hAlign is "center" but the subform was the width of its parent, so compute the real width of the subform with tb layout;
 - there is an extra empty page at the end of the pdf:
   - there is a subform with some hidden elements which are not rendered for now (since there is no plugged JS engine it isn't possible to draw them in changing their visibility).
   - so in case a subform is empty and has no real dimensions (at least one is 0), we just consider it as empty.
2021-11-05 18:59:55 +01:00
calixteman
e136afbabc
Merge pull request #14218 from janekotovich/subform_min_0
XFA subform with occur min=0 and no bound data displaying.
2021-11-05 04:12:34 -07:00
Jonas Jenwald
8222d6530b
Merge pull request #14232 from brendandahl/show-text-pattern
Use correct matrix for patterns with showText.
2021-11-05 10:04:56 +01:00
Brendan Dahl
1c7048399b Use correct matrix for patterns with showText.
We were incorrectly using the transform in the pattern before it had been
adjusted causing the pattern to be misplaced relative to the page.

Fixes: ShowText-ShadingPattern.pdf (already in corpus)
Fixes: #8111
Fixes: #9243
2021-11-04 16:57:36 -07:00
Jane-Kotovich
56b502391c XFA subform with occur min=0 and no bound data displaying
Subfrom nomin displays even though it's subform is set to <occur max=-1 min=0>
If we look through specs of XFA 3.3 : https://www.pdfa.org/norm-refs/XFA-3_3.pdf
- The min attribute is used when processing a form that contains data. Regardless of the data at least this number of instances is included. It is permissible to set this value to zero, in which case the container is entirely excluded if there is no data for it.

However, in our case it doesn't happen, because we let our empty dataNode get through. Though by setting a clause:
- eliminate unmatched data with occur min=0
we are checking our empty data and sending it to uselessNode array where at the end it gets removed;
2021-11-04 20:22:05 +10:00
Jonas Jenwald
e1a35e7bb6
Merge pull request #14213 from Snuffleupagus/issue-11656
Tweak the Bidi-detection heuristics for very short RTL strings (issue 11656)
2021-11-03 22:09:14 +01:00
Jonas Jenwald
5f77d3719b Tweak the Bidi-detection heuristics for very short RTL strings (issue 11656)
Very short strings can narrowly miss the existing Bidi-detection threshold, leading to incorrect text-selection and copying behaviour.

In my testing, neither Adobe Reader or PDFium seem to handle copying "correctly" for this document. Hence it's not entirely clear to me that we actually want to fix this, since tweaking these heuristics can *obviously* cause regressions elsewhere (and our test coverage for RTL-text isn't exactly great).
2021-11-03 20:31:57 +01:00
Brendan Dahl
039a7a670f Reset path bounding box tracking when starting a new path.
Starting a new path will wipe out any of the current subpaths in the
current graphics state, so we should reset the min/maxes.

This makes a number of the bounding boxes smaller and reduces the number
of composed pixels. For the smask tests in the corpus, the number of
composed pixesl goes from 19,872,109 to 19,676,905. The difference is much
larger on other PDFs though.
2021-11-03 11:46:52 -07:00
Jonas Jenwald
8c70258065
Merge pull request #14182 from calixteman/richtext
Support rich content in markup annotation
2021-10-31 14:41:56 +01:00
Calixte Denizet
cf8dc750d6 Support rich content in markup annotation
- use the xfa parser but in the xhtml namespace.
2021-10-31 13:44:51 +01:00
calixteman
2d8b6fda8f
Merge pull request #14207 from janekotovich/forms_version_popup
JS - Avoid a popup to ask for specific version of Acrobat
2021-10-30 05:45:31 -07:00
Tim van der Meij
ec1633c33c
Merge pull request #14201 from Snuffleupagus/bug-1219400
Use the correct border-style for Annotations, when a dash array is specified (bug 1219400)
2021-10-30 12:39:46 +02:00
Jane-Kotovich
12f89d2ab1 JS - Avoid a popup to ask for specific version of Acrobat
Embedded JS in PDF keep throwing alert reagdring specific version of Acrobat (Spanish and version 5.0 or greater).
This happens because:
- JS in pdf is enabled
- PDF contains some unsupported features (e.g. XFA)
Alert come when app.formVersion = undefined || app.formVersion < 5.0
In pdf.js we were using FORM_VERSION = undefined. After researching based on https://opensource.adobe.com/dc-acrobat-sdk-docs/acrobatsdk/pdfs/acrobatsdk_jsapiref.pdf\#G4.1993509 and Acrobat DC we decided to go with the larger number to avoid unnecessary popups.
Through investigation we realise that VIEWER_VERSION should have same value - a number.
Due to all that, we implemented 21.00720099 as a value for both FORMS_VERSION and VIEWER_VERSION
2021-10-29 23:09:59 +10:00
Tim van der Meij
0e7614df7f
Merge pull request #14180 from Snuffleupagus/bug-1627427
Handle ranges that "overflow" the last byte in `CMap.mapBfRange` (bug 1627427)
2021-10-27 20:06:09 +02:00
Jonas Jenwald
884caf602e Use the correct border-style for Annotations, when a dash array is specified (bug 1219400)
Even though we cannot use the dash array in the display layer, at least ensure that we use the correct border-style.
2021-10-27 13:20:21 +02:00
Jane-Kotovich
91fc643ff9 [api-minor] Implement securityHandler in the scripting API (bug 1731578) 2021-10-26 23:42:04 +10:00
Jonas Jenwald
aa1b78684f Handle ranges that "overflow" the last byte in CMap.mapBfRange (bug 1627427) 2021-10-24 13:48:38 +02:00
Tim van der Meij
0aaa4e3dbe
Merge pull request #14156 from Snuffleupagus/escodegen-fork
Add support for modern ECMAScript `class` features
2021-10-23 19:12:44 +02:00
Jonas Jenwald
52372b9378
Merge pull request #14175 from brendandahl/smask-v2
Use a new method for handling soft masks.
2021-10-23 09:27:18 +02:00
Brendan Dahl
82681ea20c Track the clipping box and bounding box of the path.
This allows us to compose much smaller regions of soft
mask making them much faster. This should also allow
for further optimizations in the pattern code.

For example locally I see issue #6573 go from 55s
to 5s with this change.

Fixes #6573
2021-10-22 13:41:29 -07:00
Brendan Dahl
2d1f9ff7a3 Use a new method for handling soft masks.
The old method of handling soft masks had a number of issues where the temporary
drawing canvas and the suspended main canvas could get out of sync
(e.g. mismatched save/restores or clip state) or we could end up compositing at
the wrong time. A good example of things getting out sync is the reduced test
case in #9017.

To fix this I've changed two big things:

1) Duplicate all the needed graphics state from the temporary canvas to the
suspended main canvas. This ensure the canvases stay in sync so that when we
switch back to the main canvas the graphics state stack is the same
(e.g. transforms, clip paths).

2) Immediately composite after each drawing operation. This ensures that if
there's an active clip region that we'll still be able to composite the correct
portions of the canvas. Note: This solution could be avoided by using
getImageData and putImageData since those ignore clipping region, but this is
very very slow. Note2: I also think the old way of only compositing at the end
of the soft mask is incorrect and can lead to wrong colors if drawing over the
same region, but in practice this doesn't seem to matter much.

Fixes: #5781
Fixes: #5853
Fixes: #7267
Fixes: #7891
Fixes: #8403
Fixes: #8624
Fixes: #12798
Fixes: #13891
Fixes: #9017 (reduced test case)
Fixes: https://bugzilla.mozilla.org/show_bug.cgi?id=1703683
2021-10-22 13:41:21 -07:00
Jonas Jenwald
89785a23f3 Convert Metadata to use private class fields
Please refer to https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Classes/Private_class_fields
2021-10-22 22:01:19 +02:00
Tim van der Meij
11f030d301
Merge pull request #14171 from Snuffleupagus/issue-14170
Prevent run-time errors in Node.js versions with `URL.createObjectURL` support (issue 14170)
2021-10-22 21:07:19 +02:00
Jonas Jenwald
044197808a Prevent double-rendering borders for PushButton-annotations (PR 14083 follow-up)
With ResetForm-action support added in PR 14083, there's a regression in the `issue12716` test-case. More specifically the border around the "Clear Form"-link is now rendered *twice*, once in the canvas via the appearance-stream and once in the annotationLayer via the border-data.
This looks slightly weird, and was most likely not intended, which is why this patch suggests that we ignore the border in the annotationLayer when an appearance-stream exists.
2021-10-21 13:31:16 +02:00
Jonas Jenwald
ff9d2b2ab1 Prevent run-time errors in Node.js versions with URL.createObjectURL support (issue 14170)
Apparently Node.js has added *global* `URL.createObjectURL` support, but not done the same thing for `Blob`. Hence we also need to check for the availability of `Blob` in the `createObjectURL` helper function, and it's probably a good idea to also update `examples/node/pdf2svg.js` to work-around this until these changes reach an official PDF.js release.
2021-10-21 10:32:44 +02:00
Tim van der Meij
382be22c11
Merge pull request #14160 from Snuffleupagus/pr-13770-followup
Fix pattern handling regression in `SVGGraphics` (PR 13770 follow-up)
2021-10-19 19:31:18 +02:00
Brendan Dahl
b66239d6dc
Merge pull request #14114 from Snuffleupagus/issue-14110
[api-minor] Include the /Lang-property in the `documentInfo`, and use it in the viewer (issue 14110)
2021-10-19 08:08:08 -07:00
Jonas Jenwald
68e6622c57 Ignore Square/Circle-annnotations with a zero borderWidth when creating a fallback appearance stream (issue 14164)
Trying to render these Annotation-types, when the borderWidth is `0`, causes a "hairline" border to appear. If these Annotations included an appearance stream, as they are supposed to, this wouldn't have happened and the simplest solution here seem to be to just ignore these particular Annotations.
2021-10-19 15:27:42 +02:00
Jonas Jenwald
8c6f1e45c7 Fix pattern handling regression in SVGGraphics (PR 13770 follow-up)
While the FAQ clearly lists the SVG back-end as unsupported, see https://github.com/mozilla/pdf.js/wiki/Frequently-Asked-Questions#backends, I suppose that small/simple regressions still makes sense to fix.
2021-10-18 21:40:10 +02:00
calixteman
bbb64369f1
Merge pull request #13424 from calixteman/chunks2
[api-minor] Fix issues in text selection
2021-10-18 06:14:15 -07:00
Calixte Denizet
61d1063276 Fix issues in text selection
- PR #13257 fixed a lot of issues but not all and this patch aims to fix almost all remaining issues.
  - the idea in this new patch is to compare position of new glyph with the last position where a glyph has been drawn;
    - no space are "drawn": it just moves the cursor but they aren't added in the chunk;
    - so this way a space followed by a cursor move can be treated as only one space: it helps to merge all spaces into one.
  - to make difference between real spaces and tracking ones, we used a factor of the space width (from the font)
    - it was a pretty good idea in general but it fails with some fonts where space was too big:
    - in Poppler, they're using a factor of the font size: this is an excellent idea (<= 0.1 * fontSize implies tracking space).
2021-10-17 16:27:05 +02:00
Jonas Jenwald
00720d059a [api-minor] Include the /Lang-property in the documentInfo, and use it in the viewer (issue 14110)
*Please note:* This is a tentative patch, since I don't have the necessary a11y-software to actually test it.

To avoid having to add a new API-method just for a single string, I figured that adding the new property to the existing `documentInfo`-data (accessed via `PDFDocumentProxy.getMetadata` in the API) will hopefully be deemed acceptable.
2021-10-16 14:27:47 +02:00
Jonas Jenwald
0041230072 Re-name the XFAFactory.numberPages getter to XFAFactory.numPages for consistency
All other similar getters are called `numPages` throughout the code-base, and improved consistency should always be a good thing.
2021-10-16 12:56:21 +02:00
Jonas Jenwald
0e5348180e Fix the inconsistent return type of the PDFDocument.isPureXfa getter
Also (slightly) simplifies a couple of small getters/methods related to the `XFAFactory`-instance.
2021-10-16 12:56:20 +02:00
Jonas Jenwald
cd94a44ca1 Remove some duplication in *simple* shadowed getters in src/core/-code
In these cases there's no good reason, in my opinion, to duplicate the `shadow`-lines since that unnecessarily increases the risk of simple typos (see the previous patch).
2021-10-16 12:56:17 +02:00
Jonas Jenwald
1450da4168 Fix a xfaFaxtory typo in the shadowing in the PDFDocument.xfaFactory getter
With this typo the shadowing doesn't actually work, which causes these checks to be unnecessarily repeated. In this particular case it didn't have a significant performance impact, however we should definately fix this nonetheless.
2021-10-16 11:54:12 +02:00
Jane-Kotovich
c2af309917 XFA - Embedded image is missing 2021-10-15 21:12:29 +10:00
Tim van der Meij
f6d9d91965
Merge pull request #14116 from Snuffleupagus/api-more-optional-chaining
Use even more optional chaining in the `src/display/api.js` file
2021-10-13 19:38:03 +02:00
Jay Berkenbilt
586295fad6 Implement TrueType character map "format 2" (fixes #14117)
If a PDF included an embedded TrueType font whose preferred character
map (cmap) was in "format 2", the code would select that character map
and then refuse to read it because of an unsupported format, thus
causing the characters not to be rendered. This commit implements
support for format 2 as described at the link below.

https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6cmap.html
2021-10-13 07:37:14 -04:00
Jonas Jenwald
8fc9c7e41c Use even more optional chaining in the src/display/api.js file
This patch (slightly) simplifies a couple of `onProgress` and `onUnsupportedFeature` call-sites.
Finally, while unrelated, also removes some unnecessary `return undefined;` statements (PR 11601 follow-up).
2021-10-12 12:05:59 +02:00
Jonas Jenwald
8721557a08 For Annotations that define a closed area, make all of it toggle the PopupAnnotation (issue 14107)
For Circle, Square, and Polygon Annotations it's currently only possible to toggle the associated PopupAnnotation by clicking on its border. Depending on the border width, and also the current zoom-level in the viewer, that can make interacting with certain Annotations *practically* impossible (which is the case in issue 14107).
Hence, in order to improve this, change the "fill"-property of the SVG element in the annotationLayer to make the *entire* element part of the click/mouse-over target.

*Please note:* Given that this is a viewer-related issue, there's no simple way to test this as far as I can tell.
2021-10-09 15:55:15 +02:00
Tim van der Meij
56e3ef68d4
Merge pull request #14106 from calixteman/names
Empty name is allowed in ISO 32000
2021-10-09 14:29:10 +02:00
Jonas Jenwald
69a97bcba7 Take the /CIDToGIDMap data into account when computing the hash, in PartialEvaluator.preEvaluateFont, for composite fonts (bug 1734802)
This is unfortunately *yet another* bug in the `preEvaluateFont`-implementation, and I've lost count of the number of times I've had to tweak this code over the years :-(
I really cannot help thinking that PR 4423 was way too simplistic, since it missed a bunch of cases that leads to broken font rendering in many PDF documents.

Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1734802
2021-10-08 13:15:21 +02:00
Calixte Denizet
f384ad2356 Empty name is allowed in ISO 32000
- the exact sentence from the spec:
    "The token SOLIDUS (a slash followed by no regular characters) introduces a unique valid name defined by the empty sequence of characters."
  - so just remove the warning.
2021-10-06 20:50:39 +02:00
Jonas Jenwald
d49b1bf2ee Use the native structuredClone implementation when it's available
With a recent addition to the HTML specification, the internal structured clone algorithm used in browsers is (or will be, once it's implemented) *directly* accessible to JavaScript; please see https://developer.mozilla.org/en-US/docs/Web/API/WindowOrWorkerGlobalScope/structuredClone

Hence we'll *eventually* not need to maintain our own structured clone functionality in the `LoopbackPort`-class in the API, however for the time being we'll feature detect `structuredClone` and fallback to the existing PDF.js implementation.

Given that https://bugzilla.mozilla.org/show_bug.cgi?id=1722576 has landed in Firefox 94, we should no longer need the manually implemented `cloneValue`-functionality in MOZCENTRAL builds. Note also that in the Firefox built-in PDF Viewer it's not possible for users to *easily* disable workers, which should further reduce the risk of these changes.
2021-10-03 10:55:33 +02:00
Jonas Jenwald
8cb6efec2d [api-minor] Add a wrapper around the addLinkAttributes-function, in the API, to the PDFLinkService implementations
This patch helps reduce some duplication, given that we now have a few essentially identical `addLinkAttributes` call-sites in the code-base.
To prevent runtime errors in the Annotation/XFA-layer code, we'll warn if a custom/incomplete `PDFLinkService` is being used (limited to GENERIC builds).
2021-10-02 12:28:00 +02:00
Jonas Jenwald
bb9c905c5d Ensure that various URL-related options are applied in the xfaLayer too
Note how both the annotationLayer and the document outline will apply various URL-related options when creating the link-elements.
For consistency the `xfaLayer`-rendering should obviously use the same options, to ensure that the existing options are indeed applied to all URLs regardless of where they originate.
2021-10-02 09:32:23 +02:00
Jonas Jenwald
284d259054
Merge pull request #14057 from Snuffleupagus/bug-920426
Support CMap-data with only strings, when parsing TrueType composite fonts (bug 920426)
2021-10-01 23:22:25 +02:00
Jonas Jenwald
67a642c826 Replace a couple of Array.prototype.forEach-invocations with for..of instead
Given that `NodeList`s can be iterated using `for..of` we can use that instead, since it's a little bit nicer and easier to read than the `Array.prototype.forEach` format.
2021-10-01 09:06:17 +02:00
Calixte Denizet
aecbd7cd89 AcroForm: Add support for ResetForm action
- it aims to fix #12721.
  - Thanks to PR #14023, we've now the fieldObjects in the annotation layer so we can easily map fields names on their id if needed.
  - Reset values in the storage, in the JS sandbox and in the visible html elements.
2021-09-30 22:02:33 +02:00
Jonas Jenwald
d3ca28bc34 Support CMap-data with only strings, when parsing TrueType composite fonts (bug 920426)
In the referenced bug, the embedded fonts contain custom CMap-data that only include strings. Note how for embedded composite TrueType fonts we're using the CMap-data when building the glyph mapping, and currently we end up with a completely empty map because the code expects only CID *numbers*.
Furthermore, just fixing the glyph mapping alone isn't sufficient to fully address the bug, since we also need to consider this "special" kind of CMap-data when looking up glyph widths.
2021-09-30 18:10:47 +02:00
Tim van der Meij
9a74f3e6e0
Merge pull request #14049 from calixteman/bg_from_mk
Annotation - Use border and background colors from MK dictionary
2021-09-29 21:13:20 +02:00
Calixte Denizet
0776cd9b90 Annotation - Use border and background colors from MK dictionary
- it aims to fix #13003;
  - set the bg and fg colors as they're in the pdf;
  - put a transparent overlay to help to see the fields.
2021-09-26 20:49:26 +02:00
Jonas Jenwald
e6e04694f4 [api-minor] Move the addDefaultProtocolToUrl/tryConvertUrlEncoding functionality into the createValidAbsoluteUrl function
Having recently worked with, and reviewed patches touching, this code it seemed that it's probably not a bad idea to move that functionality into `createValidAbsoluteUrl` as new options instead.

For the `addDefaultProtocolToUrl` functionality in particular, the existing helper function was not only moved but slightly improved as well. Looking at the code, I realized that there's a small risk that it would incorrectly match a *relative* URL-string too.

With these changes, the `createValidAbsoluteUrl` call-sites in the `src/core/`-code can be simplified a little bit.

*Please note:* This patch may, indirectly, change the format of the `unsafeUrl`-property returned with relevant Annotations and OutlineItems; hence the `api-minor` tag.
However, I'd argue that it's actually more correct this way since the whole purpose of `unsafeUrl` is/was to return the URL data as-is without any parsing done.
2021-09-26 14:29:54 +02:00
Calixte Denizet
558e58f354 XFA - Add <a> element in button when an url is detected (bug 1716758)
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1716758;
  - some buttons have a JS action with the pattern `app.launchURL(...)` (or similar) so extract when it's possible the url and generate a <a> element with the href equals to the found url;
  - pdf.js already had some code to handle that so this patch slightly refactor that.
2021-09-25 21:59:39 +02:00
Calixte Denizet
c0e9108d00 Annotation - Some checkboxes have an empty N dictionary
- it aims to fix #14021;
  - the N dict is empty here so just create a default one;
  - it implies that the checked checkbox has no appearance so create a default one too in order to print it;
  - in the pdf in the issue, a checked box is not printed because it has no default appearance so we need to guess its appearance from its state.
2021-09-25 16:00:47 +02:00
Tim van der Meij
cc110b8542
Merge pull request #14064 from Snuffleupagus/issue-13845
Fallback to font name matching, when checking for serif fonts (issue 13845)
2021-09-25 12:41:57 +02:00
Jonas Jenwald
b23b8d8a5d
Merge pull request #14074 from Snuffleupagus/issue-14046
[api-minor] Add basic support for RTL text-content in PopupAnnotations (issue 14046)
2021-09-25 12:37:44 +02:00
Tim van der Meij
36dc93fe5d
Merge pull request #14065 from Snuffleupagus/fewer-EXPORT_DATA_PROPERTIES
[api-minor] Stop exporting, by default, a few additional Font properties (PR 11777 follow-up)
2021-09-25 12:25:56 +02:00
Tim van der Meij
ee34572fd0
Merge pull request #14070 from Snuffleupagus/MessageHandler-local-vars
Some small readability improvements in the `MessageHandler` code
2021-09-25 12:22:17 +02:00
Tim van der Meij
07558c158d
Merge pull request #14069 from Snuffleupagus/deprecate-OPS-paintJpegXObject
Mark the `paintJpegXObject` operator as deprecated (PR 11601 follow-up)
2021-09-25 12:15:33 +02:00
Jonas Jenwald
1dcd2f0cd3 [api-minor] Add basic support for RTL text-content in PopupAnnotations (issue 14046)
In order to implement this, we utilize the existing `bidi` function to infer the text-direction of /T and /Contents entries. While this may not be perfect in cases where one PopupAnnotation mixes LTR and RTL languages, it should work well enough in most cases.
To avoid having to add *two new* properties in lots of annotations, supplementing the existing `title`/`contents`-properties, this patch instead re-factors the existing code such that the properties are replaced by Objects (containing `str` and `dir`).

*Please note:* In order avoid breaking existing third-party implementations, `GENERIC`-builds of the PDF.js library will still provide the old `title`/`contents`-properties on annotations returned by `PDFPageProxy.getAnnotations`.
2021-09-25 09:18:58 +02:00
calixteman
104e049338
Merge pull request #14073 from calixteman/bindItems
XFA - Bind items when there's a bindItems entry
2021-09-24 09:01:52 -07:00
Calixte Denizet
97c1e076a1 XFA - Bind items when there's a bindItems entry
- In the pdf in issue #14071, some select fields don't contain any values;
  - the corresponding node has a bindItems and a bind elements and _bindItems function was just not called.
2021-09-24 16:08:58 +02:00
Calixte Denizet
cd73e282eb XFA - Create a new page in case of overflow
- it aims to fix #14071;
  - a subform is overflowing and the the target in case of overflow is itself. In this case we must create a new page.
2021-09-24 14:57:55 +02:00
Jonas Jenwald
890a6c1108 Some small readability improvements in the MessageHandler code
In particular the `_processStreamMessage`-method is a bit cumbersome to read, given the way that the current streamController/streamSink is accessed, which we can improve with a couple of local variables.
2021-09-24 13:07:20 +02:00
Jonas Jenwald
7d56fb4cbf Mark the paintJpegXObject operator as deprecated (PR 11601 follow-up)
After PR 11601, the `paintJpegXObject` operator is no longer used for anything. While I don't think we can just remove it, and essentially leave a "hole" in the `OPS` structure, we should at least mark it as explicitly unused to aid readability/maintainability of the code.
2021-09-24 12:47:28 +02:00
Brendan Dahl
d370a281c4
Merge pull request #14067 from calixteman/1732344
Don't save anything in XFA entry if no XFA! (bug 1732344)
2021-09-23 15:07:00 -07:00
Calixte Denizet
4b0538d07a Don't save anything in XFA entry if no XFA! (bug 1732344)
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1732344
  - rename some variables to have a more clear code;
  - and last but no least, add a unit test to test saving.
2021-09-23 19:51:23 +02:00
Jonas Jenwald
fd1f0f647f Print a special warning message, in the viewer, for XFA Foreground documents
Currently XFAF documents use the same warning message as in the XFA *disabled* case, which is neither helpful nor correct.
2021-09-23 15:02:24 +02:00
Jonas Jenwald
6cba5509f2 Re-factor document.getElementsByName lookups in the AnnotationLayer (issue 14003)
This replaces direct `document.getElementsByName` lookups with a helper method which:
 - Lets the AnnotationLayer use the data returned by the `PDFDocumentProxy.getFieldObjects` API-method, such that we can directly lookup only the necessary DOM elements.
 - Fallback to using `document.getElementsByName` as before, such that e.g. the standalone viewer components still work.

Finally, to fix the problems reported in issue 14003, regardless of the code-path we now also enforce that the DOM elements found were actually created by the AnnotationLayer code.
With these changes we'll thus be able to update form elements on all visible pages just as before, but we'll additionally update the AnnotationStorage for not-yet-rendered elements thus fixing a pre-existing bug.
2021-09-23 13:05:18 +02:00
Jonas Jenwald
9acfe486d4 Fallback to font name matching, when checking for serif fonts (issue 13845)
In order to handle fonts that specify completely bogus /Flags-entries, fallback to font name matching to determine if the font is a serif one.
2021-09-23 01:11:57 +02:00
Jonas Jenwald
e027748627 [api-minor] Stop exporting, by default, a few additional Font properties (PR 11777 follow-up)
*This is similar to the "isSymbolicFont"-property, which is no longer exported by default after PR 11777.*

Both "isMonospace" and "isSerifFont" are internal properties, used during font parsing and building of the glyph mapping on the worker-thread.
However both of these properties are completely unused on the main-thread and/or in the API, and accessing them they will now require setting the `fontExtraProperties`-option when calling `getDocument`.
2021-09-23 00:44:43 +02:00
Tim van der Meij
5254676ef3
Merge pull request #14055 from Snuffleupagus/PDF_TO_CSS_UNITS
Add `PDF_TO_CSS_UNITS` to the `PixelsPerInch`-structure
2021-09-22 22:24:51 +02:00
Jonas Jenwald
81a1c1cef7 Correctly validate URLs in XFA documents (bug 1731240)
With this patch we'll ensure that only valid absolute URLs can be used in XFA documents, similar to the existing validation done for "regular" PDF documents.
Furthermore, we'll also attempt to add a default protocol (i.e. `http`) to URLs beginning with "www." in XFA documents as well; this on its own is enough to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1731240
2021-09-21 21:21:01 +02:00
Jonas Jenwald
3e550f392a Add PDF_TO_CSS_UNITS to the PixelsPerInch-structure
Rather than re-computing this value in a number of different places throughout the code-base[1], we can expose this in the API via the existing `PixelsPerInch`-structure instead.
There's also been feature requests asking for the old `CSS_UNITS` viewer constant to be made accessible, such that it could be used in third-party implementations.

I suppose that it could be argued that it's somewhat confusing to place a unitless property in `PixelsPerInch`, however given that the `PDF_TO_CSS_UNITS`-property is defined strictly in terms of the existing properties this is hopefully deemed reasonable.

---
[1] These include:
 - The viewer, with the `CSS_UNITS` name.
 - The reference-tests.
 - The display-layer, when rendering images; see PR 13991.
2021-09-20 13:20:09 +02:00
Jonas Jenwald
8ea27ce157 Tweak how fonts with an /Encoding are handled in adjustToUnicode (issue 14048, PR 13277 follow-up)
Currently we only exclude /Encoding entries that also contains a /Differences array, which is the cause of the text-selection problem in the referenced issue.
In order to address this we'll now also exclude /Encoding entries that contain one of the predefined *named* encodings, and no longer require that it also contains a /Differences array.

*Please note:* This patch cases a small "regression" in the `bug1130815-text` test-case, however this is actually an improvement when compared with Adobe Reader and PDFium (in Google Chrome).
2021-09-18 22:44:25 +02:00
Tim van der Meij
83d3bb43f4
Merge pull request #14041 from Snuffleupagus/issue-9367
Support cmaps with only CID characters, when building the ToUnicode-map (issue 9367)
2021-09-18 16:47:06 +02:00
Jonas Jenwald
20eb6ca2ec
Merge pull request #14044 from calixteman/bug1719148
Annotations - Avoid empty value in text field when storage contains something for it (bug 1719148)
2021-09-18 16:31:45 +02:00
Jonas Jenwald
6634afd646
Merge pull request #14045 from calixteman/noise
XFA - Only warn about the wrong xfa type when there is an xfa thing
2021-09-18 16:13:20 +02:00
Tim van der Meij
c870fb489e
Merge pull request #14013 from Snuffleupagus/api-unittest-instanceof
Improve the API unit-tests, and try to expose more API-functionality in the TypeScript definitions
2021-09-18 16:08:19 +02:00
Calixte Denizet
2fc10727c5 XFA - Only warn about the wrong xfa type when there is an xfa thing 2021-09-18 15:44:05 +02:00
calixteman
ffa2572bdf
Merge pull request #14038 from calixteman/saveas
JS - Implement few possibilities with app.execMenuItem (bug 1724399)
2021-09-18 15:33:03 +02:00
Calixte Denizet
eb762ad624 Annotations - Avoid empty value in text field when storage contains something for it (bug 1719148)
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1719148;
  - JS can set a property for a non-rendered annotation using the annotationStorage but the other unset default properties must be used when the annotation is finally rendered;
  - so this patch just adds the properties already set in the annotationStorage to the default value.
2021-09-18 15:08:22 +02:00
Calixte Denizet
bfd570038d JS - Implement few possibilities with app.execMenuItem (bug 1724399)
- it aims to fix: https://bugzilla.mozilla.org/show_bug.cgi?id=1724399.
2021-09-18 13:52:32 +02:00
Jonas Jenwald
e3223b68fc Extract some of the glyphMap handling, for non-embedded composite standard fonts, into a helper function
This reduces some unnecessary duplication, since we currently have essentially the same code in a handful of places in the `Font.fallbackToSystemFont`-method.
2021-09-18 12:39:48 +02:00
Jonas Jenwald
ed73cf6d50 Support cmaps with only CID characters, when building the ToUnicode-map (issue 9367)
In this particular case the `CMap`-data that we create contains only numbers, but no strings, which causes `PartialEvaluator.readToUnicode` to create a ToUnicode-map with only empty strings.

*Please note:* This is yet another case where I don't know if it's necessarily the best and most correct solution, but it does fix the referenced issue.
2021-09-18 00:26:15 +02:00
Calixte Denizet
e87c12bf34 JS - Avoid the Stay/Leave popup when clicking on a button with a JS action
- it aims to fix #14039.
2021-09-17 21:04:07 +02:00
Calixte Denizet
5bef8120e7 Annotation - For checkboxes, get field value from AS (if any) instead of V (bug 1722036)
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1722036.
  - AS and V should share the same value for checkbox: it's at least what the specs say;
  - the pdf in the above bug opens correctly in Acrobat so it likely means that AS is chosen over V.
2021-09-17 13:04:16 +02:00
Brendan Dahl
d6a27860e3
Merge pull request #14025 from Snuffleupagus/issue-11915
Improve glyph mapping for non-embedded composite standard fonts with a /CIDToGIDMap (issue 11915)
2021-09-16 08:06:35 -07:00
Calixte Denizet
a3aa6dd6ab Annotation - Checkboxes with the same name and export values must be in unison
- it aims to fix #14024.
  - this patch adds an attribute `acroformExportValue` to the HTML input in order to set the checked attribute in taking into account the exportValue for the checkboxes with the same name.
2021-09-15 15:30:24 +02:00
Jonas Jenwald
a11343e9af Improve glyph mapping for non-embedded composite standard fonts with a /CIDToGIDMap (issue 11915)
*Please note:* All of this feels very handwavy, but at least it passes all tests locally. Hopefully we have enough tests for this part of the font code.

For non-embedded composite standard fonts with an "incomplete" /CIDToGIDMap, we'll now fallback to an *explicitly defined* /ToUnicode map even when that one happens to be an /Identity-H or /Identity-V map.

The `Font.fallbackToSystemFont` method is unfortunately getting more and more special-cases, however that might be unavoidable given all the weird non-embedded fonts found in the wild :-(
2021-09-15 11:30:40 +02:00
Calixte Denizet
9812e35916 XFA - Don't create images for unsupported mime types 2021-09-14 10:55:25 +02:00
Jonas Jenwald
95057a4e56 Try to expose more API-functionality in the TypeScript definitions
While these types apparently makes sense in TypeScript environments, we really don't want to extend the *public* API by simply exporting the relevant classes directly in `src/pdf.js` (since they should never be called/initialized manually).

Please see e.g. issue 12384 where this was first requested, and note that a possible work-around was also provided there. This patch simply implements that work-around[1], which will hopefully be helpful to TypeScript users.

---
[1] Based on the discussion in PR 13957, the two previous patches appear to be necessary for this to actually work.
2021-09-13 13:57:56 +02:00
Jonas Jenwald
d854352cd5 Improve the API unit-tests by checking that PDFPageProxy.render returns a RenderTask-instance
This is similar to existing unit-tests, which checks for `PDFDocumentProxy`- and `PDFPageProxy`-instances.
2021-09-13 13:34:37 +02:00
Jonas Jenwald
fa7a607d33 Improve the API unit-tests by checking that getDocument returns a PDFDocumentLoadingTask-instance
This is similar to existing unit-tests, which checks for `PDFDocumentProxy`- and `PDFPageProxy`-instances.
2021-09-13 13:34:28 +02:00
Jonas Jenwald
7025b9f859 [src/core/writer.js] Support null values in the writeValue function
*This fixes something that I noticed, having recently looked at both the `Lexer.getObj` and `writeValue` code.*

Please note that I unfortunately don't have an example of a form where saving fails without this patch. However, given its overall simplicity and that unit-tests are added, it's hopefully deemed useful to fix this potential issue pro-actively rather than waiting for a bug report.

At this point one might, and rightly so, wonder if there's actually any real-world PDF documents where a `null` value is being used?
Unfortunately the answer is *yes*, and we have a couple of examples in the test-suite (although none of those are related to forms); please see: `issue1015`, `issue2642`, `issue10402`, `issue12823`, `issue13823`, and `pr12564`.
2021-09-12 18:24:37 +02:00
Jonas Jenwald
5d578ea36a [src/core/writer.js] Remove unnecessary string-wrapping for boolean values in writeValue (PR 13998 follow-up) 2021-09-12 15:45:45 +02:00
Jonas Jenwald
761519ef3f
Merge pull request #13998 from calixteman/bug1729971
Write boolean value when saving a form (bug 1729971)
2021-09-12 15:38:10 +02:00
Jonas Jenwald
a47844d1fc Let Lexer.getObj return a dummy-Cmd for commands that start with a non-visible ASCII character (issue 13999)
This way we avoid breaking badly generated PDF documents where a non-visible ASCII character is "glued" to a valid command.
2021-09-11 19:54:13 +02:00
Tim van der Meij
e97f01b17c
Merge pull request #13977 from Snuffleupagus/enqueueChunk-batch
[api-minor] Reduce `postMessage` overhead, in `PartialEvaluator.getTextContent`, by sending text chunks in batches (issue 13962)
2021-09-11 13:34:07 +02:00
Jonas Jenwald
0e54f568fb Re-factor the CSS_PIXELS_PER_INCH/PDF_PIXELS_PER_INCH exports (PR 13991 follow-up)
For improved maintainability, since these constants are being exposed in the official API, this patch moves them into an Object instead.
2021-09-11 11:15:25 +02:00
Jonas Jenwald
bd51bbfd16 Remove mozImageSmoothingEnabled fallback in CanvasGraphics.endGroup
This was added all the way back in PR 2936, however it's been unnecessary ever since Firefox 51 (released on 2017-01-24); please see the MDN compatibility data:
https://developer.mozilla.org/en-US/docs/Web/API/CanvasRenderingContext2D/imageSmoothingEnabled#browser_compatibility
2021-09-11 10:30:39 +02:00
Jonas Jenwald
9ce63a6dc6
Merge pull request #13991 from brendandahl/interpolate
Enable/disable image smoothing based on image interpolate value. (bug 1722191)
2021-09-11 10:02:53 +02:00
Brendan Dahl
f38fb42b42 Enable/disable image smoothing based on image interpolate value. (bug 1722191)
While some of the output looks worse to my eye, this behavior more
closely matches what I see when I open the PDFs in Adobe acrobat.

Fixes: #4706, #9713, #8245, #1344
2021-09-10 14:23:35 -07:00
Calixte Denizet
474ab7c86d Write boolean value when saving a form (bug 1729971)
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1729971#c4.
2021-09-10 14:10:25 +02:00
calixteman
57b80074a2
Merge pull request #13995 from calixteman/xfa_record
XFA - Handle $record shorcut in SOM expression (issue #13994)
2021-09-10 13:57:50 +02:00
Calixte Denizet
c5841b3794 XFA - Handle shorcut in SOM expression (issue #13994) 2021-09-09 19:54:45 +02:00
Calixte Denizet
623860bf8f XFA - Remove the checked attribute from the checkbox when unchecked (bug 1729877)
- it aims to fix: https://bugzilla.mozilla.org/show_bug.cgi?id=1729877.
2021-09-09 19:14:16 +02:00
Jonas Jenwald
45ddb12f61 Remove no-op onPull/onCancel streamSink callbacks from the "GetTextContent"-handler
The `MessageHandler`-implementation already handles either of these callbacks being undefined, hence there's no particular reason (as far as I can tell) to add no-op functions here.

Also, in a couple of `MessageHandler`-methods, utilize an already existing local variable more.
2021-09-09 00:01:10 +02:00
Jonas Jenwald
f90f9466e3 [api-minor] Reduce postMessage overhead, in PartialEvaluator.getTextContent, by sending text chunks in batches (issue 13962)
Following the STR in the issue, this patch reduces the number of `PartialEvaluator.getTextContent`-related `postMessage`-calls by approximately 78 percent.[1]
Note that by enforcing a relatively low value when batching text chunks, we should thus improve worst-case scenarios while not negatively affect all `textLayer` building.

While working on these changes I noticed, thanks to our unit-tests, that the implementation of the `appendEOL` function unfortunately means that the number and content of the textItems could actually be affected by the particular chunking used.
That seems *extremely* unfortunate, since in practice this means that the particular chunking used is thus observable through the API. Obviously that should be a completely internal implementation detail, which is why this patch also modifies `appendEOL` to mitigate that.[2]

Given that this patch adds a *minimum* batch size in `enqueueChunk`, there's obviously nothing preventing it from becoming a lot larger then the limit (depending e.g. on the PDF structure and the CPU load/speed).
While sending more text chunks at once isn't an issue in itself, it could become problematic at the main-thread during `textLayer` building. Note how both the `PartialEvaluator` and `CanvasGraphics` implementations utilize `Date.now()`-checks, to prevent long-running parsing/rendering from "hanging" the respective thread. In the `textLayer` building we don't utilize such a construction[3], and streaming of textContent is thus essentially acting as a *simple* stand-in for that functionality.
Hence why we want to avoid choosing a too large minimum batch size, since that could thus indirectly affect main-thread performance negatively.

---
[1] While it'd be possible to go even lower, that'd likely require more invasive re-factoring/changes to the `PartialEvaluator.getTextContent`-code to ensure that the batches don't become too large.

[2] This should also, as far as I can tell, explain some of the regressions observed in the "enhance" text-selection tests back in PR 13257.
    Looking closer at the `appendEOL` function it should potentially be changed even more, however that should probably not be done here.

[3] I'd really like to avoid implementing something like that for the `textLayer` building as well, given that it'd require adding a fair bit of complexity.
2021-09-09 00:01:07 +02:00
Jonas Jenwald
69034ab8dc Improve glyph mapping for non-embedded composite standard fonts (issue 11088)
For non-embedded CIDFontType2 fonts with a non-/Identity encoding, use the /ToUnicode data to improve the glyph mapping.
2021-09-08 15:15:33 +02:00
Jonas Jenwald
4c1b586dd2 Reduce the size of TextLayerRenderTask._textDivProperties in "regular" text-selection mode
While these changes will obviously not have a significant effect on overall memory usage, it cannot hurt as far as I'm concerned. This patch makes the following changes:
 - Clear out `_textDivProperties` once rendering is done, since those properties are only necessary to keep alive when *enhanced* text-selection is being used.

 - Reduce the size of the `_textDivProperties`-entries by default, since a majority of the properties are only relevant when *enhanced* text-selection is being used.
2021-09-05 12:12:34 +02:00
Tim van der Meij
1b20f61b56
Merge pull request #13972 from Snuffleupagus/issue-13971
Treat all content as visible when no optional content groups are defined (issue 13971)
2021-09-04 15:53:44 +02:00
Tim van der Meij
680f33c31c
Merge pull request #13961 from Snuffleupagus/simpler-regexp
Simplify some regular expressions
2021-09-04 15:39:30 +02:00
Jonas Jenwald
6318ccf6d2 Treat all content as visible when no optional content groups are defined (issue 13971)
In the referenced PDF document the /Contents stream contains MarkedContent-operators, however no optional content dictionary exists; according to [the specification](https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G7.3883825):

> Null values or references to deleted objects shall be ignored. If this entry is
  not present, is an empty array, or contains references only to null or deleted
  objects,  the  membership  dictionary  shall  have  no  effect  on  the  visibility  of
  any content.
2021-09-04 08:13:37 +02:00
Jonas Jenwald
3ccf277f58 Fallback to the /ToUnicode map for TrueType fonts with (3, 1) and (1, 0) cmap-tables (issue 13316)
In the PDF document some of the glyphs have bogus `differences`-entries[1] that cannot be resolved to valid glyph names, thus causing the glyph mapping to fail.
My initial idea was to use a similar approach as in the `PartialEvaluator._simpleFontToUnicode`-method, to extract the charCodes from those entries, however it turned out that that didn't actually help in this case (the mapping was still wrong).

To fix this I'm thus proposing that we fallback to the /ToUnicode map when no other useable data exists (e.g. no post-table), since it *hopefully* shouldn't make things any worse than leaving parts of the glyph map empty (which currently happens).

---
[1] As can be seem below, some of the entries are completely normal while others are non-standard:
```
Differences (array)
    0 = 65
    1 = /g5167
    2 = /space
    3 = /g11927
    4 = /g17737
    5 = /g11540
    6 = /g2180
    7 = /K
    8 = /P
    9 = /two
    10 = /zero
    11 = /one
    12 = /five
    13 = /four
    14 = /g6932
    15 = /g7246
    16 = /g1691
    17 = /g2343
    18 = /g14792
    19 = /g3325
    20 = /g4280
    21 = /g20383
    22 = /g18166
    23 = /g16988
    24 = /g17943
    25 = /g19223
    26 = /g10830
    27 = 97
    28 = /g982
    29 = /g1226
    30 = /g5059
    31 = /g2677
    32 = /g1042
    33 = /g11568
    34 = /L
    35 = /three
    36 = /seven
    37 = /g2364
    38 = /g12063
    39 = /g5356
    40 = /g2173
    41 = /g17877
    42 = /g7273
    43 = /g7647
    44 = /g7224
    45 = /g19327
    46 = /g5054
    47 = /g2342
    48 = /g10136
    49 = /g6856
    50 = /g13381
    51 = /g7257
    52 = /g12093
    53 = /g2359
```
2021-09-04 07:38:22 +02:00
Brendan Dahl
da15dbf962
Merge pull request #13698 from linfangrong/master
[FIX] fix jpx tag tree decode (issue 11957)
2021-09-03 10:00:19 -07:00
Brendan Dahl
a8ce15a2d7
Merge pull request #13966 from calixteman/no_ns
XFA - Created data node mustn't belong to datasets namespace
2021-09-03 09:59:40 -07:00
Calixte Denizet
77b9657e57 XFA - Overwrite AcroForm dictionary when saving if no datasets in XFA (bug 1720179)
- aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1720179
  - in some pdfs the XFA array in AcroForm dictionary doesn't contain an entry for 'datasets' (which contains saved data), so basically this patch allows to overwrite the AcroForm dictionary with an updated XFA array when doing an incremental update.
2021-09-03 17:04:03 +02:00
Calixte Denizet
57ae3a5a76 XFA - Created data node mustn't belong to datasets namespace
- when some named nodes in the template don't have their counterpart in datasets we create some nodes: the main node mustn't belong to the datasets namespace because it doesn't make sense and Acrobat Reader isn't able to read pdf with such nodes.
  - so created nodes under a datasets node have a namespaceId set to -1 and consequently when serialized no namespace prefix will appear.
2021-09-03 15:43:25 +02:00
Brendan Dahl
804abb3786
Merge pull request #13959 from calixteman/encrypt
Correctly pad strings when saving an encrypted pdf (bug 1726789)
2021-09-02 11:41:02 -07:00
Jonas Jenwald
c42887221a Simplify some regular expressions
There's a fair number of regular expressions througout the code-base which are slightly more verbose than strictly necessary, in particular:
 - We have a lot of regular expressions that use `[0-9]` explicitly, and those can be simplified to use `\d` instead.
 - We have one instance of a regular expression containing a `A-Za-z0-9_` sequence, which can be simplified to use `\w` instead.
2021-09-02 11:50:42 +02:00
Calixte Denizet
9619bf92be Correctly pad strings when saving an encrypted pdf (bug 1726789) 2021-09-02 10:37:21 +02:00
Tim van der Meij
0a366dda6a
Merge pull request #13955 from Snuffleupagus/issue-13433
Always prefer the post-table for TrueType fonts with (0, x) cmap-tables (issue 13433)
2021-09-01 21:46:34 +02:00
Tim van der Meij
19ce2de6f7
Merge pull request #13952 from Snuffleupagus/ItcSymbol
Extend `getNonStdFontMap` for non-embedded versions of the ItcSymbol font (issue 11532)
2021-09-01 21:38:59 +02:00
Jonas Jenwald
b7b6076294 Always prefer the post-table for TrueType fonts with (0, x) cmap-tables (issue 13433)
While I don't know if this is necessarily the "correct" solution, it does fix issue 13433 without breaking any of the existing reference-tests.
2021-09-01 12:35:49 +02:00
Jonas Jenwald
ba9f004097 Extend getNonStdFontMap for non-embedded versions of the ItcSymbol font (issue 11532)
Despite its name, the fonts in ItcSymbol-family are "regular" fonts and not Symbol ones. However, given that the font name contains the word "Symbol" we ended up picking the wrong code-path in the `Font.fallbackToSystemFont`-method.

*Please note:* While this patch ensures that the text becomes readable, by falling back a standard font, the rendering will obviously not be perfect. However, that's the PDF generators "fault" since non-embedded fonts cannot be guaranteed to render correctly in all environments.
2021-08-31 23:21:16 +02:00
Jonas Jenwald
1f56451d56 Implement PDFNetworkStreamRangeRequestReader._onError, to handle range request errors with XMLHttpRequest (issue 9883)
Given that the Fetch API is normally being used now, these changes are probably less important now than they used to be. However, given that it's simple enough to implement this I figured why not just fix issue 9883 (better late than never I suppose).
2021-08-31 10:23:57 +02:00
Jonas Jenwald
bd9a92a161 Use optional chaining more in the src/display/network.js file
Also changes the different `_onDone`/`_onProgress` methods to use consistent parameter names, and some other small improvements.
2021-08-31 10:23:54 +02:00
linfangrong
369f1899c6 [FIX] fix jpx tag tree decode (issue 11957) 2021-08-31 11:44:26 +08:00
Brendan Dahl
a7f807b059 Only use base encoding if it's populated. (bug 1727053)
The font dict in this file has an encoding entry, but only specifies a
differences map. The base encoding is empty in this case and shouldn't
be used.
2021-08-30 12:51:59 -07:00
Brendan Dahl
306119b12a
Merge pull request #13932 from Snuffleupagus/oc-images
Support Optional Content in Image-/XObjects (issue 13931)
2021-08-30 10:10:14 -07:00
Jonas Jenwald
cf0ccc4bab
Merge pull request #13937 from overleaf/jpa-fix-error-handling
Fix handling of fetch errors
2021-08-30 15:50:03 +02:00
Jakob Ackermann
291ffd3059
Fix handling of fetch errors
Testing:
- delete the pdf file while the initial request is inflight
- delete the pdf file after the initial request has finished

Repeat for a small file and large file, exercising both one-off and
 chunked transports.
2021-08-30 12:43:28 +01:00
Tim van der Meij
954e1a1694
Merge pull request #13943 from Snuffleupagus/api-more-async
Use `async` a bit more in the API
2021-08-29 14:34:14 +02:00
Jonas Jenwald
ce3f5ea2bf Use async a bit more in the API
This patch changes the `PDFDocumentLoadingTask.destroy`-method and the `_fetchDocument`-function to be `async`, which slightly simplifies the relevant code.

Furthermore, remove the catch-handler from the `WorkerTransport.getPageIndex`-method since it's no longer needed. Given that the `MessageHandler` is nowadays wrapping every possible Exception, it's no longer necessary to try and re-wrap the reason here.
2021-08-29 12:31:28 +02:00
Jonas Jenwald
9ea3fa0747 Ensure that PasswordException is handled correctly in the wrapReason function
While running the unit-tests with some logging statements added to this code, I noticed that `PasswordException` was missing from the list of potential Errors that could be passed to the `wrapReason` function.
2021-08-28 12:24:12 +02:00
Tim van der Meij
153d058b3a
Merge pull request #13933 from brendandahl/xfa-checkbox2
Fix saving of XFA checkboxes. (bug 1726381)
2021-08-27 22:45:44 +02:00
Jonas Jenwald
b34d2cdc42 Ensure that beginMarkedContentProps/endMarkedContent-operators, for /XObjects, are balanced in corrupt documents (PR 13854 follow-up)
Something that I *just* realized is that while PR 13854 fixed an issue as reported, it could still cause bugs in other similarily broken documents since we'll not insert a matching endMarkedContent-operator in the operatorList.
2021-08-26 17:05:30 +02:00
Jonas Jenwald
853b1172a1 Support Optional Content in Image-/XObjects (issue 13931)
Currently, in the `PartialEvaluator`, we only support Optional Content in Form-/XObjects. Hence this patch adds support for Image-/XObjects as well, which looks like a simple oversight in PR 12095 since the canvas-implementation already contains the necessary code to support this.
2021-08-26 16:54:15 +02:00
Brendan Dahl
6d2193a812 Fix saving of XFA checkboxes. (bug 1726381)
Previously were were always setting the storage value to the on value.
2021-08-24 15:53:55 -07:00