Commit Graph

544 Commits

Author SHA1 Message Date
Jay Berkenbilt
586295fad6 Implement TrueType character map "format 2" (fixes #14117)
If a PDF included an embedded TrueType font whose preferred character
map (cmap) was in "format 2", the code would select that character map
and then refuse to read it because of an unsupported format, thus
causing the characters not to be rendered. This commit implements
support for format 2 as described at the link below.

https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6cmap.html
2021-10-13 07:37:14 -04:00
Jonas Jenwald
284d259054
Merge pull request #14057 from Snuffleupagus/bug-920426
Support CMap-data with only strings, when parsing TrueType composite fonts (bug 920426)
2021-10-01 23:22:25 +02:00
Calixte Denizet
aecbd7cd89 AcroForm: Add support for ResetForm action
- it aims to fix #12721.
  - Thanks to PR #14023, we've now the fieldObjects in the annotation layer so we can easily map fields names on their id if needed.
  - Reset values in the storage, in the JS sandbox and in the visible html elements.
2021-09-30 22:02:33 +02:00
Jonas Jenwald
d3ca28bc34 Support CMap-data with only strings, when parsing TrueType composite fonts (bug 920426)
In the referenced bug, the embedded fonts contain custom CMap-data that only include strings. Note how for embedded composite TrueType fonts we're using the CMap-data when building the glyph mapping, and currently we end up with a completely empty map because the code expects only CID *numbers*.
Furthermore, just fixing the glyph mapping alone isn't sufficient to fully address the bug, since we also need to consider this "special" kind of CMap-data when looking up glyph widths.
2021-09-30 18:10:47 +02:00
Calixte Denizet
748ab4983c Add the missing pdf file for the test in the PR #14049 2021-09-29 22:07:07 +02:00
Jonas Jenwald
1dcd2f0cd3 [api-minor] Add basic support for RTL text-content in PopupAnnotations (issue 14046)
In order to implement this, we utilize the existing `bidi` function to infer the text-direction of /T and /Contents entries. While this may not be perfect in cases where one PopupAnnotation mixes LTR and RTL languages, it should work well enough in most cases.
To avoid having to add *two new* properties in lots of annotations, supplementing the existing `title`/`contents`-properties, this patch instead re-factors the existing code such that the properties are replaced by Objects (containing `str` and `dir`).

*Please note:* In order avoid breaking existing third-party implementations, `GENERIC`-builds of the PDF.js library will still provide the old `title`/`contents`-properties on annotations returned by `PDFPageProxy.getAnnotations`.
2021-09-25 09:18:58 +02:00
Calixte Denizet
386acf5bdd Integration test for PR #14023 2021-09-23 13:05:18 +02:00
Jonas Jenwald
8ea27ce157 Tweak how fonts with an /Encoding are handled in adjustToUnicode (issue 14048, PR 13277 follow-up)
Currently we only exclude /Encoding entries that also contains a /Differences array, which is the cause of the text-selection problem in the referenced issue.
In order to address this we'll now also exclude /Encoding entries that contain one of the predefined *named* encodings, and no longer require that it also contains a /Differences array.

*Please note:* This patch cases a small "regression" in the `bug1130815-text` test-case, however this is actually an improvement when compared with Adobe Reader and PDFium (in Google Chrome).
2021-09-18 22:44:25 +02:00
Jonas Jenwald
a11343e9af Improve glyph mapping for non-embedded composite standard fonts with a /CIDToGIDMap (issue 11915)
*Please note:* All of this feels very handwavy, but at least it passes all tests locally. Hopefully we have enough tests for this part of the font code.

For non-embedded composite standard fonts with an "incomplete" /CIDToGIDMap, we'll now fallback to an *explicitly defined* /ToUnicode map even when that one happens to be an /Identity-H or /Identity-V map.

The `Font.fallbackToSystemFont` method is unfortunately getting more and more special-cases, however that might be unavoidable given all the weird non-embedded fonts found in the wild :-(
2021-09-15 11:30:40 +02:00
Brendan Dahl
f38fb42b42 Enable/disable image smoothing based on image interpolate value. (bug 1722191)
While some of the output looks worse to my eye, this behavior more
closely matches what I see when I open the PDFs in Adobe acrobat.

Fixes: #4706, #9713, #8245, #1344
2021-09-10 14:23:35 -07:00
Jonas Jenwald
3ccf277f58 Fallback to the /ToUnicode map for TrueType fonts with (3, 1) and (1, 0) cmap-tables (issue 13316)
In the PDF document some of the glyphs have bogus `differences`-entries[1] that cannot be resolved to valid glyph names, thus causing the glyph mapping to fail.
My initial idea was to use a similar approach as in the `PartialEvaluator._simpleFontToUnicode`-method, to extract the charCodes from those entries, however it turned out that that didn't actually help in this case (the mapping was still wrong).

To fix this I'm thus proposing that we fallback to the /ToUnicode map when no other useable data exists (e.g. no post-table), since it *hopefully* shouldn't make things any worse than leaving parts of the glyph map empty (which currently happens).

---
[1] As can be seem below, some of the entries are completely normal while others are non-standard:
```
Differences (array)
    0 = 65
    1 = /g5167
    2 = /space
    3 = /g11927
    4 = /g17737
    5 = /g11540
    6 = /g2180
    7 = /K
    8 = /P
    9 = /two
    10 = /zero
    11 = /one
    12 = /five
    13 = /four
    14 = /g6932
    15 = /g7246
    16 = /g1691
    17 = /g2343
    18 = /g14792
    19 = /g3325
    20 = /g4280
    21 = /g20383
    22 = /g18166
    23 = /g16988
    24 = /g17943
    25 = /g19223
    26 = /g10830
    27 = 97
    28 = /g982
    29 = /g1226
    30 = /g5059
    31 = /g2677
    32 = /g1042
    33 = /g11568
    34 = /L
    35 = /three
    36 = /seven
    37 = /g2364
    38 = /g12063
    39 = /g5356
    40 = /g2173
    41 = /g17877
    42 = /g7273
    43 = /g7647
    44 = /g7224
    45 = /g19327
    46 = /g5054
    47 = /g2342
    48 = /g10136
    49 = /g6856
    50 = /g13381
    51 = /g7257
    52 = /g12093
    53 = /g2359
```
2021-09-04 07:38:22 +02:00
Brendan Dahl
a7f807b059 Only use base encoding if it's populated. (bug 1727053)
The font dict in this file has an encoding entry, but only specifies a
differences map. The base encoding is empty in this case and shouldn't
be used.
2021-08-30 12:51:59 -07:00
Jonas Jenwald
853b1172a1 Support Optional Content in Image-/XObjects (issue 13931)
Currently, in the `PartialEvaluator`, we only support Optional Content in Form-/XObjects. Hence this patch adds support for Image-/XObjects as well, which looks like a simple oversight in PR 12095 since the canvas-implementation already contains the necessary code to support this.
2021-08-26 16:54:15 +02:00
Jonas Jenwald
ac27f96987 Extend the glyph maps for standard respectively Calibri fonts (issue 13916) 2021-08-21 00:48:38 +02:00
Brendan Dahl
da1af02ac8 Improve performance of reused patterns.
Bug 1721218 has a shading pattern that was used thousands of times.
To improve performance of this PDF:
 - add a cache for patterns in the evaluator and only send the IR form once
   to the main thread (this also makes caching in canvas easier)
 - cache the created canvas radial/axial patterns
 - for shading fill radial/axial use the pattern directly instead of creating temporary
   canvas
2021-07-22 16:47:40 -07:00
Brendan Dahl
a52c0c6988 Fix transformations when painting image masks and tiling patterns.
Previously, when we filled image masks we didn't copy over the current transformation,
this caused patterns to be misaligned when painted. Now we create a temporary
canvas with the mask and have the transform copied over and offset it relative to
where the mask would be painted. We also weren't properly offsetting tiling patterns.
This isn't usually noticeable since patters repeat, but in the case of #13561 the pattern
is only drawn once and has to be in the correct position to line up with the mask image.

These fixes broke #11473, but highlighted that we were drawing that correctly by
accident and not correctly handling negative bounding boxes on tiling patterns.

Fixes #6297,  #13561, #13441

Partially fixes #1344 (still blurry but boxes are in correct position now)
2021-07-06 17:29:32 -07:00
Calixte Denizet
429ffdcd2f XFA - Save filled data in the pdf when downloading the file (Bug 1716288)
- when binding (after parsing) we get a map between some template nodes and some data nodes;
  - so set user data in input handlers in using data node uids in the annotation storage;
  - to save the form, just put the value we have in the storage in the correct data nodes, serialize the xml as a string and then write the string at the end of the pdf using src/core/writer.js;
  - fix few bugs around data bindings:
    - the "Off" issue in Bug 1716980.
2021-06-25 18:57:01 +02:00
Brendan Dahl
5efaaa0fea Fix how patterns are applied to image mask objects.
Note, this only really fixes Radial/Axial shading patterns with masks.
I'm guessing tiling patterns and mesh patterns would also be broken
if applied like the test pdf. Hopefully I'll have some time to make
test cases for the other shadings.

Fixes #13372
2021-06-16 20:06:41 -07:00
Jonas Jenwald
e7dc822e74
Merge pull request #12726 from brendandahl/standard-fonts
[api-minor] Include and use the 14 standard font files.
2021-06-08 10:09:40 +02:00
Brendan Dahl
4c1dd47e65 Include and use the 14 standard fonts files. 2021-06-07 11:10:11 -07:00
Jonas Jenwald
20770cb06a Improve text-selection for Type3 fonts with empty /FontBBox-entries (issue 6605)
For Type3 fonts where the /CharProcs-streams of the individual glyph starts with a `d1` operator, we can use that to build a fallback bounding box for the font and thus improve text-selection in some cases.
2021-06-05 08:09:29 +02:00
Brendan Dahl
6255c2a8f3
Merge pull request #13376 from calixteman/6132
Replace command with not enough args by an endchar in CFF font
2021-06-04 14:00:51 -07:00
Tim van der Meij
a0ce3cb3b4
Merge pull request #13448 from Snuffleupagus/_setDefaultAppearance-alpha
Support strokeAlpha/fillAlpha when creating a fallback appearance stream (issue 6810)
2021-05-28 23:39:36 +02:00
Jonas Jenwald
707a9e3b02 Work-around for HighlightAnnotations without a top-level /ExtGState-entry (issue 13242)
For HighlightAnnotations with a built-in appearance stream, we still rely on it to specify the opacity correctly via a suitable blend mode. However, if the Annotation-drawing operators are placed *within* a /XObject of the /Form-type, the /ExtGState won't apply to the final rendering and the result is that the highlighting obscures the underlying text.

The more *correct* and general solution would likely be to somehow modify the implementation in `src/display/canvas.js`, to special-case handling of /Form-type /XObjects when rendering Annotations. Since we can very easily work-around this problem for now by using the "no appearance stream" code-path, doing *something* here ought to be preferable.

This patch is (obviously) merely a work-around, but given that the referenced issue is (as far as I know) the first case we've seen of this problem a simple solution will hopefully suffice for now.
2021-05-28 13:49:27 +02:00
Jonas Jenwald
a6447f2ca2 Support strokeAlpha/fillAlpha when creating a fallback appearance stream (issue 6810)
This fixes the colours, by respecting the strokeAlpha/fillAlpha-values, for a couple of Annotations in the PDF document from issue 13447.[1]

---
[1] Some of the annotations still won't render at all, when compared with Adobe Reader, but that could/should probably be handled separately.
2021-05-27 16:23:18 +02:00
Calixte Denizet
1a2cea21a5 Replace command with not enough args by an endchar in CFF font
- Right now, a glyph with an erroneous outline is replaced by an empty glyph
    if the error is far enough from the start there's likely something to render
    so the idea is to replace a command with args by an endchar when no args are
    on the stack: this way OTS is likely happy (no remaining args on stack) and we
    can draw something which is likely better than nothing.
2021-05-14 13:45:45 +02:00
Brendan Dahl
53991d0924 Fix tiling pattern with smask.
After drawing a tiling pattern we were not calling
endDrawing, which handles compositing any
active smasks.

Fixes #8565.
2021-05-12 11:42:08 -07:00
Tim van der Meij
ba99e54c66
Merge pull request #13361 from brendandahl/patterns-fixes
Fix several issues with radial/axial shadings and tiling patterns.
2021-05-12 20:27:37 +02:00
Brendan Dahl
ac44afa70e Fix several issues with radial/axial shadings and tiling patterns.
Previously, we set the base transformation and pattern matrix
directly to the main rendering ctx of the page, however doing this
caused the current transform to be lost. This would cause issues
with things like shear missing so the pattern was misaligned or when
stroke was used the scale of the line width or dash would be wrong.
Instead we should leave the current transform and use setTransfrom
on the pattern so it is applied correctly. For axial and radial shadings I had
to create a temporary canvas to draw the shading so I could in turn
use setTransform.

Fixes: #13325, #6769, #7847, #11018, #11597, #11473

The following already in the corpus are improved:
issue8078-page1
issue1877-page1
2021-05-11 16:32:24 -07:00
Jonas Jenwald
fc59a5f709 Take the W array into account when computing the hash, in PartialEvaluator.preEvaluateFont, for composite fonts (issue 13343)
Without this some *composite* fonts may incorrectly end up with matching `hash`es, thus breaking rendering since we'll not actually try to load/parse some of the fonts.

*Please note:* Given that the document, in the referenced issue, doesn't embed *any* of its fonts there's no guarantee that it renders correctly in all configurations even with this patch.
2021-05-07 21:22:36 +02:00
Calixte Denizet
3f29892d63 [JS] Fix several issues found in pdf in #13269
- app.alert and few other function can use an object as parameter ({cMsg: ...});
  - support app.alert with a question and a yes/no answer;
  - update field siblings when one is changed in an action;
  - stop calculation if calculate is set to false in the middle of calculations;
  - get a boolean for checkboxes when they've been set through annotationStorage instead of a string.
2021-05-04 19:21:51 +02:00
Calixte Denizet
549aae6c3d JS -- add support for page property in field 2021-05-03 15:46:29 +02:00
Brendan Dahl
d10da907da
Fix position of highlighted all text. (#13306)
Adds a new integration test to ensure we don't
regress this again.
2021-04-28 10:15:31 +02:00
Tim van der Meij
60ab15427f
Implement rendering polyline/polygon annotations without appearance stream 2021-04-27 19:02:20 +02:00
Jonas Jenwald
6f4394fcd8
Support InkAnnotations without appearance streams (issue 13298) (#13301)
For now, we keep things purposely simple by using straight lines (rather than curves); please see https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G11.2096579
2021-04-27 11:49:03 +02:00
Calixte Denizet
e868ab0051 Update all the text widgets having the same name with the same value 2021-04-20 20:03:19 +02:00
Jani Pehkonen
3a96977ea8 Implement visibility expressions for optional content 2021-04-14 17:39:41 +03:00
Tim van der Meij
d9d626a5e1
Merge pull request #13214 from calixteman/signatures
Display widget signature
2021-04-10 19:35:16 +02:00
Calixte Denizet
5875ebb1ca Display widget signature
- but don't validate them for now;
  - Firefox will display a bar to warn that the signature validation is not supported (see https://bugzilla.mozilla.org/show_bug.cgi?id=854315)
  - almost all (all ?) pdf readers display signatures;
  - validation is done in edge but for now it's behind a pref.
2021-04-10 19:13:28 +02:00
Brendan Dahl
fc9501a637 Add support for basic structure tree for accessibility.
When a PDF is "marked" we now generate a separate DOM that represents
the structure tree from the PDF.  This DOM is inserted into the <canvas>
element and allows screen readers to walk the tree and have more
information about headings, images, links, etc. To link the structure
tree DOM (which is empty) to the text layer aria-owns is used. This
required modifying the text layer creation so that marked items are
now tracked.
2021-04-09 09:56:28 -07:00
Jonas Jenwald
f986ccdf0e Fuzzy-match the fontName, for TrueType Collection fonts, where the "name"-table is wrong (issue 13193)
The fontName, as defined in the PDF document, cannot be found in *any* of the "name"-tables in the TrueType Collection font. To work-around that, this patch adds a *fallback* code-path to allow using an approximately matching fontName rather than outright failing.
2021-04-07 15:25:32 +02:00
Jani Pehkonen
0117ee5071 Use post table when Encoding has only Differences
Fixes #13107
In the issue, some TrueType glyph names have the format `uniXXXX`.
Font's `Encoding` dictionary has the entry `Differences` but no
`BaseEncoding`. `uniXXXX` names are converted to glyph indices
using font's `post` table but currently that is done only when
`BaseEncoding` exists. We must enable the conversion also when only
`Differences` exists.
2021-03-31 17:58:44 +03:00
calixteman
81c602c61c
Set CFF header to 4 when writing it because it contains 4 elements (#13149) 2021-03-26 18:23:18 +01:00
Jonas Jenwald
5099f1977f Support LineAnnotations with empty /Rect-entries (issue 6564)
This extends PR 13033 slightly, with a heuristic to support corrupt PDF documents where the `LineAnnotation`s have an empty /Rect-entry. Please note that while I have no idea if this is "correct", this patch at least makes us output the same /BBox as re-saving in Adobe Reader does.
2021-03-15 16:33:43 +01:00
Tim van der Meij
5828ff6cb0
Implement rendering line annotations without appearance stream 2021-02-28 18:57:58 +01:00
Tim van der Meij
fa6cebf045
Implement rendering square/circle annotations without appearance stream 2021-02-27 19:05:12 +01:00
Calixte Denizet
4a5f1d1b7a JS - Fix setting a color on an annotation
- strokeColor corresponds to borderColor;
 - support fillColor and textColor;
 - support colors on the different annotations;
 - fix typo in aforms (+test).
2021-02-20 15:24:37 +01:00
Calixte Denizet
0fc8267576 Avoid infinite loop when getting annotation field name
- aims to fix issue #12963;
 - use a Set to track already visited objects;
 - remove the loop limit in getInheritableProperty and use a RefSet too.
2021-02-14 19:58:19 +01:00
calixteman
a3f6882b06
JS -- add support for choice widget (#12826) 2021-01-25 23:40:57 +01:00
Dominik Hufnagel
c5083cda02 set font size and color on annotation layer
use the default appearance to set the font size and color of a text
annotation widget
2021-01-22 23:12:14 +01:00