Commit Graph

365 Commits

Author SHA1 Message Date
Jonas Jenwald
b7b6076294 Always prefer the post-table for TrueType fonts with (0, x) cmap-tables (issue 13433)
While I don't know if this is necessarily the "correct" solution, it does fix issue 13433 without breaking any of the existing reference-tests.
2021-09-01 12:35:49 +02:00
Brendan Dahl
a7f807b059 Only use base encoding if it's populated. (bug 1727053)
The font dict in this file has an encoding entry, but only specifies a
differences map. The base encoding is empty in this case and shouldn't
be used.
2021-08-30 12:51:59 -07:00
Calixte Denizet
4a4591bd2c XFA - Fix font scale factors (bug 1720888)
- All the scale factors in for the substitution font were wrong because of different glyph positions between Liberation and the other ones:
    - regenerate all the factors
  - Text may have polish chars for example and in this case the glyph widths were wrong:
    - treat substitution font as a composite one
    - add a map glyphIndex to unicode for Liberation in order to generate width array for cid font
2021-07-28 19:10:42 +02:00
Calixte Denizet
76d882b560 XFA - Fix auto-sized fields (bug 1722030)
- In order to better compute text fields size, use line height with no gaps (and consequently guessed height for text are slightly better in general).
  - Fix default background color in fields.
2021-07-28 09:43:15 +02:00
Calixte Denizet
58e1f51688 XFA - Fix text positions (bug 1718741)
- font line height is taken into account by acrobat when it isn't with masterpdfeditor: I extracted a font from a pdf, modified some ascent/descent properties thanks to ttx and the reinjected the font in the pdf: only Acrobat is taken it into account. So in this patch, line heights for some substituted fonts are added.
  - it seems that Acrobat is using a line height of 1.2 when the line height in the font is not enough (it's the only way I found to fix correctly bug 1718741).
   - don't use flex in wrapper container (which was causing an horizontal overflow in the above bug).
   - consequently, the above fixes introduced a lot of small regressions, so in order to see real improvements on reftests, I fixed the regressions in this patch:
     - replace margin by padding in some case where padding is a part of a container dimensions;
     - remove some flex display: some containers are wrongly sized when rendered;
     - set letter-spacing to 0.01px: it helps to be sure that text is not broken because of not enough width in Firefox.
2021-07-09 18:11:12 +02:00
Jonas Jenwald
273d8cb746 Add non-PRODUCTION/TESTING overflow asserts to various string helper-functions (issue 6759) 2021-06-27 16:06:30 +02:00
Jonas Jenwald
50edd5da63 Suppress OTS warnings about the caretOffset in the hhea-table
- https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6hhea.html
 - https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6head.html
2021-06-25 17:02:02 +02:00
Jonas Jenwald
185be678ec Check that TrueType (3, 0) cmap tables, for symbolic fonts, are sorted correctly (issue 13626)
According to a comment in `readCmapTable`, we're assuming that the cmap tables (when more than one exist) are sorted in ascending order. If that's not the case, keep checking the following cmap tables in order to fix the referenced issue.
2021-06-25 16:56:00 +02:00
Calixte Denizet
7cdbc98716 XFA - Match font family correctly
- partial fix for https://bugzilla.mozilla.org/show_bug.cgi?id=1716980;
  - some pdf can contain an invalid font family (e.g. 'Windings 3') so in this case remove the space;
  - the font family in typeface attribute doesn't always match the one defined in the FontDescriptor dictionary.
2021-06-20 15:16:28 +02:00
Calixte Denizet
8eeb7ab4a3 XFA - Add the possibily to layout and measure text
- some containers doesn't always have their 2 dimensions and those dimensions re based on contents;
  - so in order to measure text, we must get the glyph widths (for the xfa fonts) before starting the layout;
  - implement a word-wrap algorithm;
  - handle font change during text layout.
2021-06-17 14:17:02 +02:00
Jonas Jenwald
229a49b9b9 Re-factor the fallbackToUnicode functionality (PR 9192 follow-up)
Rather than having to create and check a *separate* `ToUnicodeMap` to handle these cases, we can simply use the `fallbackToUnicode`-data (when it exists) to directly supplement *missing* /ToUnicode entires in the regular `ToUnicodeMap` instead.
2021-06-14 15:05:14 +02:00
Jonas Jenwald
7190bc23a8 Remove unnecessary in checks of Arrays, when building the charCodeToGlyphId for TrueType fonts
Note that all standard Encodings have the same length (i.e. `256` elements) and that missing entries are always represented by empty strings, hence why a separate exists-check isn't necessary in the `baseEncoding` case.
2021-06-14 15:05:14 +02:00
Jonas Jenwald
3660aaac85 Tweak adjustToUnicode to allow extending a built-in /ToUnicode map
*This is somewhat similiar to the recent changes, in PR 13277, for fonts with an /Encoding entry.*

Currently we're *completely* ignoring the `builtInEncoding`, from the font data itself, for fonts which have a built-in /ToUnicode map.
While it (obviously) doesn't seem like a good idea in general to simply overwrite existing built-in /ToUnicode entries, it should however not hurt to use the `builtInEncoding` to supplement *missing* /ToUnicode entires.
2021-06-14 15:05:14 +02:00
Calixte Denizet
fd1110adb4 Add the possibility to rescale each glyph in a font
- a lot of xfa files are using Myriad pro or Arial fonts without embedding them and some containers have some dimensions based on those font metrics. So not having the exact same font leads to a wrong display.
  - since it's pretty hard to find a replacement font with the exact same metrics, this patch gives the possibility to read glyf table, rescale each glyph and then write a new table.
  - so once PR #12726 is merged we could rescale for example Helvetica to replace Myriad Pro.
2021-06-09 16:01:13 +02:00
Jonas Jenwald
e7dc822e74
Merge pull request #12726 from brendandahl/standard-fonts
[api-minor] Include and use the 14 standard font files.
2021-06-08 10:09:40 +02:00
Brendan Dahl
4c1dd47e65 Include and use the 14 standard fonts files. 2021-06-07 11:10:11 -07:00
Brendan Dahl
17f1857556 Add more info for showText operator in stepper.
Adds a table that shows original char code, font char code, and unicode.
2021-06-04 13:58:05 -07:00
Tim van der Meij
0d56b1c365
Merge pull request #13443 from Snuffleupagus/charsCache
Re-factor the `charsCache` on `Font`-instances
2021-05-28 21:29:57 +02:00
Calixte Denizet
0c698346b8 Fix Postscript name in font to avoid bug when saving in pdf
- for xfa rendering, fonts are loaded and used in html;
  - when printed and saved in pdf, on linux, Firefox uses cairo backend
  - when subsetting a font, cairo uses the font postscript name and when this one is empty that leads to a bug
    (the append at 63f0d62684/src/cairo-cff-subset.c (L2049) is failing because of null length)
  - so this patch adds a postscript name to the font to make cairo happy.
2021-05-27 12:45:40 +02:00
Jonas Jenwald
8b1d01816b Re-factor the charsCache on Font-instances
Currently `charsCache` is initialized *lazily*, which considering that it just contains a simple `Object` doesn't seem entirely necessary. This first of all forces us to do repeated exists-checks in the `Font.charsToGlyphs` method, and secondly the similar/related `glyphCache` is already initialized eagerly.

Furthermore, this patch also does a bit of clean-up in the `Font.charsToGlyphs` method since this code is quite old.
2021-05-26 13:13:44 +02:00
Jonas Jenwald
1a8d05fdcf Remove some, with Prettier 2.3.0, unnecessary // prettier-ignore comments
To get the maximum benefit from something like Prettier, you obviously don't want to disable the automatic formatting unless absolutely necessary. When we added Prettier there were a number of cases, mostly involving larger Arrays, which required disabling of the automatic formatting for overall readability and/or to not break inline comments.

With changes in Prettier version `2.3.0`, see [the release notes](https://prettier.io/blog/2021/05/09/2.3.0.html#concise-formatting-of-number-only-arrays-10106httpsgithubcomprettierprettierpull10106-10160httpsgithubcomprettierprettierpull10160-by-thorn0httpsgithubcomthorn0), there's now better formatting support for Arrays containing only numbers. Hence we can now remove a number of `// prettier-ignore` comments, and thus get the benefit of automatic formatting in (slightly) more of the code-base.
2021-05-19 11:36:03 +02:00
Calixte Denizet
a74d19262a XFA - Don't move glyphes in private area with non-truetype fonts
- it has been done in PR #13146 but only for truetype fonts.
2021-05-17 16:52:39 +02:00
Jonas Jenwald
8943bcd3c3 Account for formatting changes in Prettier version 2.3.0
With the exception of one tweaked `eslint-disable` comment, in `web/generic_scripting.js`, this patch was generated automatically using `gulp lint --fix`.

Please find additional information at:
 - https://github.com/prettier/prettier/releases/tag/2.3.0
 - https://prettier.io/blog/2021/05/09/2.3.0.html
2021-05-16 11:44:05 +02:00
Jonas Jenwald
b487edd05d Convert src/core/fonts.js to use standard classes
Obviously the `Font`-class is still *very* large, given particularly how TrueType fonts are handled, however this patch-series at least improves things by moving a number of functions/classes into their own files.
As a follow-up it might make sense to try and re-factor/extract the TrueType parsing into its own file, since all of this code is quite old, however that's probably best left for another time.

For e.g. `gulp mozcentral`, the *built* `pdf.worker.js` files decreases from `1 620 332` to `1 617 466` bytes with this patch-series.
2021-05-03 13:57:25 +02:00
Jonas Jenwald
cadc20d8b9 Fix the remaining no-var failures, which couldn't be handled automatically, in the src/core/fonts.js file 2021-05-02 21:00:29 +02:00
Jonas Jenwald
b9cd080c01 Enable the no-var rule in the src/core/fonts.js file
These changes were made automatically, using `gulp lint --fix`.
Given the large size of this patch, the manual fixes are done separately in the next commit.
2021-05-02 21:00:29 +02:00
Jonas Jenwald
ff85bcfc0e Move the Type1Font from src/core/fonts.js and into its own file 2021-05-02 21:00:29 +02:00
Jonas Jenwald
d5d73e3168 Move the CFFFont from src/core/fonts.js and into its own file 2021-05-02 21:00:29 +02:00
Jonas Jenwald
77b258440b Move some constants and helper functions from src/core/fonts.js and into their own file
- `FontFlags`, is used in both `src/core/fonts.js` and `src/core/evaluator.js`.
 - `getFontType`, same as the above.
 - `MacStandardGlyphOrdering`, is a fairly large data-structure and `src/core/fonts.js` is already a *very* large file.
 - `recoverGlyphName`, a dependency of `type1FontGlyphMapping`; please see below.
 - `SEAC_ANALYSIS_ENABLED`, is used by both `Type1Font`, `CFFFont`, and unit-tests; please see below.
 - `type1FontGlyphMapping`, is used by both `Type1Font` and `CFFFont` which a later patch will move to their own files.
2021-05-02 21:00:29 +02:00
Jonas Jenwald
6912bb5e0a Move the IdentityToUnicodeMap/ToUnicodeMap from src/core/fonts.js and into its own file 2021-05-02 21:00:29 +02:00
Jonas Jenwald
a783c7ca79 Move the OpenTypeFileBuilder from src/core/fonts.js and into its own file 2021-05-02 21:00:28 +02:00
Jonas Jenwald
3624f9eac7 Add a new BaseStream.getString(...) method to replace manual bytesToString(BaseStream.getBytes(...)) calls
Given that the `bytesToString(BaseStream.getBytes(...))` pattern is somewhat common throughout the `src/core/` code, it cannot hurt to add a new `BaseStream`-method which handles that case internally.
2021-05-01 19:20:36 +02:00
Jonas Jenwald
7fab73ed23 For CFF fonts without proper ToUnicode/Encoding data, utilize the "charset"/"Encoding"-data from the font file to improve text-selection (issue 13260)
This patch extends the approach, implemented in PR 7550, to also apply to CFF fonts.
2021-04-20 20:48:44 +02:00
Jonas Jenwald
f560fe6875 A couple of small scripting/XFA-related tweaks in the worker-code
- Use `PDFManager.ensureDoc`, rather than `PDFManager.ensure`, in a couple of spots in the code. If there exists a short-hand format, we should obviously use it whenever possible.

 - Fix a unit-test helper, to account for the previous changes. (Also, converts a function to be `async` instead.)

 - Add one more exists-check in `PDFDocument.loadXfaFonts`, which I missed to suggest in PR 13146, to prevent any possible errors if the method is ever called in a situation where it shouldn't be.
   Also, print a warning if the actual font-loading fails since that could help future debugging. (Finally, reduce overall indentation in the loop.)

 - Slightly unrelated, but make a small tweak of a comment in `src/core/fonts.js` to reduce possible confusion.
2021-04-17 10:34:22 +02:00
Calixte Denizet
7e9579045f XFA -- Load fonts permanently from the pdf
- Different fonts can be used in xfa and some of them are embedded in the pdf.
  - Load all the fonts in window.document.

Update src/core/document.js

Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com>

Update src/core/worker.js

Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com>
2021-04-15 17:57:42 +02:00
Jonas Jenwald
f986ccdf0e Fuzzy-match the fontName, for TrueType Collection fonts, where the "name"-table is wrong (issue 13193)
The fontName, as defined in the PDF document, cannot be found in *any* of the "name"-tables in the TrueType Collection font. To work-around that, this patch adds a *fallback* code-path to allow using an approximately matching fontName rather than outright failing.
2021-04-07 15:25:32 +02:00
Jani Pehkonen
0117ee5071 Use post table when Encoding has only Differences
Fixes #13107
In the issue, some TrueType glyph names have the format `uniXXXX`.
Font's `Encoding` dictionary has the entry `Differences` but no
`BaseEncoding`. `uniXXXX` names are converted to glyph indices
using font's `post` table but currently that is done only when
`BaseEncoding` exists. We must enable the conversion also when only
`Differences` exists.
2021-03-31 17:58:44 +03:00
Jonas Jenwald
2600e59acb Always re-measure non-embedded ArialNarrow fonts (bug 1671312, PR 12725 follow-up)
While PR 12725 fixed bug 1671312 as reported, i.e. the "In the upper right corner "Purposes' has bad kerning."-part, it however broke other parts of the text rendering.
Note in particular the tables, e.g. on page 2 and beyond, where the glyphs are now rendered too close together. The reason for this is that the fonts in question are non-embedded ArialNarrow, which we just replace with Helvetica which obviously is not narrow. Given that the font replacement isn't a perfect fit for non-embedded ArialNarrow, we still need to re-measure the glyph widths in this case.
2021-01-14 15:51:48 +01:00
Jonas Jenwald
81525fd446 Use ESLint to ensure that exports are sorted alphabetically
There's built-in ESLint rule, see `sort-imports`, to ensure that all `import`-statements are sorted alphabetically, since that often helps with readability.
Unfortunately there's no corresponding rule to sort `export`-statements alphabetically, however there's an ESLint plugin which does this; please see https://www.npmjs.com/package/eslint-plugin-sort-exports

The only downside here is that it's not automatically fixable, but the re-ordering is a one-time "cost" and the plugin will help maintain a *consistent* ordering of `export`-statements in the future.
*Note:* To reduce the possibility of introducing any errors here, the re-ordering was done by simply selecting the relevant lines and then using the built-in sort-functionality of my editor.
2021-01-09 20:37:51 +01:00
Calixte Denizet
56424967f2 Fix encoding issues when printing/saving a form with non-ascii characters 2021-01-05 17:23:18 +01:00
Brendan Dahl
45d9ab6e45 Use widths defined by font for standard fonts.
There doesn't seem to be anything definitive about this in
the spec, but from experimenting, it seems acrobat lets
PDFs override the widths of the standard fonts.
2020-12-10 15:30:39 -08:00
Jonas Jenwald
8540b4cc76 Stop calling Font.charsToGlyphs, in src/core/annotation.js, with unused arguments
As can be seen in `src/core/fonts.js`, this method only accepts *one* parameter, hence it's somewhat difficult to understand what the Annotation-code is actually attempting to do here.
The only possible explanation that I can imagine, is that the intention was initially to call `Font.charToGlyph` *directly* instead. However, note that that'd would not actually have been correct, since that'd ignore one level of font-caching (see `this.charsCache`). Hence the unused arguments are removed, in `src/core/annotation.js`, and the `Font.charToGlyph` method is now marked as "private" as intended.
2020-10-30 13:17:52 +01:00
Jonas Jenwald
9416b14e8b Re-factor how the ESLint no-var rule is enabled in the src/ folder
This simplifies/consolidates the ESLint configuration slightly in the `src/` folder, and prevents the addition of any new files where `var` is being used.[1]
Hence we no longer need to manually add `/* eslint no-var: error */` in files, which is easy to forget, and can instead disable the rule in the `src/core/` files where `var` is still in use.

---
[1] Obviously the `no-var` rule can, in the same way as every other rule, be disabled on a case-by-case basis where actually necessary.
2020-10-03 20:15:29 +02:00
Jonas Jenwald
bd3b15b897 Use the cidToGidMap, if it exists, when building the glyph mapping for non-embedded composite fonts (issue 12418) 2020-09-28 14:40:43 +02:00
Jonas Jenwald
6c8f1f7d6f Run gulp lint --fix, to account for changes in Prettier version 2.1.x 2020-09-06 12:23:59 +02:00
Brendan Dahl
45e8a31cc0 Fix handling of symbolic fonts and unicode cmaps.
In issue 12120, the font has a 1,0 cmap and is marked symbolic which
according to the spec means we should directly use the cmap instead of
the extra steps that are defined in 9.6.6.4.

However, just fixing that caused bug 1057544 to break. The font in bug
1057544 has a 0,1 cmap (Unicode 1.1) which we were not using, but is
easy to support. We're also easily able to support some of the other
unicode cmaps, so I added those as well.

There was also a second issue with bug 1057544, the cmap doesn't have
a mapping for the "quoteright" glyph, but it is defined in the post
table. To handle this, I've moved post table as a  fallback for any
font that has an encoding.
2020-08-27 14:33:11 -07:00
Brendan Dahl
f6dff81223 Fix bad truetype loca tables.
Some fonts have loca tables that aren't sorted or use 0 as an offset to
signal a missing glyph. This fixes the bad loca tables by sorting them
and then rewriting the loca table and potentially re-ordering the glyf
table to match.

Fixes #11131 and bug 1650302.
2020-08-10 14:15:49 -07:00
Jonas Jenwald
4cc6797f17 Re-factor the idFactory functionality, used in the core/-code, and move the fontID generation into it
Note how the `getFontID`-method in `src/core/fonts.js` is *completely* global, rather than properly tied to the current document. This means that if you repeatedly open and parse/render, and then close, even the *same* PDF document the `fontID`s will still be incremented continuously.

For comparison the `createObjId` method, on `idFactory`, will always create a *consistent* id, assuming of course that the document and its pages are parsed/rendered in the same order.

In order to address this inconsistency, it thus seems reasonable to add a new `createFontId` method on the `idFactory` and use that when obtaining `fontID`s. (When the current `getFontID` method was added the `idFactory` didn't actually exist yet, which explains why the code looks the way it does.)
*Please note:* Since the document id is (still) part of the `loadedName`, it's thus not possible for different documents to have identical font names.
2020-07-07 16:33:31 +02:00
Jonas Jenwald
695140728a [src/core/fonts.js] Improve the validateOS2Table function
Rather than creating a new `Stream` just to validate the OS/2 TrueType table, it's simpler/better to just pass in a reference to the font data and use that instead (similar to other TrueType helper functions).
2020-04-19 11:25:25 +02:00
Jonas Jenwald
033d27fc25 [src/core/fonts.js] Replace some unnecessary Stream.getUint16() calls with Stream.skip(2) instead
There's a handful of cases in the code where the intention is simply to advance the `Stream` position, but rather than only doing that the code instead fetches/computes a Uint16 value (and without using the result for anything).
2020-04-19 11:18:20 +02:00