Sakurai/pdf.js - pdf.js - Gitea on kemo

Sakurai/pdf.js

Author	SHA1	Message	Date
calixteman	41b2f52f70	Merge pull request #15157 from calixteman/1778484 Add unicode mapping in the font cmap to have correct chars when printing in pdf (bug 1778484)	2022-07-13 14:45:12 +02:00
Calixte Denizet	680c293c34	Add unicode mapping in the font cmap to have correct chars when printing in pdf (bug 1778484) It aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1778484.	2022-07-13 14:38:27 +02:00
Jonas Jenwald	dcc73423e5	Enable the `unicorn/prefer-logical-operator-over-ternary` ESLint plugin rule This leads to ever so slightly more compact code, and can in some cases remove the need for a temporary variable. Please find additional information here: https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/prefer-logical-operator-over-ternary.md	2022-07-12 10:52:37 +02:00
Jonas Jenwald	9ac4536693	Enable the `unicorn/prefer-at` ESLint plugin rule (PR 15008 follow-up) Please find additional information here: - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/at - https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/prefer-at.md	2022-06-09 21:21:19 +02:00
Jonas Jenwald	6e7e9d83d8	Add support for TrueType format 12 `cmap`s (issue 14881) This is, as far as I can tell, the first case we've seen of a format 12 `cmap`. Please see https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6cmap.html	2022-05-06 11:11:38 +02:00
Jonas Jenwald	1a7921dbf0	Compute the loca table `endOffset`, of the "first" glyph, correctly (issue 14618) When there are multiple empty glyphs at the start of the data, ensure that the "first" glyph gets a correct `endOffset` to avoid skipping it during parsing in the `sanitizeGlyph` function.	2022-03-03 14:22:45 +01:00
Jonas Jenwald	05edd91bdb	Remove the `isNum` helper function The call-sites are replaced by direct `typeof`-checks instead, which removes unnecessary function calls. Note that in the `src/`-folder we already had more `typeof`-cases than `isNum`-calls. These changes were mostly done using regular expression search-and-replace, with two exceptions: - In `Font._charToGlyph` we no longer unconditionally update the `width`, since that seems completely unnecessary. - In `PDFDocument.documentInfo`, when parsing custom entries, we now do the `typeof`-check once.	2022-02-22 11:55:34 +01:00
Calixte Denizet	ae842e1c3a	[api-minor] Annotations - Adjust the font size in text field in considering the total width (bug 1721335) - it aims to fix #14502 and bug 1721335; - Acrobat and Pdfium do the same; - it'll avoid to have truncated data when printed; - change the factor to compute font size in using field height: lineHeight = 1.35*fontSize - this is the value used by Acrobat. - in order to not have truncated strings on the bottom, add few basic metrics for standard fonts.	2022-01-30 15:53:31 +01:00
Calixte Denizet	e1d3a3b414	Remove the invisible format marks from the text chunks - it aims to fix issue #9186.	2022-01-24 13:47:24 +01:00
Calixte Denizet	9dae421a0d	Handle all the whitespaces the same way when creating text chunks	2022-01-15 21:44:00 +01:00
Jay Berkenbilt	586295fad6	Implement TrueType character map "format 2" (fixes #14117 ) If a PDF included an embedded TrueType font whose preferred character map (cmap) was in "format 2", the code would select that character map and then refuse to read it because of an unsupported format, thus causing the characters not to be rendered. This commit implements support for format 2 as described at the link below. https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6cmap.html	2021-10-13 07:37:14 -04:00
Jonas Jenwald	d3ca28bc34	Support CMap-data with only strings, when parsing TrueType composite fonts (bug 920426) In the referenced bug, the embedded fonts contain custom CMap-data that only include strings. Note how for embedded composite TrueType fonts we're using the CMap-data when building the glyph mapping, and currently we end up with a completely empty map because the code expects only CID numbers. Furthermore, just fixing the glyph mapping alone isn't sufficient to fully address the bug, since we also need to consider this "special" kind of CMap-data when looking up glyph widths.	2021-09-30 18:10:47 +02:00
Tim van der Meij	cc110b8542	Merge pull request #14064 from Snuffleupagus/issue-13845 Fallback to font name matching, when checking for serif fonts (issue 13845)	2021-09-25 12:41:57 +02:00
Jonas Jenwald	9acfe486d4	Fallback to font name matching, when checking for serif fonts (issue 13845) In order to handle fonts that specify completely bogus /Flags-entries, fallback to font name matching to determine if the font is a serif one.	2021-09-23 01:11:57 +02:00
Jonas Jenwald	e027748627	[api-minor] Stop exporting, by default, a few additional Font properties (PR 11777 follow-up) This is similar to the "isSymbolicFont"-property, which is no longer exported by default after PR 11777. Both "isMonospace" and "isSerifFont" are internal properties, used during font parsing and building of the glyph mapping on the worker-thread. However both of these properties are completely unused on the main-thread and/or in the API, and accessing them they will now require setting the `fontExtraProperties`-option when calling `getDocument`.	2021-09-23 00:44:43 +02:00
Jonas Jenwald	8ea27ce157	Tweak how fonts with an /Encoding are handled in `adjustToUnicode` (issue 14048, PR 13277 follow-up) Currently we only exclude /Encoding entries that also contains a /Differences array, which is the cause of the text-selection problem in the referenced issue. In order to address this we'll now also exclude /Encoding entries that contain one of the predefined named encodings, and no longer require that it also contains a /Differences array. Please note: This patch cases a small "regression" in the `bug1130815-text` test-case, however this is actually an improvement when compared with Adobe Reader and PDFium (in Google Chrome).	2021-09-18 22:44:25 +02:00
Jonas Jenwald	e3223b68fc	Extract some of the glyphMap handling, for non-embedded composite standard fonts, into a helper function This reduces some unnecessary duplication, since we currently have essentially the same code in a handful of places in the `Font.fallbackToSystemFont`-method.	2021-09-18 12:39:48 +02:00
Jonas Jenwald	a11343e9af	Improve glyph mapping for non-embedded composite standard fonts with a /CIDToGIDMap (issue 11915) Please note: All of this feels very handwavy, but at least it passes all tests locally. Hopefully we have enough tests for this part of the font code. For non-embedded composite standard fonts with an "incomplete" /CIDToGIDMap, we'll now fallback to an explicitly defined /ToUnicode map even when that one happens to be an /Identity-H or /Identity-V map. The `Font.fallbackToSystemFont` method is unfortunately getting more and more special-cases, however that might be unavoidable given all the weird non-embedded fonts found in the wild :-(	2021-09-15 11:30:40 +02:00
Jonas Jenwald	69034ab8dc	Improve glyph mapping for non-embedded composite standard fonts (issue 11088) For non-embedded CIDFontType2 fonts with a non-/Identity encoding, use the /ToUnicode data to improve the glyph mapping.	2021-09-08 15:15:33 +02:00
Jonas Jenwald	3ccf277f58	Fallback to the /ToUnicode map for TrueType fonts with (3, 1) and (1, 0) cmap-tables (issue 13316) In the PDF document some of the glyphs have bogus `differences`-entries[1] that cannot be resolved to valid glyph names, thus causing the glyph mapping to fail. My initial idea was to use a similar approach as in the `PartialEvaluator._simpleFontToUnicode`-method, to extract the charCodes from those entries, however it turned out that that didn't actually help in this case (the mapping was still wrong). To fix this I'm thus proposing that we fallback to the /ToUnicode map when no other useable data exists (e.g. no post-table), since it hopefully shouldn't make things any worse than leaving parts of the glyph map empty (which currently happens). --- [1] As can be seem below, some of the entries are completely normal while others are non-standard: ``` Differences (array) 0 = 65 1 = /g5167 2 = /space 3 = /g11927 4 = /g17737 5 = /g11540 6 = /g2180 7 = /K 8 = /P 9 = /two 10 = /zero 11 = /one 12 = /five 13 = /four 14 = /g6932 15 = /g7246 16 = /g1691 17 = /g2343 18 = /g14792 19 = /g3325 20 = /g4280 21 = /g20383 22 = /g18166 23 = /g16988 24 = /g17943 25 = /g19223 26 = /g10830 27 = 97 28 = /g982 29 = /g1226 30 = /g5059 31 = /g2677 32 = /g1042 33 = /g11568 34 = /L 35 = /three 36 = /seven 37 = /g2364 38 = /g12063 39 = /g5356 40 = /g2173 41 = /g17877 42 = /g7273 43 = /g7647 44 = /g7224 45 = /g19327 46 = /g5054 47 = /g2342 48 = /g10136 49 = /g6856 50 = /g13381 51 = /g7257 52 = /g12093 53 = /g2359 ```	2021-09-04 07:38:22 +02:00
Jonas Jenwald	b7b6076294	Always prefer the post-table for TrueType fonts with (0, x) cmap-tables (issue 13433) While I don't know if this is necessarily the "correct" solution, it does fix issue 13433 without breaking any of the existing reference-tests.	2021-09-01 12:35:49 +02:00
Brendan Dahl	a7f807b059	Only use base encoding if it's populated. (bug 1727053) The font dict in this file has an encoding entry, but only specifies a differences map. The base encoding is empty in this case and shouldn't be used.	2021-08-30 12:51:59 -07:00
Calixte Denizet	4a4591bd2c	XFA - Fix font scale factors (bug 1720888) - All the scale factors in for the substitution font were wrong because of different glyph positions between Liberation and the other ones: - regenerate all the factors - Text may have polish chars for example and in this case the glyph widths were wrong: - treat substitution font as a composite one - add a map glyphIndex to unicode for Liberation in order to generate width array for cid font	2021-07-28 19:10:42 +02:00
Calixte Denizet	76d882b560	XFA - Fix auto-sized fields (bug 1722030) - In order to better compute text fields size, use line height with no gaps (and consequently guessed height for text are slightly better in general). - Fix default background color in fields.	2021-07-28 09:43:15 +02:00
Calixte Denizet	58e1f51688	XFA - Fix text positions (bug 1718741) - font line height is taken into account by acrobat when it isn't with masterpdfeditor: I extracted a font from a pdf, modified some ascent/descent properties thanks to ttx and the reinjected the font in the pdf: only Acrobat is taken it into account. So in this patch, line heights for some substituted fonts are added. - it seems that Acrobat is using a line height of 1.2 when the line height in the font is not enough (it's the only way I found to fix correctly bug 1718741). - don't use flex in wrapper container (which was causing an horizontal overflow in the above bug). - consequently, the above fixes introduced a lot of small regressions, so in order to see real improvements on reftests, I fixed the regressions in this patch: - replace margin by padding in some case where padding is a part of a container dimensions; - remove some flex display: some containers are wrongly sized when rendered; - set letter-spacing to 0.01px: it helps to be sure that text is not broken because of not enough width in Firefox.	2021-07-09 18:11:12 +02:00
Jonas Jenwald	273d8cb746	Add non-PRODUCTION/TESTING overflow `assert`s to various string helper-functions (issue 6759)	2021-06-27 16:06:30 +02:00
Jonas Jenwald	50edd5da63	Suppress OTS warnings about the `caretOffset` in the hhea-table - https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6hhea.html - https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6head.html	2021-06-25 17:02:02 +02:00
Jonas Jenwald	185be678ec	Check that TrueType (3, 0) cmap tables, for symbolic fonts, are sorted correctly (issue 13626) According to a comment in `readCmapTable`, we're assuming that the cmap tables (when more than one exist) are sorted in ascending order. If that's not the case, keep checking the following cmap tables in order to fix the referenced issue.	2021-06-25 16:56:00 +02:00
Calixte Denizet	7cdbc98716	XFA - Match font family correctly - partial fix for https://bugzilla.mozilla.org/show_bug.cgi?id=1716980; - some pdf can contain an invalid font family (e.g. 'Windings 3') so in this case remove the space; - the font family in typeface attribute doesn't always match the one defined in the FontDescriptor dictionary.	2021-06-20 15:16:28 +02:00
Calixte Denizet	8eeb7ab4a3	XFA - Add the possibily to layout and measure text - some containers doesn't always have their 2 dimensions and those dimensions re based on contents; - so in order to measure text, we must get the glyph widths (for the xfa fonts) before starting the layout; - implement a word-wrap algorithm; - handle font change during text layout.	2021-06-17 14:17:02 +02:00
Jonas Jenwald	229a49b9b9	Re-factor the `fallbackToUnicode` functionality (PR 9192 follow-up) Rather than having to create and check a separate `ToUnicodeMap` to handle these cases, we can simply use the `fallbackToUnicode`-data (when it exists) to directly supplement missing /ToUnicode entires in the regular `ToUnicodeMap` instead.	2021-06-14 15:05:14 +02:00
Jonas Jenwald	7190bc23a8	Remove unnecessary `in` checks of Arrays, when building the `charCodeToGlyphId` for TrueType fonts Note that all standard Encodings have the same length (i.e. `256` elements) and that missing entries are always represented by empty strings, hence why a separate exists-check isn't necessary in the `baseEncoding` case.	2021-06-14 15:05:14 +02:00
Jonas Jenwald	3660aaac85	Tweak `adjustToUnicode` to allow extending a built-in /ToUnicode map This is somewhat similiar to the recent changes, in PR 13277, for fonts with an /Encoding entry. Currently we're completely ignoring the `builtInEncoding`, from the font data itself, for fonts which have a built-in /ToUnicode map. While it (obviously) doesn't seem like a good idea in general to simply overwrite existing built-in /ToUnicode entries, it should however not hurt to use the `builtInEncoding` to supplement missing /ToUnicode entires.	2021-06-14 15:05:14 +02:00
Calixte Denizet	fd1110adb4	Add the possibility to rescale each glyph in a font - a lot of xfa files are using Myriad pro or Arial fonts without embedding them and some containers have some dimensions based on those font metrics. So not having the exact same font leads to a wrong display. - since it's pretty hard to find a replacement font with the exact same metrics, this patch gives the possibility to read glyf table, rescale each glyph and then write a new table. - so once PR #12726 is merged we could rescale for example Helvetica to replace Myriad Pro.	2021-06-09 16:01:13 +02:00
Jonas Jenwald	e7dc822e74	Merge pull request #12726 from brendandahl/standard-fonts [api-minor] Include and use the 14 standard font files.	2021-06-08 10:09:40 +02:00
Brendan Dahl	4c1dd47e65	Include and use the 14 standard fonts files.	2021-06-07 11:10:11 -07:00
Brendan Dahl	17f1857556	Add more info for showText operator in stepper. Adds a table that shows original char code, font char code, and unicode.	2021-06-04 13:58:05 -07:00
Tim van der Meij	0d56b1c365	Merge pull request #13443 from Snuffleupagus/charsCache Re-factor the `charsCache` on `Font`-instances	2021-05-28 21:29:57 +02:00
Calixte Denizet	0c698346b8	Fix Postscript name in font to avoid bug when saving in pdf - for xfa rendering, fonts are loaded and used in html; - when printed and saved in pdf, on linux, Firefox uses cairo backend - when subsetting a font, cairo uses the font postscript name and when this one is empty that leads to a bug (the append at `63f0d62684/src/cairo-cff-subset.c (L2049)` is failing because of null length) - so this patch adds a postscript name to the font to make cairo happy.	2021-05-27 12:45:40 +02:00
Jonas Jenwald	8b1d01816b	Re-factor the `charsCache` on `Font`-instances Currently `charsCache` is initialized lazily, which considering that it just contains a simple `Object` doesn't seem entirely necessary. This first of all forces us to do repeated exists-checks in the `Font.charsToGlyphs` method, and secondly the similar/related `glyphCache` is already initialized eagerly. Furthermore, this patch also does a bit of clean-up in the `Font.charsToGlyphs` method since this code is quite old.	2021-05-26 13:13:44 +02:00
Jonas Jenwald	1a8d05fdcf	Remove some, with Prettier `2.3.0`, unnecessary `// prettier-ignore` comments To get the maximum benefit from something like Prettier, you obviously don't want to disable the automatic formatting unless absolutely necessary. When we added Prettier there were a number of cases, mostly involving larger Arrays, which required disabling of the automatic formatting for overall readability and/or to not break inline comments. With changes in Prettier version `2.3.0`, see [the release notes](https://prettier.io/blog/2021/05/09/2.3.0.html#concise-formatting-of-number-only-arrays-10106httpsgithubcomprettierprettierpull10106-10160httpsgithubcomprettierprettierpull10160-by-thorn0httpsgithubcomthorn0), there's now better formatting support for Arrays containing only numbers. Hence we can now remove a number of `// prettier-ignore` comments, and thus get the benefit of automatic formatting in (slightly) more of the code-base.	2021-05-19 11:36:03 +02:00
Calixte Denizet	a74d19262a	XFA - Don't move glyphes in private area with non-truetype fonts - it has been done in PR #13146 but only for truetype fonts.	2021-05-17 16:52:39 +02:00
Jonas Jenwald	8943bcd3c3	Account for formatting changes in Prettier version `2.3.0` With the exception of one tweaked `eslint-disable` comment, in `web/generic_scripting.js`, this patch was generated automatically using `gulp lint --fix`. Please find additional information at: - https://github.com/prettier/prettier/releases/tag/2.3.0 - https://prettier.io/blog/2021/05/09/2.3.0.html	2021-05-16 11:44:05 +02:00
Jonas Jenwald	b487edd05d	Convert `src/core/fonts.js` to use standard classes Obviously the `Font`-class is still very large, given particularly how TrueType fonts are handled, however this patch-series at least improves things by moving a number of functions/classes into their own files. As a follow-up it might make sense to try and re-factor/extract the TrueType parsing into its own file, since all of this code is quite old, however that's probably best left for another time. For e.g. `gulp mozcentral`, the built `pdf.worker.js` files decreases from `1 620 332` to `1 617 466` bytes with this patch-series.	2021-05-03 13:57:25 +02:00
Jonas Jenwald	cadc20d8b9	Fix the remaining `no-var` failures, which couldn't be handled automatically, in the `src/core/fonts.js` file	2021-05-02 21:00:29 +02:00
Jonas Jenwald	b9cd080c01	Enable the `no-var` rule in the `src/core/fonts.js` file These changes were made automatically, using `gulp lint --fix`. Given the large size of this patch, the manual fixes are done separately in the next commit.	2021-05-02 21:00:29 +02:00
Jonas Jenwald	ff85bcfc0e	Move the `Type1Font` from `src/core/fonts.js` and into its own file	2021-05-02 21:00:29 +02:00
Jonas Jenwald	d5d73e3168	Move the `CFFFont` from `src/core/fonts.js` and into its own file	2021-05-02 21:00:29 +02:00
Jonas Jenwald	77b258440b	Move some constants and helper functions `from src/core/fonts.js` and into their own file - `FontFlags`, is used in both `src/core/fonts.js` and `src/core/evaluator.js`. - `getFontType`, same as the above. - `MacStandardGlyphOrdering`, is a fairly large data-structure and `src/core/fonts.js` is already a very large file. - `recoverGlyphName`, a dependency of `type1FontGlyphMapping`; please see below. - `SEAC_ANALYSIS_ENABLED`, is used by both `Type1Font`, `CFFFont`, and unit-tests; please see below. - `type1FontGlyphMapping`, is used by both `Type1Font` and `CFFFont` which a later patch will move to their own files.	2021-05-02 21:00:29 +02:00
Jonas Jenwald	6912bb5e0a	Move the `IdentityToUnicodeMap`/`ToUnicodeMap` from `src/core/fonts.js` and into its own file	2021-05-02 21:00:29 +02:00

1 2 3 4 5 ...