Sakurai/pdf.js - pdf.js - Gitea on kemo

Sakurai/pdf.js

Author	SHA1	Message	Date
Jonas Jenwald	695140728a	[src/core/fonts.js] Improve the `validateOS2Table` function Rather than creating a new `Stream` just to validate the OS/2 TrueType table, it's simpler/better to just pass in a reference to the font data and use that instead (similar to other TrueType helper functions).	2020-04-19 11:25:25 +02:00
Jonas Jenwald	033d27fc25	[src/core/fonts.js] Replace some unnecessary `Stream.getUint16()` calls with `Stream.skip(2)` instead There's a handful of cases in the code where the intention is simply to advance the `Stream` position, but rather than only doing that the code instead fetches/computes a Uint16 value (and without using the result for anything).	2020-04-19 11:18:20 +02:00
Jonas Jenwald	4fae1ac5c4	[src/core/fonts.js] Replace some unnecessary `Stream.getBytes(...)` calls with `Stream.skip(...)` instead There's a handful of cases in the code where the intention is simply to advance the `Stream` position, but rather than only doing that the code instead fetches the bytes in question (and without using the result for anything).	2020-04-19 11:18:15 +02:00
Jonas Jenwald	1cc3dbb694	Enable the `dot-notation` ESLint rule Please note: These changes were done automatically, using the `gulp lint --fix` command. This rule is already enabled in mozilla-central, see https://searchfox.org/mozilla-central/rev/567b68b8ff4b6d607ba34a6f1926873d21a7b4d7/tools/lint/eslint/eslint-plugin-mozilla/lib/configs/recommended.js#103-104 The main advantage, besides improved consistency, of this rule is that it reduces the size of the code (by 3 bytes for each case). In the PDF.js code-base there's close to 8000 instances being fixed by the `dot-notation` ESLint rule, which end up reducing the size of even the built files significantly; the total size of the `gulp mozcentral` build target changes from `3 247 456` to `3 224 278` bytes, which is a reduction of `23 178` bytes (or ~0.7%) for a completely mechanical change. A large number of these changes affect the (large) lookup tables used on the worker-thread, but given that they are still initialized lazily I don't think that the new formatting this patch introduces should undo any of the improvements from PR 6915. Please find additional details about the ESLint rule at https://eslint.org/docs/rules/dot-notation	2020-04-17 12:24:46 +02:00
Jonas Jenwald	426945b480	Update Prettier to version 2.0 Please note that these changes were done automatically, using `gulp lint --fix`. Given that the major version number was increased, there's a fair number of (primarily whitespace) changes; please see https://prettier.io/blog/2020/03/21/2.0.0.html In order to reduce the size of these changes somewhat, this patch maintains the old "arrowParens" style for now (once mozilla-central updates Prettier we can simply choose the same formatting, assuming it will differ here).	2020-04-14 12:28:14 +02:00
Tim van der Meij	70c54ab9d9	Merge pull request #11746 from Snuffleupagus/issue-11740 Create the glyph mapping correctly for composite Type1, i.e. CIDFontType0, fonts (issue 11740)	2020-04-07 00:10:12 +02:00
Jonas Jenwald	2d46230d23	[api-minor] Change `Font.exportData` to, by default, stop exporting properties which are completely unused on the main-thread and/or in the API (PR 11773 follow-up) For years now, the `Font.exportData` method has (because of its previous implementation) been exporting many properties despite them being completely unused on the main-thread and/or in the API. This is unfortunate, since among those properties there's a number of potentially very large data-structures, containing e.g. Arrays and Objects, which thus have to be first structured cloned and then stored on the main-thread. With the changes in this patch, we'll thus by default save memory for every `Font` instance created (there can be a lot in longer documents). The memory savings obviously depends a lot on the actual font data, but some approximate figures are: For non-embedded fonts it can save a couple of kilobytes, for simple embedded fonts a handful of kilobytes, and for composite fonts the size of this auxiliary can even be larger than the actual font program itself. All-in-all, there's no good reason to keep exporting these properties by default when they're unused. However, since we cannot be sure that every property is unused in custom implementations of the PDF.js library, this patch adds a new `getDocument` option (named `fontExtraProperties`) that still allows access to the following properties: - "cMap": An internal data structure, only used with composite fonts and never really intended to be exposed on the main-thread and/or in the API. Note also that the `CMap`/`IdentityCMap` classes are a lot more complex than simple Objects, but only their "internal" properties survive the structured cloning used to send data to the main-thread. Given that CMaps can often be very large, not exporting them can also save a fair bit of memory. - "defaultEncoding": An internal property used with simple fonts, and used when building the glyph mapping on the worker-thread. Considering how complex that topic is, and given that not all font types are handled identically, exposing this on the main-thread and/or in the API most likely isn't useful. - "differences": An internal property used with simple fonts, and used when building the glyph mapping on the worker-thread. Considering how complex that topic is, and given that not all font types are handled identically, exposing this on the main-thread and/or in the API most likely isn't useful. - "isSymbolicFont": An internal property, used during font parsing and building of the glyph mapping on the worker-thread. - "seacMap": An internal map, only potentially used with some Type1/CFF fonts and never intended to be exposed in the API. The existing `Font.{charToGlyph, charToGlyphs}` functionality already takes this data into account when handling text. - "toFontChar": The glyph map, necessary for mapping characters to glyphs in the font, which is built upon the various encoding information contained in the font dictionary and/or font program. This is not directly used on the main-thread and/or in the API. - "toUnicode": The unicode map, necessary for text-extraction to work correctly, which is built upon the ToUnicode/CMap information contained in the font dictionary, but not directly used on the main-thread and/or in the API. - "vmetrics": An array of width data used with fonts which are composite and vertical, but not directly used on the main-thread and/or in the API. - "widths": An array of width data used with most fonts, but not directly used on the main-thread and/or in the API.	2020-04-06 11:47:09 +02:00
Jonas Jenwald	938d519192	Create the glyph mapping correctly for composite Type1, i.e. CIDFontType0, fonts (issue 11740) This updates `Type1Font.getGlyphMapping` with a code-path "borrowed" from `CFFFont.getGlyphMapping`.	2020-04-06 11:21:02 +02:00
Tim van der Meij	0400109b87	Merge pull request #11773 from Snuffleupagus/Font-exportData-1 [api-minor] Change `Font.exportData` to use an explicit white-list of exportable properties, and stop exporting internal/unused properties	2020-04-05 20:50:33 +02:00
Jonas Jenwald	59f54b946d	Ensure that all `Font` instances have the `vertical` property set to a boolean Given that the `vertical` property is always accessed on the main-thread, ensuring that the property is explicitly defined seems like the correct thing to do since it also avoids boolean casting elsewhere in the code-base.	2020-04-05 16:27:50 +02:00
Jonas Jenwald	c5e1fd3fde	Use "standard" shadowing in the `Font.spaceWidth` method With `Font.exportData` now only exporting white-listed properties, there should no longer be any reason to not use standard shadowing in the `Font.spaceWidth` method. Furthermore, considering the amount of other changes to the code-base over the years it's not even clear to me that the special-case was necessary any more (regardless of the preceding patches).	2020-04-05 16:27:50 +02:00
Jonas Jenwald	a5e4cccf13	[api-minor] Prevent `Font.exportData` from exporting internal/unused properties A number of internal font properties, which only make sense on the worker-thread, were previously exported. Some of these properties could also contain potentially large Arrays/Objects, which thus unnecessarily increases memory usage since we're forced to copy these to the main-thread and also store them there. This patch stops exporting the following font properties: - "_shadowWidth": An internal property, which was never intended to be exported. - "charsCache": An internal cache, which was never intended to be exported and doesn't make any sense on the main-thread. Furthermore, by the time `Font.exportData` is called it's usually `undefined` or a mostly empty Object as well. - "cidEncoding": An internal property used with (some) composite fonts. As can be seen in the `PartialEvaluator.translateFont` method, `cidEncoding` will only be assigned a value when the font dictionary has an "Encoding" entry which is a `Name` (and not in the `Stream` case, since those obviously cannot be cloned). All-in-all this property doesn't really make sense on the main-thread and/or in the API, and note also that the resulting `cMap` property is (partially) available already. - "fallbackToUnicode": An internal map, part of the heuristics used to improve text-selection in (some) badly generated PDF documents with simple fonts. This was never intended to be exposed on the main-thread and/or in the API. - "glyphCache": An internal cache, which was never intended to be exported and which doesn't make any sense on the main-thread. Furthermore, by the time `Font.exportData` is called it's usually a mostly empty Object as well. - "isOpenType": An internal property, used only during font parsing on the worker-thread. In the very unlikely event that an API consumer actually needs that information, then `fontType` should be a (generally) much better property to use. Finally, in the (hopefully) unlikely event that any of these properties become necessary on the main-thread, re-adding them to the white-list is easy to do.	2020-04-05 16:27:50 +02:00
Jonas Jenwald	664f7de540	Change `Font.exportData` to use an explicit white-list of exportable properties This patch addresses an existing, and very long standing, TODO in the code such that it's no longer possible to send arbitrary/unnecessary font properties to the main-thread. Furthermore, by having a white-list it's also very easy to see exactly which font properties are being exported. Please note that in its current form, the list of exported properties contains every possible enumerable property that may exist in a `Font` instance. In practice no single font will contain all of these properties, and e.g. embedded/non-embedded/Type3 fonts will all differ slightly with respect to what properties are being defined. Hence why only explicitly set properties are included in the exported data, to avoid half of them being `undefined`, which however should not be a problem for any existing consumer (since they'd already need to handle those cases). Since a fair number of these font properties are completely internal functionality, and doesn't make any sense to expose on the main-thread and/or in the API, follow-up patch(es) will be required to trim down the list. (I purposely included all properties here for brevity and future documentation purposes.)	2020-04-05 16:27:48 +02:00
Jonas Jenwald	87142a635e	Ensure that `Font.charToGlyph` won't fail because `String.fromCodePoint` is given an invalid code point (issue 11768) Please note: This patch on its own is not sufficient to address the underlying problem in the referenced issue, hence why no test-case is included since the actual bug still needs to be fixed. As can be seen in the specification, https://tc39.es/ecma262/#sec-string.fromcodepoint, `String.fromCodePoint` will throw a RangeError for invalid code points. In the event that a CMap, in a composite font, contains invalid data and/or we fail to parse it correctly, it's thus possible that the glyph mapping that we build end up with entires that cause `String.fromCodePoint` to throw and thus `Font.charToGlyph` to break. If that happens, as is the case in issue 11768, significant portions of a page/document may fail to render which seems very unfortunate. While this patch doesn't fix the underlying problem, it's hopefully deemed useful not only for the referenced issue but also to prevent similar bugs in the future.	2020-04-03 09:49:50 +02:00
Jonas Jenwald	14c999e3ee	Remove the unused `sizes` and `encoding` properties on `Font` instances The `sizes` property doesn't appear to have been used ever since the code was first split into main/worker-threads, which is so many years ago that I wasn't able to easily find exactly in which PR/commit it became unused. The `encoding` property is always assigned the `properties.baseEncoding` value, however the `PartialEvaluator` doesn't actually compute/set that value any more. Again it was difficult to determine when it became unused, but it's been that way for years.	2020-03-27 10:12:01 +01:00
Jonas Jenwald	dcb16af968	Whitelist closure related cases to address the remaining `no-shadow` linting errors Given the way that "classes" were previously implemented in PDF.js, using regular functions and closures, there's a fair number of false positives when the `no-shadow` ESLint rule was enabled. Note that while some of these `eslint-disable` statements can be removed if/when the relevant code is converted to proper `class`es, we'll probably never be able to get rid of all of them given our naming/coding conventions (however I don't really see this being a problem).	2020-03-25 11:57:12 +01:00
Jonas Jenwald	216cbca16c	Remove variable shadowing from the JavaScript files in the `src/core/` folder This is part of a series of patches that will try to split PR 11566 into smaller chunks, to make reviewing more feasible. Once all the code has been fixed, we'll be able to eventually enable the ESLint no-shadow rule; see https://eslint.org/docs/rules/no-shadow	2020-03-23 18:28:30 +01:00
Jonas Jenwald	1cd9d5a8fd	Remove the unused `wideChars` property on `Font` instances This property was added in PR 1599 (almost eight years ago), but has been unused ever since PR 3674 (six and a half years ago).	2020-03-20 10:37:32 +01:00
Jonas Jenwald	e011be037e	Enable the `prefer-exponentiation-operator` ESLint rule Please see https://eslint.org/docs/rules/prefer-exponentiation-operator for additional information.	2020-03-19 12:41:25 +01:00
Jonas Jenwald	c5f67300e9	Rename the `isSpace` helper function to `isWhiteSpace` Trying to enable the ESLint rule `no-shadow`, against the `master` branch, would result in a fair number of errors in the `Glyph` class in `src/core/fonts.js`. Since the glyphs are exposed through the API, we can't very well change the `isSpace` property on `Glyph` instances. Thus the best approach seems, at least to me, to simply rename the `isSpace` helper function to `isWhiteSpace` which shouldn't cause any issues given that it's only used in the `src/core/` folder.	2020-03-12 11:36:59 +01:00
Jonas Jenwald	7937165537	Ignore spaces when normalizing the font name in `Font.fallbackToSystemFont` (issue 11578)	2020-02-08 19:59:04 +01:00
Brendan Dahl	09a6e17d22	Merge pull request #11528 from janpe2/type1-nonemb-notdef Hide .notdef glyphs in non-embedded Type1 fonts and don't ignore Widths	2020-02-06 13:30:07 -08:00
Jonas Jenwald	3f031f69c2	Move additional worker-thread only functions from `src/shared/util.js` and into a `src/core/core_utils.js` instead This moves the `log2`, `readInt8`, `readUint16`, `readUint32`, and `isSpace` functions since they are only used in the worker-thread.	2020-01-25 00:33:52 +01:00
Jonas Jenwald	9e262ae7fa	Enable the ESLint `prefer-const` rule globally (PR 11450 follow-up) Please find additional details about the ESLint rule at https://eslint.org/docs/rules/prefer-const With the recent introduction of Prettier this sort of mass enabling of ESLint rules becomes a lot easier, since the code will be automatically reformatted as necessary to account for e.g. changed line lengths. Note that this patch is generated automatically, by using the ESLint `--fix` argument, and will thus require some additional clean-up (which is done separately).	2020-01-25 00:20:22 +01:00
Tim van der Meij	a88dec197f	Merge pull request #11511 from Snuffleupagus/eslint-no-nested-ternary Enable the `no-nested-ternary` ESLint rule (PR 11488 follow-up)	2020-01-22 22:52:59 +01:00
Jonas Jenwald	3b78f4e8f8	Fix a couple of cases where Prettier broke existing formatting (PR 11446 follow-up) These two cases should have been whitelisted prior to re-formatting respectively had the comments fixed afterwards, however I unfortunately missed them because of the massive size of the diff.	2020-01-22 09:12:12 +01:00
Jani Pehkonen	809b96b40c	Hide .notdef glyphs in non-embedded Type1 fonts and don't ignore Widths Fixes #11403 The PDF uses the non-embedded Type1 font Helvetica. Character codes 194 and 160 (`Â` and `NBSP`) are encoded as `.notdef`. We shouldn't show those glyphs because it seems that Acrobat Reader doesn't draw glyphs that are named `.notdef` in fonts like this. In addition to testing `glyphName === ".notdef"`, we must test also `glyphName === ""` because the name `""` is used in `core/encodings.js` for undefined glyphs in encodings like `WinAnsiEncoding`. The solution above hides the `Â` characters but now the replacement character (space) appears to be too wide. I found out that PDF.js ignores font's `Widths` array if the font has no `FontDescriptor` entry. That happens in #11403, so the default widths of Helvetica were used as specified in `core/metrics.js` and `.nodef` got a width of 333. The correct width is 0 as specified by the `Widths` array in the PDF. Thus we must never ignore `Widths`.	2020-01-21 21:35:25 +02:00
Jonas Jenwald	c591826f3b	Enable the `no-nested-ternary` ESLint rule (PR 11488 follow-up) This rule is already enabled in mozilla-central, and helps avoid some confusing formatting, see https://searchfox.org/mozilla-central/rev/9e45d74b956be046e5021a746b0c8912f1c27318/tools/lint/eslint/eslint-plugin-mozilla/lib/configs/recommended.js#209-210 With the recent introduction of Prettier some of the existing nested ternary statements became even more difficult to read, since any possibly helpful indentation was removed. This particular ESLint rule wasn't entirely straightforward to enable, and I do recognize that there's a certain amount of subjectivity in the changes being made. Generally, the changes in this patch fall into three categories: - Cases where a value is only clamped to a certain range (the easiest ones to update). - Cases where the values involved are "simple", such as Numbers and Strings, which are re-factored to initialize the variable with the default value and only update it when necessary by using `if`/`else if` statements. - Cases with more complex and/or larger values, such as TypedArrays, which are re-factored to let the variable be (implicitly) undefined and where all values are then set through `if`/`else if`/`else` statements. Please find additional details about the ESLint rule at https://eslint.org/docs/rules/no-nested-ternary	2020-01-14 17:49:39 +01:00
Jonas Jenwald	36881e3770	Ensure that all `import` and `require` statements, in the entire code-base, have a `.js` file extension In order to eventually get rid of SystemJS and start using native `import`s instead, we'll need to provide "complete" file identifiers since otherwise there'll be MIME type errors when attempting to use `import`.	2020-01-04 13:01:43 +01:00
Jonas Jenwald	a63f7ad486	Fix the linting errors, from the Prettier auto-formatting, that ESLint `--fix` couldn't handle This patch makes the follow changes: - Remove no longer necessary inline `// eslint-disable-...` comments. - Fix `// eslint-disable-...` comments that Prettier moved down, thus causing new linting errors. - Concatenate strings which now fit on just one line. - Fix comments that are now too long. - Finally, and most importantly, adjust comments that Prettier moved down, since the new positions often is confusing or outright wrong.	2019-12-26 12:35:12 +01:00
Jonas Jenwald	de36b2aaba	Enable auto-formatting of the entire code-base using Prettier (issue 11444) Note that Prettier, purposely, has only limited [configuration options](https://prettier.io/docs/en/options.html). The configuration file is based on [the one in `mozilla central`](https://searchfox.org/mozilla-central/source/.prettierrc) with just a few additions (to avoid future breakage if the defaults ever changes). Prettier is being used for a couple of reasons: - To be consistent with `mozilla-central`, where Prettier is already in use across the tree. - To ensure a consistent coding style everywhere, which is automatically enforced during linting (since Prettier is used as an ESLint plugin). This thus ends "all" formatting disussions once and for all, removing the need for review comments on most stylistic matters. Many ESLint options are now redundant, and I've tried my best to remove all the now unnecessary options (but I may have missed some). Note also that since Prettier considers the `printWidth` option as a guide, rather than a hard rule, this patch resorts to a small hack in the ESLint config to ensure that comments won't become too long. Please note: This patch is generated automatically, by appending the `--fix` argument to the ESLint call used in the `gulp lint` task. It will thus require some additional clean-up, which will be done in a separate commit. (On a more personal note, I'll readily admit that some of the changes Prettier makes are extremely ugly. However, in the name of consistency we'll probably have to live with that.)	2019-12-26 12:34:24 +01:00
Jonas Jenwald	8ec1dfde49	Add `// prettier-ignore` comments to prevent re-formatting of certain data structures There's a fair number of (primarily) `Array`s/`TypedArray`s whose formatting we don't want disturb, since in many cases that would lead to the code becoming much more difficult to read and/or break existing inline comments. Please note: It may be a good idea to look through these cases individually, and possibly re-write some of the them (especially the `String` ones) to reduce the need for all of these ignore commands.	2019-12-26 00:14:03 +01:00
Jonas Jenwald	2fcb5afc7b	Add a fallback for non-embedded composite Verdana fonts (issue 11242) Obviously this won't look exactly right, but considering that the PDF file doesn't bother embedding non-standard fonts this is the best that we can do here.	2019-10-17 17:00:55 +02:00
Pedro Luiz Cabral Salomon Prado	4d0c759b7f	Change variable assignment (#11247 ) Remove unused variable assignment in `src/core/fonts.js`	2019-10-16 00:39:25 +02:00
Tim van der Meij	ca3a58f93a	Consistently use `@returns` for returned data types in JSDoc comments Sometimes we also used `@return`, but `@returns` is what the JSDoc documentation recommends. Even though `@return` works as an alias, it's good to use the recommended syntax and to be consistent within the project.	2019-10-13 13:58:17 +02:00
Jonas Jenwald	af22dc9b0c	For Type1 fonts, replace missing font dictionary /Widths entries with ones from the font data (issue 11150) Hopefully this patch makes sense, and in order to reduce the regression risk the implementation ensures that only completely missing widths are being replaced.	2019-09-18 10:15:09 +02:00
Jani Pehkonen	05c527f035	Fix glyph 0 in CIDFontType2 that has a CIDToGIDMap stream	2019-05-07 18:44:37 +03:00
Jani Pehkonen	49c6233fbc	Use CMap in Type0 fonts when CFF is not a CID font	2019-03-26 19:38:44 +02:00
Brendan Dahl	7d6ab081eb	Put the string name of the glyph in the charset array. Also, only warn once per font when missing a glyph name.	2019-03-01 18:03:51 -08:00
Brendan Dahl	34022d2fd1	Merge pull request #10591 from brendandahl/fix-charset Add unique glyph names for CFF fonts.	2019-02-28 17:22:29 -08:00
Brendan Dahl	8a596ef5d5	Add unique glyph names for CFF fonts. Printing on MacOS was broken with the previous approach of just mapping all the glyphs to notdef.	2019-02-27 15:00:29 -08:00
Jonas Jenwald	db5dc14158	Move worker-thread only functions from `src/shared/util.js` and into a new `src/core/core_utils.js` file The `src/shared/util.js` file is being bundled into both the `pdf.js` and `pdf.worker.js` files, meaning that its code is by definition duplicated. Some main-thread only utility functions have already been moved to a separate `src/display/display_utils.js` file, and this patch simply extends that concept to utility functions which are used only on the worker-thread. Note in particular the `getInheritableProperty` function, which expects a `Dict` as input and thus cannot possibly ever be used on the main-thread.	2019-02-24 00:35:39 +01:00
Jonas Jenwald	b6d090cc14	Fallback to the built-in font renderer when font loading fails After PR 9340 all glyphs are now re-mapped to a Private Use Area (PUA) which means that if a font fails to load, for whatever reason[1], all glyphs in the font will now render as Unicode glyph outlines. This obviously doesn't look good, to say the least, and might be seen as a "regression" since previously many glyphs were left in their original positions which provided a slightly better fallback[2]. Hence this patch, which implements a general fallback to the PDF.js built-in font renderer for fonts that fail to load (i.e. are rejected by the sanitizer). One caveat here is that this only works for the Font Loading API, since it's easy to handle errors in that case[3]. The solution implemented in this patch does not in any way delay the loading of valid fonts, which was the problem with my previous attempt at a solution, and will only require a bit of extra work/waiting for those fonts that actually fail to load. Please note: This patch doesn't fix any of the underlying PDF.js font conversion bugs that's responsible for creating corrupt font files, however it does improve rendering in a number of cases; refer to this possibly incomplete list: [Bug 1524888](https://bugzilla.mozilla.org/show_bug.cgi?id=1524888) Issue 10175 Issue 10232 --- [1] Usually because the PDF.js font conversion code wasn't able to parse the font file correctly. [2] Glyphs fell back to some default font, which while not accurate was more useful than the current state. [3] Furthermore I'm not sure how to implement this generally, assuming that's even possible, and don't really have time/interest to look into it either.	2019-02-11 10:27:08 +01:00
Jonas Jenwald	24a688d6c6	Convert some usage of `indexOf` to `startsWith`/`includes` where applicable In many cases in the code you don't actually care about the index itself, but rather just want to know if something exists in a String/Array or if a String starts in a particular way. With modern JavaScript functionality, it's thus possible to remove a number of existing `indexOf` cases.	2019-01-18 17:57:41 +01:00
Brendan Dahl	32eace043b	Fix reading number of HTMX metrics. The length of the HHEA table can be incorrect, so it is better to read the number of metrics offset from beginning of table instead.	2019-01-04 15:13:13 -08:00
Jonas Jenwald	ad3e937816	Replace the `Font.loading` property with, the already existing, `Font.missingFile` property The `Font.loading` property is only ever used once in the code, whereas `Font.missingFile` is more widely used. Furthermore the name `loading` feels, at least to me, slight less clear than `missingFile`. Finally, note that these two properties are the inverse of each other.	2018-09-29 15:57:04 +02:00
Brendan Dahl	b76cf665ec	Map all glyphs to the private use area and duplicate the first glyph. There have been lots of problems with trying to map glyphs to their unicode values. It's more reliable to just use the private use areas so the browser's font renderer doesn't mess with the glyphs. Using the private use area for all glyphs did highlight other issues that this patch also had to fix: * small private use area - Previously, only the BMP private use area was used which can't map many glyphs. Now, the (much bigger) PUP 16 area can also be used. * glyph zero not shown - Browsers will not use the glyph from a font if it is glyph id = 0. This issue was less prevalent when we mapped to unicode values since the fallback font would be used. However, when using the private use area, the glyph would not be drawn at all. This is illustrated in one of the current test cases (issue #8234) where there's an "ä" glyph at position zero. The PDF looked like it rendered correctly, but it was actually not using the glyph from the font. To properly show the first glyph it is always duplicated and appended to the glyphs and the maps are adjusted. * supplementary characters - The private use area PUP 16 is 4 bytes, so String.fromCodePoint must be used where we previously used String.fromCharCode. This is actually an issue that should have been fixed regardless of this patch. * charset - Freetype fails to load fonts when the charset size doesn't match number of glyphs in the font. We now write out a fake charset with the correct length. This also brought up the issue that glyphs with seac/endchar should only ever write a standard charset, but we now write a custom one. To get around this the seac analysis is permanently enabled so those glyphs are instead always drawn as two glyphs.	2018-09-05 14:04:54 -07:00
Jonas Jenwald	06d1ff5af4	Tweak the MMType1 font detection in `getFontFileType` to improve font telemetry (PR 9961 follow-up) Please note that this patch does not affect rendering in any way, however it's relevant for font telemetry[1]. According to the specification, see https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G8.1904956, Type1C is a valid subtype for both Type1 and MMType1 fonts. --- [1] Refer to the font telemetry results in https://telemetry.mozilla.org/new-pipeline/dist.html#!cumulative=0&end_date=2018-06-25&keys=__none__!__none__!__none__&max_channel_version=nightly%252F62&measure=PDF_VIEWER_FONT_TYPES&min_channel_version=nightly%252F59&processType=*&product=Firefox&sanitize=1&sort_keys=submissions&start_date=2018-05-07&table=0&trim=1&use_submission_date=0 See also https://github.com/mozilla/pdf.js/wiki/Enumeration-Assignments-for-the-Telemetry-Histograms#pdf_viewer_font_types for help with interpreting the data.	2018-08-08 12:18:37 +02:00
Tim van der Meij	4111871ac5	Merge pull request #9958 from brendandahl/always-fallback Always fallback to system font on font failure.	2018-08-05 19:58:48 +02:00
Jonas Jenwald	3177f6aa55	Parse the font file to determine the correct type/subtype, rather than relying on the (often incorrect) data in the font dictionary The current font type/subtype detection code is quite inconsistent/unwieldy. In some cases it will simply assume that the font dictionary is correct, in others it will somewhat "arbitrarily" check the actual font file (more of these cases have been added over the years to fix specific bugs). As is evident from e.g. issue 9949, the font type/subtype detection code is continuing to cause issues. In an attempt to get rid of these hacks once and for all, this patch instead re-factors the type/subtype detection to always parse the font file. Please note that, as far as I can tell, we still appear to need to rely on the composite font detection based on the font dictionary. However, even if the composite/non-composite detection would get it wrong, that shouldn't really matter too much given that there's basically only two different code-paths (for "TrueType-like" vs "Type1-like" fonts).	2018-08-05 11:13:16 +02:00

1 2 3 4 5 ...