Commit Graph

123 Commits

Author SHA1 Message Date
Yury Delendik
31ae5f2a3d Merge pull request #5379 from brendandahl/nbsp
Don't map glyphs to unicode non breaking space.
2014-12-18 13:38:03 -06:00
Jonas Jenwald
96a77e9d6a Add basic support for non-embedded Wingdings fonts
This is a tentative patch that adds *very* basic support for non-embedded Wingdings fonts (a Windows version of Dingbats), by falling back to the ZapfDingbats encoding. Obviously this approach will not work perfectly, but in my opinion it seems to work reasonably well in pratice.

Instead of this very simple patch, another option would be to try and include more complete glyph data for Wingdings, e.g. a Unicode map and glyph widths, similar to what was done for ZapfDingbats.
However there is, in my opinion, one important difference between Wingdings and ZapfDingbats: ZapfDingbats is part of the 14 standard fonts, which in previous versions of the PDF specification was assumed to be available in PDF readers. To improve compatibility with older files, it thus makes sense for us to include data for ZapfDingbats.
However Wingdings has never been a standard font in PDF files, hence PDF files using it *should* thus contain all the necessary font data.

Given the above, I thus believe that it should be OK to fall back to ZapfDingbats for now. If non-embedded Wingdings fonts turns out to be *a lot* more common, then we can revisit this later.

Fixes 4301 completely.
Fixes 4837 almost completely. With this patch the bullets are displayed correctly, but the arrows are not of the correct type.
Fixes `artofwar.pdf`, pages 14 and 15.
2014-12-09 00:28:22 +01:00
Brendan Dahl
8a536ac346 Map missing glyphs in encoding to notdef glyph. 2014-10-03 12:11:20 -07:00
Brendan Dahl
2fc5e6a9ad Don't map glyphs to unicode non breaking space. 2014-10-02 10:58:56 -07:00
Jonas Jenwald
df2a4afd36 Use |toUnicode| when creating the glyph map for standard CIDFontType2 fonts without embedded font file 2014-09-27 13:20:04 +02:00
Yury Delendik
744c8e8d7e Merge pull request #5250 from Snuffleupagus/issue-5238
Fix Symbol fonts without font file but with Encoding dictionary (issue 5238)
2014-09-26 15:18:33 -05:00
Jonas Jenwald
3c759e296a Add support for MMType1 fonts with embedded font files 2014-09-18 16:10:46 +02:00
Jonas Jenwald
b16c973d9d Fix Symbol fonts without font file but with Encoding dictionary (issue 5238) 2014-09-16 21:38:53 +02:00
Yury Delendik
15681adbb9 Merge pull request #5245 from Snuffleupagus/issue-5244
Further amend GlyphMapForStandardFonts (issue 5244)
2014-09-16 10:12:07 -05:00
Brendan Dahl
403b7df6e7 Merge pull request #5233 from Snuffleupagus/bug-1057544
Workaround for TrueType fonts with exotic cmap tables (bug 1057544)
2014-09-15 14:47:31 -07:00
Jonas Jenwald
7b3f222787 Add |SpecialPUASymbols| map and refactor |mapSpecialUnicodeValues| 2014-09-04 13:41:15 +02:00
Jonas Jenwald
2d5596172c Add more cases to |mapSpecialUnicodeValues| to fix the rendering of various Symbol encoded brackets 2014-09-04 12:40:15 +02:00
Jonas Jenwald
4bda6ba1b8 Add basic support for ZapfDingbats 2014-09-03 21:54:04 +02:00
Jonas Jenwald
be595d0721 Further amend GlyphMapForStandardFonts (issue 5244) 2014-09-01 10:51:22 +02:00
Jonas Jenwald
cc8710acbf Workaround for TrueType fonts with exotic cmap tables (bug 1057544) 2014-08-23 11:27:41 +02:00
Jonas Jenwald
ae896fc071 Avoid creating intermediate strings in sanitizeMetrics
This patch avoids creating many intermediate strings, when adding dummy width/lsb entries for glyphs where those are missing.
For the relevant PDF files in our test suite, the average number of intermediate strings are well over 1000.
2014-08-20 23:55:57 +02:00
Yury Delendik
a2c2f81167 Use cff glyph width in the hmtx table 2014-08-14 16:11:09 -05:00
Yury Delendik
0ad323f621 Adds width at the beginning of the Type2 charstring 2014-08-13 21:15:40 -05:00
Nicholas Nethercote
61e6b576d4 Avoid an allocation in readCharCode().
readCharCode() returns two values, and currently allocates a length-2
array on every call to do so. This change makes it instead us a
passed-in object which can be reused.

This tiny change reduces the total JS allocations done for the document
in Mozilla bug 992125 by 4.2%.
2014-08-12 16:12:58 -07:00
Yury Delendik
ab8270ae3a Fixes searchRange calculation 2014-08-10 14:11:04 -05:00
Yury Delendik
42771159ca Removes stringToArray 2014-08-10 14:11:04 -05:00
Yury Delendik
350556f085 Removes bytesToString/stringToArray conversions in the font.js 2014-08-10 14:11:04 -05:00
Nicholas Nethercote
f82977caf9 Simplify isIdentityUnicode detection. 2014-08-08 02:02:42 -07:00
Nicholas Nethercote
6c8cca1284 Add IdentityToUnicodeMap class.
When loading the PDF from issue #4935, this change reduces peak RSS from
~2400 to ~300 MiB, and improves overall speed by ~81%, from 6336 ms to
1222 ms.
2014-08-07 20:45:11 -07:00
Nicholas Nethercote
9576047f0d Add ToUnicodeMap class. 2014-08-07 20:05:24 -07:00
Yury Delendik
46a9a35ddc Merge pull request #5071 from nnethercote/font-savings
Optimize a font-heavy document
2014-08-05 18:57:46 -05:00
Yury Delendik
fa53fcbf57 Merge pull request #5095 from Snuffleupagus/issue-5070
Adjust the heuristics to recognize more cases of unknown glyphs for |toUnicode| (issue 5070)
2014-08-05 17:41:38 -05:00
Yury Delendik
6865c284a7 Merge pull request #5111 from nnethercote/better-cidchars
Represent cid chars using integers, not strings.
2014-08-04 22:26:55 -05:00
Jonas Jenwald
8ecbb4da05 Adjust the heuristics to recognize more cases of unknown glyphs for |toUnicode| (issue 5070) 2014-08-03 21:18:23 +02:00
Jonas Jenwald
b918df3547 Re-factor heuristics to recognize unknown glyphs for |toUnicode| 2014-08-03 21:12:36 +02:00
Jonas Jenwald
97b3eadbc4 Add strict equalities in src/core/fonts.js 2014-08-01 21:56:03 +02:00
Nicholas Nethercote
adf58ed687 Represent cid chars using integers, not strings.
cid chars are 16-bit unsigned integers. Currently we convert them to
single-char strings when inserting them into the CMap, and then convert
them back to integers when extracting them from the CMap. This patch
changes CMap so that cid chars stay in integer format throughout, saving
both time and space.

When loading the PDF from issue #4580, this change reduces peak RSS from
~600 to ~370 MiB. It also improves overall speed on that PDF by ~26%,
going from 724 ms to 533 ms.
2014-08-01 02:35:17 -07:00
Nicholas Nethercote
b86daed29d Make CMap.map quasi-private.
This makes it easier for the representation to be improved.
2014-07-30 06:26:35 -07:00
Jonas Jenwald
c3c72948b9 Stop including cidmaps.js
In b5b94a4af3, i.e. PR #4259, we stopped using cidmaps.js. Despite that, it's still included when PDF.js is built. At almost 0.5 MB (and approx. 7000 lines), this is currently the single largest file in the codebase.
Including such a large file in the builds, when it is not actually used, seems extremely wasteful; hence this patch.
2014-07-25 21:53:09 +02:00
Tim van der Meij
62e6265fb3 Merge pull request #5074 from nnethercote/readPostScriptTable-join
Use Array.join to build up strings in readPostScriptTable().
2014-07-25 21:26:54 +02:00
Nicholas Nethercote
1039791472 Use Array.join to build up strings in readPostScriptTable().
This avoids about 5 MiB of string allocations on one test case.
2014-07-24 16:12:08 -07:00
Nicholas Nethercote
c7f02d2c8e Minimize memory usage of font-related arrays.
This patch replaces some vanilla arrays with typed arrays, and avoids
some array copying.

It reduces the peak RSS when viewing
http://www.dynacw.co.jp/Portals/3/fontsamplepdf/sample_4942546800828.pdf
from ~940 MiB to ~750 MiB, and reduces its load time from 83 to 76 ms.
2014-07-22 22:47:45 -07:00
Jonas Jenwald
f13c217b25 Fix another seac regression (issue 4801) 2014-07-22 21:44:13 +02:00
Jonas Jenwald
a7c786775d [CIDFontType2] Map characters missing in toUnicode to the private use area (bug 1028735 and issue 4881) 2014-07-05 00:18:51 +02:00
Jonas Jenwald
c5f4051a75 A few small optimizations of adjustMapping
Replace a couple of |in| checks with comparisons against undefined.
2014-06-27 00:59:42 +02:00
Jonas Jenwald
c121def806 A few small optimizations for CIDFontType2 fonts
Cache a constant length and replace one usage of |in| with a comparison against undefined.
2014-06-27 00:52:54 +02:00
Yury Delendik
10db93be29 Merge pull request #4980 from Snuffleupagus/bug-1027533
Additional heuristics to recognize unknown glyphs for toUnicode (bug 1027533)
2014-06-23 21:56:13 -05:00
Yury Delendik
c28839b2f3 Merge pull request #4944 from Snuffleupagus/issue-4934
Don't blindly trust toUnicode when building toFontChar for non-standard fonts without a font file (issue 4934)
2014-06-23 21:49:24 -05:00
Jonas Jenwald
b19bb74813 Additional heuristics to recognize unknown glyphs for toUnicode (bug 1027533) 2014-06-20 09:57:16 +02:00
Yury Delendik
0cd28ebfa3 Telemetry for used stream and font types 2014-06-16 16:41:04 -05:00
Jonas Jenwald
158790981c Don't blindly trust toUnicode when building toFontChar for non-standard fonts without a font file (issue 4934) 2014-06-14 22:59:08 +02:00
Jonas Jenwald
3c5dedf60d Prevent font error when no preferred cmap table is found (workaround for issue 4800) 2014-05-27 17:30:11 +02:00
Yury Delendik
e5a0d89da9 Refactors loadFont for translateFont be async; fixes type3 dup data 2014-05-19 16:27:54 -05:00
Jonas Jenwald
3e1db41ddd Fix loading of fonts with empty font files (bug 866395 and issue 3522) 2014-05-18 21:41:06 +02:00
Jonas Jenwald
0fa154be4e Amend GlyphMapForStandardFonts to fix issue 4276 2014-04-30 15:56:40 +02:00