Commit Graph

148 Commits

Author SHA1 Message Date
Brendan Dahl
3eaeacfe19 Merge pull request #6476 from Snuffleupagus/PartialEvaluator_readToUnicode-cmap-length
Right-size the `map` array in PartialEvaluator_readToUnicode
2015-10-09 10:31:28 -07:00
Jonas Jenwald
1b8cb52555 Prevent PartialEvaluator_buildFormXObject from failing if the Matrix or BBox contains indirect objects
This patch fixes yet another instance of bad PDF data, specifically a case where the `BBox` array contains indirect objects (i.e. `Ref`s).

Fixes the missing image in http://www.int.washington.edu/talks/WorkShops/int_08_37W/People/Franz_M/Franz.pdf#page=24. *Note:* There are missing images on a number of the pages in that file.
2015-09-29 10:11:49 +02:00
Jonas Jenwald
8d831449ab Right-size the map array in PartialEvaluator_readToUnicode
We can avoid a lot of intermediate resizings, by directly allocating the required number of elements for the `map` array.
2015-09-24 13:08:53 +02:00
Fabian Lange
2564827503 Fix text spacing with vertical fonts (#6387)
According to the PDF spec 5.3.2, a positive value means in horizontal,
that the next glyph is further to the left (so narrower), and in
vertical that it is further down (so wider).
This change fixes the way PDF.js has interpreted the value.
2015-09-15 09:28:45 +02:00
Rob Wu
b0a8c0fa40 cmaps: Use cmap.forEach instead of Array.forEach
CMaps may be sparse. Array.prototype.forEach is terribly slow in Chrome
(and also in Firefox) when the sparse array contains a key with a high
value. E.g.

    console.time('forEach sparse')
    var a = [];
    a[0xFFFFFF] = 1;
    a.forEach(function(){});
    console.timeEnd('forEach sparse');

    // Chrome: 2890ms
    // Firefox: 1345ms

Switching to CMap.prototype.forEach, which is optimized for such
scenarios fixes the problem.
2015-08-08 13:30:30 +02:00
Jonas Jenwald
46a8485db4 Ignore paint form XObject when the name is missing (issue 4558)
Fixes 4558 (since the font issues already appear to be fixed).
2015-06-22 22:10:26 +02:00
Tim van der Meij
90982332bf Merge pull request #5995 from CodingFabian/tweak-char-spacing-text-selection
Apply char spacing only when there are chars.
2015-05-14 20:06:22 +02:00
Fabian Lange
c2013094e7 Apply char spacing only when there are chars. 2015-05-13 23:45:20 +02:00
Tim van der Meij
b34366d2fc Merge pull request #5898 from stri8ed/master
Extract more accurate glyph heights from type3 fonts
2015-05-13 21:07:17 +02:00
Jonas Jenwald
6d2d854f65 Merge pull request #5815 from Snuffleupagus/type1-diff-refs
Ensure that entries in the Differences array of Type1 fonts are either numbers or names
2015-05-07 22:33:23 +02:00
Brendan Dahl
cd53cbe7d4 Merge pull request #5964 from Snuffleupagus/bug-1157493
Handle the Encoding being a dictionary in PartialEvaluator_preEvaluateFont (bug 1157493)
2015-05-05 14:41:32 -07:00
Jonas Jenwald
760222cf0b Handle the Encoding being a dictionary in PartialEvaluator_preEvaluateFont (bug 1157493)
*This is a regression from PR 4423.*

Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1157493.
2015-04-25 16:48:14 +02:00
Jonas Jenwald
7c7d05e7a3 Attempt to infer if a CMap file actually contains just a standard Identity-H/Identity-V map 2015-04-25 11:28:33 +02:00
Jonas Jenwald
4c2ad3bc7b Ensure that entries in the Differences array of Type1 fonts are either numbers or names
This patch is yet another installment in the (never ending) series of bugs in PDF files with non-embedded fonts.

Fixes http://www.int.washington.edu/talks/WorkShops/int_08_37W/People/Franz_M/Franz.pdf.
2015-04-17 20:32:27 +02:00
Levi Melamed
a5159a7942 extract more accurate glpyh heights from type-3 fonts 2015-04-03 08:49:06 -05:00
Brendan Dahl
3a8d4a7d72 Merge pull request #5713 from Snuffleupagus/evaluator-IdentityToUnicodeMap
Create a IdentityToUnicodeMap in evaluator.js when toUnicode contains IdentityH/IdentityV
2015-03-25 10:33:29 -07:00
Hengjie
109d67691c Lower threshold
Fixes text selection formatting with https://github.com/vortext/vortext/blob/master/resources/public/examples/TestDocument3.pdf
2015-02-13 22:27:49 -08:00
Jonas Jenwald
f19a1db414 Create a IdentityToUnicodeMap in evaluator.js when toUnicode contains IdentityH/IdentityV
Currently if a font contains a `toUnicode` entry, we always create a new `ToUnicodeMap` in evaluator.js. This is done even for `IdentityV/IdentityH`, despite to possibility to use the much more compact `IdentityToUnicodeMap` representation.
This patch refactors the `IdentityH/IdentityV` cases, to:
 - Avoid calling `IdentityCMap.getMap`, since this prevents allocating and iterating through an array with 65536 elements.

 - Ensure that the handling of `toUnicode` is actually correct in fonts.js.
We rely on `toUnicode instanceof IdentityToUnicodeMap` in a few places, and currently this does not work correctly for `IdentityH/IdentityV`.
2015-02-09 16:52:31 +01:00
Yury Delendik
f5df30f967 Merge pull request #5445 from CodingFabian/fixImageCachingInParser
Fixes caching of inline images during parsing.
2014-12-15 10:51:23 -06:00
palkan
4764c52b5b fix passing null as Promise's onFullfilled (which is broken in Chrome 32) 2014-11-25 16:40:27 +04:00
Fabian Lange
970c048d50 fixes caching of inline images during parsing.
As described in #5444, the evaluator will perform identity checking of
paintImageMaskXObjects to decide if it can use
paintImageMaskXObjectRepeat instead of paintImageMaskXObjectGroup.

This can only ever work if the entry is a cache hit. However the
previous caching implementation was doing a lazy caching, which would
only consider a image cache worthy if it is repeated.
Only then the repeated instance would be cached.
As a result of this the sequence of identical images A1 A2 A3 A4 would
be seen as A1 A2 A2 A2 by the evaluator, which prevents using the
"repeat" optimization. Also only the last encountered image is cached,
so A1 B1 A2 B2, would stay A1 B1 A2 B2.

The new implementation drops the "lazy" init of the cache. The threshold
for enabling an image to be cached is rather small, so the potential waste
in storage and adler32 calculation is rather low. It also caches any
eligible image by its adler32.

The two example from above would now be A1 A1 A1 A1 and A1 B1 A1 B1
which not only saves temporary storage, but also prevents computing
identical masks over and over again (which is the main performance impact
of #2618)
2014-10-28 15:39:41 +01:00
Jonas Jenwald
4bda6ba1b8 Add basic support for ZapfDingbats 2014-09-03 21:54:04 +02:00
Nicholas Nethercote
96b9af68dd Remove setGStateForKey() closure.
setGStateForKey() is a closure that serves no particularly useful
purpose. This change inlines it at the single call site. This avoids 1.7
MiB of allocations (because closures are objects) for the MTA map
mentioned in https://bugzilla.mozilla.org/show_bug.cgi?id=835380#c17.
2014-08-17 22:21:45 -07:00
Yury Delendik
e53a28c996 Merge pull request #5192 from nnethercote/empty-setGState
Ignore setGState no-ops.
2014-08-15 10:20:14 -05:00
Nicholas Nethercote
9674abc542 Ignore setGState no-ops.
For the document in #2504, 11% of the ops are `setGState` with a
`gStateObj` that is an empty array, which is a no-op. This is possible
because we ignore various setGState keys (OP, OPM, BG, etc).

This change prevents these ops from being inserted into the operator
list.
2014-08-14 20:46:28 -07:00
Nicholas Nethercote
7cbd057deb Avoid unnecessary array allocations in EvaluatorPreprocessor_read().
EvaluatorPreprocessor_read() is called in two cases. For the normal
layer, the args array it produces is used beyond the bounds of the loop
in which EvaluatorPreprocessor_read() is called.

But for the text layer, the args array is used in a very short-term
fashion. This change reworks things so that a single array is repeatedly
used for the text layer. This reduces total JS allocations for the
Spoorkaart map by 11%, and has similar effects on many other PDFs.
2014-08-11 16:57:40 -07:00
Yury Delendik
4ce1b1e987 Merge pull request #5150 from nnethercote/toUnicode
Fix #4935
2014-08-10 14:07:26 -05:00
Yury Delendik
99b08ed223 Merge pull request #5162 from yurydelendik/pramodhkp-fixupgstate2
[SVG] Reduces amount of used memory during PNG creation.
2014-08-09 15:56:11 -05:00
pramodhkp
458b69b649 Adds image and mask features, fixes clippath 2014-08-10 01:06:43 +05:30
Jonas Jenwald
66c56ac546 Fixes a regression from PR 4982
After PR 4982, the rendering of the first two pages of http://www.openmagazin.cz/pdf/2011/openMagazin-2011-04.pdf (from issue 215) no longer completes.

The issue is that we cannot have `args === null` in `PartialEvaluator_buildPath`, but *must* use an empty array instead.

In this patch I've also moved the `argsLength` variable definition in `EvaluatorPreprocessor_read`, to make sure that it's always defined.
2014-08-08 13:19:18 +02:00
Nicholas Nethercote
9576047f0d Add ToUnicodeMap class. 2014-08-07 20:05:24 -07:00
Yury Delendik
2b87ff9286 Merge pull request #5008 from nnethercote/better-QueueOpt
Make QueueOptimizer easier to read.
2014-08-05 16:59:26 -05:00
Jonas Jenwald
87038e44cd Add strict equalities in src/core/evaluator.js 2014-08-01 18:40:10 +02:00
Nicholas Nethercote
b86daed29d Make CMap.map quasi-private.
This makes it easier for the representation to be improved.
2014-07-30 06:26:35 -07:00
Jonas Jenwald
2485f11829 Fix loading of PDF files with invalid or missing Type3 characters (issue 5039) 2014-07-24 15:03:22 +02:00
Nicholas Nethercote
a483c80fc3 Make QueueOptimizer easier to read.
QueueOptimizer is really hard to read. Enough so that it's blocking my
efforts to streamline the representation used for operator lists.

This patch improves its readability in the following ways.

- More descriptive variable names make the sequence checking much clearer,
  as do additional comments.

- The addState() functions now return the index of the first op past the
  sequence, instead of setting context.currentOperation to the last op of
  the sequence.

- The loop in optimize() is clearer.

- The array modification in the fourth addState() function is much clearer
  -- we're just removing trios of ops.

- All four |addState| functions are now more consistent with each other.

I used some debug printfs to find documents where these optimizations are
used and then checked that the number of optimized ops was the same before
and after my changes.
2014-07-03 19:16:31 -07:00
Jonas Jenwald
04975acceb Prevent CMapFactory.create from failing by passing the necessary parameters from PartialEvaluator_readToUnicode (issue 5010) 2014-06-27 00:46:16 +02:00
Yury Delendik
6d5a04149b Merge pull request #4993 from pramodhkp/rectelmnt
Combine re element into constructPath
2014-06-24 09:27:21 -05:00
pramodhkp
8407d28c9e Combine re element into constructPath 2014-06-25 00:27:42 +05:30
Fabian Lange
60f67c3961 Restructured EvaluatorPreprocessor_read to be more natural. 2014-06-23 23:35:25 +02:00
Nicholas Nethercote
081866a184 Use null instead of [] for ops with no args.
This reduces peak RSS on one test file from ~600 to ~560 MiB.
2014-06-22 16:03:48 -07:00
Yury Delendik
b557b87fc9 Merge pull request #4972 from nnethercote/preprocessor-read
Avoid allocating return object in EvaluatorPreprocessor_read().
2014-06-18 22:00:31 -05:00
Nicholas Nethercote
17170af3c7 Avoid allocating return object in EvaluatorPreprocessor_read().
This function can be called 100s of 1000s or even millions of times, and the
allocated return object accounts for 10% of all GC thing allocations for some
documents. It's easy to avoid, which reduces stress on the garbage collector,
and this patch does that.
2014-06-18 16:41:29 -07:00
Nicholas Nethercote
bce7601480 Build up textChunk.str more efficiently.
PartialEvaluator.getTextContent() builds up textChunk strings 1 char at a time,
creating many 100s of 1000s of intermediate strings along the way. This patch
make it instead push chars to an array and then join them at the end, as we
have done in numerous other places.
2014-06-18 07:48:22 -07:00
Yury Delendik
5a2e511cbd Merge pull request #4955 from timvandermeij/rename-concatenate
Renames concatenateToArray to appendToArray
2014-06-17 08:21:47 -05:00
Yury Delendik
0cd28ebfa3 Telemetry for used stream and font types 2014-06-16 16:41:04 -05:00
Tim van der Meij
9c072a5d4b Renames concatenateToArray to appendToArray 2014-06-16 22:10:10 +02:00
Nicholas Nethercote
7923eb7edb Fix mishandling of incomplete, inverted masks. 2014-06-13 06:14:52 -07:00
Jonas Jenwald
c0250e16e3 Return ErrorFont in loadFont when the fontRef is undefined 2014-06-12 12:46:39 +02:00
Jonas Jenwald
7802a7ab97 Handle cases where the fontName contains non-alphanumeric characters (issue 4909) 2014-06-10 17:25:49 +02:00