Currently when an exception is thrown, we try to reject `workerReadyCapability` with multiple arguments in src/core/api.js. This obviously doesn't work, hence this patch changes that to instead reject with the exception object as is.
In src/core/worker.js the exception is currently (unncessarily) wrapped in an object, so this patch also simplifies that to directly send the exception object instead.
This patch avoids creating many intermediate strings, when adding dummy width/lsb entries for glyphs where those are missing.
For the relevant PDF files in our test suite, the average number of intermediate strings are well over 1000.
The scanned, black-and-white document at
https://bugzilla.mozilla.org/show_bug.cgi?id=835380 doesn't benefit from
the critical GRAYSCALE_1BPP optimization because the optimization is
skipped if `needsDecode` is set.
This change addresses that, and reduces both rendering time and memory
usage for that document by almost 10x.
setGStateForKey() is a closure that serves no particularly useful
purpose. This change inlines it at the single call site. This avoids 1.7
MiB of allocations (because closures are objects) for the MTA map
mentioned in https://bugzilla.mozilla.org/show_bug.cgi?id=835380#c17.
For the document in #2504, 11% of the ops are `setGState` with a
`gStateObj` that is an empty array, which is a no-op. This is possible
because we ignore various setGState keys (OP, OPM, BG, etc).
This change prevents these ops from being inserted into the operator
list.
This allows the JS engine to do a better job of allocating the right
number of elements for the array, avoiding some resizings. For the PDF
in #2504, this avoids 100s of MiBs of allocations in Firefox.
makeInlineImage() has a "are the next five chars ASCII?" check which is
run after an "EI" sequence has been found. This check involves the
creation of a new object because peekBytes() calls subarray().
Unfortunately, the check is currently run on whitespace chars even when
an "EI" sequence has not yet been found, i.e. when it's not needed. For
the PDF in #2618, there are over 820,000 such checks.
This change reworks the relevant loop so that the check is only done
once an "EI" sequence has been seen. This reduces the number of checks
to 157,000, and speeds up rendering by somewhere between 2% and 7% (the
measurements are noisy).
The `console.log` statement in evaluator_spec.js is obviously not needed. In obj.js it could have been replaced by `info`, but that seemed unnecessary given the already existing `error`.
readCharCode() returns two values, and currently allocates a length-2
array on every call to do so. This change makes it instead us a
passed-in object which can be reused.
This tiny change reduces the total JS allocations done for the document
in Mozilla bug 992125 by 4.2%.
EvaluatorPreprocessor_read() is called in two cases. For the normal
layer, the args array it produces is used beyond the bounds of the loop
in which EvaluatorPreprocessor_read() is called.
But for the text layer, the args array is used in a very short-term
fashion. This change reworks things so that a single array is repeatedly
used for the text layer. This reduces total JS allocations for the
Spoorkaart map by 11%, and has similar effects on many other PDFs.
After PR 4982, the rendering of the first two pages of http://www.openmagazin.cz/pdf/2011/openMagazin-2011-04.pdf (from issue 215) no longer completes.
The issue is that we cannot have `args === null` in `PartialEvaluator_buildPath`, but *must* use an empty array instead.
In this patch I've also moved the `argsLength` variable definition in `EvaluatorPreprocessor_read`, to make sure that it's always defined.
When loading the PDF from issue #4935, this change reduces peak RSS from
~2400 to ~300 MiB, and improves overall speed by ~81%, from 6336 ms to
1222 ms.
IdentityCMap uses an array to represent a 16-bit unsigned identity
function. This is very space-inefficient, and some files cause multiple
IdentityCMaps to be instantiated (e.g. the one from #4580 has 74).
This patch make the representation implicit.
When loading the PDF from issue #4580, this change reduces peak RSS from
~370 to ~280 MiB. It also improves overall speed on that PDF by ~30%,
going from 522 ms to 366 ms.
cid chars are 16-bit unsigned integers. Currently we convert them to
single-char strings when inserting them into the CMap, and then convert
them back to integers when extracting them from the CMap. This patch
changes CMap so that cid chars stay in integer format throughout, saving
both time and space.
When loading the PDF from issue #4580, this change reduces peak RSS from
~600 to ~370 MiB. It also improves overall speed on that PDF by ~26%,
going from 724 ms to 533 ms.
This change avoids the element stringification caused by for..in for the
vast majority of CMaps.
When loading the PDF from issue #4580, this change reduces peak RSS from ~650
to ~600 MiB, and improves overall speed by ~20%, from 902 ms to 713 ms. Other
CMap-heavy documents will also see improvements.
This lets the JS engine resize the array elements buffer immediately,
thus avoiding some intermediate resizings. This can save multiple MiBs
of reallocation in text-heavy files.
In b5b94a4af3, i.e. PR #4259, we stopped using cidmaps.js. Despite that, it's still included when PDF.js is built. At almost 0.5 MB (and approx. 7000 lines), this is currently the single largest file in the codebase.
Including such a large file in the builds, when it is not actually used, seems extremely wasteful; hence this patch.
I have a large PDF where this function is called 1.6 million times
during loading. Minimizing the string concatenations reduces the
cumulative allocations done by Firefox within this function from 113 MB
to 48 MB.
QueueOptimizer is really hard to read. Enough so that it's blocking my
efforts to streamline the representation used for operator lists.
This patch improves its readability in the following ways.
- More descriptive variable names make the sequence checking much clearer,
as do additional comments.
- The addState() functions now return the index of the first op past the
sequence, instead of setting context.currentOperation to the last op of
the sequence.
- The loop in optimize() is clearer.
- The array modification in the fourth addState() function is much clearer
-- we're just removing trios of ops.
- All four |addState| functions are now more consistent with each other.
I used some debug printfs to find documents where these optimizations are
used and then checked that the number of optimized ops was the same before
and after my changes.
DecodeStream currently initializes its |buffer| field to |null|, which
is reasonable, because lots of DecodeStreams never need to instantiate a
buffer. But this requires various special cases in the code.
This patch change it so DecodeStreamClosure has a single empty
Uint8Array which gets shared between all buffers upon initialization.
This avoids the special cases.
DecodeStream.prototype.ensureBuffer() is really hot, and this removes a
test from the fast path. For one 226 page scanned document this sped up
rendering by about 2%.
In src/core/obj.js, we convert a Ref to a string to index into a table like
this: 'R1.0'. This conversion is repeated numerous times.
This patch factors out the conversion into a new function.
Ref.prototype.toString().
This function can be called 100s of 1000s or even millions of times, and the
allocated return object accounts for 10% of all GC thing allocations for some
documents. It's easy to avoid, which reduces stress on the garbage collector,
and this patch does that.
PartialEvaluator.getTextContent() builds up textChunk strings 1 char at a time,
creating many 100s of 1000s of intermediate strings along the way. This patch
make it instead push chars to an array and then join them at the end, as we
have done in numerous other places.
This new function is much faster than ensureRange(pos, pos+1), which is a very
common case.
This speeds up the rendering of some test cases (including the Tracemonkey
paper) by 4--5%.