For the document in #2504, 11% of the ops are `setGState` with a
`gStateObj` that is an empty array, which is a no-op. This is possible
because we ignore various setGState keys (OP, OPM, BG, etc).
This change prevents these ops from being inserted into the operator
list.
This allows the JS engine to do a better job of allocating the right
number of elements for the array, avoiding some resizings. For the PDF
in #2504, this avoids 100s of MiBs of allocations in Firefox.
makeInlineImage() has a "are the next five chars ASCII?" check which is
run after an "EI" sequence has been found. This check involves the
creation of a new object because peekBytes() calls subarray().
Unfortunately, the check is currently run on whitespace chars even when
an "EI" sequence has not yet been found, i.e. when it's not needed. For
the PDF in #2618, there are over 820,000 such checks.
This change reworks the relevant loop so that the check is only done
once an "EI" sequence has been seen. This reduces the number of checks
to 157,000, and speeds up rendering by somewhere between 2% and 7% (the
measurements are noisy).
The `console.log` statement in evaluator_spec.js is obviously not needed. In obj.js it could have been replaced by `info`, but that seemed unnecessary given the already existing `error`.
readCharCode() returns two values, and currently allocates a length-2
array on every call to do so. This change makes it instead us a
passed-in object which can be reused.
This tiny change reduces the total JS allocations done for the document
in Mozilla bug 992125 by 4.2%.
EvaluatorPreprocessor_read() is called in two cases. For the normal
layer, the args array it produces is used beyond the bounds of the loop
in which EvaluatorPreprocessor_read() is called.
But for the text layer, the args array is used in a very short-term
fashion. This change reworks things so that a single array is repeatedly
used for the text layer. This reduces total JS allocations for the
Spoorkaart map by 11%, and has similar effects on many other PDFs.
After PR 4982, the rendering of the first two pages of http://www.openmagazin.cz/pdf/2011/openMagazin-2011-04.pdf (from issue 215) no longer completes.
The issue is that we cannot have `args === null` in `PartialEvaluator_buildPath`, but *must* use an empty array instead.
In this patch I've also moved the `argsLength` variable definition in `EvaluatorPreprocessor_read`, to make sure that it's always defined.
When loading the PDF from issue #4935, this change reduces peak RSS from
~2400 to ~300 MiB, and improves overall speed by ~81%, from 6336 ms to
1222 ms.
IdentityCMap uses an array to represent a 16-bit unsigned identity
function. This is very space-inefficient, and some files cause multiple
IdentityCMaps to be instantiated (e.g. the one from #4580 has 74).
This patch make the representation implicit.
When loading the PDF from issue #4580, this change reduces peak RSS from
~370 to ~280 MiB. It also improves overall speed on that PDF by ~30%,
going from 522 ms to 366 ms.
cid chars are 16-bit unsigned integers. Currently we convert them to
single-char strings when inserting them into the CMap, and then convert
them back to integers when extracting them from the CMap. This patch
changes CMap so that cid chars stay in integer format throughout, saving
both time and space.
When loading the PDF from issue #4580, this change reduces peak RSS from
~600 to ~370 MiB. It also improves overall speed on that PDF by ~26%,
going from 724 ms to 533 ms.
This change avoids the element stringification caused by for..in for the
vast majority of CMaps.
When loading the PDF from issue #4580, this change reduces peak RSS from ~650
to ~600 MiB, and improves overall speed by ~20%, from 902 ms to 713 ms. Other
CMap-heavy documents will also see improvements.
This lets the JS engine resize the array elements buffer immediately,
thus avoiding some intermediate resizings. This can save multiple MiBs
of reallocation in text-heavy files.
In b5b94a4af3, i.e. PR #4259, we stopped using cidmaps.js. Despite that, it's still included when PDF.js is built. At almost 0.5 MB (and approx. 7000 lines), this is currently the single largest file in the codebase.
Including such a large file in the builds, when it is not actually used, seems extremely wasteful; hence this patch.