Commit Graph

529 Commits

Author SHA1 Message Date
Yury Delendik
cc180d7e2b Removes some bind() calls from fetchAsync 2014-08-05 21:22:12 -05:00
Yury Delendik
46a9a35ddc Merge pull request #5071 from nnethercote/font-savings
Optimize a font-heavy document
2014-08-05 18:57:46 -05:00
Yury Delendik
fa53fcbf57 Merge pull request #5095 from Snuffleupagus/issue-5070
Adjust the heuristics to recognize more cases of unknown glyphs for |toUnicode| (issue 5070)
2014-08-05 17:41:38 -05:00
Yury Delendik
2b87ff9286 Merge pull request #5008 from nnethercote/better-QueueOpt
Make QueueOptimizer easier to read.
2014-08-05 16:59:26 -05:00
Jonas Jenwald
cb4a847347 Merge pull request #5134 from yurydelendik/fun4
Improves speed of the functions
2014-08-05 23:51:03 +02:00
Yury Delendik
12b50486de Merge pull request #5136 from timvandermeij/ccitt-lines
Properly set this.eof in CCITTFaxStream
2014-08-05 12:49:50 -05:00
Tim van der Meij
5cc7d23066 Properly set this.eof in CCITTFaxStream 2014-08-05 19:08:00 +02:00
fkaelberer
5b83e0b9a3 Faster JBIG2 bitmap decoding 2014-08-05 16:12:45 +02:00
Nicholas Nethercote
51055e5836 Make IdentityCMaps more compact.
IdentityCMap uses an array to represent a 16-bit unsigned identity
function. This is very space-inefficient, and some files cause multiple
IdentityCMaps to be instantiated (e.g. the one from #4580 has 74).

This patch make the representation implicit.

When loading the PDF from issue #4580, this change reduces peak RSS from
~370 to ~280 MiB. It also improves overall speed on that PDF by ~30%,
going from 522 ms to 366 ms.
2014-08-05 03:01:39 -07:00
Yury Delendik
6865c284a7 Merge pull request #5111 from nnethercote/better-cidchars
Represent cid chars using integers, not strings.
2014-08-04 22:26:55 -05:00
Yury Delendik
f750e35224 Optimizes functions to not create arrays 2014-08-04 11:23:11 -05:00
Yury Delendik
cb81bd6be6 Compiles some of the FunctionType 4 2014-08-04 11:21:31 -05:00
Jonas Jenwald
8ecbb4da05 Adjust the heuristics to recognize more cases of unknown glyphs for |toUnicode| (issue 5070) 2014-08-03 21:18:23 +02:00
Jonas Jenwald
b918df3547 Re-factor heuristics to recognize unknown glyphs for |toUnicode| 2014-08-03 21:12:36 +02:00
Jonas Jenwald
4b54d6fd43 Add strict equalities in src/core/stream.js 2014-08-02 17:59:14 +02:00
Jonas Jenwald
7fa204c805 Add strict equalities in src/core/parser.js 2014-08-02 17:37:24 +02:00
Tim van der Meij
cb59b5772b Merge pull request #5120 from Snuffleupagus/strict-equalities-src-core-2
Add strict equalities in src/core/* (part 2)
2014-08-02 13:51:14 +02:00
Tim van der Meij
4899e9e54f Use strict equalities in src/core/jbig2.js 2014-08-01 23:02:57 +02:00
Tim van der Meij
5d0fde4a2c Use strict equalities in src/core/jpx.js 2014-08-01 23:02:57 +02:00
Tim van der Meij
2796d1bf10 Use strict equalities in src/core/jpg.js 2014-08-01 23:02:56 +02:00
Tim van der Meij
160c7cab33 Use strict equalities in src/core/image.js 2014-08-01 23:02:55 +02:00
Jonas Jenwald
fb9fea2f36 Add strict equalities in src/core/worker.js 2014-08-01 22:17:47 +02:00
Jonas Jenwald
c9fb3e1b6d Add strict equalities in src/core/ps_parser.js 2014-08-01 22:02:10 +02:00
Jonas Jenwald
ee371fe6b2 Add strict equalities in src/core/pattern.js 2014-08-01 21:56:04 +02:00
Jonas Jenwald
ee0c0dd8a9 Add strict equalities in src/core/obj.js 2014-08-01 21:56:04 +02:00
Jonas Jenwald
a154ca2dd3 Add strict equalities in src/core/murmurhash3.js 2014-08-01 21:56:04 +02:00
Jonas Jenwald
8f5894d81a Add strict equalities in src/core/function.js 2014-08-01 21:56:03 +02:00
Jonas Jenwald
97b3eadbc4 Add strict equalities in src/core/fonts.js 2014-08-01 21:56:03 +02:00
Jonas Jenwald
87038e44cd Add strict equalities in src/core/evaluator.js 2014-08-01 18:40:10 +02:00
Jonas Jenwald
83a4c68df9 Add strict equalities in src/core/core.js 2014-08-01 18:40:10 +02:00
Jonas Jenwald
0012b8803c Add strict equalities in src/core/colorspace.js 2014-08-01 18:40:09 +02:00
Jonas Jenwald
84503c656d Add strict equalities in src/core/bidi.js 2014-08-01 18:39:46 +02:00
Jonas Jenwald
2162a19ed9 Add strict equalities in src/core/arithmetic_decoder.js 2014-08-01 18:39:46 +02:00
Jonas Jenwald
9cb09324d2 Add strict equalities in src/core/annotation.js 2014-08-01 18:39:45 +02:00
Nicholas Nethercote
adf58ed687 Represent cid chars using integers, not strings.
cid chars are 16-bit unsigned integers. Currently we convert them to
single-char strings when inserting them into the CMap, and then convert
them back to integers when extracting them from the CMap. This patch
changes CMap so that cid chars stay in integer format throughout, saving
both time and space.

When loading the PDF from issue #4580, this change reduces peak RSS from
~600 to ~370 MiB. It also improves overall speed on that PDF by ~26%,
going from 724 ms to 533 ms.
2014-08-01 02:35:17 -07:00
fkaelberer
c03cf20d37 Fix JBIG2 decoding issue #5026 2014-08-01 09:02:25 +02:00
Yury Delendik
ad2ea78280 Merge pull request #5101 from nnethercote/CMap-forEach
Avoid expensive for..in loops involving CMaps
2014-07-31 23:03:25 -05:00
Nicholas Nethercote
28687bca75 Optimize CMap.prototype.forEach().
This change avoids the element stringification caused by for..in for the
vast majority of CMaps.

When loading the PDF from issue #4580, this change reduces peak RSS from ~650
to ~600 MiB, and improves overall speed by ~20%, from 902 ms to 713 ms.  Other
CMap-heavy documents will also see improvements.
2014-07-30 06:28:47 -07:00
Nicholas Nethercote
b86daed29d Make CMap.map quasi-private.
This makes it easier for the representation to be improved.
2014-07-30 06:26:35 -07:00
Jonas Jenwald
2264748109 Merge pull request #5096 from nnethercote/bidi-length
Right-size |chars.length| and |type.length| in bidi().
2014-07-29 12:19:22 +02:00
Nicholas Nethercote
f1d5ec407e Right-size |chars.length| and |type.length| in bidi().
This lets the JS engine resize the array elements buffer immediately,
thus avoiding some intermediate resizings. This can save multiple MiBs
of reallocation in text-heavy files.
2014-07-28 16:35:45 -07:00
Yury Delendik
6038ee7cff Merge pull request #5063 from Snuffleupagus/ps-parser-avoid-intermediate-string-creation
Avoid creating intermediate strings in the PostScriptLexer
2014-07-28 15:07:32 -05:00
Jonas Jenwald
4960af3a4c Avoid creating intermediate strings in the PostScriptLexer 2014-07-27 13:51:28 +02:00
Jonas Jenwald
a5c98aab36 Re-factor parsing of the Linearization dictionary 2014-07-27 12:56:09 +02:00
Jonas Jenwald
86f9503876 Remove src/core/cidmaps.js 2014-07-25 21:53:17 +02:00
Jonas Jenwald
c3c72948b9 Stop including cidmaps.js
In b5b94a4af3, i.e. PR #4259, we stopped using cidmaps.js. Despite that, it's still included when PDF.js is built. At almost 0.5 MB (and approx. 7000 lines), this is currently the single largest file in the codebase.
Including such a large file in the builds, when it is not actually used, seems extremely wasteful; hence this patch.
2014-07-25 21:53:09 +02:00
Yury Delendik
1e21bac9d3 Merge pull request #5077 from Snuffleupagus/issue-5039
Fix loading of PDF files with invalid or missing Type3 characters (issue 5039)
2014-07-25 14:34:51 -05:00
Tim van der Meij
62e6265fb3 Merge pull request #5074 from nnethercote/readPostScriptTable-join
Use Array.join to build up strings in readPostScriptTable().
2014-07-25 21:26:54 +02:00
Yury Delendik
2aea7d7047 Merge pull request #5078 from nnethercote/Ref-toString
Optimize Ref_toString().
2014-07-25 10:10:10 -05:00
Nicholas Nethercote
1039791472 Use Array.join to build up strings in readPostScriptTable().
This avoids about 5 MiB of string allocations on one test case.
2014-07-24 16:12:08 -07:00
Nicholas Nethercote
856e1c600b Optimize Ref_toString().
I have a large PDF where this function is called 1.6 million times
during loading. Minimizing the string concatenations reduces the
cumulative allocations done by Firefox within this function from 113 MB
to 48 MB.
2014-07-24 06:49:56 -07:00
Jonas Jenwald
2485f11829 Fix loading of PDF files with invalid or missing Type3 characters (issue 5039) 2014-07-24 15:03:22 +02:00
Nicholas Nethercote
501446ccc4 Optimize common cases in hexToStr().
This avoids the creation of over two million array objects when viewing
http://www.dynacw.co.jp/Portals/3/fontsamplepdf/sample_4942546800828.pdf,
and reduces load time from 76 to 73 ms.
2014-07-22 23:26:03 -07:00
Nicholas Nethercote
c7f02d2c8e Minimize memory usage of font-related arrays.
This patch replaces some vanilla arrays with typed arrays, and avoids
some array copying.

It reduces the peak RSS when viewing
http://www.dynacw.co.jp/Portals/3/fontsamplepdf/sample_4942546800828.pdf
from ~940 MiB to ~750 MiB, and reduces its load time from 83 to 76 ms.
2014-07-22 22:47:45 -07:00
Jonas Jenwald
f13c217b25 Fix another seac regression (issue 4801) 2014-07-22 21:44:13 +02:00
Jonas Jenwald
b950118681 Revert commit fc73e2e (PR 5005) for breaking certain PDF files 2014-07-22 21:17:57 +02:00
Yury Delendik
53320ce734 Merge pull request #5012 from Snuffleupagus/issue-5010
Prevent CMapFactory.create from failing by passing the necessary parameters from PartialEvaluator_readToUnicode (issue 5010)
2014-07-22 10:54:35 -05:00
Yury Delendik
584fef90ab Merge pull request #5037 from Snuffleupagus/issue-5036
Add |fillRgb| method to LabCS
2014-07-21 09:55:55 -05:00
Jonas Jenwald
9c6316fc15 Merge pull request #5005 from fkaelberer/faster_ChunkedStream_getByte
Faster chunkedStream_getByte()
2014-07-18 18:23:49 +02:00
Jonas Jenwald
0237d5036a Merge pull request #5025 from nnethercote/share-zero-length-buffers
Improve how DecodeStream handles empty buffers.
2014-07-13 12:13:06 +02:00
Jonas Jenwald
1cb4de2227 Add |fillRgb| method to LabCS 2014-07-10 12:06:19 +02:00
Jonas Jenwald
a7c786775d [CIDFontType2] Map characters missing in toUnicode to the private use area (bug 1028735 and issue 4881) 2014-07-05 00:18:51 +02:00
Nicholas Nethercote
a483c80fc3 Make QueueOptimizer easier to read.
QueueOptimizer is really hard to read. Enough so that it's blocking my
efforts to streamline the representation used for operator lists.

This patch improves its readability in the following ways.

- More descriptive variable names make the sequence checking much clearer,
  as do additional comments.

- The addState() functions now return the index of the first op past the
  sequence, instead of setting context.currentOperation to the last op of
  the sequence.

- The loop in optimize() is clearer.

- The array modification in the fourth addState() function is much clearer
  -- we're just removing trios of ops.

- All four |addState| functions are now more consistent with each other.

I used some debug printfs to find documents where these optimizations are
used and then checked that the number of optimized ops was the same before
and after my changes.
2014-07-03 19:16:31 -07:00
Nicholas Nethercote
db866945b7 Improve how DecodeStream handles empty buffers.
DecodeStream currently initializes its |buffer| field to |null|, which
is reasonable, because lots of DecodeStreams never need to instantiate a
buffer. But this requires various special cases in the code.

This patch change it so DecodeStreamClosure has a single empty
Uint8Array which gets shared between all buffers upon initialization.
This avoids the special cases.

DecodeStream.prototype.ensureBuffer() is really hot, and this removes a
test from the fast path. For one 226 page scanned document this sped up
rendering by about 2%.
2014-07-02 18:53:21 -07:00
fkaelberer
fc73e2e173 use getBytes() instead of looping over getByte() 2014-06-27 09:09:54 +02:00
Jonas Jenwald
c5f4051a75 A few small optimizations of adjustMapping
Replace a couple of |in| checks with comparisons against undefined.
2014-06-27 00:59:42 +02:00
Jonas Jenwald
c121def806 A few small optimizations for CIDFontType2 fonts
Cache a constant length and replace one usage of |in| with a comparison against undefined.
2014-06-27 00:52:54 +02:00
Jonas Jenwald
04975acceb Prevent CMapFactory.create from failing by passing the necessary parameters from PartialEvaluator_readToUnicode (issue 5010) 2014-06-27 00:46:16 +02:00
fkaelberer
9a41659ae7 Faster chunkedStream_getByte() 2014-06-26 22:34:00 +02:00
Yury Delendik
6d5a04149b Merge pull request #4993 from pramodhkp/rectelmnt
Combine re element into constructPath
2014-06-24 09:27:21 -05:00
pramodhkp
8407d28c9e Combine re element into constructPath 2014-06-25 00:27:42 +05:30
Yury Delendik
10db93be29 Merge pull request #4980 from Snuffleupagus/bug-1027533
Additional heuristics to recognize unknown glyphs for toUnicode (bug 1027533)
2014-06-23 21:56:13 -05:00
Yury Delendik
bb7e7d33c5 Merge pull request #4976 from CodingFabian/restructure-evaluator-read
Restructured EvaluatorPreprocessor_read to be more natural.
2014-06-23 21:50:14 -05:00
Yury Delendik
c28839b2f3 Merge pull request #4944 from Snuffleupagus/issue-4934
Don't blindly trust toUnicode when building toFontChar for non-standard fonts without a font file (issue 4934)
2014-06-23 21:49:24 -05:00
Fabian Lange
60f67c3961 Restructured EvaluatorPreprocessor_read to be more natural. 2014-06-23 23:35:25 +02:00
Yury Delendik
3ad58db7e8 Merge pull request #4982 from nnethercote/use-null-for-zero-args
Use null instead of [] for ops with no args.
2014-06-23 15:38:48 -05:00
Nicholas Nethercote
081866a184 Use null instead of [] for ops with no args.
This reduces peak RSS on one test file from ~600 to ~560 MiB.
2014-06-22 16:03:48 -07:00
Jonas Jenwald
b19bb74813 Additional heuristics to recognize unknown glyphs for toUnicode (bug 1027533) 2014-06-20 09:57:16 +02:00
Yury Delendik
84157e039d Merge pull request #4973 from nnethercote/better-ref-keys
Factor out repeated Ref key string generation code.
2014-06-19 21:00:09 -05:00
Nicholas Nethercote
1ad3ffbc7b Factor out repeated Ref key string generation code.
In src/core/obj.js, we convert a Ref to a string to index into a table like
this: 'R1.0'.  This conversion is repeated numerous times.

This patch factors out the conversion into a new function.
Ref.prototype.toString().
2014-06-19 18:22:39 -07:00
Yury Delendik
c0a6b0f308 Merge pull request #4971 from yurydelendik/rm-suppressEncryption
Removes error catch from fetchUncompressed()
2014-06-18 22:03:09 -05:00
Yury Delendik
b557b87fc9 Merge pull request #4972 from nnethercote/preprocessor-read
Avoid allocating return object in EvaluatorPreprocessor_read().
2014-06-18 22:00:31 -05:00
Nicholas Nethercote
17170af3c7 Avoid allocating return object in EvaluatorPreprocessor_read().
This function can be called 100s of 1000s or even millions of times, and the
allocated return object accounts for 10% of all GC thing allocations for some
documents. It's easy to avoid, which reduces stress on the garbage collector,
and this patch does that.
2014-06-18 16:41:29 -07:00
Yury Delendik
623fa29300 Removes error catch from fetchUncompressed() 2014-06-18 18:30:27 -05:00
Yury Delendik
fbdab2c7c5 Not ignoring MissingDataException exception. 2014-06-18 18:24:54 -05:00
Yury Delendik
cf4bc42e33 Merge pull request #4968 from nnethercote/glyphBuf
Build up textChunk.str more efficiently.
2014-06-18 17:51:07 -05:00
Yury Delendik
88fd1aa78b Removes PDFJS.Annotation 2014-06-18 16:58:11 -05:00
Jonas Jenwald
2282c98500 Merge pull request #4965 from yurydelendik/annotations
Splits shared/annotation.js into core/ and display/
2014-06-18 17:01:38 +02:00
Nicholas Nethercote
bce7601480 Build up textChunk.str more efficiently.
PartialEvaluator.getTextContent() builds up textChunk strings 1 char at a time,
creating many 100s of 1000s of intermediate strings along the way. This patch
make it instead push chars to an array and then join them at the end, as we
have done in numerous other places.
2014-06-18 07:48:22 -07:00
Nicholas Nethercote
4428cebdbc Add ChunkedStream.ensureByte().
This new function is much faster than ensureRange(pos, pos+1), which is a very
common case.

This speeds up the rendering of some test cases (including the Tracemonkey
paper) by 4--5%.
2014-06-17 21:33:48 -07:00
Yury Delendik
bdeca30fbf Splits shared/annotation.js into core/ and display/ 2014-06-17 17:43:33 -05:00
Yury Delendik
bad24bf707 Merge pull request #4950 from fkaelberer/fasterJPEGtransform
Faster JPEG transform
2014-06-17 09:03:23 -05:00
Yury Delendik
5a2e511cbd Merge pull request #4955 from timvandermeij/rename-concatenate
Renames concatenateToArray to appendToArray
2014-06-17 08:21:47 -05:00
Jonas Jenwald
ab67e1c272 Let Parser_makeFilter return NullStream when an invalid stream is encountered (issue 3417) 2014-06-17 12:03:34 +02:00
fkaelberer
f9cde5d93e faster JPEG transform 2014-06-17 10:09:17 +02:00
Jonas Jenwald
22cfcbcf8a Merge pull request #4952 from yurydelendik/telemetry
Collect More Telemetry Data
2014-06-17 00:36:58 +02:00
Yury Delendik
0cd28ebfa3 Telemetry for used stream and font types 2014-06-16 16:41:04 -05:00
Tim van der Meij
9c072a5d4b Renames concatenateToArray to appendToArray 2014-06-16 22:10:10 +02:00
Jonas Jenwald
158790981c Don't blindly trust toUnicode when building toFontChar for non-standard fonts without a font file (issue 4934) 2014-06-14 22:59:08 +02:00
Yury Delendik
9f51e46917 Refactoring error reporting in JPX 2014-06-13 18:22:42 -05:00