Commit Graph

33 Commits

Author SHA1 Message Date
Jonas Jenwald
4b54d6fd43 Add strict equalities in src/core/stream.js 2014-08-02 17:59:14 +02:00
Jonas Jenwald
b950118681 Revert commit fc73e2e (PR 5005) for breaking certain PDF files 2014-07-22 21:17:57 +02:00
Jonas Jenwald
9c6316fc15 Merge pull request #5005 from fkaelberer/faster_ChunkedStream_getByte
Faster chunkedStream_getByte()
2014-07-18 18:23:49 +02:00
Nicholas Nethercote
db866945b7 Improve how DecodeStream handles empty buffers.
DecodeStream currently initializes its |buffer| field to |null|, which
is reasonable, because lots of DecodeStreams never need to instantiate a
buffer. But this requires various special cases in the code.

This patch change it so DecodeStreamClosure has a single empty
Uint8Array which gets shared between all buffers upon initialization.
This avoids the special cases.

DecodeStream.prototype.ensureBuffer() is really hot, and this removes a
test from the fast path. For one 226 page scanned document this sped up
rendering by about 2%.
2014-07-02 18:53:21 -07:00
fkaelberer
fc73e2e173 use getBytes() instead of looping over getByte() 2014-06-27 09:09:54 +02:00
Tim van der Meij
9c072a5d4b Renames concatenateToArray to appendToArray 2014-06-16 22:10:10 +02:00
Yury Delendik
cff2c3afc1 Merge pull request #4892 from yurydelendik/issue4890
Fixes masked JPEG image
2014-06-10 09:16:12 -05:00
Fabian Lange
22a0e7fe65 Optimization for FlateStream_getCode, making more pdfs parsable.
This commit cleans up the FlateStream_getCode method, and removes a few error
conditions.
Previously it would fail if the codeSize is less than maxLen if end of stream
is reached. However in the document linked below there is a sub-stream
(the one starting at pos 337) which has maxLen set to 11, but actually
contains only 10. After breaking the sanity check still applies, and in this
case passes validating codeSize(10)==codeLen(10).

 http://www.cafeculture.com/wp-content/uploads/2014/03/V-CM-BR-086-04002-1346-0258-GP-Brazil-Fazenda-Cafe-Cambara-Terra-Preta-Microlot-Sample-0460-13-Pulped-Natural-60Kg.pdf
2014-06-09 20:55:31 +02:00
Yury Delendik
6b411b559d Fixes masked JPEG image 2014-06-04 15:53:46 -05:00
Yury Delendik
6235e3a61c Adds color components decoding to the JPEG 2014-06-03 08:51:57 -05:00
Jonas Jenwald
ea0453f106 Add isEmpty method to Stream, DecodeStream and ChunkedStream 2014-05-18 21:41:05 +02:00
p01
330b99f428 Optimized stream.js / 9-10x faster DecodeStream_ensureBuffer 2014-05-14 17:06:39 +02:00
Thorben Bochenek
e8f0700bfa Move the colour conversion to jpg.js
Benchmarking shows that this improves performance for the invitation document
from https://github.com/mozilla/pdf.js/issues/3809 by 35%
2014-04-24 15:07:12 +02:00
fkaelberer
b06c10cbbd rename getUint32 to getInt32 and collect readInt*() in util.js 2014-04-16 21:31:16 +02:00
fkaelberer
04602c8a5e Less copying in the JPX coder, merged and rebased 2014-04-16 10:40:04 +02:00
Rob Wu
2e97c0d085 Remove some unused variables from src/
Only obviously useless, local variables have been removed.
2014-04-15 17:10:23 +02:00
Tim van der Meij
df91acf239 Fixes lint warning W004 in src/core 2014-04-11 00:41:08 +02:00
Yury Delendik
31f081ae17 Doesn't traverse cyclic references in Dict.getAll; reduces empty-Dict garbage 2014-03-26 09:07:38 -05:00
Jonas Jenwald
6883362a84 Fix coding style in src/core/stream.js 2014-03-22 21:21:01 +01:00
Nicholas Nethercote
6a75e45309 Allocate fewer objects when parsing 2 and 4 byte chunks.
This is achieved by adding getBytes2() and getBytes4() to streams, and by
changing int16() and int32() to take multiple scalar args instead of an array
arg.
2014-03-13 22:15:05 -07:00
Nicholas Nethercote
b3024db677 Estimate the size of decoded streams in advance.
When decoding a stream, the decode buffer is often grown multiple times, its
byte size increasing like so: 512, 1024, 2048, etc. This patch estimates the
minimum size in advance (using the length of the encoded stream), often
allowing the smaller sizes to be skipped. It also renames numerous |length|
variables as |maybeLength| to make it clear that they can be |null|.

I measured this change on eight documents. This change reduces the cumulative
size of decode buffer allocations by 0--32%, with 10--20% being typical. This
reduces peak RSS by 10 or 20 MiB for several of them.
2014-03-13 02:06:58 -07:00
Nicholas Nethercote
ea17749b93 Don't get bytes eagerly when creating FlateStream objects. 2014-03-11 16:03:15 -07:00
Nicholas Nethercote
d0253c8291 Don't get bytes eagerly when creating {Jpeg,Jpx,Jbig2}Stream objects.
This avoids lots of unnecessary work when such streams are referred to via
fetch(), and so their bytes aren't subsequently read. This is a large
performance win on some files.
2014-03-11 16:03:15 -07:00
fkaelberer
6755ea70b0 Fix infinite loop in DecodeStream_ensureBuffer() 2014-03-06 10:31:18 +01:00
Yury Delendik
f46942758f Merge pull request #4382 from nnethercote/off-by-one
Avoid extra allocations in ensureBuffer() caused by an off-by-one error.
2014-03-04 22:27:33 -06:00
Nicholas Nethercote
fe8b6b0950 Remove FakeStream. 2014-03-04 18:07:15 -08:00
Nicholas Nethercote
64431a9909 Avoid extra allocations in ensureBuffer() caused by an off-by-one error. 2014-03-03 18:03:48 -08:00
Nicholas Nethercote
33dd1b0c3c Remove the unnecessary this.buf in CCITTFaxStream. 2014-02-24 16:45:18 -08:00
Brendan Dahl
2e7c71c75e Merge pull request #4011 from Rob--W/issue-3885
Set eof to true at the end of a FlateStream
2014-01-07 10:52:37 -08:00
Jonas Jenwald
e6c805490b [JBIG2] Fix getting decodeParms when it's an array 2013-12-19 20:23:58 +01:00
Rob Wu
43847d7ff8 Set eof to true at the end of a FlateStream
At the initialization of `Lexer_getObj` (in `parser.js`), there's a loop
that skips whitespace and breaks out whenever EOF is encountered.
(https://github.com/mozilla/pdf.js/blob/88ec2bd1a/src/core/parser.js#L586-L599)

Whenever the current character is not a whitespace character,
`ch = this.nextChar();` is used to find the next character
(using `return this.currentChar = this.stream.getByte())`).

The aforementioned `getByte` method retrieves the next byte using
(https://github.com/mozilla/pdf.js/blob/88ec2bd1a/src/core/stream.js#L122-L128)

      var pos = this.pos;
      while (this.bufferLength <= pos) {
        if (this.eof)
          return -1;
        this.readBlock();
      }
      return this.buffer[this.pos++];

This piece of code relies on this.eof to detect whether the last character
has been read. When the stream is a `FlateStream`, and the end of the stream
has been reached, then **`this.eof` is not set to `true`**, because this check
is done inside a loop that does not occur when the read block size is zero:
(https://github.com/mozilla/pdf.js/blob/88ec2bd1ac/src/core/stream.js#L511-L517)

      for (var n = bufferLength; n < end; ++n) {
        if (typeof (b = bytes[bytesPos++]) == 'undefined') {
          this.eof = true;
          break;
        }
        buffer[n] = b;
      }

This commit fixes the issue by setting this.eof to true whenever the loop is not
going to run (i.e. when bufferLength === end, i.e. blockLen === 0).
2013-12-19 18:37:39 +01:00
Yury Delendik
c8af2565f1 Uses blob URL instead of data when possible 2013-11-14 15:21:42 -08:00
Brendan Dahl
5ecce4996b Split files into worker and main thread pieces. 2013-08-12 10:48:06 -07:00