Commit Graph

121 Commits

Author SHA1 Message Date
Jonas Jenwald
ea71d23f74 Fix a stupid spelling error in the ASCII85Decode name in Parser.makeInlineImage (issue 8613)
This is a trivial follow-up to PR 5383, and it's a bit strange that this has been wrong since late 2014 without anyone noticing (maybe because inline images aren't too common).
So, apparently code works better if you actually spell correctly, who knew ;-)

Fixes 8613.
2017-07-05 19:43:09 +02:00
Jonas Jenwald
a8c87f8019 Fix inconsistent spacing and trailing commas in objects in src/core/ files, so we can enable the comma-dangle and object-curly-spacing ESLint rules later on
*Unfortunately this patch is fairly big, even though it only covers the `src/core` folder, but splitting it even further seemed difficult.*

http://eslint.org/docs/rules/comma-dangle
http://eslint.org/docs/rules/object-curly-spacing

Given that we currently have quite inconsistent object formatting, fixing this in *one* big patch probably wouldn't be feasible (since I cannot imagine anyone wanting to review that); hence I've opted to try and do this piecewise instead.

Please note: This patch was created automatically, using the ESLint --fix command line option. In a couple of places this caused lines to become too long, and I've fixed those manually; please refer to the interdiff below for the only hand-edits in this patch.

```diff
diff --git a/src/core/evaluator.js b/src/core/evaluator.js
index abab9027..dcd3594b 100644
--- a/src/core/evaluator.js
+++ b/src/core/evaluator.js
@@ -2785,7 +2785,8 @@ var EvaluatorPreprocessor = (function EvaluatorPreprocessorClosure() {
     t['Tz'] = { id: OPS.setHScale, numArgs: 1, variableArgs: false, };
     t['TL'] = { id: OPS.setLeading, numArgs: 1, variableArgs: false, };
     t['Tf'] = { id: OPS.setFont, numArgs: 2, variableArgs: false, };
-    t['Tr'] = { id: OPS.setTextRenderingMode, numArgs: 1, variableArgs: false, };
+    t['Tr'] = { id: OPS.setTextRenderingMode, numArgs: 1,
+                variableArgs: false, };
     t['Ts'] = { id: OPS.setTextRise, numArgs: 1, variableArgs: false, };
     t['Td'] = { id: OPS.moveText, numArgs: 2, variableArgs: false, };
     t['TD'] = { id: OPS.setLeadingMoveText, numArgs: 2, variableArgs: false, };
diff --git a/src/core/jbig2.js b/src/core/jbig2.js
index 5a17d482..71671541 100644
--- a/src/core/jbig2.js
+++ b/src/core/jbig2.js
@@ -123,19 +123,22 @@ var Jbig2Image = (function Jbig2ImageClosure() {
      { x: -1, y: -1, }, { x: 0, y: -1, }, { x: 1, y: -1, }, { x: -2, y: 0, },
      { x: -1, y: 0, }],
     [{ x: -3, y: -1, }, { x: -2, y: -1, }, { x: -1, y: -1, }, { x: 0, y: -1, },
-     { x: 1, y: -1, }, { x: -4, y: 0, }, { x: -3, y: 0, }, { x: -2, y: 0, }, { x: -1, y: 0, }]
+     { x: 1, y: -1, }, { x: -4, y: 0, }, { x: -3, y: 0, }, { x: -2, y: 0, },
+     { x: -1, y: 0, }]
   ];

   var RefinementTemplates = [
     {
       coding: [{ x: 0, y: -1, }, { x: 1, y: -1, }, { x: -1, y: 0, }],
-      reference: [{ x: 0, y: -1, }, { x: 1, y: -1, }, { x: -1, y: 0, }, { x: 0, y: 0, },
-                  { x: 1, y: 0, }, { x: -1, y: 1, }, { x: 0, y: 1, }, { x: 1, y: 1, }],
+      reference: [{ x: 0, y: -1, }, { x: 1, y: -1, }, { x: -1, y: 0, },
+                  { x: 0, y: 0, }, { x: 1, y: 0, }, { x: -1, y: 1, },
+                  { x: 0, y: 1, }, { x: 1, y: 1, }],
     },
     {
-      coding: [{ x: -1, y: -1, }, { x: 0, y: -1, }, { x: 1, y: -1, }, { x: -1, y: 0, }],
-      reference: [{ x: 0, y: -1, }, { x: -1, y: 0, }, { x: 0, y: 0, }, { x: 1, y: 0, },
-                  { x: 0, y: 1, }, { x: 1, y: 1, }],
+      coding: [{ x: -1, y: -1, }, { x: 0, y: -1, }, { x: 1, y: -1, },
+               { x: -1, y: 0, }],
+      reference: [{ x: 0, y: -1, }, { x: -1, y: 0, }, { x: 0, y: 0, },
+                  { x: 1, y: 0, }, { x: 0, y: 1, }, { x: 1, y: 1, }],
     }
   ];
```
2017-06-02 11:20:19 +02:00
Jonas Jenwald
982b6aa65b Convert the files in the /src/core folder to ES6 modules
Please note that the `glyphlist.js` and `unicode.js` files are converted to CommonJS modules instead, since Babel cannot handle files that large and they are thus excluded from transpilation.
2017-05-30 22:06:21 +02:00
Jonas Jenwald
40feca12c1 Ignore line-breaks between operator and digit in Lexer.getNumber
This is consistent with the behaviour in Adobe Reader (and PDFium), and it fixes the display of page 30 in https://bug1354114.bmoattachments.org/attachment.cgi?id=8855457 (taken from https://bugzilla.mozilla.org/show_bug.cgi?id=1354114).

The patch also makes the `error` message for invalid numbers slightly more useful, by including the charCode as well. (Having that information available would have reduced the time spent on debugging the PDF file above.)
2017-05-02 20:59:42 +02:00
Jonas Jenwald
afc74b0178 Enable the object-shorthand ESLint rule in src/shared
Please see http://eslint.org/docs/rules/object-shorthand.

For the most part, these changes are of the search-and-replace kind, and the previously enabled `no-undef` rule should complement the tests in helping ensure that no stupid errors crept into to the patch.
2017-04-27 17:29:40 +02:00
Jonas Jenwald
23c62cc321 Consume the current character when encountering illegal characters in Lexer.getObject, in order to prevent infinite loops during reading of streams (issue 8061)
*Please note:* The rendering of the PDF file in issue 8061 first regressed in PR 7039, and then PR 7493 exacerbated the problem even further by causing an infinite loop.

In this particular case, when errors were encountered inside of the `Lexer.getObject` method *itself*, we didn't advance the stream position. This thus caused an inifinite loop in `parseCMap`, since the exact same character was then parsed over and over again.

Fixes 8061.
2017-02-11 19:32:48 +01:00
Jonas Jenwald
50c2856097 Move EOF/isEOF from core/parser.js to core/primitives.js
Given the nature of `EOF` and `isEOF`, it seems to me that they really ought to be placed in `core/primitives.js` instead.

In general, it doesn't seem great to have to depend on the entire `core/parser.js` file for such simple primitives/helper functions.
In particular, while `core/ps_parser.js` is completely separate from `core/parser.js` with regards to its function, it still depends on the latter for just *one* primitive.

Note that compared to e.g. PR 7389, this will not reduce the number of dependencies for `core/ps_parser`, however the new dependency IMHO makes more sense.
2017-01-27 13:37:48 +01:00
Jonas Jenwald
2f3805efbc Switch to using ESLint, instead of JSHint, for linting
*Please note that most of the necessary code adjustments were made in PR 7890.*

ESLint has a number of advantageous properties, compared to JSHint. Among those are:
 - The ability to find subtle bugs, thanks to more rules (e.g. PR 7881).
 - Much more customizable in general, and many rules allow fine-tuned behaviour rather than the just the on/off rules in JSHint.
 - Many more rules that can help developers avoid bugs, and a lot of rules that can be used to enforce a consistent coding style. The latter should be particularily useful for new contributors (and reduce the amount of stylistic review comments necessary).
 - The ability to easily specify exactly what rules to use/not to use, as opposed to JSHint which has a default set. *Note:* in future JSHint version some of the rules we depend on will be removed, according to warnings in http://jshint.com/docs/options/, so we wouldn't be able to update without losing lint coverage.
 - More easily disable one, or more, rules temporarily. In JSHint this requires using a numeric code, which isn't very user friendly, whereas in ESLint the rule name is simply used instead.

By default there's no rules enabled in ESLint, but there are some default rule sets available. However, to prevent linting failures if we update ESLint in the future, it seemed easier to just explicitly specify what rules we want.
Obviously this makes the ESLint config file somewhat bigger than the old JSHint config file, but given how rarely that one has been updated over the years I don't think that matters too much.

I've tried, to the best of my ability, to ensure that we enable the same rules for ESLint that we had for JSHint. Furthermore, I've also enabled a number of rules that seemed to make sense, both to catch possible errors *and* various style guide violations.

Despite the ESLint README claiming that it's slower that JSHint, https://github.com/eslint/eslint#how-does-eslint-performance-compare-to-jshint, locally this patch actually reduces the runtime for `gulp` lint (by approximately 20-25%).

A couple of stylistic rules that would have been nice to enable, but where our code currently differs to much to make it feasible:
 - `comma-dangle`, controls trailing commas in Objects and Arrays (among others).
 - `object-curly-spacing`, controls spacing inside of Objects.
 - `spaced-comment`, used to enforce spaces after `//` and `/*. (This is made difficult by the fact that there's still some usage of the old preprocessor left.)

Rules that I indend to look into possibly enabling in follow-ups, if it seems to make sense: `no-else-return`, `no-lonely-if`, `brace-style` with the `allowSingleLine` parameter removed.

Useful links:
 - http://eslint.org/docs/user-guide/configuring
 - http://eslint.org/docs/rules/
2016-12-16 21:06:36 +01:00
Jonas Jenwald
28e50cfa21 Fix errors reported by the space-infix-ops ESLint rule
http://eslint.org/docs/rules/space-infix-ops
2016-12-12 20:36:00 +01:00
Jonas Jenwald
b4ac6bd2f6 Ensure that we resolve indirect objects in Filter and DecodeParms arrays in parser.js
I've not actually, thus far, come across a PDF file that this patch fixes. However, given the string of recent patches that has fixed issues with indirect objects in arrays, I think that it makes sense to proactively avoid any issues in this code.
2016-12-08 11:55:08 +01:00
Jonas Jenwald
c8f83d6487 Let Parser_makeFilter pass in the DecodeParms data to various image Streams, instead of re-fetching it in various [...]Stream.prototype.ensureBuffer methods
In `Parser_filter` the `DecodeParms` data is fetched and passed to `Parser_makeFilter`, where we also make sure that a `Ref` is resolved to a direct object.
We can thus pass this along to the various image `Stream` constructors, to avoid the current situation where we lookup/resolve data that is already available.
Note also that we currently do *not* handle the case where `DecodeParms` is an Array entirely correct in the various image `Stream`s, and this patch fixes that for free.
2016-10-15 12:09:51 +02:00
Yury Delendik
7b2a9ee4e0 Merge pull request #7670 from Snuffleupagus/Parser_makeFilter-maybeLength
Only skip parsing a stream in `Parser_makeFilter` when we know for sure that it is empty (PR 6372 follow-up)
2016-10-05 10:38:12 -05:00
Jonas Jenwald
116ba19dd9 Respect the 'ColorTransform' entry in the image dictionary when decoding JPEG images (bug 956965, issue 6574)
Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=956965.
Fixes 6574.
2016-09-26 21:55:43 +02:00
Jonas Jenwald
a22f0ae820 Only skip parsing a stream in Parser_makeFilter when we know for sure that it is empty (PR 6372 follow-up)
For PDF files with multiple `/Filter`s, where the `/Length` entry is zero, we fail to render the file correctly. The reason is that `maybeLength` is `null` for the every filter except the first, and `!maybeLength` is thus truthy.
Hence it seems that we should completely ignore the `/Length` entry and also explicitly check `maybeLength === 0`.

Note that I've not (yet) come across a PDF file with this issue in the wild, but given all the stupid things PDF generators do I wouldn't be surprised if such a file actually exists. In order to prevent a possible future bug, I'm submitting this patch which includes a hand-edited PDF file that we currently cannot render correctly (but e.g. Adobe Reader can).
2016-09-25 12:40:15 +02:00
Jonas Jenwald
544d29f5cb Add a recoveryMode that suppresses errors from the Parser, and utilize it when searching for the main trailer in XRef_indexObjects (bug 1250079)
Instead of having `Parser_getObj` fail unconditionally for the referenced PDF file, this patch attempts to let searching for the main trailer continue even if there are errors.

Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1250079.
2016-08-17 12:37:35 +02:00
klemens
6f03f62327 trivial spelling fixes 2016-07-17 14:33:41 +02:00
Jonas Jenwald
a36a946976 Move the isSpace utility function from core/parser.js to shared/util.js
Currently the `isSpace` utility function is a member of `Lexer`, which seems suboptimal, given that it's placed in `core/parser.js`. In practice, this means that in a number of `core/*.js` files we thus have an *otherwise* completely unnecessary dependency on `core/parser.js` for a one-line function.

Instead, this patch moves `isSpace` into `shared/util.js` which seems more appropriate for this kind of utility function. Not to mention that since all the affected `core/*.js` files already depends on `shared/util.js`, this doesn't incur any more file dependencies.
2016-06-06 09:11:33 +02:00
Yury Delendik
6038c236b2 Removes core/stream circular dependency on core/parser. 2016-03-22 14:06:01 -05:00
Yury Delendik
825a2225ab Merge pull request #6915 from yurydelendik/lookuptables
Refactor lookup hash tables/objects
2016-01-28 15:01:06 -06:00
Yury Delendik
2edf2792dc Replaces literal {} created lookup tables with Object.create 2016-01-28 12:18:38 -06:00
Jonas Jenwald
15ce96a6eb Prevent failures in the "scanning for endstream" code, in Parser_makeStream, by handling the case where 'endstream' is split between contiguous chunks (issue 1536) 2016-01-26 09:03:51 +01:00
Yury Delendik
6b60c8f4db Adds UMD headers to core, display and shared files. 2015-12-15 13:24:39 -06:00
Jonas Jenwald
995e1a45b8 Ensure that Lexer_getName does not fail if a Name contains in invalid usage of the NUMBER SIGN (#) (issue 6692)
*This is a regression from PR 3424.*

The PDF file in the referenced issue is using `Type3` fonts. In one of those, the `/CharProcs` dictionary contains an entry with the name `/#`. Before the changes to `Lexer_getName` in PR 3424, we were allowing certain invalid `Name` patterns containing the NUMBER SIGN (#).

It's unfortunate that this has been broken for close to two and a half years before the bug surfaced, but it should at least indicate that this is not a widespread issue.

Fixes 6692.
2015-11-28 11:59:09 +01:00
Manas
a2ba1b8189 Uses editorconfig to maintain consistent coding styles
Removes the following as they unnecessary
/* -*- Mode: Java; tab-width: 2; indent-tabs-mode: nil; c-basic-offset: 2 -*- */
/* vim: set shiftwidth=2 tabstop=2 autoindent cindent expandtab: */
2015-11-14 07:32:18 +05:30
Jonas Jenwald
b1d148a4aa Remove Parser_fetchIfRef since it's obsolete
This code was added in PR 1214, but was made obsolete by PRs 1488/1493. Prior to the latter ones, `Dict_get` retured the raw objects. However, afterwards (and currently) `Dict_get` now resolves indirect objects, which makes `Parser_fetchIfRef` redundant.

*Potential risks with this patch:*
This patch passes all tests locally, but there's a *small* possibility that it could break some weird PDF files.
In the current code, wrapping `Dict_get` inside `Parser_fetchIfRef` will potentially mean two back-to-back call of `XRef_fetch`, if a reference points directly to another reference. I'm not sure if this can actually happen in practice, and I'd think that if that were the case we'd already have run into it elsewhere in the code-base, given that `Parser` is the only place where we try to "double" resolve references.
2015-09-02 23:11:00 +02:00
Tim van der Meij
b42b894570 Merge pull request #6386 from Snuffleupagus/Parser_makeFilter-warn-on-empty-stream
Add a warning when we encounter an empty stream in `Parser_makeFilter`
2015-08-30 23:14:22 +02:00
Rob Wu
582573b96b Merge pull request #6358 from Snuffleupagus/Parser_tryShift-missingDataException
Don't catch `MissingDataException` in `Parser_tryShift`
2015-08-27 14:46:24 +02:00
Jonas Jenwald
f814fdc215 Add a warning when we encounter an empty stream in Parser_makeFilter
Having a warning here would have meant that issue 6360 could have been solved in approximately five minutes, instead of an hour. To avoid that happening again, this patch adds a warning whenever we treat a stream as empty.
2015-08-26 20:14:30 +02:00
Jonas Jenwald
5128603f64 Also check maybeLength when deciding if a stream is empty in Parser_makeFilter (issue 6360)
The problem with the PDF files in the issue, besides the obviously broken XRef tables which we're able to recover from, is that many/most of the streams have Dictionaries where the `Length` entry is set to `0`. This causes us to return `NullStream`, instead of the appropriate one in `Parser_makeFilter`.

Fixes 6360.
2015-08-20 23:04:18 +02:00
Jonas Jenwald
8c3b8238ac Don't catch MissingDataException in Parser_tryShift
I overlooked this while reviewing PR 6197, but I don't think that we should be catching that particular kind of exception here; hence this patch.
2015-08-16 11:35:54 +02:00
Jonas Jenwald
c718d1ab10 Ignore double negative in Lexer_getNumber (issue 6218)
Basic mathematics would suggest that a double negative should always become positive, but it appears that Adobe Reader simply ignores that case. Hence I think that it makes sense for us to do the same.

Fixes 6218.
2015-07-16 12:11:49 +02:00
Rob Wu
e211c25f06 Improve robustness of stream parser (invalid length)
When the parser finds a stream, it retrieves the Length from the stream
dictionary and advances the lexer to the offset as specified in Length.
If this Length is incorrect, the lexer could end up anywhere.

When the lexer gets in an invalid state, it could throw errors. For
example, in issue 6108, the lexer ends up inside the stream data. This
stream has the ASCIIHexDecode filter, so all characters are made up from
ASCII characters, and the lexer interprets it as a command token. Tokens
cannot be longer than 127 bytes, so eventually 128 bytes are consumed
and the lexer throws "Command token too long" error.

Another possible error is "Illegal character: 41" when the lexer happens
to end up at a ')' due to the length mismatch.

These problems are solved by catching lexer errors and recovering the
parser via the existing stream length detection branch.
2015-07-11 20:12:49 +02:00
Rob Wu
456ad438d8 Issue a warning instead of an error for long Names
The PDF specification (cited below) specifies a maximum length of a name
in bytes as a minimal architectural limit. This means that PDF *writers*
should not create names that exceed 127 bytes.

It does not forbid PDF *readers* to accept such names though. These
names are only used internally to link PDF objects to other objects. For
these use cases, the lengths of the names do not really matter. Hence I
have changed the implementation to not treat long names as errors, but
warnings.

> (7.3.5) The length of a name shall be subject to an implementation
> limit; see Annex C.
>
> (Annex C.2) Table C.1 describes the minimum architectural limits that
> should be accommodated by conforming readers running on 32-bit
> machines. Because conforming readers may be subject to these limits,
> conforming writers producing PDF files should remain within them.
>
> (Table C.1) name 127 "Maximum length of a name, in bytes."

http://adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf
2015-07-10 16:10:24 +02:00
Jonas Jenwald
eac168f3cc Refactor searching for end of inline (EI) JPEG image streams
This patch changes searching for EI image streams to rely on the EOI (end-of-image) marker for DCTDecode filters (i.e. JPEG images).
2015-01-10 23:55:55 +01:00
Jonas Jenwald
184880a751 Fix searching for end of inline (EI) images with ASCII85Decode filters (bug 1077808)
This patch changes searching for the end of inline image streams to rely on the EOD marker for the filters: ASCII85Decode and ASCIIHexDecode.
2014-12-15 18:48:29 +01:00
Yury Delendik
f5df30f967 Merge pull request #5445 from CodingFabian/fixImageCachingInParser
Fixes caching of inline images during parsing.
2014-12-15 10:51:23 -06:00
Jonas Jenwald
3e1b5216ac Refactor searching for the SOI marker of inline JPEG image streams 2014-12-05 17:24:27 +01:00
Fabian Lange
970c048d50 fixes caching of inline images during parsing.
As described in #5444, the evaluator will perform identity checking of
paintImageMaskXObjects to decide if it can use
paintImageMaskXObjectRepeat instead of paintImageMaskXObjectGroup.

This can only ever work if the entry is a cache hit. However the
previous caching implementation was doing a lazy caching, which would
only consider a image cache worthy if it is repeated.
Only then the repeated instance would be cached.
As a result of this the sequence of identical images A1 A2 A3 A4 would
be seen as A1 A2 A2 A2 by the evaluator, which prevents using the
"repeat" optimization. Also only the last encountered image is cached,
so A1 B1 A2 B2, would stay A1 B1 A2 B2.

The new implementation drops the "lazy" init of the cache. The threshold
for enabling an image to be cached is rather small, so the potential waste
in storage and adler32 calculation is rather low. It also caches any
eligible image by its adler32.

The two example from above would now be A1 A1 A1 A1 and A1 B1 A1 B1
which not only saves temporary storage, but also prevents computing
identical masks over and over again (which is the main performance impact
of #2618)
2014-10-28 15:39:41 +01:00
Jonas Jenwald
2003d83ea6 Fix loading of inline JPEG images 2014-09-11 16:42:51 +02:00
Jonas Jenwald
d1974eae34 Add peekByte method to Stream, DecodeStream and ChunkedStream 2014-09-11 16:42:41 +02:00
Yury Delendik
aa8d3d98f8 Fetches params in makeFilter 2014-09-09 08:29:31 -05:00
Rob Wu
07a4837763 CCITTFaxStream parser: resolve xref if needed
Fixes #5243
2014-08-31 11:03:24 +02:00
Nicholas Nethercote
ffae848f4e Reduce ASCII checks in makeInlineImage().
makeInlineImage() has a "are the next five chars ASCII?" check which is
run after an "EI" sequence has been found. This check involves the
creation of a new object because peekBytes() calls subarray().

Unfortunately, the check is currently run on whitespace chars even when
an "EI" sequence has not yet been found, i.e. when it's not needed. For
the PDF in #2618, there are over 820,000 such checks.

This change reworks the relevant loop so that the check is only done
once an "EI" sequence has been seen. This reduces the number of checks
to 157,000, and speeds up rendering by somewhere between 2% and 7% (the
measurements are noisy).
2014-08-14 16:20:58 -07:00
Jonas Jenwald
7fa204c805 Add strict equalities in src/core/parser.js 2014-08-02 17:37:24 +02:00
Jonas Jenwald
a5c98aab36 Re-factor parsing of the Linearization dictionary 2014-07-27 12:56:09 +02:00
Yury Delendik
fbdab2c7c5 Not ignoring MissingDataException exception. 2014-06-18 18:24:54 -05:00
Jonas Jenwald
ab67e1c272 Let Parser_makeFilter return NullStream when an invalid stream is encountered (issue 3417) 2014-06-17 12:03:34 +02:00
Yury Delendik
0cd28ebfa3 Telemetry for used stream and font types 2014-06-16 16:41:04 -05:00
Yury Delendik
b2d8e73d54 Merge pull request #4895 from p01/Small_optimizations_1
Small optimizations 1
2014-06-10 10:09:12 -05:00
p01
37c9765ab4 Optimized Lexer_getObj 2x faster 2014-06-10 12:37:36 +02:00
Jonas Jenwald
26bbcedcae Prevent infinite loop when scanning for endstream (bug 1020226) 2014-06-09 22:42:35 +02:00
fkaelberer
f88118dbf9 small optimizations in parser.getObj(), lexer.getObj() 2014-05-23 09:25:36 +02:00
p01
7b68737baa Strict isEOF / ~22% faster on issue2813, from 16.5s to 13.5s 2014-05-20 12:39:58 +02:00
Rob Wu
2e97c0d085 Remove some unused variables from src/
Only obviously useless, local variables have been removed.
2014-04-15 17:10:23 +02:00
Tim van der Meij
df91acf239 Fixes lint warning W004 in src/core 2014-04-11 00:41:08 +02:00
Yury Delendik
31f081ae17 Doesn't traverse cyclic references in Dict.getAll; reduces empty-Dict garbage 2014-03-26 09:07:38 -05:00
Brendan Dahl
1416eca164 Merge pull request #4493 from yurydelendik/issue4491
Fixes ignoring of the escaped CR LF
2014-03-20 14:57:24 -07:00
Tim van der Meij
284288f1d0 Making src/core/{image,obj,parser}.js adhere to the style guide 2014-03-20 20:28:22 +01:00
Yury Delendik
20a91bcdbf Fixes ignoring of the escaped CR LF 2014-03-20 11:50:12 -05:00
Yury Delendik
257898b359 Caching inlined mask images 2014-03-13 11:01:34 -05:00
Nicholas Nethercote
b3024db677 Estimate the size of decoded streams in advance.
When decoding a stream, the decode buffer is often grown multiple times, its
byte size increasing like so: 512, 1024, 2048, etc. This patch estimates the
minimum size in advance (using the length of the encoded stream), often
allowing the smaller sizes to be skipped. It also renames numerous |length|
variables as |maybeLength| to make it clear that they can be |null|.

I measured this change on eight documents. This change reduces the cumulative
size of decode buffer allocations by 0--32%, with 10--20% being typical. This
reduces peak RSS by 10 or 20 MiB for several of them.
2014-03-13 02:06:58 -07:00
Nicholas Nethercote
d0253c8291 Don't get bytes eagerly when creating {Jpeg,Jpx,Jbig2}Stream objects.
This avoids lots of unnecessary work when such streams are referred to via
fetch(), and so their bytes aren't subsequently read. This is a large
performance win on some files.
2014-03-11 16:03:15 -07:00
Tim van der Meij
3df8f89bd4 Fixes off-by-one error when finding missing endstream 2014-03-06 23:57:27 +01:00
Nicholas Nethercote
fdb7c218da Use a cache to minimize the number of Name objects. 2014-02-27 20:41:03 -08:00
Ophir LOJKINE
4a66eccedc Rewrite Lexer_getNumber.
Now, it computes the numbers with only basic arithmetic operations, without first creating a string and then calling parseFloat.
The new function doesn't behave exactly the same as the old one.
In particular, the old behaviour was that when there was a number immediatly followed by an 'E', the 'E' was consumed. Now it's not. It allows for "glued" numbers and operators.
Also, the new function is faster and consumes less memory.
2014-02-01 21:46:09 +01:00
Nicholas Nethercote
164d7a6e15 Don't create a string when lexing all-digit integers. 2014-01-29 18:22:09 -08:00
Nicholas Nethercote
b64cca0bef When lexing numbers, look for digits first. 2014-01-29 18:20:53 -08:00
Nicholas Nethercote
c1ef7e4d63 Use Array.join instead of += to build up strings in the Lexer. 2014-01-29 18:19:58 -08:00
Yury Delendik
09f8f951c8 Extracts evaluator preprocessor and refactor text extraction 2014-01-17 07:16:52 -06:00
Jonas
628f4aaf81 Enable loading of PDFs with undefined or missing stream lengths 2013-08-16 16:32:40 +02:00
Brendan Dahl
5ecce4996b Split files into worker and main thread pieces. 2013-08-12 10:48:06 -07:00