Commit Graph

11333 Commits

Author SHA1 Message Date
Tim van der Meij
7307c60407
Merge pull request #10420 from timvandermeij/misc
Update translations/packages and improve documentation
2019-01-05 15:27:06 +01:00
Tim van der Meij
61dcc41a3c
Clarify that gulp dist-install should be used for the AcroForms example
Fixes #10333.
2019-01-05 15:20:50 +01:00
Tim van der Meij
f32dcbc089
Improve the file layout overview on the website
Mention only the relevant files/folders and update the overview to match
the current file trees.

Fixes #10384.
2019-01-05 15:20:50 +01:00
Tim van der Meij
ca04a397bb
Update packages 2019-01-05 14:27:47 +01:00
Tim van der Meij
825aceb648
Update translations 2019-01-05 14:24:44 +01:00
Tim van der Meij
b81984f0cb
Merge pull request #10417 from brendandahl/metric-length
Fix reading number of HTMX metrics.
2019-01-05 13:35:16 +01:00
Tim van der Meij
a0eb5cf9d5
Merge pull request #10412 from Snuffleupagus/issue-10410
Prevent errors, in `SimpleXMLParser.onEndElement`, when the stack has already been completely parsed (issue 10410)
2019-01-05 12:56:43 +01:00
Jonas Jenwald
e8f4b47d59 Prevent errors, in SimpleXMLParser.onEndElement, when the stack has already been completely parsed (issue 10410)
The error was triggered for a particular set of metadata, where an end tag was encountered without the corresponding begin tag being present in the data.
(The patch also fixes a minor oversight, from a recent PR, in the `SimpleDOMNode.nextSibling` method.)
2019-01-05 11:15:34 +01:00
Brendan Dahl
32eace043b Fix reading number of HTMX metrics.
The length of the HHEA table can be incorrect, so it is better to
read the number of metrics offset from beginning of table instead.
2019-01-04 15:13:13 -08:00
Tim van der Meij
b39ec7af96
Merge pull request #10408 from Snuffleupagus/issue-10407
Prevent errors, because of incorrect scope, in the `XMLParserBase._resolveEntities` method (issue 10407)
2019-01-04 23:45:26 +01:00
Tim van der Meij
3f9e9f0d88
Merge pull request #10411 from Snuffleupagus/issue-10385-2
Adjust how `AnnotationBorderStyle.setWidth` handles the input being a `Name` (issue 10385)
2019-01-04 23:42:08 +01:00
Jonas Jenwald
66fccd860b Adjust how AnnotationBorderStyle.setWidth handles the input being a Name (issue 10385)
In order to be consistent with the behaviour in Adobe Reader, the width will now always be set to zero when the input is a `Name`.
2019-01-04 10:38:10 +01:00
Jonas Jenwald
6cd9ff48f3 Prevent errors, because of incorrect scope, in the XMLParserBase._resolveEntities method (issue 10407) 2019-01-04 10:13:32 +01:00
Tim van der Meij
5a2bd9fc63
Merge pull request #10399 from MohammedEssehemy/master
migrate to canvas 2.x api
2019-01-03 23:32:56 +01:00
Tim van der Meij
2d00bb098b
Merge pull request #10404 from Snuffleupagus/issue-10401
Remove the `for ... of` loop from the `PDFDocument.fingerprint` getter (issue 10401)
2019-01-03 22:46:51 +01:00
Brendan Dahl
e2686db49b
Merge pull request #10277 from janpe2/cff-stems
Repair CFF fonts if stem hints are in wrong order
2019-01-03 10:30:43 -08:00
Jonas Jenwald
8c278530dd Remove the for ... of loop from the PDFDocument.fingerprint getter (issue 10401)
It appears that the `Symbol` polyfill doesn't work well in conjunction with `TypedArray`s, and that part of PR 10393 is thus reverted.
2019-01-03 11:17:45 +01:00
Mohammed Essehemy
f0e9df745c
migrate to canvas 2.x api 2019-01-02 01:10:07 +02:00
Tim van der Meij
1b84b2ed60
Merge pull request #10398 from Snuffleupagus/issue-10395
Prevent errors in various methods in `SimpleDOMNode` when the `childNodes` property is not defined (issue 10395)
2019-01-01 16:22:11 +01:00
Jonas Jenwald
d371d23382 Prevent errors in various methods in SimpleDOMNode when the childNodes property is not defined (issue 10395)
Given that the issue, as filed, is incomplete since no PDF file was provided for debugging, this patch is really the best that we can do here. *Please note:* This patch will *not* enable the Metadata to be successfully parsed, but it should at least prevent the errors.
2018-12-31 13:07:15 +01:00
Tim van der Meij
d8f201ea2a
Merge pull request #10397 from Snuffleupagus/issue-10385
Ensure that `AnnotationBorderStyle.setWidth` is able to handle the input being a `Name`, to correctly deal with corrupt PDF documents (issue 10385)
2018-12-31 12:58:28 +01:00
Tim van der Meij
2cdeb93b5f
Merge pull request #10396 from timvandermeij/optimizations
Optimizations to avoid intermediate string creation
2018-12-31 12:50:00 +01:00
Jonas Jenwald
76a9580aeb Ensure that AnnotationBorderStyle.setWidth is able to handle the input being a Name, to correctly deal with corrupt PDF documents (issue 10385) 2018-12-31 12:21:28 +01:00
Jonas Jenwald
15b3806937 Actually validate the input in AnnotationBorderStyle.setStyle 2018-12-31 12:15:15 +01:00
Tim van der Meij
5b57e69da2
Optimize CanvasGraphics.setFont to avoid intermediate string creation
This method creates quite a few intermediate strings on each call and
it's called often, even for smaller documents like the Tracemonkey
document. Scrolling from top to bottom in that document resulted in
14126 strings being created in this method. With this commit applied,
this is reduced to 2018 strings.
2018-12-30 14:58:32 +01:00
Tim van der Meij
95f9075565
Optimize TextLayerRenderTask._layoutText to avoid intermediate string creation
This method creates quite a few intermediate strings on each call and
it's called often, even for smaller documents like the Tracemonkey
document. Scrolling from top to bottom in that document resulted in
12936 strings being created in this method. With this commit applied,
this is reduced to 3610 strings.
2018-12-30 14:39:08 +01:00
Tim van der Meij
7c080584b6
Merge pull request #10393 from timvandermeij/document
Convert `src/core/document.js` to ES6 syntax
2018-12-30 14:23:38 +01:00
Tim van der Meij
d5e5d18430
Convert the PDFDocument class in src/core/document.js to ES6 syntax 2018-12-30 13:54:43 +01:00
Tim van der Meij
612fc9fcc2
Convert the Page class in src/core/document.js to ES6 syntax 2018-12-30 13:54:43 +01:00
Tim van der Meij
85363f4566
Merge pull request #10394 from timvandermeij/primitives-optimization
Optimize the `Ref` class in `src/core/primitives.js`
2018-12-30 12:30:09 +01:00
Tim van der Meij
aad27ff9a0
Optimize the Ref class in src/core/primitives.js
The `toString` method always creates two string objects (for the 'R'
character and for the `num` concatenation) and in the worst case
creates three string objects (one more for the `gen` concatenation).
For the Tracemonkey paper alone, this resulted in 12000 string
objects when scrolling from the top to the bottom of the document.
Since this is a hot function, it's worth minimizing the number of string
objects, especially for large documents, to reduce peak memory usage.

This commit refactors the `toString` method to always create only one
string object.
2018-12-29 17:48:41 +01:00
Tim van der Meij
e53877f372
Merge pull request #10392 from Snuffleupagus/checkFirstPage
Check that the first page can be successfully loaded, to try and ascertain the validity of the XRef table (issue 7496, issue 10326)
2018-12-29 15:13:19 +01:00
Jonas Jenwald
60bcce184e Check that the first page can be successfully loaded, to try and ascertain the validity of the XRef table (issue 7496, issue 10326)
For PDF documents with sufficiently broken XRef tables, it's usually quite obvious when you need to fallback to indexing the entire file. However, for certain kinds of corrupted PDF documents the XRef table will, for all intents and purposes, appear to be valid. It's not until you actually try to fetch various objects that things will start to break, which is the case in the referenced issues[1].

Since there's generally a real effort being in made PDF.js to load even corrupt PDF documents, this patch contains a suggested approach to attempt to do a bit more validation of the XRef table during the initial document loading phase.

Here the choice is made to attempt to load the *first* page, as a basic sanity check of the validity of the XRef table. Please note that attempting to load a more-or-less arbitrarily chosen object without any context of what it's supposed to be isn't a very useful, which is why this particular choice was made.
Obviously, just because the first page can be loaded successfully that doesn't guarantee that the *entire* XRef table is valid, however if even the first page fails to load you can be reasonably sure that the document is *not* valid[2].

Even though this patch won't cause any significant increase in the amount of parsing required during initial loading of the document[3], it will require loading of more data upfront which thus delays the initial `getDocument` call.
Whether or not this is a problem depends very much on what you actually measure, please consider the following examples:

```javascript
console.time('first');
getDocument(...).promise.then((pdfDocument) => {
  console.timeEnd('first');
});

console.time('second');
getDocument(...).promise.then((pdfDocument) => {
  pdfDocument.getPage(1).then((pdfPage) => { // Note: the API uses `pageNumber >= 1`, the Worker uses `pageIndex >= 0`.
    console.timeEnd('second');
  });
});
```

The first case is pretty much guaranteed to show a small regression, however the second case won't be affected at all since the Worker caches the result of `getPage` calls. Again, please remember that the second case is what matters for the standard PDF.js use-case which is why I'm hoping that this patch is deemed acceptable.

---
[1] In issue 7496, the problem is that the document is edited without the XRef table being correctly updated.
In issue 10326, the generator was sorting the XRef table according to the offsets rather than the objects.

[2] The idea of checking the first page in particular came from the "standard" use-case for the PDF.js library, i.e. the default viewer, where a failure to load the first page basically means that nothing will work; note how `{BaseViewer, PDFThumbnailViewer}.setDocument` depends completely on being able to fetch the *first* page.

[3] The only extra parsing is caused by, potentially, having to traverse *part* of the `Pages` tree to find the first page.
2018-12-29 12:47:25 +01:00
Tim van der Meij
d3868e1bd1
Merge pull request #10376 from timvandermeij/chunked-stream
Convert `src/core/chunked_stream.js` to ES6 syntax
2018-12-25 15:28:00 +01:00
Tim van der Meij
360c3d3813
Remove the unused url argument for the ChunkedStreamManager class 2018-12-24 13:14:42 +01:00
Tim van der Meij
47344197f4
Convert src/core/chunked_stream.js to ES6 syntax 2018-12-24 13:14:42 +01:00
Tim van der Meij
2e05827b87
Merge pull request #10378 from Snuffleupagus/issue-10377
Update remaining examples, and docs, to utilize current API functionality (issue 10377)
2018-12-24 13:13:10 +01:00
Jonas Jenwald
9962ab66ab Update remaining examples, and docs, to utilize current API functionality (issue 10377)
This contains a couple of changes that I missed elsewhere, sorry about that!
2018-12-24 12:33:39 +01:00
Tim van der Meij
103f4616ac
Merge pull request #10334 from Snuffleupagus/OpenAction-dest
[api-minor] Add support for OpenAction destinations (issue 10332)
2018-12-23 20:49:50 +01:00
Tim van der Meij
74461b0aa4
Merge pull request #10375 from timvandermeij/updates
Update translations and packages
2018-12-22 16:42:23 +01:00
Tim van der Meij
18bac1b4fc
Update packages 2018-12-22 16:35:34 +01:00
Tim van der Meij
11bf77aca4
Update translations 2018-12-22 15:54:42 +01:00
Tim van der Meij
98b4aff291
Merge pull request #10369 from Snuffleupagus/getViewport-signature
[api-minor] Change the `getViewport` method, on `PDFPageProxy`, to take a parameter object rather than a bunch of (randomly) ordered parameters
2018-12-22 14:56:16 +01:00
Jonas Jenwald
f0719ed565 [api-minor] Change the getViewport method, on PDFPageProxy, to take a parameter object rather than a bunch of (randomly) ordered parameters
If, as PR 10368 suggests, more parameters should be added to `getViewport` I think that it would be a mistake to not change the signature *first* to avoid needlessly unwieldy call-sites.

To not break any existing code and third-party use-cases, this is obviously implemented with a deprecation warning *and* with a working fallback[1] for the old method signature.

---
[1] This is limited to `GENERIC` builds, which should be sufficient.
2018-12-21 11:55:20 +01:00
Jonas Jenwald
a7e70a50f5 Add OpenAction destination support, off by default, to the viewer
Given that it's really not clear to me if this is actually desired functionality in the default viewer, and considering that it doesn't fit in *great* with the way that `PDFHistory` is initialized, this feature is currently off by default[1].

---
[1] It's controlled with the `disableOpenActionDestination` Preference/AppOption.
2018-12-19 11:45:17 +01:00
Jonas Jenwald
b05f053287 [api-minor] Add support for OpenAction destinations (issue 10332)
Note that the OpenAction dictionary may contain other information besides just a destination array, e.g. instructions for auto-printing[1].
Given first of all that an arbitrary `Dict` cannot be sent from the Worker (since cloning would fail), and second of all that the data obviously needs to be validated, this patch purposely only adds support for fetching a destination from the OpenAction entry[2].

---
[1] This information is, currently in PDF.js, being included through the `getJavaScript` API method.

[2] This significantly reduces the complexity of the implementation, which seems fine for now. If there's ever need for other kinds of OpenAction to be fetched, additional API methods could/should be implemented as necessary (could e.g. follow the `getOpenActionWhatever` naming scheme).
2018-12-19 11:45:16 +01:00
Jonas Jenwald
ba2edeae18 [api-minor] Add support, in getMetadata, for custom information dictionary entries (issue 5970, issue 10344) (#10346)
The custom entries, provided that they exist *and* that their types are safe to include, are exposed through a new `Custom` infoDict entry to clearly separate them from the standard ones.

Fixes 5970.
Fixes 10344.
2018-12-18 23:26:02 +01:00
Thiago da Silva
811c8803b3 Fix small visual quirk in thumbnail viewer 2018-12-18 22:48:26 +01:00
Tim van der Meij
417c234c1c
Merge pull request #10266 from timvandermeij/gulp-4
Upgrade to Gulp 4
2018-12-17 17:00:51 +01:00
Tim van der Meij
fa85f86298
Upgrade to Gulp 4
This required the following changes in the Gulpfile:

- Defining a series of tasks is no longer done with arrays, but with the
  `gulp.series` function. The `web` target is refactored to use a
  smaller number of tasks to prevent tasks from running multiple times.
- Getting all tasks must now be done through the task registry.
- Tasks that don't return anything must call `done` upon completion.

Moreover, this upgrade allows us to use the latest Node.js on Travis CI
again.
2018-12-17 16:20:13 +01:00