Commit Graph

1688 Commits

Author SHA1 Message Date
Jonas Jenwald
77d025dc14 Move the isPortraitOrientation helper function from web/base_viewer.js to web/ui_utils.js
A couple of basic unit-tests are added, and a manual `isLandscape` check (in `web/base_viewer.js`) is also converted to use the helper function instead.
2018-03-25 18:48:53 +02:00
Tim van der Meij
5c1a16ba6e
Merge pull request #9586 from Snuffleupagus/pageSize-api-rotate
Ensure that `PDFPageProxy.pageSizeInches` handles non-default /Rotate entries correctly
2018-03-25 18:03:32 +02:00
Tim van der Meij
6cc0efe1cc
Merge pull request #9576 from timvandermeij/versions
Update packages
2018-03-25 17:52:26 +02:00
Tim van der Meij
95de23e6e3
Update packages
Jasmine had a major version bump and required a few minor changes in our
booting code. Most notably, using `pending` in a `describe` block is no
longer supported, so we can only return early there. On the positive
side, the unit tests now run in a random order by default, which
eliminates any dependencies between unit tests.

Note that upgrading to Webpack 4 is out of scope for this patch since
the bots cannot work well with the newly generated bundles (both
browsers on both bots do not react within 120 seconds). Webpack 4 is not
faster for us than Webpack 3, so for now there is no need to upgrade.
2018-03-25 16:59:50 +02:00
Jonas Jenwald
d547936827 Ensure that PDFPageProxy.pageSizeInches handles non-default /Rotate entries correctly
Without this patch, the pageSize will be incorrectly reported for some PDF files.

---

Move pageSizeInches to ui_utils
2018-03-25 16:48:29 +02:00
Brendan Dahl
63c7aee112
Merge pull request #9565 from brendandahl/new-name
Rename the globals to shorter names.
2018-03-20 13:49:04 -07:00
Jonas Jenwald
e0ae157582 [api-minor] Fix various issues related to the pageSize information
The `getPageSizeInches` method was implemented on `PDFDocumentProxy`, which seems conceptually wrong since the size property isn't global to the document but rather specific to each page. Hence the method is moved into `PDFPageProxy`, as `get pageSizeInches` instead to address this.

Despite the fact that new API functionality was implemented, no unit-tests were added. To prevent issues later on, we should *always* ensure that new functionality has at least some test-coverage; something that this patch also takes care of.

The new `PDFDocumentProperties._parsePageSize` method seemed unnecessary convoluted. Furthermore, in the "no data provided"-case it even returned incorrect data (an array, rather than the expected object).
Finally, the fallback strings didn't actually agree with the `en-US` locale. This inconsistency doesn't look too great, and it's thus addressed here as well.
2018-03-18 09:10:19 +01:00
Brendan Dahl
01bff1a81d Rename the globals to shorter names.
pdfjsDistBuildPdf=pdfjsLib
pdfjsDistWebPdfViewer=pdfjsViewer
pdfjsDistBuildPdfWorker=pdfjsWorker
2018-03-16 11:08:56 -07:00
Jonas Jenwald
d431ae069d Attempt to handle corrupt PDF documents that inline Page dictionaries in a Kids array (issue 9540)
According to the specification, see https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G6.1942297, the contents of a Kids array should be indirect objects.
2018-03-12 14:13:23 +01:00
Tim van der Meij
f308d73d40
Implement a single getInheritableProperty utility function
This function combines the logic of two separate methods into one.
The loop limit is also a good thing to have for the calls in
`src/core/annotation.js`.

Moreover, since this is important functionality, a set of unit tests and
documentation is added.
2018-03-03 19:19:39 +01:00
Jonas Jenwald
b8606abbc1 [api-major] Completely remove the global PDFJS object 2018-03-01 18:13:27 +01:00
Jonas Jenwald
212553840f Move the pdfBug option from the global PDFJS object and into getDocument instead
Also removes the now unused `getDefaultSetting` helper function.
2018-03-01 18:11:17 +01:00
Jonas Jenwald
b69abf1111 Move the disableRange option from the global PDFJS object and into getDocument instead 2018-03-01 18:11:16 +01:00
Jonas Jenwald
69d7191034 Move the disableAutoFetch option from the global PDFJS object and into getDocument instead
One additional complication with removing this option from the global `PDFJS` object, is that the viewer currently needs to check `disableAutoFetch` in a couple of places. To address this I'm thus proposing adding a getter in `PDFDocumentProxy`, to allow checking the *actually* used values for a particular `getDocument` invocation.
2018-03-01 18:11:16 +01:00
Jonas Jenwald
3c2fbdffe6 Move the cMapUrl and cMapPacked options from the global PDFJS object and into getDocument instead 2018-03-01 18:11:16 +01:00
Jonas Jenwald
83d52518da [api-major] Refactor PDFWorker to be initialized with a parameter object, rather than a bunch of regular parameters 2018-02-16 13:22:35 +01:00
Jonas Jenwald
c3c1fc511d Move the workerSrc option from the global PDFJS object and into GlobalWorkerOptions instead 2018-02-16 13:22:35 +01:00
Rob Wu
a89071bdef
Merge pull request #9470 from Snuffleupagus/issue-4888
Ensure that `JpegImage.getData` returns the correct data length when `forceRGBoutput == true` (issue 4888)
2018-02-16 13:14:21 +01:00
Brendan Dahl
4f5fb78237
Merge pull request #9401 from brendandahl/svg-fail
Make the test framework more resilient to errors.
2018-02-14 17:05:40 -08:00
Brendan Dahl
53d19b619d Make the test framework more resilient to errors. 2018-02-14 17:02:19 -08:00
Tim van der Meij
538dda1096
Merge pull request #9479 from Snuffleupagus/refactor-viewer-options
[api-major] Refactor viewer components initialization to reduce their dependency on the global `PDFJS` object
2018-02-14 22:47:33 +01:00
Jonas Jenwald
11ab3b5c00 Ensure that JpegImage.getData returns the correct data length when forceRGBoutput == true (issue 4888)
With PDF.js version `2.0` we'll only support browsers with built-in `TypedArray` functionality, hence there doesn't seem to be any good reason not to implement this now.

Fixes 4888.
2018-02-13 20:44:21 +01:00
Jonas Jenwald
e95c11a7f0 Remove the undocumented PDFJS.enableStats option
In order to simplify things, the undocumented `enableStats` option was removed and `pdfBug` is now instead used to enabled general debugging *and* page request/rendering stats.
Considering that in the default viewer the `stats` was only used when debugging was also enabled, this simplification (code wise) definitely seem worthwhile to me.
2018-02-13 16:56:57 +01:00
Jonas Jenwald
3a6f6d23d6 Move the externalLinkTarget and externalLinkRel options to PDFLinkService options
This removes the `PDFJS.externalLinkTarget`/`PDFJS.externalLinkRel` dependency from the viewer components, but please note that as a *temporary* solution the default viewer still uses it.
2018-02-13 14:28:40 +01:00
Jonas Jenwald
c45c394364 Move the imageResourcesPath option to a BaseViewer/PDFPageView/AnnotationLayerBuilder option
This removes the `PDFJS.imageResourcesPath` dependency from the viewer components and the test-suite, but please note that as a *temporary* solution the default viewer still uses it.
2018-02-13 14:28:38 +01:00
Jonas Jenwald
f05e5c5460 Take the dictionary, and not just the image data, into account when caching inline images (issue 9398)
The reason for the bug is that we're only computing a checksum of the image data itself, but completely ignore the inline dictionary. The latter is important, since in practice it's not uncommon for inline images to be identical but use e.g. different ColourSpaces.

There's obviously a couple of different ways that we could compute a hash/checksum of the dictionary.
Initially I tried using `MurmurHash3_64` to compute a hash of the keys/values in the dictionary. Unfortunately this approach turned out to be *way* too slow in practice, especially for PDF files with a huge number of inline images; in particular issue 2618 would regresses quite badly with this solution.

The solution that is instead implemented in this patch, is to compute a checksum of the dictionary contents. While this is a much simpler, not to mention a lot more efficient, solution there's one drawback associated with it:
If the contents of inline image dictionaries are ordered differently, they will not be considered equal with this approach which could thus lead to failures to cache repeated inline images. In practice this doesn't seem to be a problem in any of the PDF files I've tested, and generally I'd rather err on the side of *not* caching given that too aggressive caching can easily lead to rendering bugs.

One small, but somewhat annoying, complication is that by the time `Parser.makeInlineImage` is called, we no longer know the *exact* stream position where the inline image dictionary starts. Having access to that information is crucial here, and the easiest solution I could come up with is to track this in the current `Lexer` instance.[1]

With the patch, we're thus able to fix the referenced issues without incurring large regressions in problematic cases such as issue 2618.

Fixes 9398; also improves/fixes the `issue8823` reference test.

---

[1] Obviously I'd have preferred if this patch could be limited to `Parser.makeInlineImage`, without the need for this "hack", but I'm not sure what that'd look like here.
2018-02-12 16:43:47 +01:00
Jonas Jenwald
1cf116ab88 Enable the mozilla/use-includes-instead-of-indexOf ESLint rule globally
This rule is available from https://www.npmjs.com/package/eslint-plugin-mozilla, and is enforced in mozilla-central. Note that we have the necessary `Array`/`String` polyfills and that most cases have already been fixed, see PRs 9032 and 9434.
2018-02-10 23:24:50 +01:00
Jonas Jenwald
2eb29409bc Enable the mozilla/avoid-removeChild ESLint rule globally
This rule is available from https://www.npmjs.com/package/eslint-plugin-mozilla, and is enforced in mozilla-central. Note that we have a polyfill for `ChildNode.remove()` and that most cases have already been fixed, see PRs 8056 and 8138.
2018-02-10 23:24:50 +01:00
Tim van der Meij
7bb066494f
Merge pull request #9427 from Snuffleupagus/native-JPEG-decoding-fallback
Fallback to the built-in JPEG decoder when browser decoding fails, and attempt to handle JPEG images with DNL (Define Number of Lines) markers (issue 8614)
2018-02-09 21:36:08 +01:00
Jonas Jenwald
a18c65ae9f Use the correct stream position when reading maxSizeOfInstructions from the maxp table (issue 9458)
Please refer to the `maxp` table specification, found at https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6maxp.html.

Fixes 9458.
2018-02-07 21:57:43 +01:00
Jonas Jenwald
bf4166e6c9 Attempt to handle DNL (Define Number of Lines) markers when parsing JPEG images (issue 8614)
Please refer to the specification, found at https://www.w3.org/Graphics/JPEG/itu-t81.pdf#page=49

Given how the JPEG decoder is currently implemented, we need to know the value of the scanLines parameter (among others) *before* parsing of the SOS (Start of Scan) data begins.
Hence the best solution I could come up with here, is to re-parse the image in the *hopefully* rare case of JPEG images that include a DNL (Define Number of Lines) marker.

Fixes 8614.
2018-02-05 21:05:32 +01:00
Rob Wu
911659cd70 Add tests for file names with spaces and semicolons 2018-02-04 17:58:10 +01:00
Jonas Jenwald
f4a95de694 Attempt to find the next valid marker when encountering invalid image data in JpegImage.parse (issue 9425)
In the JPEG images in the referenced PDF file, the DHT (Define Huffman Tables) segments contain more data than expected based on the length parameter.

Fixes 9425.
2018-02-03 16:01:19 +01:00
Jonas Jenwald
56a8c934dd [api-major] Remove the PDFJS.disableWorker option
Despite this patch removing the `disableWorker` option itself, please note that we'll still fallback to loading the worker file(s) on the main-thread when running in environments without proper Web Worker support.

Furthermore it's still possible, even with this patch, to force the use of fake workers by manually loading the necessary file using a `<script>` tag on the main-thread.[1]
That way, the functionality of the now removed `SINGLE_FILE` build target and the resulting `build/pdf.combined.js` file can still be achieved simply by adding e.g. `<script src="build/pdf.worker.js"></script>` to the HTML (obviously with the path adjusted as needed).

Finally note that the `disableWorker` option is a performance footgun, and unfortunately many existing third-party examples actually use it without providing any sort of warning/justification.

---

[1] This approach is used in the default viewer, since certain kind of debugging may be easier if the code is running directly on the main-thread.
2018-01-31 12:52:10 +01:00
Jonas Jenwald
a5aaf62754 [api-minor] Add a (static) PDFWorker.getWorkerSrc method that returns the current workerSrc
This method returns the currently used `workerSrc`, which thus allows obtaining the fallback `workerSrc` value (e.g. when the option wasn't set by the user).
2018-01-31 12:52:07 +01:00
Jonas Jenwald
42c71cd99f Utilize PDFNodeStream to run more API unit-tests on Node.js/Travis 2018-01-28 17:14:08 +01:00
Jani Pehkonen
5593c970e0 Implement Huffman coding in JBIG2 2018-01-23 17:04:07 +02:00
Jonas Jenwald
f0216484bc
Merge pull request #9383 from Rob--W/better-content-disposition-parser
Better content disposition parser
2018-01-21 15:08:14 +01:00
Rob Wu
a4e907169e Improve correctness of Content-Disposition parser
Re-uses logic from 9f5fcae11c/extension/content-disposition.js
which is already covered by tests: 6f3bbb8bbf
2018-01-21 13:31:12 +01:00
Jonas Jenwald
fe5102a27f
Merge pull request #9363 from Rob--W/fetch-http/s-only
Limit PDFFetchStream to http(s) in the Chrome extension
2018-01-21 11:45:09 +01:00
Rob Wu
0ffe9b9289 Remove useless test from network_utils_spec.js
Remove "returns null when content disposition is form-data".
The name of the test is already misleading: It suggests that
the return value is null if the Content-Disposition starts with
"form-data". This is not the case, anything with the "filename"
parameter is accepted.

So, to correct this, one would have to rephrase the test description to
"returns null when content disposition has no filename".
But this is already tested by the test called
"gets the filename from the response header".

So, remove the test.
2018-01-19 17:28:47 +01:00
Jonas Jenwald
69a8336cf1 Address the final round of review comments for Content-Disposition filename extraction
This patch updates the `IPDFStreamReader` interface and ensures that the interface/implementation of `network.js`, `fetch_stream.js`, `node_stream.js`, and `transport_stream.js` all match properly.
The unit-tests are also adjusted, to more closely replicate the actual behaviour of the various actual `IPDFStreamReader` implementations.
Finally, this patch adjusts the use of the Content-Disposition filename when setting the title in the viewer, and adds `PDFDocumentProperties` support as well.
2018-01-18 17:39:22 +01:00
Juan Salvador Perez Garcia
eb1f6f4c24 Content disposition filename
File name is extracted from headers.
2018-01-18 17:38:44 +01:00
Jonas Jenwald
f774abc8d3
Merge pull request #9368 from acchou/9362
end() is the official way to release a Writable stream
2018-01-17 12:45:23 +01:00
Andy Chou
b867c3299b end() is the official way to release a Writable stream 2018-01-16 12:01:33 -08:00
Rob Wu
1c8cacd6b9 Limit PDFFetchStream to http(s) in the Chrome extension
The `fetch` API is only supported for http(s), even in Chrome extensions.
Because of this limitation, we should use the XMLHttpRequest API when the
requested URL is not a http(s) URL.

Fixes #9361
2018-01-14 00:34:46 +01:00
Jonas Jenwald
0e1b5589e7 Restore the btoa/atob polyfills for Node.js
These were removed in PR 9170, since they were unused in the browsers that we'll support in PDF.js version `2.0`.
However looking at the output of Travis, where a subset of the unit-tests are run using Node.js, there's warnings about `btoa` being undefined. This doesn't appear to cause any errors, which probably explains why we didn't notice this before (despite PR 9201).
2018-01-13 01:31:05 +01:00
Jonas Jenwald
d0c8992e8a Attempt to actually resolve ColourSpace names in accordance with the specification (issue 9285)
Please refer to the PDF specification, in particular http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G7.3801570

> A colour space shall be specified in one of two ways:
>  - Within a content stream, the CS or cs operator establishes the current colour space parameter in the graphics state. The operand shall always be name object, which either identifies one of the colour spaces that need no additional parameters (DeviceGray, DeviceRGB, DeviceCMYK, or some cases of Pattern) or shall be used as a key in the ColorSpace subdictionary of the current resource dictionary (see 7.8.3, "Resource Dictionaries"). In the latter case, the value of the dictionary entry in turn shall be a colour space array or name. A colour space array shall never be inline within a content stream.
>
> - Outside a content stream, certain objects, such as image XObjects, shall specify a colour space as an explicit parameter, often associated with the key ColorSpace. In this case, the colour space array or name shall always be defined directly as a PDF object, not by an entry in the ColorSpace resource subdictionary. This convention also applies when colour spaces are defined in terms of other colour spaces.
2018-01-10 20:20:43 +01:00
Brendan Dahl
3925aab010
Merge pull request #9282 from Snuffleupagus/TrueType-Collection
Add support for TrueType Collection fonts (issue 9262)
2018-01-09 11:22:52 -08:00
Jonas Jenwald
915e3f4c5f
Merge pull request #9099 from tiriana/allow-dontFlip-in-PDFPageProxy-getViewport
Allows 'dontFlip' as third arg in PDFPageProxy.getViewport
2018-01-09 18:27:26 +01:00