Commit Graph

492 Commits

Author SHA1 Message Date
Jonas Jenwald
db5dc14158 Move worker-thread only functions from src/shared/util.js and into a new src/core/core_utils.js file
The `src/shared/util.js` file is being bundled into both the `pdf.js` and `pdf.worker.js` files, meaning that its code is by definition duplicated.
Some main-thread only utility functions have already been moved to a separate `src/display/display_utils.js` file, and this patch simply extends that concept to utility functions which are used *only* on the worker-thread.

Note in particular the `getInheritableProperty` function, which expects a `Dict` as input and thus *cannot* possibly ever be used on the main-thread.
2019-02-24 00:35:39 +01:00
Jonas Jenwald
a1f7517996 Rename the src/display/dom_utils.js file to src/display/display_utils.js
This file (currently) contains not only DOM-specific helper functions/classes, but is used generally for various helper code relevant for main-thread functionality.
2019-02-23 16:30:16 +01:00
Jonas Jenwald
a0354494bd Re-factor the PDFDataRangeTransport unit-tests and enable them in Node.js/Travis
There doesn't appear to be any particular reason for only running these unit-tests in browsers, since the `PDFDataRangeTransport` functionality itself should be back-end agnostic.
2019-02-17 14:45:17 +01:00
Jonas Jenwald
507e0a4907 Add a new DOMFileReaderFactory helper to the unit-tests, and re-factor NodeFileReaderFactory to be asynchronous
This allows simplification of the 'creates pdf doc from URL and aborts loading after worker initialized' API unit-test.

Note that the `DOMFileReaderFactory` uses the Fetch API, for simplicity, since it should be available in all browsers where we're running tests.
2019-02-17 14:41:14 +01:00
Jonas Jenwald
60f6d49ff7 [api-minor] Expose the existence of a Collection dictionary via the getMetadata API method (issue 10555)
Given the complexity of this functionality, and the fact that it doesn't seem widely used, I highly doubt that it'd ever make sense to support Collections; see also https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#M11.9.39646.2Heading.824.Collections
2019-02-15 15:40:31 +01:00
Tim van der Meij
7c91e94b19
Implement the NodeCanvasFactory class to execute more unit tests in Node.js 2019-02-10 19:37:34 +01:00
Tim van der Meij
b6eddc40b5
Write unit tests for the string32 and toRomanNumerals utility functions 2019-02-10 18:58:52 +01:00
Jonas Jenwald
22468817e1 Add a settled property, tracking the fulfilled/rejected stated of the Promise, to createPromiseCapability
This allows cleaning-up code which is currently manually tracking the state of the Promise of a `createPromiseCapability` instance.
2019-02-02 15:18:56 +01:00
Jonas Jenwald
2b0b6178f7 Clean-up after the gets operatorList with JPEG image (issue 4888) unit-test
This unit-test wasn't destroying the `loadingTask` when complete, as it should have done.
2019-01-29 15:24:08 +01:00
Jonas Jenwald
6f94a05a29 Do the final text scaling correctly in flushTextContentItem (issue 8276)
It's necessary to take into account whether or not the text is vertical, to avoid either the textContent `width` or `height` becoming incorrect.
2019-01-29 15:24:04 +01:00
Jani Pehkonen
26121177ab Implement Decode entry in Indexed images 2019-01-22 22:51:04 +02:00
Tim van der Meij
c4fe4087d3
Implement a unit test for metadata parsing to ensure that it's not vulnerable to the billion laughs attack 2019-01-19 19:54:08 +01:00
Jonas Jenwald
24a688d6c6 Convert some usage of indexOf to startsWith/includes where applicable
In many cases in the code you don't actually care about the index itself, but rather just want to know if something exists in a String/Array or if a String starts in a particular way. With modern JavaScript functionality, it's thus possible to remove a number of existing `indexOf` cases.
2019-01-18 17:57:41 +01:00
Jonas Jenwald
9f45f8dfda When parsing Metadata, attempt to remove "junk" before the first tag (PR 10398 follow-up)
This will allow the Metadata to be successfully extracted from the PDF file in issue 10395.
Furthermore, this patch also fixes a bug in `Metadata.get` which causes the method to return `null` rather than an empty string or zero (since either ought to be allowed).
2019-01-16 12:44:27 +01:00
Jonas Jenwald
5d90224409 Add a unit-test for issue 10395 (PR 10398 follow-up) 2019-01-16 11:30:36 +01:00
Jonas Jenwald
b2235ec9c4 Add a unit-test to check that the sortByVisibility parameter, in getVisibleElements, works correctly 2019-01-13 11:34:38 +01:00
Jonas Jenwald
9743708a24 Prevent TypeError: views[index] is undefined being throw in getVisibleElements when the viewer, or all pages, are hidden
Previously a couple of different attempts at fixing this problem has been rejected, given how *crucial* this code is for the correct function of the viewer, since no one has thus far provided any evidence that the problem actually affects the default viewer[1] nor an example using the viewer components directly (without another library on top).
The fact that none of the prior patches contained even a *simple* unit-test probably contributed to the unwillingness of a reviewer to sign off on the suggested changes.

However, it turns out that it's possible to create a reduced test-case, using the default viewer, that demonstrates the error[2]. Since this utilizes a hidden `<iframe>`, please note that this error will thus affect Firefox as well.
Note that while errors are thrown when the hidden `<iframe>` loads, the default viewer doesn't break completely since rendering does start working once the `<iframe>` becomes visible (although the errors do break the initial Toolbar state).

Before making any changes here, I carefully read through not just the immediately relevant code but also the rendering code in the viewer (given it's dependence on `getVisibleElements`). After concluding that the changes should be safe in general, the default viewer was tested without any issues found. (The above being much easier with significant prior experience of working with the viewer code.)
Finally the patch also adds new unit-tests, one of which explicitly triggers the relevant code-path and will thus fail with the current `master` branch.

This patch also makes `PDFViewerApplication` slightly more robust against errors during document opening, to ensure that viewer/document initialization always completes as expected.
Please keep in mind that even though this patch prevents an error in `getVisibleElements`, it's still not possible to set the initial position/zoom level/sidebar view etc. when the viewer is hidden since rendering and scrolling is completely dependent[3] on being able to actually access the DOM elements.

---
[1] And hence the PDF Viewer that's built-in to Firefox.

[2] Copy the HTML code below and save it as `iframe.html`, and place the file in the `web/` folder. Then start the server, with `gulp server`, and navigate to http://localhost:8888/web/iframe.html

```html
<!DOCTYPE html>
<html>
  <head>
    <title>Iframe test</title>

    <script>
      window.onload = function() {
        const button = document.getElementById('button1');
        const frame = document.getElementById('frame1');

        button.addEventListener('click', function(evt) {
          frame.hidden = !frame.hidden;
        });
      };
    </script>
  </head>

  <body>
    <button id="button1">Toggle iframe</button>
    <br>
    <iframe id="frame1" width="800" height="600" src="http://localhost:8888/web/viewer.html" hidden="true"></iframe>
  </body>
</html>
```

[3] This is an old, pre-exisiting, issue that's not relevant to this patch as such (and it's already being tracked elsewhere).
2019-01-13 11:34:24 +01:00
Tim van der Meij
ed918bad21
Remove left-over console log from the find controller unit tests 2019-01-12 22:27:40 +01:00
Tim van der Meij
b1cef896f4
Write more unit tests for the find controller
Fixes #7356.
2019-01-12 22:17:46 +01:00
Jonas Jenwald
e8f4b47d59 Prevent errors, in SimpleXMLParser.onEndElement, when the stack has already been completely parsed (issue 10410)
The error was triggered for a particular set of metadata, where an end tag was encountered without the corresponding begin tag being present in the data.
(The patch also fixes a minor oversight, from a recent PR, in the `SimpleDOMNode.nextSibling` method.)
2019-01-05 11:15:34 +01:00
Tim van der Meij
b39ec7af96
Merge pull request #10408 from Snuffleupagus/issue-10407
Prevent errors, because of incorrect scope, in the `XMLParserBase._resolveEntities` method (issue 10407)
2019-01-04 23:45:26 +01:00
Jonas Jenwald
66fccd860b Adjust how AnnotationBorderStyle.setWidth handles the input being a Name (issue 10385)
In order to be consistent with the behaviour in Adobe Reader, the width will now always be set to zero when the input is a `Name`.
2019-01-04 10:38:10 +01:00
Jonas Jenwald
6cd9ff48f3 Prevent errors, because of incorrect scope, in the XMLParserBase._resolveEntities method (issue 10407) 2019-01-04 10:13:32 +01:00
Jonas Jenwald
76a9580aeb Ensure that AnnotationBorderStyle.setWidth is able to handle the input being a Name, to correctly deal with corrupt PDF documents (issue 10385) 2018-12-31 12:21:28 +01:00
Tim van der Meij
103f4616ac
Merge pull request #10334 from Snuffleupagus/OpenAction-dest
[api-minor] Add support for OpenAction destinations (issue 10332)
2018-12-23 20:49:50 +01:00
Jonas Jenwald
f0719ed565 [api-minor] Change the getViewport method, on PDFPageProxy, to take a parameter object rather than a bunch of (randomly) ordered parameters
If, as PR 10368 suggests, more parameters should be added to `getViewport` I think that it would be a mistake to not change the signature *first* to avoid needlessly unwieldy call-sites.

To not break any existing code and third-party use-cases, this is obviously implemented with a deprecation warning *and* with a working fallback[1] for the old method signature.

---
[1] This is limited to `GENERIC` builds, which should be sufficient.
2018-12-21 11:55:20 +01:00
Jonas Jenwald
b05f053287 [api-minor] Add support for OpenAction destinations (issue 10332)
Note that the OpenAction dictionary may contain other information besides just a destination array, e.g. instructions for auto-printing[1].
Given first of all that an arbitrary `Dict` cannot be sent from the Worker (since cloning would fail), and second of all that the data obviously needs to be validated, this patch purposely only adds support for fetching a destination from the OpenAction entry[2].

---
[1] This information is, currently in PDF.js, being included through the `getJavaScript` API method.

[2] This significantly reduces the complexity of the implementation, which seems fine for now. If there's ever need for other kinds of OpenAction to be fetched, additional API methods could/should be implemented as necessary (could e.g. follow the `getOpenActionWhatever` naming scheme).
2018-12-19 11:45:16 +01:00
Jonas Jenwald
ba2edeae18 [api-minor] Add support, in getMetadata, for custom information dictionary entries (issue 5970, issue 10344) (#10346)
The custom entries, provided that they exist *and* that their types are safe to include, are exposed through a new `Custom` infoDict entry to clearly separate them from the standard ones.

Fixes 5970.
Fixes 10344.
2018-12-18 23:26:02 +01:00
Jonas Jenwald
2c003a82d5 Convert RenderTask, in src/display/api.js, to an ES6 class
Also deprecates the `then` method, in favour of the `promise` getter.
2018-11-18 19:08:00 +01:00
Tim van der Meij
016c2761da
Resolve deprecation warnings for Jasmine
Jasmine recommends to use the `configure` method on the environment
during boot. This commit makes the code correspond to how it's done in
Jasmine's default boot file. The options dropdown in the HTML reporter
now works again after these changes, because this broke in the upgrade
to Jasmine 3, and the unit tests are executed in a random order by
default, which is important to make sure the unit tests are
self-contained and don't depend on the result of another unit test.
2018-11-17 23:31:22 +01:00
Thomas den Hollander
b157d8b478
Change splice to pop in annotation tests
This line in the annotation tests subtracts an array from a number. This is because `splice(-1, 1)` returns a one-element array, while `pop()` returns only the element itself.
2018-10-24 13:08:08 +02:00
Jonas Jenwald
327f2eb588 Ensure that onProgress is always called when the entire PDF file has been loaded, regardless of how it was fetched (issue 10160)
*Please note:* I'm totally fine with this patch being rejected, and the issue closed as WONTFIX; however these changes should address the issue if that's desired.

From a conceptual point of view, reporting loading progress doesn't really make a lot of sense for PDF files opened by passing raw binary data directly to `getDocument` (since obviously *all* data was loaded).
This is compared to PDF files loaded via e.g. `XMLHttpRequest` or the Fetch API, where the entire PDF file isn't available from the start and knowing the loading progress makes total sense.

However I can certainly see why the current API could be considered inconsistent, which isn't great, since a registered `onProgress` callback will never be called for certain `getDocument` calls.
The simplest solution to this inconsistency thus seem to be to ensure that `onProgress` is always called when handling the `DataLoaded` message, since that will *always* be dispatched[1] from the worker-thread.

---
[1] Note that this isn't guaranteed to happen, since setting `disableAutoFetch = true` often prevents the *entire* file from ever loading. However, this isn't relevant for the issue at hand, and is a well-known consequence of using `disableAutoFetch = true`; note how the default viewer even has a specialized code-path for hiding the loadingBar.
2018-10-16 13:51:12 +02:00
Kevin Lee Drum
4cf10ac79d set returnValues.suggestedLength to Content-Length if integer 2018-10-07 13:26:29 -04:00
Tim van der Meij
ff2df9c5b6
Merge pull request #10117 from leblanc-simon/ink-annotation-support
Add support of Ink annotation
2018-10-04 23:39:41 +02:00
Jonas Jenwald
2ed3591b22 Make PDFFindController less confusing to use, by allowing searching to start when setDocument is called
*This patch is based on something that I noticed while working on PR 10126.*

The recent re-factoring of `PDFFindController` brought many improvements, among those the fact that access to `BaseViewer` is no longer required. However, with these changes there's one thing which now strikes me as not particularly user-friendly[1]: The fact that in order for searching to actually work, `PDFFindController.setDocument` must be called *and* a 'pagesinit' event must be dispatched (from somewhere).

For all other viewer components, calling the `setDocument` method[2] is enough in order for the component to actually be usable.
The `PDFFindController` thus stands out quite a bit, and it also becomes difficult to work with in any sort of custom implementation. For example: Imagine someone trying to use `PDFFindController` separately from the viewer[3], which *should* now be relatively simple given the re-factoring, and thus having to (somehow) figure out that they'll also need to manually dispatch a 'pagesinit' event for searching to work.

Note that the above even affects the unit-tests, where an out-of-place 'pagesinit' event is being used.
To attempt to address these problems, I'm thus suggesting that *only* `setDocument` should be used to indicate that searching may start. For the default viewer and/or the viewer components, `BaseViewer.setDocument` will now call `PDFFindController.setDocument` when the document is ready, thus requiring no outside configuration anymore[4]. For custom implementation, and the unit-tests, it's now as simple as just calling `PDFFindController.setDocument` to allow searching to start.

---
[1] I should have caught this during review of PR 10099, but unfortunately it's sometimes not until you actually work with the code in question that things like these become clear.

[2] Assuming, obviously, that the viewer component in question actually implements such a method :-)

[3] There's even a very recent issue, filed by someone trying to do just that.

[4] Short of providing a `PDFFindController` instance when creating a `BaseViewer` instance, of course.
2018-10-04 10:28:50 +02:00
Simon Leblanc
b5806735d8 Add support of Ink annotation 2018-10-03 00:28:49 +02:00
Tim van der Meij
1b402996cf
Implement a basic unit test for the find controller
This commit shows that we can now unit test the find controller and
that executing regular queries works. Note that this is only a first
step and not a complete suite of unit tests for all possible options
of the find controller.

While writing this unit test, I found two smaller issues that I
addressed directly. The first one is that in the previous find
controller refactoring I forgot to rename some occurrences of a now
private member variable. Fortunately this did not cause any bugs since
we did have a public getter and the fetched value may be changed by
reference, but it's nevertheless good to fix. The second issue is that
some entries in the `test/unit/clitests.json` file were not correct,
resulting in these tests not being executed on e.g., Travis CI.
2018-09-30 18:32:34 +02:00
Jonas Jenwald
1c814e208e Prevent getPDFFileNameFromURL from breaking if the url parameter is not a string 2018-09-30 12:28:59 +02:00
Jonas Jenwald
842e9206c0 Replace String.prototype.substr() occurrences with String.prototype.substring()
As outlined in https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/substr, which refers to the ECMA-262 specification, using the `substr` function is advised against.

Hence this PR, which replaces all remaining `substr` occurrences with `substring` instead. Please refer to https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/substr#Syntax respectively https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/substring#Syntax for the differences between the two functions.

Note that in most cases in the code-base there's only one argument passed to `substr`, and those require no other changes except replacing "substr" with "substring". For the other cases, the `substr(start, length)` calls are changed to `substring(start, start + length)` instead.
2018-09-28 11:41:07 +02:00
Jonas Jenwald
39776168a2 Add EventBus unit-tests to ensure that the (optional) argument handling works correctly 2018-09-21 14:31:35 +02:00
Jonas Jenwald
f317a2cb40 Ensure that the DOM event listeners are removed at the end of the relevant EventBus unit-tests, to prevent the tests from interfering with each other 2018-09-20 23:12:01 +02:00
Tim van der Meij
99de25d6cc
Implement unit tests for the isSameOrigin and createValidAbsoluteUrl utility functions
Moreover, mark the `isValidProtocol` function as private since it's only
used in the utilities file and is not (meant to be) exported.
2018-09-11 16:17:45 +02:00
Jonas Jenwald
6d804d657f Add initial support for "Whole words" searching in the viewer
As outlined in https://bugzilla.mozilla.org/show_bug.cgi?id=1282759 the internal Firefox name for the feature is `entireWord`, hence that name is used here as well for consistency (with "Whole words" being limited to the UI).

Given existing limitations of the PDF.js search functionality, e.g. the existing problems of searching across "new lines", there's some edge-cases where "Whole words" searching will ignore (valid) results.
However, considering that this is a pre-existing issue related to the way that the find controller joins text-content together, that shouldn't have to block this new feature in my opionion.

*Please note:* In order to enable this feature in the `MOZCENTRAL` version, a small follow-up patch for [PdfjsChromeUtils.jsm](https://hg.mozilla.org/mozilla-central/file/tip/browser/extensions/pdfjs/content/PdfjsChromeUtils.jsm) will be required once this has landed in `mozilla-central`.
2018-09-10 11:59:29 +02:00
Tim van der Meij
66422eb83e
Merge pull request #9340 from brendandahl/private-use
Map all glyphs to the private use area and duplicate the first glyph.
2018-09-08 17:51:04 +02:00
Brendan Dahl
b76cf665ec Map all glyphs to the private use area and duplicate the first glyph.
There have been lots of problems with trying to map glyphs to their unicode
values. It's more reliable to just use the private use areas so the browser's
font renderer doesn't mess with the glyphs.

Using the private use area for all glyphs did highlight other issues that this
patch also had to fix:

  * small private use area - Previously, only the BMP private use area was used
    which can't map many glyphs. Now, the (much bigger) PUP 16 area can also be
    used.

  * glyph zero not shown - Browsers will not use the glyph from a font if it is
    glyph id = 0. This issue was less prevalent when we mapped to unicode values
    since the fallback font would be used. However, when using the private use
    area, the glyph would not be drawn at all. This is illustrated in one of the
    current test cases (issue #8234) where there's an "ä" glyph at position
    zero. The PDF looked like it rendered correctly, but it was actually not
    using the glyph from the font. To properly show the first glyph it is always
    duplicated and appended to the glyphs and the maps are adjusted.

  * supplementary characters - The private use area PUP 16 is 4 bytes, so
    String.fromCodePoint must be used where we previously used
    String.fromCharCode. This is actually an issue that should have been fixed
    regardless of this patch.

  * charset - Freetype fails to load fonts when the charset size doesn't match
    number of glyphs in the font. We now write out a fake charset with the
    correct length. This also brought up the issue that glyphs with seac/endchar
    should only ever write a standard charset, but we now write a custom one.
    To get around this the seac analysis is permanently enabled so those glyphs
    are instead always drawn as two glyphs.
2018-09-05 14:04:54 -07:00
Tim van der Meij
e812c6e7ac
Use shorter code for failing a test in test/unit/api_spec.js 2018-09-02 21:23:09 +02:00
Tim van der Meij
959ed3705b
Implement a permissions API 2018-09-02 21:23:09 +02:00
Jonas Jenwald
0b1f41c5b3 Add general support for re-dispatching events, on EventBus instances, to the DOM
This patch is the first step to be able to eventually get rid of the `attachDOMEventsToEventBus` function, by allowing `EventBus` instances to simply re-dispatch most[1] events to the DOM.
Note that the re-dispatching is purposely implemented to occur *after* all registered `EventBus` listeners have been serviced, to prevent the ordering issues that necessitated the duplicated page/scale-change events.

The DOM events are currently necessary for the `mozilla-central` tests, see https://hg.mozilla.org/mozilla-central/file/tip/browser/extensions/pdfjs/test, and perhaps also for custom deployments of the PDF.js default viewer.

Once this have landed, and been successfully uplifted to `mozilla-central`, I intent to submit a patch to update the test-code to utilize the new preference. This will thus, eventually, make it possible to remove the `attachDOMEventsToEventBus` functionality.

*Please note:* I've successfully ran all `mozilla-central` tests locally, with these patches applied.

---
[1] The exception being events that originated on the `window` or `document`, since those are already globally available anyway.
2018-08-30 17:28:12 +02:00
Tim van der Meij
1268aea2b6
Merge pull request #9975 from Snuffleupagus/getDestination-refactor
Re-factor `destinations`/`getDestination` to reduce unnecessary duplication, and reject non-string inputs
2018-08-12 15:51:58 +02:00
Tim van der Meij
af19ed6ee9
Merge pull request #9822 from timvandermeij/annotations
[api-minor] Refactor the annotation code to be asynchronous
2018-08-11 20:39:50 +02:00