Commit Graph

189 Commits

Author SHA1 Message Date
Jonas Jenwald
47f94235ab [api-minor] Re-factor the *internal* renderingIntent, and change the default intent value in the PDFPageProxy.getAnnotations method
With the changes made in PR 13746 the *internal* renderingIntent handling became somewhat "messy", since we're now having to do string-matching in various spots in order to handle the "oplist"-intent correctly.
Hence this patch, which implements the idea from PR 13746 to convert the `intent`-strings, used in various API-methods, into an *internal* renderingIntent that's implemented using a bit-field instead. *Please note:* This part of the patch, in itself, does *not* change the public API (but see below).

This patch is tagged `api-minor` for the following reasons:
 1. It changes the *default* value for the `intent` parameter, in the `PDFPageProxy.getAnnotations` method, to "display" in order to be consistent across the API.
 2. In order to get *all* annotations, with the `PDFPageProxy.getAnnotations` method, you now need to explicitly set "any" as the `intent` parameter.
 3. The `PDFPageProxy.getOperatorList` method will now also support the new "any" intent, to allow accessing the operatorList of all annotations (limited to those types that have one).
 4. Finally, for consistency across the API, the `PDFPageProxy.render` method also support the new "any" intent (although I'm not sure how useful that'll be).

Points 1 and 2 above are the significant, and thus breaking, changes in *default* behaviour here. However, unfortunately I cannot see a good way to improve the overall API while also keeping `PDFPageProxy.getAnnotations` unchanged.
2021-08-06 00:39:42 +02:00
Jonas Jenwald
16a09eaed8 Fix a broken regular expression in the docId unit-test (issue 13838, PR 13813 follow-up)
The current regular expression contains a typo, leading to intermittent test-failures for certain `docId`s; sorry about that!
2021-08-01 15:18:25 +02:00
Jonas Jenwald
b18620ac0f Remove the closure used with the PDFDocumentLoadingTask class
This patch utilizes the same approach as used in lots of other parts of the code-base, which thus *slightly* reduces the size of this code.

By removing some of the (current) indirection, we can also simplify the JSDocs a little bit. Looking at the `gulp jsdoc` output, this actually seem to *improve* the documentation for this class.
2021-07-30 11:34:47 +02:00
Jonas Jenwald
03cf28bf17 [api-minor] Add intent support to the PDFPageProxy.getOperatorList method (issue 13704)
With this patch, the `PDFPageProxy.getOperatorList` method will now return `PDFOperatorList`-instances that also include Annotation-operatorLists (when those exist). Hence this closes a small, but potentially confusing, gap between the `render` and `getOperatorList` methods.

Previously we've been somewhat reluctant to do this, as explained below, but given that there's actual use-cases where it's required probably means that we'll *have* to implement it now.
Since we still need the ability to separate "normal" rendering operations from direct `getOperatorList` calls in the worker-thread, this API-change unfortunately causes the *internal* renderingIntent to become a bit "messy" which is indeed unfortunate (note the `"oplist-"` strings in various spots). As-is I suppose that it's not all that bad, but we may want to consider changing the *internal* renderingIntent to e.g. a bitfield in the future.

Besides fixing issue 13704, this patch would also be necessary if someone ever tries to implement e.g. issue 10165 (since currently `PDFPageProxy.getOperatorList` doesn't include Annotation-operatorLists).

*Please note:* This patch is *also* tagged "api-minor" for a second reason, which is that we're now including the Annotation-id in the `beginAnnotation` argument. The reason for this is to allow correlating the Annotation-data returned by `PDFPageProxy.getAnnotations`, with its corresponding operatorList-data (for those Annotations that have it).
2021-07-16 17:16:30 +02:00
Jonas Jenwald
661c60ecc9 [api-minor] Support accessing both the original and modified PDF fingerprint
The PDF.js API has only ever supported accessing the original file ID, however the second one that (should) exist in *modified* documents have thus far been completely inaccessible through the API.
That seems like a simple oversight, caused e.g. by the viewer not needing it, since it really shouldn't hurt to provide API-users with the ability to check if a PDF document has been modified since its creation.[1]

Please refer to https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G13.2261661 for additional information.

For an example of how to update existing code to use the new API, please see the changes in the `web/app.js` file included in this patch.

*Please note:* While I'm not sure if we'll ever be able to remove the old `PDFDocumentProxy.fingerprint` getter, given that it's existed since "forever", that probably isn't a big deal given that it's now limited to only `GENERIC`-builds.

---
[1] Although this obviously depends on the PDF software following the specification, by updating the second file ID as intended.
2021-07-03 13:56:33 +02:00
Brendan Dahl
4c1dd47e65 Include and use the 14 standard fonts files. 2021-06-07 11:10:11 -07:00
Jonas Jenwald
8d5689387b Improve handling of named destinations in out-of-order NameTrees (PR 10274 follow-up)
According to the specification, see https://web.archive.org/web/20210404042322if_/https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G6.2384179, the keys of a NameTree/NumberTree should be ordered.
For corrupt PDF files, which violate this assumption, it's thus possible that trying to lookup a single entry fails.

Previously, in PR 10274, we implemented a fallback that only applies to the "bottom" node of a NameTree/NumberTree, which in general might not actually help for sufficiently corrupt NameTree/NumberTree data.
Instead we remove the current *limited* fallback from `NameOrNumberTree.get`, and defer to the call-site to handle this case explicitly e.g. by using `NameOrNumberTree.getAll` for data where that makes sense. For well-formed documents, these changes should *not* lead to any additional data fetching/parsing.

Finally, as part of these changes, the validation of named destination data is improved in the `Catalog` and a new unit-test is also added.
2021-05-21 15:48:37 +02:00
Jonas Jenwald
8943bcd3c3 Account for formatting changes in Prettier version 2.3.0
With the exception of one tweaked `eslint-disable` comment, in `web/generic_scripting.js`, this patch was generated automatically using `gulp lint --fix`.

Please find additional information at:
 - https://github.com/prettier/prettier/releases/tag/2.3.0
 - https://prettier.io/blog/2021/05/09/2.3.0.html
2021-05-16 11:44:05 +02:00
calixteman
af4dc55019
[api-minor] Fix the way to chunk the strings (#13257)
- Improve chunking in order to fix some bugs where the spaces aren't here:
    * track the last position where a glyph has been drawn;
    * when a new glyph (first glyph in a chunk) is added then compare its position with the last saved one and add a space or break:
      - there are multiple ways to move the glyphs and to avoid to have to deal with all the different possibilities it's a way easier to just compare positions;
      - and so there is now one function (i.e. "compareWithLastPosition") where all the job is done.
  - Add some breaks in order to get lines;
  - Remove the multiple whites spaces:
    * some spaces were filled with several whites spaces and so it makes harder to find some sequences of words using the search tool;
    * other pdf readers replace spaces by one white space.

Update src/core/evaluator.js

Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com>

Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com>
2021-04-30 14:41:13 +02:00
Tim van der Meij
c2f3a71eca
Convert done callbacks to async/await in test/unit/api_spec.js 2021-04-17 17:52:23 +02:00
Jonas Jenwald
2b2234fd5a [api-minor] Ensure that PDFDocumentProxy.hasJSActions won't fail if MissingDataExceptions are thrown during the associated worker-thread parsing
With the current implementation of `PDFDocument.hasJSActions`, in the worker-thread, we're not actually handling not-yet-loaded data correctly. This can thus fail in *two* different ways:
 - The `PDFDocument.fieldObjects` getter (and its helper method), while it may *return* a Promise, still fetches all of its data synchronously and it can thus throw a `MissingDataException` during parsing.
 - The `Catalog.jsActions` getter, which is completely synchronous, can obviously throw a `MissingDataException` during parsing.

If either of these cases occur currently, the `PDFDocumentProxy.hasJSActions` method in the API can either return a *rejected* Promise (which it never should) or possibly "hang" and never resolve.

*Please note:* While I've not *yet* seen this error in an actual PDF document, it can happen during loading if you're unlucky enough with e.g. the structure of the PDF document and/or the download speed offered by the server.
This patch is thus based on code-inspection *and* on manually throwing a `MissingDataException` on the first access of `Catalog.jsActions` to simulate this situation.

Finally, this patch adds a couple of *API* unit-tests for this (since none existed).
2021-04-13 14:33:56 +02:00
Jonas Jenwald
5adee0cdd1 [api-minor] Let PDFPageProxy.getStructTree return null, rather than an empty structTree, for documents without any accessibility data (PR 13171 follow-up)
This is first of all consistent with existing API-methods, where we return `null` when the data in question doesn't exist. Secondly, it should also be (slightly) more efficient since there's less dummy-data that we need to transfer between threads.
Finally, this prevents us from adding an empty/unnecessary span to *every* single page even in documents without any structure tree data.
2021-04-11 12:35:33 +02:00
Tim van der Meij
10574a0f8a
Remove obsolete done callbacks from the unit tests
The done callbacks are an outdated mechanism to signal Jasmine that a
unit test is done, mostly in cases where a unit test needed to wait for
an asynchronous operation to complete before doing its assertions.
Nowadays a much better mechanism is in place for that, namely simply
passing an asynchronous function to Jasmine, so we don't need callbacks
anymore (which require more code and may be more difficult to reason
about).

In these particular cases though the done callbacks never had any real
use since nothing asynchronous happens in these places. Synchronous
functions don't need to use done callbacks since Jasmine simply knows
it's done when the function reaches its normal end, so we can safely get
rid of these callbacks. The telltale sign is if the done callback is
used unconditionally at the end of the function.

This is all done in an effort to over time get rid of all callbacks in
the unit test code.
2021-04-10 20:29:39 +02:00
Calixte Denizet
5875ebb1ca Display widget signature
- but don't validate them for now;
  - Firefox will display a bar to warn that the signature validation is not supported (see https://bugzilla.mozilla.org/show_bug.cgi?id=854315)
  - almost all (all ?) pdf readers display signatures;
  - validation is done in edge but for now it's behind a pref.
2021-04-10 19:13:28 +02:00
Jonas Jenwald
232fbd28e1 Re-factor the PDFDocumentProxy.cleanup unit-tests to use async/await 2021-04-02 12:32:35 +02:00
Jonas Jenwald
db1e1612df [api-minor] Support proper URL-objects, in addition to URL-strings, in getDocument
Currently only URL-strings are officially supported by `getDocument`, however at this point in time I cannot really see any compelling reason to not support `URL`-objects as well.

Most likely the reason that we've don't already support `URL`-objects, in `getDocument`, is that historically `URL` wasn't fully implemented across browsers and our old polyfill wasn't perfect; see https://developer.mozilla.org/en-US/docs/Web/API/URL/URL#browser_compatibility

*Please note:* Because of how the `url` parameter is currently handled, there's actually *some* cases where passing a `URL`-object to `getDocument` already works. That, in my opinion, provides additional motivation for supporting `URL`-objects officially, since it makes the API more consistent.

The following is an attempt to summarize the *current* situation, based on the actual code rather than the JSDocs:
 - `getDocument("url string")` works and is documented.[1]
 - `getDocument({ url: "url string", })` works and is documented.[1]
 - `getDocument(new URL(...))` throws immediately, since no supported parameters are found.
 - `getDocument({ url: new URL(...), })` actually works even though it's not documented.[1] Originally, when data was fetched on the worker-thread, this would likely have thrown since `URL` isn't clonable.[2]
 - `getDocument({ url: { abc: 123, }, })`, or some similarily meaningless input, will be "accepted" by `getDocument` and then throw a `MissingPDFException` when attempting to fetch the bogus data.

With the changes in this patch, not only is `URL`-objects now officially supported and documented when calling `getDocument`, but we'll also do a much better job at actually validating any URL-data passed to `getDocument` (and instead fail early).

---
[1] In *browsers*, we create a valid URL thus indirectly validating the input. In Node.js environments, on the other hand, no validation is done since obtaining a baseUrl is more difficult (and PDF.js is primarily written for browsers anyway).

[2] https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm#supported_types
2021-03-31 16:21:41 +02:00
Jonas Jenwald
eeda2215d7 Remove redundant done-callback functions from unit-tests which are async
For unit-tests which are asynchronous, using a `done`-callback is redundant and future Jasmine versions will stop supporting that pattern.
2021-03-21 11:33:39 +01:00
Brendan Dahl
47a5550f10 Disable intermittent unit test "creates pdf doc from non-existent URL"
Disable this test so we don't have to manually review unit test
failure log all the time.
2021-03-11 11:40:40 -08:00
Jonas Jenwald
941b65f683 Remove unncessary CanvasFactory/CMapReaderFactory/FileReaderFactory duplication in unit-tests
Given that the API will now, after PR 12039, automatically pick the correct factories to use depending on the environment (browser vs. Node.js), we can utilize that in the unit-tests as well. This way we don't have to manually repeat the same initialization code in *multiple* unit-tests.
*Note:* The *official* PDF.js API is defined in `src/pdf.js`, hence the new exports in `src/display/api.js` will not affect that.

Also, updates the unit-test `FileReaderFactory` helpers similarily.

*Drive-by change:* Fix the `CMapReaderFactory` usage in the annotation unit-tests, since the cache should only contain raw data and not a Promise. While this obviously works as-is, having unit-tests that "abuse" the intended data format can easily lead to unnecessary failures if changes are made to the relevant `src/core/` code.
2021-01-08 17:33:59 +01:00
Calixte Denizet
ffd4bc790c JS -- Add tests for print/save actions
* change PDFDocument::hasJSActions to return true when there are JS actions in catalog.
2020-12-24 18:51:00 +01:00
Calixte Denizet
1e2173f038 JS - Collect and execute actions at doc and pages level
* the goal is to execute actions like Open or OpenAction
 * can be tested with issue6106.pdf (auto-print)
 * once #12701 is merged, we can add page actions
2020-12-18 20:03:59 +01:00
Jonas Jenwald
01d12b465c [api-minor] Add "contentLength" to the information returned by the getMetadata method
Given that we already include the "Content-Disposition"-header filename, when it exists, it shouldn't hurt to also include the information from the "Content-Length"-header.
For PDF documents opened via a URL, which should be a very common way for the PDF.js library to be used, this will[1] thus provide a way of getting the PDF filesize without having to wait for the `getDownloadInfo`-promise to resolve[2].

With these API improvements, we can also simplify the filesize handling in the `PDFDocumentProperties` class.

---
[1] Assuming that the server is correctly configured, of course.

[2] Since that's not *guaranteed* to happen in general, with e.g. `disableAutoFetch = true` set.
2020-11-20 15:30:36 +01:00
Tim van der Meij
e341e6e542
Merge pull request #12525 from brendandahl/mark-info
[api-minor] Implement API to get MarkInfo from the catalog.
2020-10-31 00:05:19 +01:00
Brendan Dahl
f5c821e9c3 [api-minor] Implement API to get MarkInfo from the catalog. 2020-10-30 10:59:45 -07:00
Tim van der Meij
3e2bfb5819
Convert var to const/let in the test/unit folder
This has been done automatically using ESLint's `--fix` argument.
2020-10-25 15:40:51 +01:00
Jonas Jenwald
13e44c0776 Attempt to reduce intermittent failures in the "multiple render() on the same canvas" unit-test
This patch should *hopefully* remove the intermittent unit-test failure, by using the *same* `optionalContentConfigPromise` for both `renderTask`s and thus get more predictable timing behaviour.
2020-08-04 22:31:24 +02:00
Brendan Dahl
ac494a2278 Add support for optional marked content.
Add a new method to the API to get the optional content configuration. Add
a new render task param that accepts the above configuration.
For now, the optional content is not controllable by the user in
the viewer, but renders with the default configuration in the PDF.

All of the test files added exhibit different uses of optional content.

Fixes #269.

Fix test to work with optional content.

- Change the stopAtErrors test to ensure the operator list has something,
  instead of asserting the exact number of operators.
2020-08-04 09:26:55 -07:00
Jonas Jenwald
86a8fd9810 Attempt to reduce intermittent failures in the "cleans up document resources during rendering of page" unit-test
This patch should *hopefully* remove the `Unhandled promise rejection: ...` errors, by returning the "final" promise. Also, by pausing/delaying of rendering slightly the likelihood of the test failing in the first place should thus be reduced.
2020-07-26 14:05:46 +02:00
Jonas Jenwald
4a7e29865d [api-minor] Use the NodeCanvasFactory/NodeCMapReaderFactory classes as defaults in Node.js environments (issue 11900)
This moves, and slightly simplifies, code that's currently residing in the unit-test utils into the actual library, such that it's bundled with `GENERIC`-builds and used in e.g. the API-code.

As an added bonus, this also brings out-of-the-box support for CMaps in e.g. the Node.js examples.
2020-07-02 04:44:23 +02:00
Jonas Jenwald
dda6626f40 Attempt to cache repeated images at the document, rather than the page, level (issue 11878)
Currently image resources, as opposed to e.g. font resources, are handled exclusively on a page-specific basis. Generally speaking this makes sense, since pages are separate from each other, however there's PDF documents where many (or even all) pages actually references exactly the same image resources (through the XRef table). Hence, in some cases, we're decoding the *same* images over and over for every page which is obviously slow and wasting both CPU and memory resources better used elsewhere.[1]

Obviously we cannot simply treat all image resources as-if they're used throughout the entire PDF document, since that would end up increasing memory usage too much.[2]
However, by introducing a `GlobalImageCache` in the worker we can track image resources that appear on more than one page. Hence we can switch image resources from being page-specific to being document-specific, once the image resource has been seen on more than a certain number of pages.

In many cases, such as e.g. the referenced issue, this patch will thus lead to reduced memory usage for image resources. Scrolling through all pages of the document, there's now only a few main-thread copies of the same image data, as opposed to one for each rendered page (i.e. there could theoretically be *twenty* copies of the image data).
While this obviously benefit both CPU and memory usage in this case, for *very* large image data this patch *may* possibly increase persistent main-thread memory usage a tiny bit. Thus to avoid negatively affecting memory usage too much in general, particularly on the main-thread, the `GlobalImageCache` will *only* cache a certain number of image resources at the document level and simply fallback to the default behaviour.

Unfortunately the asynchronous nature of the code, with ranged/streamed loading of data, actually makes all of this much more complicated than if all data could be assumed to be immediately available.[3]

*Please note:* The patch will lead to *small* movement in some existing test-cases, since we're now using the built-in PDF.js JPEG decoder more. This was done in order to simplify the overall implementation, especially on the main-thread, by limiting it to only the `OPS.paintImageXObject` operator.

---
[1] There's e.g. PDF documents that use the same image as background on all pages.

[2] Given that data stored in the `commonObjs`, on the main-thread, are only cleared manually through `PDFDocumentProxy.cleanup`. This as opposed to data stored in the `objs` of each page, which is automatically removed when the page is cleaned-up e.g. by being evicted from the cache in the default viewer.

[3] If the latter case were true, we could simply check for repeat images *before* parsing started and thus avoid handling *any* duplicate image resources.
2020-05-21 18:13:45 +02:00
Jonas Jenwald
cdc60402f6 [api-minor] Change PageViewport to throw when the rotation is not a multiple of 90 degrees
As evident from the code, `PageViewport` only supports[1] `rotation` values which are a multiple of 90 degrees. Besides it being somewhat difficult to imagine meaningful use-cases for a non-multiple of 90 degrees `rotation`, the code also becomes both simpler and more efficient by not having to consider arbitrary `rotation` values.

However, any invalid rotation will *silently* fallback to assume zero `rotation` which probably isn't great for e.g. `PDFPageProxy.getViewport` in the API. Hence this patch, which will now enforce that only valid `rotation` values are accepted.

---
[1] As far as I can tell, from looking through the history, nothing else has ever been supported either.
2020-04-22 15:19:13 +02:00
Jonas Jenwald
1cc3dbb694 Enable the dot-notation ESLint rule
*Please note:* These changes were done automatically, using the `gulp lint --fix` command.

This rule is already enabled in mozilla-central, see https://searchfox.org/mozilla-central/rev/567b68b8ff4b6d607ba34a6f1926873d21a7b4d7/tools/lint/eslint/eslint-plugin-mozilla/lib/configs/recommended.js#103-104

The main advantage, besides improved consistency, of this rule is that it reduces the size of the code (by 3 bytes for each case). In the PDF.js code-base there's close to 8000 instances being fixed by the `dot-notation` ESLint rule, which end up reducing the size of even the *built* files significantly; the total size of the `gulp mozcentral` build target changes from `3 247 456` to `3 224 278` bytes, which is a *reduction* of `23 178` bytes (or ~0.7%) for a completely mechanical change.

A large number of these changes affect the (large) lookup tables used on the worker-thread, but given that they are still initialized lazily I don't *think* that the new formatting this patch introduces should undo any of the improvements from PR 6915.

Please find additional details about the ESLint rule at https://eslint.org/docs/rules/dot-notation
2020-04-17 12:24:46 +02:00
Jonas Jenwald
746eaf3154 [api-minor] Fix the return value of PDFDocumentProxy.getViewerPreferences when no viewer preferences are present (PR 10738 follow-up)
This patch fixes yet another instalment in the never-ending series of "what the *bleep* was I thinking", by changing the `PDFDocumentProxy.getViewerPreferences` method to return `null` by default.
Not only is this method now consistent with many other API methods, for the data not present case, but it also avoids having to e.g. loop through an object to check if it's actually empty (note the old unit-test).
2020-04-14 23:25:50 +02:00
Jonas Jenwald
426945b480 Update Prettier to version 2.0
Please note that these changes were done automatically, using `gulp lint --fix`.

Given that the major version number was increased, there's a fair number of (primarily whitespace) changes; please see https://prettier.io/blog/2020/03/21/2.0.0.html
In order to reduce the size of these changes somewhat, this patch maintains the old "arrowParens" style for now (once mozilla-central updates Prettier we can simply choose the same formatting, assuming it will differ here).
2020-04-14 12:28:14 +02:00
Jonas Jenwald
66ee8f5acd Remove variable shadowing from the JavaScript files in the test/unit/ folder
*This is part of a series of patches that will try to split PR 11566 into smaller chunks, to make reviewing more feasible.*

Once all the code has been fixed, we'll be able to eventually enable the ESLint no-shadow rule; see https://eslint.org/docs/rules/no-shadow
2020-03-24 10:44:17 +01:00
Jonas Jenwald
ae2900e510 [api-minor] Change the pageIndex, on PDFPageProxy instances, to a private property
This property has never been documented and/or *intentionally* exposed through the API, instead the `PDFPageProxy.pageNumber` property is the documented/intended API to use here.
Hence pageIndex is changed to a "private" property on `PDFPageProxy` instances, and internal API functionality is also updated to *consistently* use `this._pageIndex` rather than a mix of formats.
2020-03-19 15:47:11 +01:00
Jonas Jenwald
01fb309a2a [api-minor] Add more general OpenAction support (PR 10334 follow-up, issue 11642)
This patch deprecates the existing `getOpenActionDestination` API method, in favor of a better and more general `getOpenAction` method instead. (For now JavaScript actions, related to printing, are still handled as before.)

By clearly separating "regular" Print actions from the JavaScript handling, it's thus possible to get rid of the somewhat annoying and strictly incorrect warning when the viewer loads.
2020-03-06 13:03:00 +01:00
Jonas Jenwald
3c7b7be100 Prevent circular references in the /Pages tree 2020-02-19 01:49:39 +01:00
Tim van der Meij
2fb4076e05 Merge pull request #11568 from Snuffleupagus/PDF-header-validation
Ensure that the PDF header contains an actual number (PR 11463 follow-up)
2020-02-09 17:16:25 +01:00
Jonas Jenwald
7117ee03d6 [api-minor] Change PDFDocumentProxy.cleanup/PDFPageProxy.cleanup to return data
This patch makes the following changes, to improve these API methods:

 - Let `PDFPageProxy.cleanup` return a boolean indicating if clean-up actually happened, since ongoing rendering will block clean-up.
   Besides being used in other parts of this patch, it seems that an API user may also be interested in the return value given that clean-up isn't *guaranteed* to happen.

 - Let `PDFDocumentProxy.cleanup` return the promise indicating when clean-up is finished.

 - Improve the JSDoc comment for `PDFDocumentProxy.cleanup` to mention that clean-up is triggered on *both* threads (without going into unnecessary specifics regarding what *exactly* said data actually is).
   Add a note in the JSDoc comment about not calling this method when rendering is ongoing.

 - Change `WorkerTransport.startCleanup` to throw an `Error` if it's called when rendering is ongoing, to prevent rendering from breaking.
   Please note that this won't stop *worker-thread* clean-up from happening (since there's no general "something is rendering"-flag), however I'm not sure if that's really a problem; but please don't quote me on that :-)
   All of the caches that's being cleared in `Catalog.cleanup`, on the worker-thread, *should* be re-filled automatically even if cleared *during* parsing/rendering, and the only thing that probably happens is that e.g. font data would have to be re-parsed.
  On the main-thread, on the other hand, clearing the caches is more-or-less guaranteed to cause rendering errors, since the rendering code in `src/display/canvas.js` isn't able to re-request any image/font data that's suddenly being pulled out from under it.

 - Last, but not least, add a couple of basic unit-tests for the clean-up functionality.
2020-02-07 17:00:29 +01:00
Jonas Jenwald
88c35d872f Ensure that the PDF header contains an actual number (PR 11463 follow-up)
While it would be nice to change the `PDFFormatVersion` property, as returned through `PDFDocumentProxy.getMetadata`, to a number (rather than a string) that would unfortunately be a breaking API change.
However, it does seem like a good idea to at least *validate* the PDF header version on the worker-thread, rather than potentially returning an arbitrary string.
2020-02-07 12:25:07 +01:00
Jonas Jenwald
9e262ae7fa Enable the ESLint prefer-const rule globally (PR 11450 follow-up)
Please find additional details about the ESLint rule at https://eslint.org/docs/rules/prefer-const

With the recent introduction of Prettier this sort of mass enabling of ESLint rules becomes a lot easier, since the code will be automatically reformatted as necessary to account for e.g. changed line lengths.

Note that this patch is generated automatically, by using the ESLint `--fix` argument, and will thus require some additional clean-up (which is done separately).
2020-01-25 00:20:22 +01:00
Jonas Jenwald
36881e3770 Ensure that all import and require statements, in the entire code-base, have a .js file extension
In order to eventually get rid of SystemJS and start using native `import`s instead, we'll need to provide "complete" file identifiers since otherwise there'll be MIME type errors when attempting to use `import`.
2020-01-04 13:01:43 +01:00
Jonas Jenwald
d9d856020f Move the regular expression, used with auto printing in the viewer, to web/ui_utils.js and also use it in the API unit-tests
Rather than having a copy of this regular expression in the `test/unit/api_spec.js` file, with a comment about keeping it up-to-date with the code in the viewer (note the incorrect file reference as well), we can just import it instead to simplify all of this.
2019-12-27 00:38:28 +01:00
Tim van der Meij
dfe42a5ca4
Include a unit test for OpenAction dictionaries without Type entries (PR 11443 follow-up)
The original issue did not contain a (reduced) test case that we could
include and linked test cases are not ideal for unit tests, so the
original PR could only be verified manually.

I found this a bit unfortunate considering that the print data is
exposed through the API, so I thought about how we could have an
automated test and managed to create a reduced test case with the
OpenAction dictionary from the file in the original issue.

Therefore, this commit includes a unit test for parsing OpenAction
dictionaries without `Type` entries. I verified that this PDF file
behaves the same as the original one, i.e., no print dialog is shown for
older viewers and the print dialog is shown for the most recent viewer.
2019-12-27 00:05:51 +01:00
Jonas Jenwald
de36b2aaba Enable auto-formatting of the entire code-base using Prettier (issue 11444)
Note that Prettier, purposely, has only limited [configuration options](https://prettier.io/docs/en/options.html). The configuration file is based on [the one in `mozilla central`](https://searchfox.org/mozilla-central/source/.prettierrc) with just a few additions (to avoid future breakage if the defaults ever changes).

Prettier is being used for a couple of reasons:

 - To be consistent with `mozilla-central`, where Prettier is already in use across the tree.

 - To ensure a *consistent* coding style everywhere, which is automatically enforced during linting (since Prettier is used as an ESLint plugin). This thus ends "all" formatting disussions once and for all, removing the need for review comments on most stylistic matters.

Many ESLint options are now redundant, and I've tried my best to remove all the now unnecessary options (but I may have missed some).
Note also that since Prettier considers the `printWidth` option as a guide, rather than a hard rule, this patch resorts to a small hack in the ESLint config to ensure that *comments* won't become too long.

*Please note:* This patch is generated automatically, by appending the `--fix` argument to the ESLint call used in the `gulp lint` task. It will thus require some additional clean-up, which will be done in a *separate* commit.

(On a more personal note, I'll readily admit that some of the changes Prettier makes are *extremely* ugly. However, in the name of consistency we'll probably have to live with that.)
2019-12-26 12:34:24 +01:00
Jonas Jenwald
b4d95f3763 Tweak the "gets page stats after rendering page, with pdfBug set" unit-test to remove an intermittent failure on Travis
I recently noticed a couple of intermittent failures on Travis, hence this patch which changes the expectation to be identical to the 'Page Request' check in the preceding test-case.
2019-12-23 23:07:02 +01:00
Jonas Jenwald
b00835f589 Attempt to improve the PDFDocument error message for empty files (issue 5887)
Given that the error in question is surfaced on the API-side, this patch makes the following changes:
 - Updates the wording such that it'll hopefully be slightly easier for users to understand.
 - Changes the plain `Error` to an `InvalidPDFException` instead, since that should work better with the existing Error handling.
 - Adds a unit-test which loads an empty PDF document (and also improves a pre-existing `InvalidPDFException` message and its test-case).
2019-12-09 15:45:50 +01:00
Jonas Jenwald
74e00ed93c Change isNodeJS from a function to a constant
Given that this shouldn't change after the `pdf.js`/`pdf.worker.js` files have been loaded, it doesn't seems necessary to keep this as a function.
2019-11-10 16:44:29 +01:00
Jonas Jenwald
2817121bc1 Convert globalScope and isNodeJS to proper modules
Slightly unrelated to the rest of the patch, but this also removes an out-of-place `globals` definition from the `web/viewer.js` file.
2019-11-10 16:44:29 +01:00