Commit Graph

64 Commits

Author SHA1 Message Date
Jonas Jenwald
59c4041a49 Update the examples/-folder to account for outputting of JavaScript modules (PR 17055 follow-up)
This patch also changes most examples to use "top level await", since that's now supported and slightly simplifies the code.
2023-10-28 10:26:25 +02:00
Jonas Jenwald
d53093045a Enable the import/no-commonjs ESLint plugin rule
Given the amount of work put into removing `require`-calls from the code-base, let's ensure that new ones aren't accidentally added in the future.

Note that we still have a couple of files where `require` is being used, in particular:
 - The Node.js examples, however those will be updated to use `import` in PR 17081.
 - The Webpack examples, and related support files, however I unfortunately don't know enough about Webpack to be able to update those. (Hopefully users of that code will help out here, once version `4` is released.)
 - The `statcmp`-tool, since *some* of those `require`-calls cannot be converted to `import` without other code changes (and that file is only used during benchmarking).

Please find additional details at https://github.com/import-js/eslint-plugin-import/blob/main/docs/rules/no-commonjs.md
2023-10-14 12:49:17 +02:00
Jonas Jenwald
3ced0dec1b [api-major] Remove the SVG back-end (PR 15173 follow-up)
This has been deprecated since version `2.15.349`, which is a year ago.
Removing this will also simplify some upcoming changes, specifically outputting of JavaScript modules in the builds.
2023-10-01 23:14:29 +02:00
Jonas Jenwald
f42a2e8451
[api-minor] Move the canvasFactory option into getDocument
Rather than repeatedly initializing a `canvasFactory`-instance for every page, move it to the document-level instead.

*Please note:* This patch is written using the GitHub UI, since I'm currently without a dev machine, so hopefully it works correctly.
2023-03-01 09:07:16 +01:00
Jonas Jenwald
8129815538 Enable the unicorn/prefer-dom-node-append ESLint plugin rule
This rule will help enforce slightly shorter code, especially since you can insert multiple elements at once, and according to MDN `Element.append()` is available in all browsers that we currently support.

Please find additional information here:
 - https://developer.mozilla.org/en-US/docs/Web/API/Element/append
 - https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/prefer-dom-node-append.md
2022-06-12 13:07:03 +02:00
Jonas Jenwald
dd58671589 Remove mention of gulp singlefile-command from examples/node/getinfo.js
This comment should've been removed in PR 9385, but better late than never I suppose.
2022-06-04 23:10:58 +02:00
Jonas Jenwald
487a7ddc7d Update (primarily) the Node.js examples to release page resources
Given that Node.js doesn't support Workers, general PDF.js performance will be worse when compared to browsers. In an attempt to improve at least memory usage a little bit, update the Node.js examples to release page resources once parsing is done for that page.
2021-11-30 13:11:50 +01:00
Jonas Jenwald
ff9d2b2ab1 Prevent run-time errors in Node.js versions with URL.createObjectURL support (issue 14170)
Apparently Node.js has added *global* `URL.createObjectURL` support, but not done the same thing for `Blob`. Hence we also need to check for the availability of `Blob` in the `createObjectURL` helper function, and it's probably a good idea to also update `examples/node/pdf2svg.js` to work-around this until these changes reach an official PDF.js release.
2021-10-21 10:32:44 +02:00
Benson Imoh,ST
0643ccb68b
Convert examples/node/pdf2svg.js to await/async #14125 2021-10-20 21:51:58 +01:00
Jane-Kotovich
37d90ec378 Convert examples/node/pdf2png/pdf2png.js to await/async 2021-10-14 20:35:18 +10:00
Jonas Jenwald
2ffc921163 Stop encoding the value in the DOMElement.setAttribute method (issue 8558)
This patch is an attempt at closing an old, and seemingly trivial, issue and the SVG-files created by the `pdf2svg.js` examples still appear to work just fine when opened in browsers (tested with Firefox Nightly and Google Chrome Beta).
2021-06-20 11:55:24 +02:00
Brendan Dahl
4c1dd47e65 Include and use the 14 standard fonts files. 2021-06-07 11:10:11 -07:00
Jonas Jenwald
5c712f2131 Enable the ESLint no-var rule in the examples/ folder
Updating the examples to use `let`/`const` should be fine, given that they are available in all browsers/platforms that the PDF.js library now supports; please note the following compatibility information:

 - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/let#browser_compatibility
 - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/const#browser_compatibility
2021-03-12 17:52:52 +01:00
Jonas Jenwald
276fa4ad8f Replace *most* cases of var with let/const in the examples/ folder
These changes were done automatically, by using the `gulp lint --fix` command, in preparation for the next patch.
2021-03-12 17:16:59 +01:00
Jonas Jenwald
0068dba009 [api-minor] Rename -es5 to -legacy, to reduce confusion over what's actually supported (issue 12976)
*Please note that this will also require some edits of the Wiki.*
2021-02-10 16:01:59 +01:00
Jonas Jenwald
4db7330677 Enable ESLint rules that no longer need to be disabled on a directory/file-basis
Given that browsers/environments without native support for both arrow functions and object shorthand properties are no longer supported in PDF.js, please refer to the compatibility information below, we can now enable a fair number of ESLint rules and also simplify/remove some `.eslintrc` files.

With the exception of the `no-alert` cases, all code changes were made automatically by using `gulp lint --fix`.

 - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Functions/Arrow_functions#browser_compatibility
 - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Object_initializer#browser_compatibility
2021-01-22 17:47:03 +01:00
Jonas Jenwald
4a7e29865d [api-minor] Use the NodeCanvasFactory/NodeCMapReaderFactory classes as defaults in Node.js environments (issue 11900)
This moves, and slightly simplifies, code that's currently residing in the unit-test utils into the actual library, such that it's bundled with `GENERIC`-builds and used in e.g. the API-code.

As an added bonus, this also brings out-of-the-box support for CMaps in e.g. the Node.js examples.
2020-07-02 04:44:23 +02:00
Jonas Jenwald
0351852d74 [api-minor] Decode all JPEG images with the built-in PDF.js decoder in src/core/jpg.js
Currently some JPEG images are decoded by the built-in PDF.js decoder in `src/core/jpg.js`, while others attempt to use the browser JPEG decoder. This inconsistency seem unfortunate for a number of reasons:

 - It adds, compared to the other image formats supported in the PDF specification, a fair amount of code/complexity to the image handling in the PDF.js library.

 - The PDF specification support JPEG images with features, e.g. certain ColorSpaces, that browsers are unable to decode natively. Hence, determining if a JPEG image is possible to decode natively in the browser require a non-trivial amount of parsing. In particular, we're parsing (part of) the raw JPEG data to extract certain marker data and we also need to parse the ColorSpace for the JPEG image.

 - While some JPEG images may, for all intents and purposes, appear to be natively supported there's still cases where the browser may fail to decode some JPEG images. In order to support those cases, we've had to implement a fallback to the PDF.js JPEG decoder if there's any issues during the native decoding. This also means that it's no longer possible to simply send the JPEG image to the main-thread and continue parsing, but you now need to actually wait for the main-thread to indicate success/failure first.
   In practice this means that there's a code-path where the worker-thread is forced to wait for the main-thread, while the reverse should *always* be the case.

 - The native decoding, for anything except the *simplest* of JPEG images, result in increased peak memory usage because there's a handful of short-lived copies of the JPEG data (see PR 11707).
Furthermore this also leads to data being *parsed* on the main-thread, rather than the worker-thread, which you usually want to avoid for e.g. performance and UI-reponsiveness reasons.

 - Not all environments, e.g. Node.js, fully support native JPEG decoding. This has, historically, lead to some issues and support requests.

 - Different browsers may use different JPEG decoders, possibly leading to images being rendered slightly differently depending on the platform/browser where the PDF.js library is used.

Originally the implementation in `src/core/jpg.js` were unable to handle all of the JPEG images in the test-suite, but over the last couple of years I've fixed (hopefully) all of those issues.
At this point in time, there's two kinds of failure with this patch:

 - Changes which are basically imperceivable to the naked eye, where some pixels in the images are essentially off-by-one (in all components), which could probably be attributed to things such as different rounding behaviour in the browser/PDF.js JPEG decoder.
   This type of "failure" accounts for the *vast* majority of the total number of changes in the reference tests.

 - Changes where the JPEG images now looks *ever so slightly* blurrier than with the native browser decoder. For quite some time I've just assumed that this pointed to a general deficiency in the `src/core/jpg.js` implementation, however I've discovered when comparing two viewers side-by-side that the differences vanish at higher zoom levels (usually around 200% is enough).
   Basically if you disable [this downscaling in canvas.js](8fb82e939c/src/display/canvas.js (L2356-L2395)), which is what happens when zooming in, the differences simply vanish!
   Hence I'm pretty satisfied that there's no significant problems with the `src/core/jpg.js` implementation, and the problems are rather tied to the general quality of the downscaling algorithm used. It could even be seen as a positive that *all* images now share the same downscaling behaviour, since this actually fixes one old bug; see issue 7041.
2020-05-22 00:22:48 +02:00
Jonas Jenwald
c355f91d2e [api-minor] Immediately release the font.data property once the font been attached to the DOM (PR 11777 follow-up)
*This patch implements https://github.com/mozilla/pdf.js/pull/11777#issuecomment-609741348*

This extends the work from PR 11773 and 11777 further, by immediately releasing the `font.data` property once the font been attached to the DOM. By not unnecessarily holding onto this data on the main-thread, we'll thus reduce the memory usage of fonts even further (especially beneficial in longer documents with composite fonts).

The new behaviour is controlled by the recently added `fontExtraProperties` API option (adding a new option just for this patch didn't seem necessary), since there's one edge-case in the SVG renderer where the `font.data` property is necessary (see the `pdf2svg` example).

Note that while the default viewer does run clean-up with an idle timeout, that timeout will be reset whenever rendering occurs *or* when scrolling happens in the viewer. In practice this means that unless the user doesn't interact with the viewer in *any* way during an extended period of time, currently set to 30 seconds, the `PDFDocumentProxy.cleanup` method will never be called and font resources will thus not be cleaned-up.
2020-04-23 13:04:57 +02:00
Jonas Jenwald
426945b480 Update Prettier to version 2.0
Please note that these changes were done automatically, using `gulp lint --fix`.

Given that the major version number was increased, there's a fair number of (primarily whitespace) changes; please see https://prettier.io/blog/2020/03/21/2.0.0.html
In order to reduce the size of these changes somewhat, this patch maintains the old "arrowParens" style for now (once mozilla-central updates Prettier we can simply choose the same formatting, assuming it will differ here).
2020-04-14 12:28:14 +02:00
Jonas Jenwald
c97c778f8f [api-minor] Produce non-translated/non-polyfilled builds by default 2020-02-14 18:12:07 +01:00
Jonas Jenwald
de36b2aaba Enable auto-formatting of the entire code-base using Prettier (issue 11444)
Note that Prettier, purposely, has only limited [configuration options](https://prettier.io/docs/en/options.html). The configuration file is based on [the one in `mozilla central`](https://searchfox.org/mozilla-central/source/.prettierrc) with just a few additions (to avoid future breakage if the defaults ever changes).

Prettier is being used for a couple of reasons:

 - To be consistent with `mozilla-central`, where Prettier is already in use across the tree.

 - To ensure a *consistent* coding style everywhere, which is automatically enforced during linting (since Prettier is used as an ESLint plugin). This thus ends "all" formatting disussions once and for all, removing the need for review comments on most stylistic matters.

Many ESLint options are now redundant, and I've tried my best to remove all the now unnecessary options (but I may have missed some).
Note also that since Prettier considers the `printWidth` option as a guide, rather than a hard rule, this patch resorts to a small hack in the ESLint config to ensure that *comments* won't become too long.

*Please note:* This patch is generated automatically, by appending the `--fix` argument to the ESLint call used in the `gulp lint` task. It will thus require some additional clean-up, which will be done in a *separate* commit.

(On a more personal note, I'll readily admit that some of the changes Prettier makes are *extremely* ugly. However, in the name of consistency we'll probably have to live with that.)
2019-12-26 12:34:24 +01:00
Jonas Jenwald
3783eccfa4 Use the strict mode assert in the pdf2png Node.js example (issue 10768)
See https://nodejs.org/api/assert.html#assert_strict_mode
2019-12-21 13:24:13 +01:00
Jonas Jenwald
9c3024fe7e Add missing hasChildNodes polyfill to domstubs.js (PR 10022 follow-up) 2019-04-01 23:23:50 +02:00
Mohammed Essehemy
f0e9df745c
migrate to canvas 2.x api 2019-01-02 01:10:07 +02:00
Jonas Jenwald
f0719ed565 [api-minor] Change the getViewport method, on PDFPageProxy, to take a parameter object rather than a bunch of (randomly) ordered parameters
If, as PR 10368 suggests, more parameters should be added to `getViewport` I think that it would be a mistake to not change the signature *first* to avoid needlessly unwieldy call-sites.

To not break any existing code and third-party use-cases, this is obviously implemented with a deprecation warning *and* with a working fallback[1] for the old method signature.

---
[1] This is limited to `GENERIC` builds, which should be sufficient.
2018-12-21 11:55:20 +01:00
Wojciech Maj
9e3f7ac7fa Manually fix remaining ESLint errors 2018-12-11 15:23:26 +01:00
Wojciech Maj
ef1f255649 ESLint --fix 2018-12-11 15:23:26 +01:00
Wojciech Maj
80d7ff4912 Turn on ESLint in examples directory, apply examples-specific exceptions 2018-12-11 15:23:26 +01:00
Felipe augusto
1a75647a27
Remove unuseful variable
Variable is declared, but never used.
2018-12-01 01:44:18 -02:00
Jonas Jenwald
2c003a82d5 Convert RenderTask, in src/display/api.js, to an ES6 class
Also deprecates the `then` method, in favour of the `promise` getter.
2018-11-18 19:08:00 +01:00
Jonas Jenwald
ef8e5fd77c Convert PDFDocumentLoadingTask, in src/display/api.js, to an ES6 class
Also deprecates the `then` method, in favour of the `promise` getter.
2018-11-18 19:07:57 +01:00
Jonas Jenwald
0ecc22cb04 Attempt to provide better default values for the disableFontFace/nativeImageDecoderSupport API options in Node.js
This should provide a better out-of-the-box experience when using PDF.js in a Node.js environment, since it's missing native support for both `@font-face` and `Image`.
Please note that this change *only* affects the default values, hence it's still possible for an API consumer to override those values when calling `getDocument`.

Also, prevents "ReferenceError: document is not defined" errors, when running the unit-tests in Node.js/Travis.
2018-06-03 00:28:37 +02:00
Tim van der Meij
64b1315bb5
Improve the instructions and code for the pdf2png example
We need to pass `disableFontFace` and `nativeImageDecoderSupport`
because Node.js has no native support for `@font-face` and `Image`.
Doing so makes it possible to render e.g., the Tracemonkey paper, which
failed before. I made this PDF file the default because it's also the
default in other examples/demos and because it showcases the
possibilities better than the very simple hello world PDF file.

Building the library with `gulp dist-install` is easier and is already
recommended in the other examples.
2018-04-01 12:52:57 +02:00
Jonas Jenwald
0e1b5589e7 Restore the btoa/atob polyfills for Node.js
These were removed in PR 9170, since they were unused in the browsers that we'll support in PDF.js version `2.0`.
However looking at the output of Travis, where a subset of the unit-tests are run using Node.js, there's warnings about `btoa` being undefined. This doesn't appear to cause any errors, which probably explains why we didn't notice this before (despite PR 9201).
2018-01-13 01:31:05 +01:00
巴里切罗
27a619246f Add btoa back to domstubs.js 2017-11-28 16:34:53 +08:00
Tim van der Meij
d4309614f9
Replace DOMParser with SimpleXMLParser
The `DOMParser` is most likely overkill and may be less secure.
Moreover, it is not supported in Node.js environments.

This patch replaces the `DOMParser` with a simple XML parser. This
should be faster and gives us Node.js support for free. The simple XML
parser is a port of the one that existed in the examples folder with a
small regex fix to make the parsing work correctly.

The unit tests are extended for increased test coverage of the metadata
code. The new method `getAll` is provided so the example does not have
to access internal properties of the object anymore.
2017-09-19 23:09:07 +02:00
Tim van der Meij
cc654fd38d
Provide a stub for setAttribute in order to use the SVG back-end with
Node.js

This patch fixes a regression from PR #8691 where we switched to using
`setAttribute` instead of `setAttributeNS` if no namespace is provided.
2017-09-12 23:23:41 +02:00
Mukul Mishra
d16709f5e4 Adds tests for node_stream 2017-08-24 12:46:44 +05:30
Rob Wu
9b5086d649 pdf2svg.js: Serialize the SVG to a stream
Implement a serialization "generator" for `DOMElement` in domutils.js
that yields the serialization of the SVG element. This method is used by
a newly added `ReadableSVGStream` class, which can be used like any
other readable stream in Node.js.

This reduces the memory requirements. Now, it is not needed to require
the serialization to fully fit in memory.

Note: The implementation of the serializer is a state machine in ES5
since the rest of the file is also in ES5. Its functionality is
equivalent to:

```
function* serializeSVGElement(elem) {
  yield '<' + elem.nodeName;
  if (elem.nodeName === 'svg:svg') {
    yield ' xmlns:xlink="http://www.w3.org/1999/xlink"' +
          ' xmlns:svg="http://www.w3.org/2000/svg"';
  }
  for (let i in elem.attributes) {
    yield ' ' + i + '="' + xmlEncode(elem.attributes[i]) + '"';
  }

  yield '>';

  if (elem.nodeName === 'svg:tspan' || elem.nodeName === 'svg:style') {
    yield xmlEncode(elem.textContent);
  } else {
    for (let childNode of elem.childNodes) {
      yield* serializeSVGElement(childNode);
    }
  }
  yield '</' + elem.nodeName + '>';
}
```
2017-08-16 19:16:38 +02:00
Rob Wu
18566091aa Fix display_svg_spec tests.
- Mark the test as async, and don't swallow exceptions.
- Fix the DOMElement polyfill to behave closer to the actual getAttributeNS
  method, which excludes the namespace prefix.
2017-07-16 11:01:52 +02:00
Rob Wu
3479a19bf0 Remove btoa from domstubs.js
btoa is already defined by src/shared/compatibility.js,
which is unconditionally imported by src/shared/util.js.
2017-07-10 18:45:47 +02:00
Rob Wu
9caaaf3a91 Add setStubs/unsetStubs to domstubs to support testing
Do not directly export to global. Instead, export all stubs in domstubs.js and
add a method setStubs to assign all exported stubs to a namespace. Then replace
the import domstubs with an explicit call to this setStubs method.  Also added
unsetStubs for undoing the changes. This is done to allow unit testing of the
SVG backend without namespace pollution.
2017-07-10 18:45:47 +02:00
Yury Delendik
9bed695ebd Merge pull request #8540 from Rob--W/svg-oom
Reduce memory requirements of pdf2svg.js example to avoid OOM
2017-06-20 17:24:48 -05:00
Rob Wu
0cc1735809 Reduce concurrent memory footprint of pdf2svg.js
Wait for the completion of writing the generated SVG file before
processing the next page. This is to enable the garbage collector to
garbage-collect the (potentially large) SVG string before trying to
allocate memory again for the next page.

Note that since the PDF-to-SVG conversion is now sequential instead of
parallel, the time to generate all pages increases.

Test case:
node --max_old_space_size=200 examples/node/pdf2svg.js /tmp/FatalProcessOutOfMemory.pdf

Before this patch:
- Node.js crashes due to OOM after processing 20 pages.

After this patch:
- Node.js is able to convert all 203 PDFs to SVG without crashing.
2017-06-19 21:53:11 +02:00
Rob Wu
849d8cfa24 Improve memory-efficiency of DOMElement_toString in domstubs
Test case:
Using the PDF file from https://github.com/mozilla/pdf.js/issues/8534
node --max_old_space_size=200 examples/node/pdf2svg.js /tmp/FatalProcessOutOfMemory.pdf

Before this patch:
Node.js crashes due to OOM after processing 10 pages.

After this patch:
Node.js crashes due to OOM after processing 19 pages.
2017-06-19 21:52:39 +02:00
Rob Wu
4f22ba54bf Add getAttributeNS to domstubs for SVG example
The closePath method in src/display/svg.js relies on this.
2017-06-19 14:11:13 +02:00
Yury Delendik
a18caa730d Adds gulp dist-install command; using pdfjs-dist package in examples. 2017-06-12 10:22:16 -05:00
Jonas Jenwald
c5f73edcd2 Convert the DOMCanvasFactory to an ES6 class
For consistency, also updates the `pdf2png.js` example to use the slightly less verbose `canvasAndContext` parameter name.
2017-05-11 20:15:22 +02:00
巴里切罗
8d5d97264e fix(svg) adjust strategy for decoding JPEG images 2017-05-08 11:32:44 +08:00