pdf.js

Author	SHA1	Message	Date
Jonas Jenwald	d53093045a	Enable the `import/no-commonjs` ESLint plugin rule Given the amount of work put into removing `require`-calls from the code-base, let's ensure that new ones aren't accidentally added in the future. Note that we still have a couple of files where `require` is being used, in particular: - The Node.js examples, however those will be updated to use `import` in PR 17081. - The Webpack examples, and related support files, however I unfortunately don't know enough about Webpack to be able to update those. (Hopefully users of that code will help out here, once version `4` is released.) - The `statcmp`-tool, since some of those `require`-calls cannot be converted to `import` without other code changes (and that file is only used during benchmarking). Please find additional details at https://github.com/import-js/eslint-plugin-import/blob/main/docs/rules/no-commonjs.md	2023-10-14 12:49:17 +02:00
Jonas Jenwald	3ced0dec1b	[api-major] Remove the SVG back-end (PR 15173 follow-up) This has been deprecated since version `2.15.349`, which is a year ago. Removing this will also simplify some upcoming changes, specifically outputting of JavaScript modules in the builds.	2023-10-01 23:14:29 +02:00
Jonas Jenwald	f42a2e8451	[api-minor] Move the `canvasFactory` option into `getDocument` Rather than repeatedly initializing a `canvasFactory`-instance for every page, move it to the document-level instead. Please note: This patch is written using the GitHub UI, since I'm currently without a dev machine, so hopefully it works correctly.	2023-03-01 09:07:16 +01:00
Jonas Jenwald	8129815538	Enable the `unicorn/prefer-dom-node-append` ESLint plugin rule This rule will help enforce slightly shorter code, especially since you can insert multiple elements at once, and according to MDN `Element.append()` is available in all browsers that we currently support. Please find additional information here: - https://developer.mozilla.org/en-US/docs/Web/API/Element/append - https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/prefer-dom-node-append.md	2022-06-12 13:07:03 +02:00
Jonas Jenwald	dd58671589	Remove mention of `gulp singlefile`-command from `examples/node/getinfo.js` This comment should've been removed in PR 9385, but better late than never I suppose.	2022-06-04 23:10:58 +02:00
Jonas Jenwald	487a7ddc7d	Update (primarily) the Node.js examples to release page resources Given that Node.js doesn't support Workers, general PDF.js performance will be worse when compared to browsers. In an attempt to improve at least memory usage a little bit, update the Node.js examples to release page resources once parsing is done for that page.	2021-11-30 13:11:50 +01:00
Jonas Jenwald	ff9d2b2ab1	Prevent run-time errors in Node.js versions with `URL.createObjectURL` support (issue 14170) Apparently Node.js has added global `URL.createObjectURL` support, but not done the same thing for `Blob`. Hence we also need to check for the availability of `Blob` in the `createObjectURL` helper function, and it's probably a good idea to also update `examples/node/pdf2svg.js` to work-around this until these changes reach an official PDF.js release.	2021-10-21 10:32:44 +02:00
Benson Imoh,ST	0643ccb68b	Convert examples/node/pdf2svg.js to await/async #14125	2021-10-20 21:51:58 +01:00
Jane-Kotovich	37d90ec378	Convert examples/node/pdf2png/pdf2png.js to await/async	2021-10-14 20:35:18 +10:00
Jonas Jenwald	2ffc921163	Stop encoding the value in the `DOMElement.setAttribute` method (issue 8558) This patch is an attempt at closing an old, and seemingly trivial, issue and the SVG-files created by the `pdf2svg.js` examples still appear to work just fine when opened in browsers (tested with Firefox Nightly and Google Chrome Beta).	2021-06-20 11:55:24 +02:00
Brendan Dahl	4c1dd47e65	Include and use the 14 standard fonts files.	2021-06-07 11:10:11 -07:00
Jonas Jenwald	5c712f2131	Enable the ESLint `no-var` rule in the `examples/` folder Updating the examples to use `let`/`const` should be fine, given that they are available in all browsers/platforms that the PDF.js library now supports; please note the following compatibility information: - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/let#browser_compatibility - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/const#browser_compatibility	2021-03-12 17:52:52 +01:00
Jonas Jenwald	276fa4ad8f	Replace most cases of `var` with `let`/`const` in the `examples/` folder These changes were done automatically, by using the `gulp lint --fix` command, in preparation for the next patch.	2021-03-12 17:16:59 +01:00
Jonas Jenwald	0068dba009	[api-minor] Rename `-es5` to `-legacy`, to reduce confusion over what's actually supported (issue 12976) Please note that this will also require some edits of the Wiki.	2021-02-10 16:01:59 +01:00
Jonas Jenwald	4db7330677	Enable ESLint rules that no longer need to be disabled on a directory/file-basis Given that browsers/environments without native support for both arrow functions and object shorthand properties are no longer supported in PDF.js, please refer to the compatibility information below, we can now enable a fair number of ESLint rules and also simplify/remove some `.eslintrc` files. With the exception of the `no-alert` cases, all code changes were made automatically by using `gulp lint --fix`. - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Functions/Arrow_functions#browser_compatibility - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Object_initializer#browser_compatibility	2021-01-22 17:47:03 +01:00
Jonas Jenwald	4a7e29865d	[api-minor] Use the `NodeCanvasFactory`/`NodeCMapReaderFactory` classes as defaults in Node.js environments (issue 11900) This moves, and slightly simplifies, code that's currently residing in the unit-test utils into the actual library, such that it's bundled with `GENERIC`-builds and used in e.g. the API-code. As an added bonus, this also brings out-of-the-box support for CMaps in e.g. the Node.js examples.	2020-07-02 04:44:23 +02:00
Jonas Jenwald	0351852d74	[api-minor] Decode all JPEG images with the built-in PDF.js decoder in `src/core/jpg.js` Currently some JPEG images are decoded by the built-in PDF.js decoder in `src/core/jpg.js`, while others attempt to use the browser JPEG decoder. This inconsistency seem unfortunate for a number of reasons: - It adds, compared to the other image formats supported in the PDF specification, a fair amount of code/complexity to the image handling in the PDF.js library. - The PDF specification support JPEG images with features, e.g. certain ColorSpaces, that browsers are unable to decode natively. Hence, determining if a JPEG image is possible to decode natively in the browser require a non-trivial amount of parsing. In particular, we're parsing (part of) the raw JPEG data to extract certain marker data and we also need to parse the ColorSpace for the JPEG image. - While some JPEG images may, for all intents and purposes, appear to be natively supported there's still cases where the browser may fail to decode some JPEG images. In order to support those cases, we've had to implement a fallback to the PDF.js JPEG decoder if there's any issues during the native decoding. This also means that it's no longer possible to simply send the JPEG image to the main-thread and continue parsing, but you now need to actually wait for the main-thread to indicate success/failure first. In practice this means that there's a code-path where the worker-thread is forced to wait for the main-thread, while the reverse should always be the case. - The native decoding, for anything except the simplest of JPEG images, result in increased peak memory usage because there's a handful of short-lived copies of the JPEG data (see PR 11707). Furthermore this also leads to data being parsed on the main-thread, rather than the worker-thread, which you usually want to avoid for e.g. performance and UI-reponsiveness reasons. - Not all environments, e.g. Node.js, fully support native JPEG decoding. This has, historically, lead to some issues and support requests. - Different browsers may use different JPEG decoders, possibly leading to images being rendered slightly differently depending on the platform/browser where the PDF.js library is used. Originally the implementation in `src/core/jpg.js` were unable to handle all of the JPEG images in the test-suite, but over the last couple of years I've fixed (hopefully) all of those issues. At this point in time, there's two kinds of failure with this patch: - Changes which are basically imperceivable to the naked eye, where some pixels in the images are essentially off-by-one (in all components), which could probably be attributed to things such as different rounding behaviour in the browser/PDF.js JPEG decoder. This type of "failure" accounts for the vast majority of the total number of changes in the reference tests. - Changes where the JPEG images now looks ever so slightly blurrier than with the native browser decoder. For quite some time I've just assumed that this pointed to a general deficiency in the `src/core/jpg.js` implementation, however I've discovered when comparing two viewers side-by-side that the differences vanish at higher zoom levels (usually around 200% is enough). Basically if you disable [this downscaling in canvas.js](`8fb82e939c/src/display/canvas.js (L2356-L2395)`), which is what happens when zooming in, the differences simply vanish! Hence I'm pretty satisfied that there's no significant problems with the `src/core/jpg.js` implementation, and the problems are rather tied to the general quality of the downscaling algorithm used. It could even be seen as a positive that all images now share the same downscaling behaviour, since this actually fixes one old bug; see issue 7041.	2020-05-22 00:22:48 +02:00
Jonas Jenwald	c355f91d2e	[api-minor] Immediately release the `font.data` property once the font been attached to the DOM (PR 11777 follow-up) This patch implements https://github.com/mozilla/pdf.js/pull/11777#issuecomment-609741348 This extends the work from PR 11773 and 11777 further, by immediately releasing the `font.data` property once the font been attached to the DOM. By not unnecessarily holding onto this data on the main-thread, we'll thus reduce the memory usage of fonts even further (especially beneficial in longer documents with composite fonts). The new behaviour is controlled by the recently added `fontExtraProperties` API option (adding a new option just for this patch didn't seem necessary), since there's one edge-case in the SVG renderer where the `font.data` property is necessary (see the `pdf2svg` example). Note that while the default viewer does run clean-up with an idle timeout, that timeout will be reset whenever rendering occurs or when scrolling happens in the viewer. In practice this means that unless the user doesn't interact with the viewer in any way during an extended period of time, currently set to 30 seconds, the `PDFDocumentProxy.cleanup` method will never be called and font resources will thus not be cleaned-up.	2020-04-23 13:04:57 +02:00
Jonas Jenwald	426945b480	Update Prettier to version 2.0 Please note that these changes were done automatically, using `gulp lint --fix`. Given that the major version number was increased, there's a fair number of (primarily whitespace) changes; please see https://prettier.io/blog/2020/03/21/2.0.0.html In order to reduce the size of these changes somewhat, this patch maintains the old "arrowParens" style for now (once mozilla-central updates Prettier we can simply choose the same formatting, assuming it will differ here).	2020-04-14 12:28:14 +02:00
Jonas Jenwald	c97c778f8f	[api-minor] Produce non-translated/non-polyfilled builds by default	2020-02-14 18:12:07 +01:00
Jonas Jenwald	de36b2aaba	Enable auto-formatting of the entire code-base using Prettier (issue 11444) Note that Prettier, purposely, has only limited [configuration options](https://prettier.io/docs/en/options.html). The configuration file is based on [the one in `mozilla central`](https://searchfox.org/mozilla-central/source/.prettierrc) with just a few additions (to avoid future breakage if the defaults ever changes). Prettier is being used for a couple of reasons: - To be consistent with `mozilla-central`, where Prettier is already in use across the tree. - To ensure a consistent coding style everywhere, which is automatically enforced during linting (since Prettier is used as an ESLint plugin). This thus ends "all" formatting disussions once and for all, removing the need for review comments on most stylistic matters. Many ESLint options are now redundant, and I've tried my best to remove all the now unnecessary options (but I may have missed some). Note also that since Prettier considers the `printWidth` option as a guide, rather than a hard rule, this patch resorts to a small hack in the ESLint config to ensure that comments won't become too long. Please note: This patch is generated automatically, by appending the `--fix` argument to the ESLint call used in the `gulp lint` task. It will thus require some additional clean-up, which will be done in a separate commit. (On a more personal note, I'll readily admit that some of the changes Prettier makes are extremely ugly. However, in the name of consistency we'll probably have to live with that.)	2019-12-26 12:34:24 +01:00
Jonas Jenwald	3783eccfa4	Use the `strict` mode `assert` in the pdf2png Node.js example (issue 10768) See https://nodejs.org/api/assert.html#assert_strict_mode	2019-12-21 13:24:13 +01:00
Jonas Jenwald	9c3024fe7e	Add missing `hasChildNodes` polyfill to `domstubs.js` (PR 10022 follow-up)	2019-04-01 23:23:50 +02:00
Mohammed Essehemy	f0e9df745c	migrate to canvas 2.x api	2019-01-02 01:10:07 +02:00
Jonas Jenwald	f0719ed565	[api-minor] Change the `getViewport` method, on `PDFPageProxy`, to take a parameter object rather than a bunch of (randomly) ordered parameters If, as PR 10368 suggests, more parameters should be added to `getViewport` I think that it would be a mistake to not change the signature first to avoid needlessly unwieldy call-sites. To not break any existing code and third-party use-cases, this is obviously implemented with a deprecation warning and with a working fallback[1] for the old method signature. --- [1] This is limited to `GENERIC` builds, which should be sufficient.	2018-12-21 11:55:20 +01:00
Wojciech Maj	9e3f7ac7fa	Manually fix remaining ESLint errors	2018-12-11 15:23:26 +01:00
Wojciech Maj	ef1f255649	ESLint --fix	2018-12-11 15:23:26 +01:00
Wojciech Maj	80d7ff4912	Turn on ESLint in examples directory, apply examples-specific exceptions	2018-12-11 15:23:26 +01:00
Felipe augusto	1a75647a27	Remove unuseful variable Variable is declared, but never used.	2018-12-01 01:44:18 -02:00
Jonas Jenwald	2c003a82d5	Convert `RenderTask`, in `src/display/api.js`, to an ES6 class Also deprecates the `then` method, in favour of the `promise` getter.	2018-11-18 19:08:00 +01:00
Jonas Jenwald	ef8e5fd77c	Convert `PDFDocumentLoadingTask`, in `src/display/api.js`, to an ES6 class Also deprecates the `then` method, in favour of the `promise` getter.	2018-11-18 19:07:57 +01:00
Jonas Jenwald	0ecc22cb04	Attempt to provide better default values for the `disableFontFace`/`nativeImageDecoderSupport` API options in Node.js This should provide a better out-of-the-box experience when using PDF.js in a Node.js environment, since it's missing native support for both `@font-face` and `Image`. Please note that this change only affects the default values, hence it's still possible for an API consumer to override those values when calling `getDocument`. Also, prevents "ReferenceError: document is not defined" errors, when running the unit-tests in Node.js/Travis.	2018-06-03 00:28:37 +02:00
Tim van der Meij	64b1315bb5	Improve the instructions and code for the `pdf2png` example We need to pass `disableFontFace` and `nativeImageDecoderSupport` because Node.js has no native support for `@font-face` and `Image`. Doing so makes it possible to render e.g., the Tracemonkey paper, which failed before. I made this PDF file the default because it's also the default in other examples/demos and because it showcases the possibilities better than the very simple hello world PDF file. Building the library with `gulp dist-install` is easier and is already recommended in the other examples.	2018-04-01 12:52:57 +02:00
Jonas Jenwald	0e1b5589e7	Restore the `btoa`/`atob` polyfills for Node.js These were removed in PR 9170, since they were unused in the browsers that we'll support in PDF.js version `2.0`. However looking at the output of Travis, where a subset of the unit-tests are run using Node.js, there's warnings about `btoa` being undefined. This doesn't appear to cause any errors, which probably explains why we didn't notice this before (despite PR 9201).	2018-01-13 01:31:05 +01:00
巴里切罗	27a619246f	Add `btoa` back to domstubs.js	2017-11-28 16:34:53 +08:00
Tim van der Meij	d4309614f9	Replace `DOMParser` with `SimpleXMLParser` The `DOMParser` is most likely overkill and may be less secure. Moreover, it is not supported in Node.js environments. This patch replaces the `DOMParser` with a simple XML parser. This should be faster and gives us Node.js support for free. The simple XML parser is a port of the one that existed in the examples folder with a small regex fix to make the parsing work correctly. The unit tests are extended for increased test coverage of the metadata code. The new method `getAll` is provided so the example does not have to access internal properties of the object anymore.	2017-09-19 23:09:07 +02:00
Tim van der Meij	cc654fd38d	Provide a stub for `setAttribute` in order to use the SVG back-end with Node.js This patch fixes a regression from PR #8691 where we switched to using `setAttribute` instead of `setAttributeNS` if no namespace is provided.	2017-09-12 23:23:41 +02:00
Mukul Mishra	d16709f5e4	Adds tests for node_stream	2017-08-24 12:46:44 +05:30
Rob Wu	9b5086d649	pdf2svg.js: Serialize the SVG to a stream Implement a serialization "generator" for `DOMElement` in domutils.js that yields the serialization of the SVG element. This method is used by a newly added `ReadableSVGStream` class, which can be used like any other readable stream in Node.js. This reduces the memory requirements. Now, it is not needed to require the serialization to fully fit in memory. Note: The implementation of the serializer is a state machine in ES5 since the rest of the file is also in ES5. Its functionality is equivalent to: ``` function* serializeSVGElement(elem) { yield '<' + elem.nodeName; if (elem.nodeName === 'svg:svg') { yield ' xmlns:xlink="http://www.w3.org/1999/xlink"' + ' xmlns:svg="http://www.w3.org/2000/svg"'; } for (let i in elem.attributes) { yield ' ' + i + '="' + xmlEncode(elem.attributes[i]) + '"'; } yield '>'; if (elem.nodeName === 'svg:tspan' \|\| elem.nodeName === 'svg:style') { yield xmlEncode(elem.textContent); } else { for (let childNode of elem.childNodes) { yield* serializeSVGElement(childNode); } } yield '</' + elem.nodeName + '>'; } ```	2017-08-16 19:16:38 +02:00
Rob Wu	18566091aa	Fix display_svg_spec tests. - Mark the test as async, and don't swallow exceptions. - Fix the DOMElement polyfill to behave closer to the actual getAttributeNS method, which excludes the namespace prefix.	2017-07-16 11:01:52 +02:00
Rob Wu	3479a19bf0	Remove btoa from domstubs.js btoa is already defined by src/shared/compatibility.js, which is unconditionally imported by src/shared/util.js.	2017-07-10 18:45:47 +02:00
Rob Wu	9caaaf3a91	Add setStubs/unsetStubs to domstubs to support testing Do not directly export to global. Instead, export all stubs in domstubs.js and add a method setStubs to assign all exported stubs to a namespace. Then replace the import domstubs with an explicit call to this setStubs method. Also added unsetStubs for undoing the changes. This is done to allow unit testing of the SVG backend without namespace pollution.	2017-07-10 18:45:47 +02:00
Yury Delendik	9bed695ebd	Merge pull request #8540 from Rob--W/svg-oom Reduce memory requirements of pdf2svg.js example to avoid OOM	2017-06-20 17:24:48 -05:00
Rob Wu	0cc1735809	Reduce concurrent memory footprint of pdf2svg.js Wait for the completion of writing the generated SVG file before processing the next page. This is to enable the garbage collector to garbage-collect the (potentially large) SVG string before trying to allocate memory again for the next page. Note that since the PDF-to-SVG conversion is now sequential instead of parallel, the time to generate all pages increases. Test case: node --max_old_space_size=200 examples/node/pdf2svg.js /tmp/FatalProcessOutOfMemory.pdf Before this patch: - Node.js crashes due to OOM after processing 20 pages. After this patch: - Node.js is able to convert all 203 PDFs to SVG without crashing.	2017-06-19 21:53:11 +02:00
Rob Wu	849d8cfa24	Improve memory-efficiency of DOMElement_toString in domstubs Test case: Using the PDF file from https://github.com/mozilla/pdf.js/issues/8534 node --max_old_space_size=200 examples/node/pdf2svg.js /tmp/FatalProcessOutOfMemory.pdf Before this patch: Node.js crashes due to OOM after processing 10 pages. After this patch: Node.js crashes due to OOM after processing 19 pages.	2017-06-19 21:52:39 +02:00
Rob Wu	4f22ba54bf	Add getAttributeNS to domstubs for SVG example The closePath method in src/display/svg.js relies on this.	2017-06-19 14:11:13 +02:00
Yury Delendik	a18caa730d	Adds gulp dist-install command; using pdfjs-dist package in examples.	2017-06-12 10:22:16 -05:00
Jonas Jenwald	c5f73edcd2	Convert the `DOMCanvasFactory` to an ES6 class For consistency, also updates the `pdf2png.js` example to use the slightly less verbose `canvasAndContext` parameter name.	2017-05-11 20:15:22 +02:00
巴里切罗	8d5d97264e	fix(svg) adjust strategy for decoding JPEG images	2017-05-08 11:32:44 +08:00
Jonas Jenwald	d76cfc0610	Disable the `NativeImageDecoder` in the `node/pdf2svg.js` example (issue 7901) It doesn't really make sense to attempt to utilize the `NativeImageDecoder` in Node, since there's no native image support available, hence building on PR 8035 we can easily disable it in the example. Fixes 7901.	2017-04-04 17:24:29 +02:00

1 2

63 Commits