pdf.js

Author	SHA1	Message	Date
Tim van der Meij	95e094c0bd	Merge pull request #12815 from Snuffleupagus/update-webpack-example Update webpack example	2021-01-07 22:24:24 +01:00
fabien	35b15cc0b5	1. Add `filename` option in `worker-loader` package require. Without this option, since version 3.0.0, it tell webpack to generate a worker file named `pdf.worker.worker.js` instead of the expected `pdf.worker.js`. 2. Update README of webpack example to mention that a version 3.0.0 or higher of the `worker-loader` package is now required.	2021-01-07 15:14:02 +01:00
Jonas Jenwald	67746ac1c0	Update the webpack-versions used in `examples/webpack` Once the next PDF.js release is made, the `webpack` example will no longer work since the non-translated builds now use ECMAScript features not supported by older `webpack`-versions.	2021-01-05 12:42:11 +01:00
Jonas Jenwald	ba079453bf	Enable the ESLint `no-debugger` and `no-alert` rules The `debugger`-statement would only, potentially, make sense during development and we thus want to prevent it from being accidentally included when landing code. The `alert`, `confirm`, and `prompt` functions should generally be avoided, with the few intended cases manually allowed. Please find additional details about the ESLint rules at: - https://eslint.org/docs/rules/no-debugger - https://eslint.org/docs/rules/no-alert	2020-10-05 13:41:06 +02:00
Jonas Jenwald	8aa2718d22	Re-format all `web/*.css` files using Stylelint/Prettier This was done automatically, using `gulp lint --fix`.	2020-08-30 21:49:08 +02:00
Jonas Jenwald	4a7e29865d	[api-minor] Use the `NodeCanvasFactory`/`NodeCMapReaderFactory` classes as defaults in Node.js environments (issue 11900) This moves, and slightly simplifies, code that's currently residing in the unit-test utils into the actual library, such that it's bundled with `GENERIC`-builds and used in e.g. the API-code. As an added bonus, this also brings out-of-the-box support for CMaps in e.g. the Node.js examples.	2020-07-02 04:44:23 +02:00
Alex Plumley	3b9031f6a3	Fix pdfjs-dist/webpack causing errors with certain configs Using `require.resolve("worker-loader")` to check if `worker-loader` is installed causes webpack to include `worker-loader` in the output bundle, which is not the intended effect. Aside from increasing the bundle size unnecessarily, it also causes errors for webpack configs with targets that don't have node's built-in modules. These errors can be fixed by configuring webpack `externals` to exclude `worker-loader`, but it's more difficult to figure out this solution than to figure out that `worker-loader` needs to be installed (even without this explicit error message). To solve this, the explicit check for `worker-loader` has been removed. An alternative solution would be to use webpack's `resolveWeak`. Documentation has also been added in `examples/webpack` to help users.	2020-06-03 14:50:41 -04:00
Jonas Jenwald	0351852d74	[api-minor] Decode all JPEG images with the built-in PDF.js decoder in `src/core/jpg.js` Currently some JPEG images are decoded by the built-in PDF.js decoder in `src/core/jpg.js`, while others attempt to use the browser JPEG decoder. This inconsistency seem unfortunate for a number of reasons: - It adds, compared to the other image formats supported in the PDF specification, a fair amount of code/complexity to the image handling in the PDF.js library. - The PDF specification support JPEG images with features, e.g. certain ColorSpaces, that browsers are unable to decode natively. Hence, determining if a JPEG image is possible to decode natively in the browser require a non-trivial amount of parsing. In particular, we're parsing (part of) the raw JPEG data to extract certain marker data and we also need to parse the ColorSpace for the JPEG image. - While some JPEG images may, for all intents and purposes, appear to be natively supported there's still cases where the browser may fail to decode some JPEG images. In order to support those cases, we've had to implement a fallback to the PDF.js JPEG decoder if there's any issues during the native decoding. This also means that it's no longer possible to simply send the JPEG image to the main-thread and continue parsing, but you now need to actually wait for the main-thread to indicate success/failure first. In practice this means that there's a code-path where the worker-thread is forced to wait for the main-thread, while the reverse should always be the case. - The native decoding, for anything except the simplest of JPEG images, result in increased peak memory usage because there's a handful of short-lived copies of the JPEG data (see PR 11707). Furthermore this also leads to data being parsed on the main-thread, rather than the worker-thread, which you usually want to avoid for e.g. performance and UI-reponsiveness reasons. - Not all environments, e.g. Node.js, fully support native JPEG decoding. This has, historically, lead to some issues and support requests. - Different browsers may use different JPEG decoders, possibly leading to images being rendered slightly differently depending on the platform/browser where the PDF.js library is used. Originally the implementation in `src/core/jpg.js` were unable to handle all of the JPEG images in the test-suite, but over the last couple of years I've fixed (hopefully) all of those issues. At this point in time, there's two kinds of failure with this patch: - Changes which are basically imperceivable to the naked eye, where some pixels in the images are essentially off-by-one (in all components), which could probably be attributed to things such as different rounding behaviour in the browser/PDF.js JPEG decoder. This type of "failure" accounts for the vast majority of the total number of changes in the reference tests. - Changes where the JPEG images now looks ever so slightly blurrier than with the native browser decoder. For quite some time I've just assumed that this pointed to a general deficiency in the `src/core/jpg.js` implementation, however I've discovered when comparing two viewers side-by-side that the differences vanish at higher zoom levels (usually around 200% is enough). Basically if you disable [this downscaling in canvas.js](`8fb82e939c/src/display/canvas.js (L2356-L2395)`), which is what happens when zooming in, the differences simply vanish! Hence I'm pretty satisfied that there's no significant problems with the `src/core/jpg.js` implementation, and the problems are rather tied to the general quality of the downscaling algorithm used. It could even be seen as a positive that all images now share the same downscaling behaviour, since this actually fixes one old bug; see issue 7041.	2020-05-22 00:22:48 +02:00
Jonas Jenwald	744af9eeb8	Enable the ESLint `grouped-accessor-pairs` rule This rule complements the existing `accessor-pairs` nicely, and ensures that a getter/setter pair is always consistently ordered. Please find additional details about this rule at https://eslint.org/docs/rules/grouped-accessor-pairs	2020-05-07 11:43:19 +02:00
Jonas Jenwald	3dc0567a37	Remove the `create-react-app` example (issue 11729) Given that none of the PDF.js contributors know React, maintaining and/or providing supporting for the example isn't really feasible unfortunately. Even something as simple as running/testing the example becomes difficult for anyone completely unfamiliar with React, and furthermore: - It's very difficult to tell if the example demonstrates React best-practices, since the PDF.js contributors don't know React. - We also have no reasonable way of keeping the example up-to-date with changes in React. - The React example, in its current form, is even hard-coding the PDF.js version to a now unsupported version. - The example is currently triggering "fake worker" usage, see issue 11729, which is really really bad. Note that the "fake worker" functionality is only intended as a fallback, and it should absolutely not under any circumstances be advertised and certainly shouldn't be triggered in official PDF.js examples.	2020-05-01 12:42:35 +02:00
Jonas Jenwald	c355f91d2e	[api-minor] Immediately release the `font.data` property once the font been attached to the DOM (PR 11777 follow-up) This patch implements https://github.com/mozilla/pdf.js/pull/11777#issuecomment-609741348 This extends the work from PR 11773 and 11777 further, by immediately releasing the `font.data` property once the font been attached to the DOM. By not unnecessarily holding onto this data on the main-thread, we'll thus reduce the memory usage of fonts even further (especially beneficial in longer documents with composite fonts). The new behaviour is controlled by the recently added `fontExtraProperties` API option (adding a new option just for this patch didn't seem necessary), since there's one edge-case in the SVG renderer where the `font.data` property is necessary (see the `pdf2svg` example). Note that while the default viewer does run clean-up with an idle timeout, that timeout will be reset whenever rendering occurs or when scrolling happens in the viewer. In practice this means that unless the user doesn't interact with the viewer in any way during an extended period of time, currently set to 30 seconds, the `PDFDocumentProxy.cleanup` method will never be called and font resources will thus not be cleaned-up.	2020-04-23 13:04:57 +02:00
Jonas Jenwald	1cc3dbb694	Enable the `dot-notation` ESLint rule Please note: These changes were done automatically, using the `gulp lint --fix` command. This rule is already enabled in mozilla-central, see https://searchfox.org/mozilla-central/rev/567b68b8ff4b6d607ba34a6f1926873d21a7b4d7/tools/lint/eslint/eslint-plugin-mozilla/lib/configs/recommended.js#103-104 The main advantage, besides improved consistency, of this rule is that it reduces the size of the code (by 3 bytes for each case). In the PDF.js code-base there's close to 8000 instances being fixed by the `dot-notation` ESLint rule, which end up reducing the size of even the built files significantly; the total size of the `gulp mozcentral` build target changes from `3 247 456` to `3 224 278` bytes, which is a reduction of `23 178` bytes (or ~0.7%) for a completely mechanical change. A large number of these changes affect the (large) lookup tables used on the worker-thread, but given that they are still initialized lazily I don't think that the new formatting this patch introduces should undo any of the improvements from PR 6915. Please find additional details about the ESLint rule at https://eslint.org/docs/rules/dot-notation	2020-04-17 12:24:46 +02:00
Jonas Jenwald	426945b480	Update Prettier to version 2.0 Please note that these changes were done automatically, using `gulp lint --fix`. Given that the major version number was increased, there's a fair number of (primarily whitespace) changes; please see https://prettier.io/blog/2020/03/21/2.0.0.html In order to reduce the size of these changes somewhat, this patch maintains the old "arrowParens" style for now (once mozilla-central updates Prettier we can simply choose the same formatting, assuming it will differ here).	2020-04-14 12:28:14 +02:00
Jonas Jenwald	9a437a158f	[api-minor] Deprecate `getGlobalEventBus` and update the "viewer components" examples accordingly To avoid outright breaking third-party usages of the "viewer components" the `getGlobalEventBus` functionality is left intact, but a deprecation message is printed if the function is invoked. The various examples are updated to explicitly initialize an `EventBus` instance, and provide that when initializing the relevant viewer components.	2020-02-27 14:44:48 +01:00
Jonas Jenwald	c97c778f8f	[api-minor] Produce non-translated/non-polyfilled builds by default	2020-02-14 18:12:07 +01:00
Jonas Jenwald	2e5faa8edc	Add `direction: ltr;` to the canvases used in `examples/learning`, to ensure correct text rendering (issue 11457) This is currently the only possible way of addressing the issue, until https://developer.mozilla.org/en-US/docs/Web/API/CanvasRenderingContext2D/direction becomes generally available in browsers. Note: This will also require manually updating https://mozilla.github.io/pdf.js/examples/#interactive-examples	2020-01-12 12:25:23 +01:00
Tim van der Meij	e3c0181357	Convert all six-digit HEX colors to RGBA colors	2020-01-01 14:52:37 +01:00
Tim van der Meij	403a994556	Convert all three-digit HEX colors to RGBA colors	2020-01-01 14:52:37 +01:00
Tim van der Meij	d002637405	Convert all named colors to RGBA colors	2020-01-01 14:48:56 +01:00
Jonas Jenwald	a63f7ad486	Fix the linting errors, from the Prettier auto-formatting, that ESLint `--fix` couldn't handle This patch makes the follow changes: - Remove no longer necessary inline `// eslint-disable-...` comments. - Fix `// eslint-disable-...` comments that Prettier moved down, thus causing new linting errors. - Concatenate strings which now fit on just one line. - Fix comments that are now too long. - Finally, and most importantly, adjust comments that Prettier moved down, since the new positions often is confusing or outright wrong.	2019-12-26 12:35:12 +01:00
Jonas Jenwald	de36b2aaba	Enable auto-formatting of the entire code-base using Prettier (issue 11444) Note that Prettier, purposely, has only limited [configuration options](https://prettier.io/docs/en/options.html). The configuration file is based on [the one in `mozilla central`](https://searchfox.org/mozilla-central/source/.prettierrc) with just a few additions (to avoid future breakage if the defaults ever changes). Prettier is being used for a couple of reasons: - To be consistent with `mozilla-central`, where Prettier is already in use across the tree. - To ensure a consistent coding style everywhere, which is automatically enforced during linting (since Prettier is used as an ESLint plugin). This thus ends "all" formatting disussions once and for all, removing the need for review comments on most stylistic matters. Many ESLint options are now redundant, and I've tried my best to remove all the now unnecessary options (but I may have missed some). Note also that since Prettier considers the `printWidth` option as a guide, rather than a hard rule, this patch resorts to a small hack in the ESLint config to ensure that comments won't become too long. Please note: This patch is generated automatically, by appending the `--fix` argument to the ESLint call used in the `gulp lint` task. It will thus require some additional clean-up, which will be done in a separate commit. (On a more personal note, I'll readily admit that some of the changes Prettier makes are extremely ugly. However, in the name of consistency we'll probably have to live with that.)	2019-12-26 12:34:24 +01:00
Tim van der Meij	6316b2a195	Merge pull request #11422 from Snuffleupagus/issue-10768 Use the `strict` mode `assert` in the pdf2png Node.js example (issue 10768)	2019-12-21 13:37:22 +01:00
Jonas Jenwald	3783eccfa4	Use the `strict` mode `assert` in the pdf2png Node.js example (issue 10768) See https://nodejs.org/api/assert.html#assert_strict_mode	2019-12-21 13:24:13 +01:00
Jonas Jenwald	aab0f91740	[api-minor] Simplify the fallback fake worker loader code in `src/display/api.js` For performance reasons, and to avoid hanging the browser UI, the PDF.js library should always be used with web workers enabled. At this point in time all of the supported browsers should have proper worker support, and Node.js is thus the only environment where workers aren't supported. Hence it no longer seems relevant/necessary to provide, by default, fake worker loaders for various JS builders/bundlers/frameworks in the PDF.js code itself.[1] In order to simplify things, the fake worker loader code is thus simplified to now only support Node.js usage respectively "normal" browser usage out-of-the-box.[2] Please note: The officially intended way of using the PDF.js library is with workers enabled, which can be done by setting `GlobalWorkerOptions.workerSrc`, `GlobalWorkerOptions.workerPort`, or manually providing a `PDFWorker` instance when calling `getDocument`. --- [1] Note that it's still possible to manually disable workers, simply my manually loading the built `pdf.worker.js` file into the (current) global scope, however this's mostly intended for testing/debugging purposes. [2] Unfortunately some bundlers such as Webpack, when used with third-party deployments of the PDF.js library, will start to print `Critical dependency: ...` warnings when run against the built `pdf.js` file from this patch. The reason is that despite the `require` calls being protected by runtime `isNodeJS` checks, it's not possible to simply tell Webpack to just ignore the `require`; please see [Webpack issue 8826](https://github.com/webpack/webpack) and libraries such as [require-fool-webpack](https://github.com/sindresorhus/require-fool-webpack).	2019-12-20 17:36:08 +01:00
Jonas Jenwald	d621899d50	Add a `reset` method to the `PDFHistory` implementation This patch addresses a couple of smaller issues with the `PDFHistory` class: - Most, if not all, other viewer components can be reset in one way or another, and there's no good reason for the `PDFHistory` implementation to be different here. - Currently it's (technically) possible to keep adding entries to the browser history, via the `PDFHistory` instance, even after the document has been closed. That obviously makes no sense, and is caused by the lack of a `reset` method. - The internal `this._isPagesLoaded` property was never actually reset, which would lead to it being temporarily wrong when a new document was opened in the default viewer.	2019-12-13 10:38:39 +01:00
Luís Takahashi	00c3339520	Add Create React App example with TypeScript and basic usage	2019-10-10 23:25:41 +02:00
Tim van der Meij	215c546fd5	Upgrade to `eslint` version 6 This major version bump required two changes: - The global line in the mobile viewer example should be removed because the `.eslintrc` file already defines these globals and with the new `eslint` version we otherwise get an error saying "'pdfjsLib' is already defined as a built-in global variable". - The ECMA version for the examples must be set to 6 since we're using modules, otherwise we get an error saying "sourceType 'module' is not supported when ecmaVersion < 2015". It turns out that the previous version of `eslint` already used ECMA version 6 silently even though we set 5, see https://github.com/eslint/eslint/issues/9687#issuecomment-432413384, so in terms of our code nothing really changes.	2019-08-24 20:21:10 +02:00
dhuang612	d52d1e2d09	added in information about pdfjs/webpack updated readme with corrections	2019-08-20 10:20:32 -04:00
Jonas Jenwald	9c3024fe7e	Add missing `hasChildNodes` polyfill to `domstubs.js` (PR 10022 follow-up)	2019-04-01 23:23:50 +02:00
Jonas Jenwald	f06b2e4e9f	Update the `mobile-viewer` example to use the new `PDFHistory.initialize` format (PR 10423 follow-up)	2019-01-23 15:27:19 +01:00
Tim van der Meij	61dcc41a3c	Clarify that `gulp dist-install` should be used for the AcroForms example Fixes #10333.	2019-01-05 15:20:50 +01:00
Mohammed Essehemy	f0e9df745c	migrate to canvas 2.x api	2019-01-02 01:10:07 +02:00
Jonas Jenwald	9962ab66ab	Update remaining examples, and docs, to utilize current API functionality (issue 10377) This contains a couple of changes that I missed elsewhere, sorry about that!	2018-12-24 12:33:39 +01:00
Jonas Jenwald	f0719ed565	[api-minor] Change the `getViewport` method, on `PDFPageProxy`, to take a parameter object rather than a bunch of (randomly) ordered parameters If, as PR 10368 suggests, more parameters should be added to `getViewport` I think that it would be a mistake to not change the signature first to avoid needlessly unwieldy call-sites. To not break any existing code and third-party use-cases, this is obviously implemented with a deprecation warning and with a working fallback[1] for the old method signature. --- [1] This is limited to `GENERIC` builds, which should be sufficient.	2018-12-21 11:55:20 +01:00
Tim van der Meij	fa85f86298	Upgrade to Gulp 4 This required the following changes in the Gulpfile: - Defining a series of tasks is no longer done with arrays, but with the `gulp.series` function. The `web` target is refactored to use a smaller number of tasks to prevent tasks from running multiple times. - Getting all tasks must now be done through the task registry. - Tasks that don't return anything must call `done` upon completion. Moreover, this upgrade allows us to use the latest Node.js on Travis CI again.	2018-12-17 16:20:13 +01:00
Wojciech Maj	9e3f7ac7fa	Manually fix remaining ESLint errors	2018-12-11 15:23:26 +01:00
Wojciech Maj	ef1f255649	ESLint --fix	2018-12-11 15:23:26 +01:00
Wojciech Maj	80d7ff4912	Turn on ESLint in examples directory, apply examples-specific exceptions	2018-12-11 15:23:26 +01:00
Felipe augusto	1a75647a27	Remove unuseful variable Variable is declared, but never used.	2018-12-01 01:44:18 -02:00
Jonas Jenwald	2c003a82d5	Convert `RenderTask`, in `src/display/api.js`, to an ES6 class Also deprecates the `then` method, in favour of the `promise` getter.	2018-11-18 19:08:00 +01:00
Jonas Jenwald	ef8e5fd77c	Convert `PDFDocumentLoadingTask`, in `src/display/api.js`, to an ES6 class Also deprecates the `then` method, in favour of the `promise` getter.	2018-11-18 19:07:57 +01:00
Alexis Dardinier	2011345315	Update versions in webpack example Fix package.json after review	2018-11-12 11:15:17 +01:00
Jonas Jenwald	e2e9657ed0	Remove the `attachDOMEventsToEventBus` functionality, since `EventBus` instances are able to re-dispatch events to the DOM (PR 10019, bug 1492849 follow-up) This also removes the old 'pagechange'/'scalechange'/'documentload' events.	2018-10-31 23:32:39 +01:00
Jonas Jenwald	2ed3591b22	Make `PDFFindController` less confusing to use, by allowing searching to start when `setDocument` is called This patch is based on something that I noticed while working on PR 10126. The recent re-factoring of `PDFFindController` brought many improvements, among those the fact that access to `BaseViewer` is no longer required. However, with these changes there's one thing which now strikes me as not particularly user-friendly[1]: The fact that in order for searching to actually work, `PDFFindController.setDocument` must be called and a 'pagesinit' event must be dispatched (from somewhere). For all other viewer components, calling the `setDocument` method[2] is enough in order for the component to actually be usable. The `PDFFindController` thus stands out quite a bit, and it also becomes difficult to work with in any sort of custom implementation. For example: Imagine someone trying to use `PDFFindController` separately from the viewer[3], which should now be relatively simple given the re-factoring, and thus having to (somehow) figure out that they'll also need to manually dispatch a 'pagesinit' event for searching to work. Note that the above even affects the unit-tests, where an out-of-place 'pagesinit' event is being used. To attempt to address these problems, I'm thus suggesting that only `setDocument` should be used to indicate that searching may start. For the default viewer and/or the viewer components, `BaseViewer.setDocument` will now call `PDFFindController.setDocument` when the document is ready, thus requiring no outside configuration anymore[4]. For custom implementation, and the unit-tests, it's now as simple as just calling `PDFFindController.setDocument` to allow searching to start. --- [1] I should have caught this during review of PR 10099, but unfortunately it's sometimes not until you actually work with the code in question that things like these become clear. [2] Assuming, obviously, that the viewer component in question actually implements such a method :-) [3] There's even a very recent issue, filed by someone trying to do just that. [4] Short of providing a `PDFFindController` instance when creating a `BaseViewer` instance, of course.	2018-10-04 10:28:50 +02:00
Tim van der Meij	f79fb88864	Remove the find controller setter in `web/base_viewer.js` With `PDFFindController` instances no longer (directly) depending on `BaseViewer` instances, we can pass a single `findController` when initializing a viewer, similar to other components.	2018-09-30 16:59:58 +02:00
Tim van der Meij	e0c811f2ed	Use the link service for getting and setting page information This removes the dependency on a `PDFViewer` instance from the find controller, which makes it more similar to other components and makes it easier to unit test with a mock link service. Finally, we remove the search capabilities from the SVG example since it doesn't work there because there is no separate text layer.	2018-09-30 16:59:46 +02:00
Tim van der Meij	e293c12afc	Implement the `setDocument` method for the find controller Now it follows the same pattern as e.g., the document properties component, which allows us to have one instance of the find controller and set a new document to search upon switching documents. Moreover, this allows us to get rid of the dependency on `pdfViewer` in order to fetch the text content for a page. This is working towards getting rid of the `pdfViewer` dependency upon initializing the component entirely in future commits. Finally, we make the `reset` method private since it's not supposed to be used from the outside anymore now that `setDocument` takes care of this, similar to other components.	2018-09-30 16:57:40 +02:00
Jonas Jenwald	663922f93f	Add a new parameter to `JpegImage.getData` to indicate the source of the image data (issue 9513) The purpose of this patch is to provide a better default behaviour when `JpegImage` is used to parse standalone JPEG images with CMYK colour spaces. Since the issue that the patch concerns is somewhat of a special-case, the implementation utilizes the already existing decode support in an attempt to minimize the impact w.r.t. code size. Please note: It's always possible for the user of `JpegImage` to control image inversion, and thus override the new behaviour, by simply passing a custom `decodeTransform` array upon initialization.	2018-09-02 14:15:22 +02:00
RonLek	8afc4ce258	Modified Examples to work without systemjs	2018-07-21 16:56:06 +05:30
Tim van der Meij	1024615ecb	Correct the instructions in the README file for `examples/mobile-viewer`	2018-07-08 15:32:06 +02:00

1 2 3 4

190 Commits