pdf.js

Author	SHA1	Message	Date
Jonas Jenwald	88fdb482b0	Move the `isEmptyObj` helper function from `src/shared/util.js` to `test/unit/test_utils.js` Since this helper function is no longer used anywhere in the main code-base, but only in a couple of unit-tests, it's thus being moved to a more appropriate spot. Finally, the implementation of `isEmptyObj` is also tweaked slightly by removing the manual loop.	2020-06-09 17:50:16 +02:00
Jonas Jenwald	159e13c4e4	Convert the `ChunkedStreamManager.promisesByRequest` property to a `Map` Compared to regular `Object`s, `Map`s have a number of advantageous properties: Of particular importance in this case is the built-in iteration support, and that determining if the structure is empty is easy.	2020-06-09 17:50:14 +02:00
Jonas Jenwald	dda7a5d1b7	Convert the `ChunkedStreamManager.requestsByChunk` property to a `Map` Compared to regular `Object`s, `Map`s have a number of advantageous properties: Of particular importance in this case is the built-in iteration support, and that determining if the structure is empty is easy.	2020-06-09 17:50:11 +02:00
Jonas Jenwald	17e23ffb33	Convert the `ChunkedStreamManager.chunksNeededByRequest` property to a `Map` (containing `Set`s) Compared to regular `Object`s, `Map`s (and `Set`s) have a number of advantageous properties: Of particular importance in this case is the built-in iteration support, and that determining if the structure is empty is easy.	2020-06-09 17:49:53 +02:00
Tim van der Meij	a4fa4554d6	Merge pull request #11977 from timvandermeij/refset Convert the `RefSet` primitive to a proper class and use a `Set` internally	2020-06-07 23:15:35 +02:00
Tim van der Meij	4c2e056796	Convert the `RefSet` primitive to a proper class and use a `Set` internally The `RefSet` primitive predates ES6, so that most likely explains why an object is used internally to track the entries. However, nowadays we can use built-in JavaScript sets for this purpose. Built-in types are often more efficient/optimized and using it makes the code a bit more clear since we don't have to assign `true` to keys anymore just to indicate their presence.	2020-06-07 19:01:29 +02:00
Tim van der Meij	4c36dadfe2	Merge pull request #11978 from timvandermeij/unit-test-primitives Improve unit test coverage for primitives	2020-06-07 18:58:17 +02:00
Tim van der Meij	550a38f1ba	Improve unit test coverage for primitives This commit includes unit tests for: - `isEOF` - `isStream` - `Ref`'s string representation and caching - `Dict`'s XRef assignment	2020-06-07 17:31:40 +02:00
Tim van der Meij	4cfeda31fa	Merge pull request #11976 from Snuffleupagus/rm-dead-network-code Remove unused methods from `NetworkManager`, in `src/display/network.js`	2020-06-07 17:27:06 +02:00
Jonas Jenwald	466d10f6fc	Remove unused methods from `NetworkManager`, in `src/display/network.js` Both of the removed methods were added in PR 2719, however they are no longer used: - It appears that `hasPendingRequests` was never used at all, even from the beginning. - The only general PDF.js library usage of `abortAllRequests` was removed in PR 6879, which is now four years ago. (Originally the Firefox-specific network implementation, see https://searchfox.org/mozilla-central/source/browser/extensions/pdfjs/content/PdfJsNetwork.jsm, was shared with the `src/display/network.js` file and there this method is used. However, since all of the Firefox-specific code now lives directly in mozilla-central, that's not relevant for the removal in this patch.)	2020-06-07 16:03:32 +02:00
Tim van der Meij	2bd0690fdd	Convert `var` to `const`/`let` in `test/unit_primitives_spec.js`	2020-06-07 15:04:24 +02:00
Tim van der Meij	c97200ff59	Merge pull request #11974 from Snuffleupagus/sendImgData A couple of small image caching/sending improvements	2020-06-07 13:53:26 +02:00
Tim van der Meij	b779507370	Merge pull request #11963 from tamuratak/srgb_conv Avoid calling Math.pow if possible.	2020-06-07 13:09:41 +02:00
Jonas Jenwald	df7d8c74ca	Extract the actual sending of image data from the `PartialEvaluator.buildPaintImageXObject` method After PRs 10727 and 11912, the code responsible for sending the decoded image data to the main-thread has now become a fair bit more involved the previously. To reduce the amount of duplication here, the actual code responsible for sending the data is thus extracted into a new helper method instead.	2020-06-07 12:01:51 +02:00
Jonas Jenwald	aff0d56326	Remove an unnecessary `RefSetCache.prototype.has()` call from `GlobalImageCache.getData` We can simply attempt to get the data directly, and instead check the result, rather than first checking if it exists.	2020-06-07 11:56:04 +02:00
Takashi Tamura	7acb112ca9	Optimization: Avoid calling Math.pow if possible when calculating the transfer function of the CalRGB color space since calling Math.pow is expensive. If the value of color is larger than the threshold, 0.99554525, the final result of the transform is larger that 254.5 since ((1 + 0.055) * 0.99554525 ** (1 / 2.4) - 0.055) * 255 === 254.50000003134699	2020-06-07 13:17:18 +09:00
Tim van der Meij	039307f88c	Merge pull request #11972 from Snuffleupagus/ChunkedStream-loadedChunks-Set Change the `loadedChunks` property, on `ChunkedStream` instances, from an Array to a Set	2020-06-06 00:12:14 +02:00
Tim van der Meij	891c706aa8	Merge pull request #11953 from emalysz/11838-fallback-after-click For #11838: trigger fallback bar after user clicks in pdf	2020-06-06 00:03:19 +02:00
Jonas Jenwald	b7272a34eb	Change the `loadedChunks` property, on `ChunkedStream` instances, from an Array to a Set In the old code the use of an Array meant that we had to manually track the `numChunksLoaded` property, given that simply using the Array `length` wouldn't have worked since there's no guarantee that the data is loaded in order when e.g. range requests are in use. Tracking closely related state separately in this manner never seem like a good idea, and we can now instead utilize a Set to avoid that.	2020-06-05 15:03:06 +02:00
Tim van der Meij	7aa1b2d418	Merge pull request #11964 from aplum/fix-webpack-import Fix pdfjs-dist/webpack causing errors with certain configs	2020-06-04 23:56:49 +02:00
Tim van der Meij	ad261a2da4	Merge pull request #11967 from havocbcn/jpg-rgb Do not transform jpeg RGB components	2020-06-04 23:53:45 +02:00
Carlos Rodríguez	802aa14a99	Jpeg encoded with RGB -instead of YCbCr- write the components index as "RGB" in ASCII to say it so On ISO/IEC 10918-6:2013 (E), section 6.1: (http://www.itu.int/rec/T-REC-T.872-201206-I/en) "Images encoded with three components are assumed to be RGB data encoded as YCbCr unless the image contains an APP14 marker segment as specified in 6.5.3, in which case the colour encoding is considered either RGB or YCbCr according to the application data of the APP14 marker segment" But common jpeg libraries consider RGB too if components index are ASCII R (0x52), G (0x47) and B (0x42): https://stackoverflow.com/questions/50798014/determining-color-space-for-jpeg/50861048 Issue #11931	2020-06-04 15:08:47 +02:00
Emma Malysz	6e9d158a98	For #11838 : trigger fallback bar after user clicks in pdf	2020-06-03 14:03:46 -07:00
Alex Plumley	3b9031f6a3	Fix pdfjs-dist/webpack causing errors with certain configs Using `require.resolve("worker-loader")` to check if `worker-loader` is installed causes webpack to include `worker-loader` in the output bundle, which is not the intended effect. Aside from increasing the bundle size unnecessarily, it also causes errors for webpack configs with targets that don't have node's built-in modules. These errors can be fixed by configuring webpack `externals` to exclude `worker-loader`, but it's more difficult to figure out this solution than to figure out that `worker-loader` needs to be installed (even without this explicit error message). To solve this, the explicit check for `worker-loader` has been removed. An alternative solution would be to use webpack's `resolveWeak`. Documentation has also been added in `examples/webpack` to help users.	2020-06-03 14:50:41 -04:00
Tim van der Meij	96ad60f116	Merge pull request #11958 from Snuffleupagus/rm-getOpenActionDestination [api-minor] Remove the deprecated `PDFDocumentProxy.getOpenActionDestination` method (PR 11644 follow-up)	2020-06-02 23:51:55 +02:00
Jonas Jenwald	64378fc366	[api-minor] Remove the deprecated `PDFDocumentProxy.getOpenActionDestination` method (PR 11644 follow-up) This method has been printing a `deprecated` warning in two releases, hence it should hopefully be safe to remove now.	2020-06-02 12:28:00 +02:00
Tim van der Meij	8fc1126b5a	Merge pull request #11948 from timvandermeij/bump Bump versions in `pdfjs.config` and update the getting started page of the website for the new release	2020-06-01 12:51:06 +02:00
Tim van der Meij	a98b81f8ae	Bump versions in `pdfjs.config` and update the getting started page of the website for the new release	2020-06-01 12:45:04 +02:00
Tim van der Meij	0974d60523	Merge pull request #11947 from Snuffleupagus/GlobalImageCache-assert-not-inline Ensure that that we don't attempt to cache inline images in the `GlobalImageCache` (PR 11912 follow-up) v2.5.207	2020-06-01 11:39:40 +02:00
Jonas Jenwald	af815e417d	Ensure that that we don't attempt to cache inline images in the `GlobalImageCache` (PR 11912 follow-up) Since inline images, i.e. those defined inside of `/Contents` streams, are by their very definition page-specific it thus seem like a good idea to actually enforce that they won't accidentally end up in the `GlobalImageCache`.	2020-06-01 01:00:30 +02:00
Tim van der Meij	5879710327	Merge pull request #11945 from Snuffleupagus/update-packages Update packages and translations	2020-05-30 14:24:25 +02:00
Jonas Jenwald	f2cbd5de42	Update l10n files	2020-05-30 11:01:34 +02:00
Jonas Jenwald	da482310ee	Update `npm` packages	2020-05-30 10:58:10 +02:00
Tim van der Meij	878619956b	Merge pull request #11943 from Snuffleupagus/cleanup-preprocessCSS Remove unused code from the `external/builder/builder.js` file	2020-05-29 23:52:25 +02:00
Jonas Jenwald	d7dee0ea1c	Remove the `hasPrefixedFirefox` functionality from the `external/builder/builder.js` file This functionality has been completely unused ever since PR 9566 (two years ago).	2020-05-29 17:18:16 +02:00
Jonas Jenwald	ce234ab3c7	Remove the `deprecatedInMozcentral` functionality from the `external/builder/builder.js` file This functionality has been completely unused ever since PR 9629 (two years ago).	2020-05-29 17:14:38 +02:00
Tim van der Meij	fe5689705d	Merge pull request #11930 from Snuffleupagus/LocalImageCache Improve the local image caching in `PartialEvaluator.getOperatorList`	2020-05-28 00:12:37 +02:00
Tim van der Meij	efc2588d12	Merge pull request #11940 from Snuffleupagus/pdf.js-export-comments Add comments to the `export` list in the `src/pdf.js` file (PR 11914 follow-up)	2020-05-27 23:58:53 +02:00
Tim van der Meij	15493ebdc3	Merge pull request #11939 from Snuffleupagus/acorn-7 Update Acorn to version 7	2020-05-27 23:57:58 +02:00
Jonas Jenwald	4d60430b1c	Add comments to the `export` list in the `src/pdf.js` file (PR 11914 follow-up) When converting this file to use standard `import`/`export` statements, I sorted the exports in the same order as the imports to simplify things. However, looking at the list of `export`ed properties it probably doesn't hurt to add a couple of comments to clarify from where specifically the `export`s originated.	2020-05-27 13:57:25 +02:00
Jonas Jenwald	6a1490faa7	Update Acorn to version 7 By updating to the new major version of Acorn, we'll get support for newer ECMAScript features as they become available (although some features are currently also blocked by ESLint support and/or SystemJS usage). Please see https://github.com/acornjs/acorn/releases/tag/7.2.0 for details.	2020-05-27 11:54:27 +02:00
Jonas Jenwald	4ef547f400	Improve caching of empty `/XObject`s in the `PartialEvaluator.getTextContent` method It turns out that `getTextContent` suffers from similar problems with repeated images as `getOperatorList`; please see the previous patch. While only `/XObject` resources of the `Form`-type will actually be parsed in `PartialEvaluator.getTextContent`, since those are the only ones that may contain text, we're still forced to fetch repeated image resources where the name differs (but not the reference). Obviously it's less bad in this case, since we're not actually parsing `/XObject`s of e.g. the `Image`-type. However, you still want to avoid even fetching the data whenever possible, since `Stream`s are not cached on the `XRef` instance (given their potential size) and the lookup can thus be somewhat expensive in general. To address these issues, we can simply replace the exiting name-only caching in `PartialEvaluator.getTextContent` with a new cache backed by `LocalImageCache` instead.	2020-05-26 09:49:01 +02:00
Jonas Jenwald	d62c9181bd	Improve the local image caching in `PartialEvaluator.getOperatorList` Currently the local `imageCache`, as used in `PartialEvaluator.getOperatorList`, will miss certain cases of repeated images because the caching is only done by name (usually using a format such as e.g. "Im0", "Im1", ...). However, in some PDF documents the `/XObject` dictionaries many contain hundreds (or even thousands) of distinctly named images, despite them referring to only a handful of actual image objects (via the XRef table). With these changes we'll now cache local images using both name and (where applicable) reference, thus improving re-usage of images resources even further. This patch was tested using the PDF file from [bug 857031](https://bugzilla.mozilla.org/show_bug.cgi?id=857031), i.e. https://bug857031.bmoattachments.org/attachment.cgi?id=732270, with the following manifest file: ``` [ { "id": "bug857031", "file": "../web/pdfs/bug857031.pdf", "md5": "", "rounds": 250, "lastPage": 1, "type": "eq" } ] ``` which gave the following results when comparing this patch against the `master` branch: ``` -- Grouped By browser, page, stat -- browser \| page \| stat \| Count \| Baseline(ms) \| Current(ms) \| +/- \| % \| Result(P<.05) ------- \| ---- \| ------------ \| ----- \| ------------ \| ----------- \| --- \| ----- \| ------------- firefox \| 0 \| Overall \| 250 \| 2749 \| 2656 \| -93 \| -3.38 \| faster firefox \| 0 \| Page Request \| 250 \| 3 \| 4 \| 1 \| 50.14 \| slower firefox \| 0 \| Rendering \| 250 \| 2746 \| 2652 \| -94 \| -3.44 \| faster ``` While this is certainly an improvement, since we now avoid re-parsing ~1000 images on the first page, all of the image resources are small enough that the total rendering time doesn't improve that much in this particular case. In pathological cases, such as e.g. the PDF document in issue 4958, the improvements with this patch can be very significant. Looking for example at page 2, from issue 4958, the rendering time drops from ~60 seconds with `master` to ~30 seconds with this patch (obviously still slow, but it really showcases the potential of this patch nicely). Finally, note that there's also potential for additional improvements by re-using `LocalImageCache` instances for e.g. /XObject data of the `Form`-type. However, given that recent changes in this area I purposely didn't want to complicate this patch more than necessary.	2020-05-25 15:14:14 +02:00
Tim van der Meij	9d38dd4e8b	Merge pull request #11927 from timvandermeij/svg-fill-opacity-shading Implement fill opacity for shading patterns in the SVG back-end	2020-05-24 14:29:22 +02:00
Tim van der Meij	f14215da37	Implement fill opacity for shading patterns in the SVG back-end In the PDF file from the issue below, the fill alpha (`ca`) is set before drawing the circles using the `setGState` operator. Doing so causes the global alpha to be set on the canvas' context for the canvas back-end, but this was not handled in the SVG back-end. This patch fixes that by taking the fill opacity into account when drawing shading patterns in the same way as done elsewhere so it is only included if the value is non-default. Fixes #11812.	2020-05-24 14:25:40 +02:00
Tim van der Meij	3b615e4ca3	Merge pull request #11601 from Snuffleupagus/rm-nativeImageDecoderSupport [api-minor] Decode all JPEG images with the built-in PDF.js decoder in `src/core/jpg.js`	2020-05-23 15:33:46 +02:00
Tim van der Meij	cd6d089489	Merge pull request #11926 from Snuffleupagus/GlobalImageCache-clear-onlyData Allow `GlobalImageCache.clear` to, optionally, only remove the actual data (PR 11912 follow-up)	2020-05-23 12:21:38 +02:00
Jonas Jenwald	8af70d75aa	Allow `GlobalImageCache.clear` to, optionally, only remove the actual data (PR 11912 follow-up) When "Cleanup" is triggered, you obviously need to remove all globally cached data on both the main- and worker-threads. However, the current the implementation of the `GlobalImageCache.clear` method also means that we lose all information about which images were cached and not just their data. This thus has the somewhat unfortunate side-effect of requiring images, which were previously known to be "global", to again having to reach `NUM_PAGES_THRESHOLD` before being cached again. To avoid doing unnecessary parsing after "Cleanup", we can thus let `GlobalImageCache.clear` keep track of which images were cached while still removing their actual data. This should not have any significant impact on memory usage, since the only extra thing being kept is a `RefSetCache` (essentially an Object) with a couple of `Set`s containing only integers.	2020-05-23 11:30:24 +02:00
Tim van der Meij	973f39b558	Merge pull request #11924 from Snuffleupagus/issue-11922 Avoid hanging the worker-thread for CMap data with ridiculously large ranges (issue 11922)	2020-05-23 00:32:12 +02:00
Jonas Jenwald	56ebf01ae0	Avoid hanging the worker-thread for CMap data with ridiculously large ranges (issue 11922) This patch was inspired by `ad2b64f124/xpdf/CharCodeToUnicode.cc (L480-L484)`	2020-05-22 15:23:17 +02:00

... 5 6 7 8 9 ...

12852 Commits