pdf.js

Author	SHA1	Message	Date
Jonas Jenwald	74585c7c59	Remove the unused `PDF20.hash` method This method was added in PR 4938, almost nine years ago, however it doesn't appear to ever have been used. Given the similarities between the `PDF17` and `PDF20` classes, and how they're used, if the `PDF20.hash` method was actually necessary you'd also expect a similiar method in the `PDF17` class.	2023-04-23 10:13:46 +02:00
Jonas Jenwald	5e0722e4c2	Remove the `PDF20` closure, in the `src/core/crypto.js` file To allow doing this the existing helper function was changed into a "private" method instead.	2023-04-23 10:08:17 +02:00
Jonas Jenwald	9cb3236ac0	Remove the remaining unnecessary closures in the `src/core/primitives.js` file	2023-04-22 15:33:04 +02:00
Tim van der Meij	e304423ba1	Merge pull request #16331 from Snuffleupagus/cmap-rm-closure Remove unnecessary closures in the CMap code	2023-04-22 14:58:13 +02:00
Tim van der Meij	c9359957e6	Merge pull request #16305 from Snuffleupagus/PDFJSDev-skip-PRODUCTION Remove the `PRODUCTION` build-target	2023-04-22 14:53:30 +02:00
Jonas Jenwald	bc7aa8a585	Re-factor some `String.fromCharCode` usage in the `src/core/binary_cmap.js` file We can replace one case of `apply` with rest parameters, and avoid doing repeated `String.fromCharCode` calls within a loop.	2023-04-21 12:21:31 +02:00
Jonas Jenwald	cabc98f310	Remove the remaining closure in the `src/core/cmap.js` file With modern JavaScript we (usually) no longer need to keep old closures, which slightly reduces the size of the code.	2023-04-21 12:21:31 +02:00
Jonas Jenwald	244002502b	Move the `BinaryCMapReader` into its own file The "binary" CMap-format is specific to the PDF.js library, and is used to reduce the size of the built-in CMap data-files. By moving this code to its own file we can remove the nowadays unnecessary closures, which helps to slightly reduce the size of this code.	2023-04-21 12:21:20 +02:00
Calixte Denizet	19ca41896e	Correctly clip the text in the text layer (fixes #16316 )	2023-04-18 17:00:42 +02:00
Calixte Denizet	117bbf7cd9	[api-minor] Don't normalize the text used in the text layer. Some arabic chars like \ufe94 could be searched in a pdf, hence it must be normalized when creating the search query. So to avoid to duplicate the normalization code, everything is moved in the find controller. The previous code to normalize text was using NFKC but with a hardcoded map, hence it has been replaced by the use of normalize("NFKC") (it helps to reduce the bundle size by 30kb). In playing with this \ufe94 char, I noticed that the bidi algorithm wasn't taking into account some RTL unicode ranges, the generated font wasn't embedding the mapping this char and the unicode ranges in the OS/2 table weren't up-to-date. When normalized some chars can be replaced by several ones and it induced to have some extra chars in the text layer. To avoid any regression, when copying some text from the text layer, a copied string is normalized (NFKC) before being put in the clipboard (it works like this in either Acrobat or Chrome).	2023-04-17 14:31:23 +02:00
Jonas Jenwald	804aa896a7	Stop using the `PRODUCTION` build-target in the JavaScript code This special build-target is very old, and was introduced with the first pre-processor that only uses comments to enable/disable code. When the new pre-processor was added `PRODUCTION` effectively became redundant, at least in JavaScript code, since `typeof PDFJSDev === "undefined"` checks now do the same thing. This patch proposes that we remove `PRODUCTION` from the JavaScript code, since that simplifies the conditions and thus improves readability in many cases. Please note: There's not, nor has there ever been, any gulp-task that set `PRODUCTION = false` during building.	2023-04-17 12:04:34 +02:00
Jonas Jenwald	c79bdd6ae6	Simplify the `CFFCompiler.compileTypedArray` method Rather than manually creating the Array, we can use the now existing `Array.from` method instead.	2023-04-15 11:13:34 +02:00
Jonas Jenwald	0ce568e789	Remove `CFFCompiler.compileGlobalSubrIndex` since it's completely unused This method was originally added in PR 1320, eleven years ago, however it doesn't appear to ever have been used (not even from the start). Furthermore, this method also tries to access a property that doesn't exist (`this.out`) and then call a method that also doesn't exist (`writeByteArray`).	2023-04-15 11:13:21 +02:00
Jonas Jenwald	ab2773416b	Merge pull request #16291 from Snuffleupagus/issue-16289 Limit the `Path2D`-checks in the worker-thread to Node.js (PR 16238 follow-up, issue 16289)	2023-04-14 21:26:12 +02:00
Calixte Denizet	5eab8ec610	Avoid when it's possible to use Array.concat when compiling a CFF font In looking at https://bugs.ghostscript.com/show_bug.cgi?id=706451 I noticed that bug2.pdf was pretty slow to load for such a basic file. In profiling I noticed that a lot of time is spent in Array.concat, hence this patch use Array.push when it's possible (it's now ~3 times faster).	2023-04-14 19:01:01 +02:00
Jonas Jenwald	edd13895dd	Limit the `Path2D`-checks in the worker-thread to Node.js (PR 16238 follow-up, issue 16289) The changes in PR 16238 were intended specifically for Node.js environments, however they accidentally applied to older browsers as well. Please note: In up-to-date browsers `Path2D` is available in Workers, which should be connected to the introduction of `OffscreenCanvas`.	2023-04-14 11:51:11 +02:00
Jonas Jenwald	3a36a9d337	Merge pull request #16268 from Snuffleupagus/RegionalImageCache Attempt to also cache images at the "page"-level (issue 16263)	2023-04-11 12:06:29 +02:00
calixteman	c1c372c320	Merge pull request #16225 from calixteman/16224 Thin whitespaces must have their own span	2023-04-11 11:13:16 +02:00
Jonas Jenwald	9881dbf927	Attempt to also cache images at the "page"-level (issue 16263) Currently we have two separate image-caches on the worker-thread: - A local one, which is unique to each `PartialEvaluator.getOperatorList` invocation. This one caches both names and references, since image-resources may be accessed in either way. - A global one, which applies to the entire PDF documents and all its pages. This one only caches references, since nothing else would work. This patch introduces a third image-cache, which essentially sits "between" the two existing ones. The new `RegionalImageCache`[1] will be usable throughout a `PartialEvaluator` instance, and consequently it only caches references, which thus allows us to keep track of repeated image-resources found in e.g. different /Form and /SMask objects. --- [1] For lack of a better word, since naming things is hard...	2023-04-10 11:34:41 +02:00
Tim van der Meij	13f2426aab	Merge pull request #16238 from Snuffleupagus/update-Node-compat-check Update the Node.js compatibility-check in the worker-thread	2023-04-01 14:20:33 +02:00
Jonas Jenwald	57a307d0cd	Update the Node.js compatibility-check in the worker-thread Please note: In Node.js environments a `legacy`-build must be used since only those versions include any polyfills. Previously we'd only check if `ReadableStream` is natively supported, however since Node.js version 18 that's now been implemented; please see https://developer.mozilla.org/en-US/docs/Web/API/ReadableStream#browser_compatibility Hence we'll also check for the availability of `Path2D`, since that's browser-specific functionality not expected to be available in Node.js environments; please see https://developer.mozilla.org/en-US/docs/Web/API/Path2D#browser_compatibility	2023-03-30 18:36:15 +02:00
Jonas Jenwald	5063a6f2a9	[api-minor] Remove the `disableCombineTextItems` option Please note: This parameter has never been used within the PDF.js library/viewer itself, and it was only ever added for backwards compatibility reasons. This parameter was added in PR 7475, over six years ago, to try and optionally maintain the previous default text-extraction behaviour. However as part of the general text-extraction improvements in PR 13257, almost two years ago, the `disableCombineTextItems` functionality was accidentally "broken" in various ways. Note how the only (very basic) unit-test was updated in a way that doesn't really make sense, since generally speaking you'd expect that using the option should result in more (or at least the same number of) text-items. Furthermore there's also the recent issue 16209, where the option causes almost all textContent to be concatenated together. Hence this patch proposes that we simply remove the `disableCombineTextItems` option since it's essentially unused/untested functionality, as evident from the fact that it took almost two years for someone to notice that it's broken.	2023-03-30 14:23:38 +02:00
Calixte Denizet	4b7eb1436d	Thin whitespaces must have their own span	2023-03-29 11:23:58 +02:00
calixteman	622465dc20	Merge pull request #16223 from calixteman/16221 Create a new chunk when the char is too rised compared to the previous one	2023-03-28 15:30:14 +02:00
Calixte Denizet	a96f10e55d	Create a new chunk when the char is too rised compared to the previouse one	2023-03-28 13:56:46 +02:00
Jonas Jenwald	d584513cb2	Merge pull request #16213 from Snuffleupagus/validateCSSFont-quotes Reduce duplication in the `validateCSSFont` helper function	2023-03-28 12:40:23 +02:00
Jonas Jenwald	20cbb89412	Simplify the `isPDFFunction` helper function Originally we used helper functions for checking if something was a Dictionary or Stream, and then having an initial `typeof` check probably made sense. However, given that we're using `instanceof` nowadays the additional check longer seems necessary.	2023-03-27 11:34:20 +02:00
Jonas Jenwald	ef70988027	Reduce duplication in the `validateCSSFont` helper function Currently we're virtually duplicating the same code, for validating quotation marks, twice in this helper function. The size decrease is quite small (107 bytes) and this makes the code slightly harder to reader, hence I completely understand if this patch is rejected.	2023-03-26 12:12:49 +02:00
Jonas Jenwald	035a273d30	Use `replaceAll` in the `recoverJsURL` helper function We can just do direct replacement when building the regular expression, rather than splitting the string into an Array and then re-joining it.	2023-03-25 12:31:39 +01:00
Jonas Jenwald	96e34fbb7d	Enable the `unicorn/prefer-negative-index` ESLint plugin rule Please see https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/prefer-negative-index.md	2023-03-24 10:18:32 +01:00
Jonas Jenwald	1fc09f0235	Enable the `unicorn/prefer-string-replace-all` ESLint plugin rule Note that the `replaceAll` method still requires that a global regular expression is used, however by using this method it's immediately obvious when looking at the code that all occurrences will be replaced; please see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replaceAll#parameters Please find additional details at https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/prefer-string-replace-all.md	2023-03-23 12:57:10 +01:00
Jonas Jenwald	5f64621d46	Use `String.prototype.replaceAll()` where appropriate This fairly new method allows replacing multiple occurrences within a string without having to use regular expressions. Please refer to: - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replaceAll - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replaceAll#browser_compatibility	2023-03-22 15:31:10 +01:00
Jonas Jenwald	137a2d6e30	Add even more non-standard ligatures (PR 15517 follow-up) Given that we already create multi-byte ToUnicode entries in other cases, see e.g. the `getNormalizedUnicodes` table, this is hopefully fine.	2023-03-22 10:42:52 +01:00
Jonas Jenwald	122d5e549a	Track previous "XRefStm"s in a `Set`, rather than an `Object` Having just reviewed a patch touching this code, I couldn't help noticing that an `Object` isn't really the optimal data-structure for this and nowadays we can do better by using a `Set` instead.	2023-03-22 09:41:19 +01:00
Jonas Jenwald	9321758d91	Merge pull request #16186 from Snuffleupagus/issue-16176 Support multi-byte ToUnicode entries, when using predefined CMaps (issue 16176)	2023-03-21 22:17:18 +01:00
Jonas Jenwald	d4bcfe8c16	Support multi-byte ToUnicode entries, when using predefined CMaps (issue 16176) Hopefully this makes sense, since we already "create" multi-byte ToUnicode entries in other cases (see e.g. the `getNormalizedUnicodes` table).	2023-03-21 21:35:57 +01:00
Calixte Denizet	2d0f30a67c	Use the position of the previous xref stream if any when saving a pdf (bug 1823296)	2023-03-21 19:27:24 +01:00
Jonas Jenwald	50c844c5b8	Stop including `isOffscreenCanvasSupported` in the "StartRenderPage" message With the previous commit this is now completely unused in API, hence it can be removed. This is done in a separate commit to make it easier to re-instate it, would the need ever arise.	2023-03-14 13:09:20 +01:00
Tim van der Meij	9819f1cc6b	Merge pull request #16108 from Snuffleupagus/delay-cleanup Slightly delay cleanup, after rendering, in documents with large images	2023-03-11 15:52:12 +01:00
calixteman	b2a86350fc	Merge pull request #16096 from bungeman/fix_trig_functions Correct PostScript trigonometric operators	2023-03-11 14:32:23 +01:00
Ben Wagner	5fad91a680	Better approximate gradient color stops PDF gradients do not have color stops but an arbitrary PDF function of the type f(t) -> color. CSS gradients are only based on color stops. Most PDF gradient functions are produced from color stop oriented gradients. Take advantage of this by sampling the PDF function at a higher frequency but not converting any samples which could be interpolated to color stops. The sampling frequency is chosen to be the least common multiple of as many values as practical to exactly re-create the common case of the PDF function implementing equally spaced linearly interpolated stops in RGB color space. This also allows for better approximation of other smooth PDF functions (non-linear, or non-equally spaced, or in different color space). Fixes: #10572, #14165	2023-03-09 08:49:50 -05:00
Jonas Jenwald	c0671ac133	Slightly increase the maximum image sizes that we'll cache The current value originated in PR 2317, and in the decade that have passed the amount of RAM available in (most) devices should have increased a fair bit. Nowadays we also do a much better job of detecting repeated images at both the page- and document-level, which helps reduce overall memory-usage in many documents. Finally the constant is also moved into the `src/shared/util.js` file, since it was implicitly used on both the main- and worker-thread previously.	2023-03-08 17:06:10 +01:00
Jonas Jenwald	6839f15a32	Merge pull request #16128 from Snuffleupagus/issue-16127 Support (rare) Type3 fonts with Pattern resources (issue 16127)	2023-03-08 12:21:53 +01:00
Jonas Jenwald	e5427ab11b	Merge pull request #16122 from Snuffleupagus/rm-onUnsupportedFeature [api-minor] Remove the deprecated `onUnsupportedFeature` functionality (PR 15758 follow-up)	2023-03-08 12:16:27 +01:00
calixteman	cc555a389b	Merge pull request #16117 from calixteman/workaround_bug1820511 Avoid to have a factor too close to 2 when downscaling image	2023-03-08 11:12:56 +01:00
Calixte Denizet	1617ee6c3f	Avoid to have a factor too close to 2 when downscaling image It's a workaround for bug 1820511: it only affects Firefox on Windows using the D2D backend.	2023-03-08 11:05:46 +01:00
Calixte Denizet	e9474f1c84	[api-minor] Add an option to set the max canvas area	2023-03-08 10:37:06 +01:00
Jonas Jenwald	471aef5fc6	Support (rare) Type3 fonts with Pattern resources (issue 16127) This simply extends the approach in PR 10727 to also cover Patterns, which shouldn't be a common occurrence in Type3 fonts (since this is the first issue we've seen).	2023-03-08 09:20:52 +01:00
Calixte Denizet	b8dda089e2	Slightly modify the max width of a tracking space	2023-03-07 19:38:49 +01:00
Calixte Denizet	8db77cc361	Use appearance stream to render locked annotations (bug 1723568)	2023-03-07 15:01:31 +01:00

1 2 3 4 5 ...

2872 Commits