pdf.js

Author	SHA1	Message	Date
Jonas Jenwald	e723da7261	Ignore invalid /Encoding-entries when parsing fonts (issue 14821) In the referenced PDF document the fonts have /Encoding-entries that are Streams (containing completely bogus data), which are thus obviously not valid here. Hence, only when `ignoreErrors` is set, we'll now ignore these corrupt /Encoding-entries and fallback to the existing code to try and infer a usable encoding. Given that this is clearly a case of corrupt PDF documents, there's no guarantee that this will "fix" all such cases, however it's the best that we do here and shouldn't really be worse than ignoring an entire font.	2022-04-22 11:49:03 +02:00
Tim van der Meij	f39219cd45	Merge pull request #14815 from Snuffleupagus/issue-14814 Ignore non-Stream /SMask-entries when parsing images (issue 14814)	2022-04-22 11:39:13 +02:00
Sean Wei	6bf978404e	Use correct case for JavaScript	2022-04-21 23:56:28 +08:00
Jonas Jenwald	39d1bdde09	Ignore non-Stream /SMask-entries when parsing images (issue 14814) This is similar to the pre-existing check used in the /Mask-case below, to handle corrupt PDF documents that include non-Stream /SMask-entries in images; please refer to the PDF specification: https://web.archive.org/web/20220309040754if_/https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#page=216 Please note: Adobe Reader also fails to render the image on the second page, and displays an error message.	2022-04-21 12:14:08 +02:00
Jonas Jenwald	5bc7339c1b	Add support for the /Catalog Base-URI when resolving URLs (issue 14802) As far as I can tell, this is actually the very first time that we've seen a PDF document with a Base-URI specified in the /Catalog; please refer to the specification: https://web.archive.org/web/20220309040754if_/https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G11.2097122 To simplify the overall implementation, this new parameter is accessed via the existing `BasePdfManager.docBaseUrl`-getter and will thus override any user-specified `docBaseUrl` API-parameter.	2022-04-19 17:14:52 +02:00
Calixte Denizet	c2aa03e194	Fix clipping issue with pattern (follow-up of #14797 )	2022-04-18 12:41:14 +02:00
Jonas Jenwald	5bbed400f2	Merge pull request #14797 from calixteman/12306 Don't clip when the clip path is empty (issue #12306)	2022-04-18 11:18:32 +02:00
Calixte Denizet	3d74d2c6cb	Don't clip when the clip path is empty (issue #12306 )	2022-04-18 10:33:44 +02:00
Calixte Denizet	4b7691baf6	Simplify min/max computations in constructPath (bug 1135277) - most of the time the current transform is a scaling one (modulo translation), hence it's possible to avoid to apply the transform on each bbox and then apply it a posteriori; - compute the bbox when it's possible in the worker.	2022-04-17 17:25:54 +02:00
Calixte Denizet	f62d961dfe	Improve performances with image masks (bug 857031) - it's the second part of the fix for https://bugzilla.mozilla.org/show_bug.cgi?id=857031; - some image masks can be used several times but at different positions; - an image need to be pre-process before to be rendered: * rescale it; * use the fill color/pattern. - the two operations above are time consuming so we can cache the generated canvas; - the cache key is based on the current transform matrix (without the translation part) and the current fill color when it isn't a pattern. - the rendering of the pdf in the above bug is really faster than without this patch.	2022-04-16 20:48:39 +02:00
Tim van der Meij	b73a6cc213	Merge pull request #14785 from Snuffleupagus/core-js-structuredClone-transfers Update `core-js` to allow removing a `structuredClone` work-around	2022-04-16 12:36:44 +02:00
calixteman	681a9b8927	Merge pull request #14784 from calixteman/intersect Improve performance of shared/utils.js::intersect (bug 1135277)	2022-04-15 22:38:19 +02:00
Calixte Denizet	7501fe6f30	Improve performance of shared/utils.js::intersect - avoid to call normalizeRect which clones the rectangles: it's useless and time consuming; - in profiling the pdf in bug 1135277, the time spent in intersect drops from ~1s to ~30ms.	2022-04-15 22:24:26 +02:00
Jonas Jenwald	b996e107c3	Update `core-js` to allow removing a `structuredClone` work-around Because of a bug in previous `core-js` versions, which caused an Error to be thrown if its `structuredClone` polyfill was called with an explicit `null`/`undefined` transfer-parameter, the `LoopbackPort`-class contained a work-around. In the latest `core-js` version this has been fixed, and we can thus simplify our code ever so slightly; please see https://github.com/zloirock/core-js/releases/tag/v3.22.0	2022-04-15 22:12:02 +02:00
Jonas Jenwald	e67cd7fae0	Replace the `--viewport-scale-factor` CSS variable This CSS variable is only used together with the `annotationCanvasMap`-functionality in the canvas-code, however its value can be trivially computed by using the older `--zoom-factor` CSS variable together with the `PixelsPerInch`-structure. Rather than having two different CSS variables that are this closely linked, it seems better to simplify things by using just one CSS variable instead.	2022-04-14 12:43:57 +02:00
Tim van der Meij	cdb3481d6c	Merge pull request #14764 from apeltop/correct-typos Correct typos	2022-04-10 14:55:08 +02:00
Calixte Denizet	687c9a8710	Improve performance of applyMaskImageData - write some uint32 instead of uint8 to avoid the check before clamping; - unroll the loop to write data in the buffer - but keep a loop for the last element of a line: it likely doesn't hurt that much since it's executed only for one time for each line; - I tested on a macbook with an Apple chip, and on Firefox nightly the new code is almost 3.5x faster than before (~1.8x with Chrome).	2022-04-09 22:19:02 +02:00
Calixte Denizet	040fcae5ab	Improve performance with image masks (bug 857031) - it aims to partially fix performance issue reported: https://bugzilla.mozilla.org/show_bug.cgi?id=857031; - the idea is too avoid to use byte arrays but use ImageBitmap which are a way faster to draw: * an ImageBitmap is Transferable which means that it can be built in the worker instead of in the main thread: - this is achieved in using an OffscreenCanvas when it's available, there is a bug to enable them for pdf.js: https://bugzilla.mozilla.org/show_bug.cgi?id=1763330; - or in using createImageBitmap: in Firefox a task is sent to the main thread to build the bitmap so it's slightly slower than using an OffscreenCanvas. * it's transfered from the worker to the main thread by "reference"; * the byte buffers used to create the image data have a very short lifetime and ergo the memory used is globally less than before. - Use the localImageCache for the mask; - Fix the pdf issue4436r.pdf: it was expected to have a binary stream for the image; - Move the singlePixel trick from operator_list to image: this way we can use this trick even if it isn't in a set as defined in operator_list.	2022-04-09 18:26:26 +02:00
apeltop	a97dd26389	Correct typos	2022-04-09 09:43:18 +09:00
Jonas Jenwald	a919959d83	Slightly simplify the `Catalog._readMarkInfo` method We don't need to first check if the Dictionary contains the key, since trying to get a non-existent key simply returns `undefined` and we're already ensuring that the value is a boolean. Furthermore, we shouldn't need to worry about the `Object.prototype` containing enumerable properties since the checks (in `src/core/worker.js`) done for `Array.prototype` indirectly also cover `Object`s. (Keep in mind that an `Array` is just a special kind of `Object` in JavaScript.)	2022-04-05 16:37:51 +02:00
Jonas Jenwald	1dc4713a0b	Re-factor the `isLittleEndian`/`isEvalSupported` caching This functionality is very old, hence we should be able to improve the caching a little bit with modern JavaScript features.	2022-04-05 16:01:01 +02:00
Calixte Denizet	f4fcb59a5e	Refactor some xfa*** getters in document.js - it's a follow-up of PR #14735.	2022-04-03 20:38:12 +02:00
Jonas Jenwald	f33ce5fc2d	Decode non-ASCII values found in the xfa:datasets (PR 14735 follow-up) Please note: This is possibly bad/wrong in general, but I figured that submitting it for review wouldn't hurt. It seems that even Adobe Reader doesn't handle the non-ASCII characters that appear in some of the fields correctly, however it should be pretty easy to improve things on the PDF.js side.	2022-04-01 11:54:34 +02:00
Jonas Jenwald	36a289d747	Merge pull request #14735 from calixteman/14685 [Annotations] Some annotations can have their values stored in the xfa:datasets	2022-04-01 11:30:16 +02:00
Calixte Denizet	0b597304c1	[Annotations] Some annotations can have their values stored in the xfa:datasets - it aims to fix #14685; - add a basic object to get values from the parsed datasets; - these annotations don't have an appearance so we must create one when printing or saving.	2022-04-01 10:28:04 +02:00
Jonas Jenwald	addb4cb12b	Use `String.prototype.repeat()` in a couple of spots Rather than using a temporary Array to manually create repeated strings, we can use `String.prototype.repeat()` instead. The reason that we didn't use this from the start is most likely because some browsers, notably IE, didn't support this; note https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/repeat#browser_compatibility	2022-03-30 15:42:40 +02:00
Calixte Denizet	ad3fb71a02	[Annotations] Add support for printing/saving choice list with multiple selections - it aims to fix issue #12189.	2022-03-29 18:59:44 +02:00
Jonas Jenwald	0dd6bc9a85	Merge pull request #14703 from calixteman/14627 [text selection] Add the whitespaces present in the pdf in the text chunk	2022-03-27 15:20:19 +02:00
Calixte Denizet	18e79e3c0b	[text selection] Add the whitespaces present in the pdf in the text chunk - it aims to fix issue #14627; - the basic idea of the recent text refactoring was to only consider the rendered visible whitespaces. But sometimes, the heuristics aren't correct and although some whitespaces are in the text stream they weren't in the text chunks because they were too small. Hence we added some exceptions, for example, we always add a whitespace when it is between two non-whitespace chars but only when in the same Tj. So basically, this patch removes the constraint to have the chars in the same Tj (in using a circular buffer to save the two last chars) but don't add a space when the visible space is really too small (hence `NOT_A_SPACE_FACTOR`).	2022-03-27 14:34:56 +02:00
Jonas Jenwald	7f0589c74a	Change the type of the `container` property, in the `TextLayerRenderParameters` typedef (issue 14716) Given that the textLayer-code has been using a `DocumentFragment` ever since PR 3356 (back in 2013), simply updating the type of the `container` property should be fine. This patch also tries to, ever so slightly, improve the grammar of a couple of other properties in the typedef.	2022-03-24 22:42:37 +01:00
Jonas Jenwald	849de5a508	Slightly improve validation of (some) parameters in `getDocument` There's a couple of `getDocument` parameters that should be numbers, but which are currently not fully validated to prevent issues elsewhere in the code-base. Also, improves validation of the `ownerDocument` parameter since we currently accept more-or-less anything here.	2022-03-21 13:32:17 +01:00
Jonas Jenwald	73d2ddac0d	Update npm packages Note that the Prettier update made it possible to move a couple of comments after `default:`-cases back to their original/intended positions, please see https://prettier.io/blog/2022/03/16/2.6.0.html	2022-03-20 10:59:13 +01:00
Calixte Denizet	f0b549c2a2	[JS] - Parse a date in using the given format first and then try the default date parser - it aims to fix #14672.	2022-03-19 16:07:43 +01:00
Tim van der Meij	5de6af4e64	Merge pull request #14683 from Snuffleupagus/sendTest-cleanup [src/display/api.js] Simplify the `sendTest` function, used with Worker initialization (PR 14291 follow-up)	2022-03-19 13:38:05 +01:00
Jonas Jenwald	c0736647f9	Add general iteration support in the `RefSet` and `RefSetCache` classes This patch removes the existing `forEach` methods, in favor of making the classes properly iterable instead. Given that the classes are using a `Set` respectively a `Map` internally, implementing this is very easy/efficient and allows us to simplify some existing code.	2022-03-18 14:27:34 +01:00
Jonas Jenwald	be2b1d5d2a	[src/display/api.js] Simplify the `sendTest` function, used with Worker initialization (PR 14291 follow-up) Given that we now only use Workers when `postMessage` transfers are supported, there's really no point in trying to send a "test" message without transfers present. Hence, if `postMessage` transfers are not supported by the browser, we'll now fallback to "fake" Workers immediately instead. The comment about Opera is also removed, since it was originally added back in PR 983 and mentions Opera `11.60` [which was released in 2011](https://en.wikipedia.org/wiki/History_of_the_Opera_web_browser#Version_11).	2022-03-16 13:25:41 +01:00
Jonas Jenwald	d5c9be341d	[src/display/api.js] Use private static class fields, rather than `shadow`ed getter work-arounds (PR 13813, 13882 follow-up) At the time private static class fields were to new, however that's no longer an issue and we can thus (ever so slightly) simplify the code.	2022-03-16 13:02:34 +01:00
Jonas Jenwald	0c349c701f	Remove the `addLinkAttributes` warnings in the Annotation/XFA-layers (PR 14092 follow-up) These warnings have now been present in three releases, see PR 14092, hence it should (hopefully) be fine to remove them now.	2022-03-13 11:38:56 +01:00
Tim van der Meij	790735eaf1	Merge pull request #14658 from Snuffleupagus/api-validate-cMapUrl-standardFontDataUrl Validate the `cMapUrl`/`standardFontDataUrl` parameters in `getDocument`	2022-03-11 21:09:58 +01:00
Jonas Jenwald	a60b98412f	Validate the `cMapUrl`/`standardFontDataUrl` parameters in `getDocument` These changes make sense for two reasons: - Given that the parameters are potentially passed to the worker-thread, depending on the `useWorkerFetch` parameter, we need to prevent errors if the user provides values that aren't clonable. - By ensuring that the default values are indeed `null`, we'll trigger main-thread fetching (of CMaps and Standard fonts) as intended in the `PartialEvaluator` and thus potentially provide better Error messages.	2022-03-10 16:33:10 +01:00
Jonas Jenwald	537ed37835	Move the `isSameOrigin` helper function This function is currently placed in the `src/shared/util.js` file, which means that the code is duplicated in both of the built `pdf.js` and `pdf.worker.js` files. Furthermore, it only has a single call-site which is also specific to the `GENERIC`-build of the PDF.js library. Hence this helper function is instead moved into the `src/display/api.js` file, in such a way that it's conditionally defined but still can be unit-tested.	2022-03-10 13:51:09 +01:00
Tim van der Meij	e85bb0b599	Merge pull request #14645 from Snuffleupagus/Node-DOMMatrix-polyfill [api-minor] Remove the, in `legacy` builds, bundled `DOMMatrix` polyfill	2022-03-09 20:38:26 +01:00
Tim van der Meij	55a931e454	Merge pull request #14648 from Snuffleupagus/PDFDocument-stream Simplify the `PDFDocument` constructor	2022-03-09 20:36:49 +01:00
Jonas Jenwald	6a78f20b17	Simplify the `PDFDocument` constructor Originally the code in the `src/`-folder was shared between the main/worker-threads, and back then it probably made sense that the `PDFDocument` constructor accepted different arguments. However, for many years we've not been passing anything except Streams to `PDFDocument` and we should thus be able to slightly simplify that code. Note that for e.g. unit-tests of this code, using either a `NullStream` or a `StringStream` works just fine.	2022-03-08 17:13:47 +01:00
Jonas Jenwald	157a71d404	[api-minor] Remove the, in `legacy` builds, bundled `DOMMatrix` polyfill According to the MDN compatibility data, see https://developer.mozilla.org/en-US/docs/Web/API/DOMMatrix/DOMMatrix#browser_compatibility, all browsers that we support have native `DOMMatrix` implementations (since quite some time too). Hence Node.js is the only environment that lack `DOMMatrix` support, which probably isn't that surprising given that it's browser functionality. While the `DOMMatrix` polyfill isn't that large, it nonetheless seems completely unnecessary to bundle it in the `legacy` builds when it's not needed in browsers. However, we can avoid that by simply listing `dommatrix` as a dependency for the `pdfjs-dist` library.	2022-03-08 10:29:11 +01:00
Jonas Jenwald	6f600befdd	Update TypeScript to version `4.6.2` and work-around stricter type checks I'm guessing that we're now running into the class-related improvements mentioned in https://devblogs.microsoft.com/typescript/announcing-typescript-4-6/#target-es2022 To unblock this update, and any future ones, this patch simply tweaks the JSDocs to get `gulp typestest` to run without errors.	2022-03-07 11:55:17 +01:00
Tim van der Meij	5242c38af5	Merge pull request #14628 from Snuffleupagus/issue-14626 When `stopAtErrors` is set, throw rather than warn when exceeding `maxImageSize` (issue 14626)	2022-03-05 13:09:36 +01:00
Tim van der Meij	5d12ac576b	Merge pull request #14631 from Snuffleupagus/typedef-fixes Fix a couple of small typos in JSDoc `typedef` comments	2022-03-05 13:06:53 +01:00
Jonas Jenwald	939e6f0c4c	Fix a couple of small typos in JSDoc `typedef` comments While this doesn't affect the official API documentation, these cases should nonetheless be fixed.	2022-03-04 12:11:52 +01:00
Jonas Jenwald	1a7921dbf0	Compute the loca table `endOffset`, of the "first" glyph, correctly (issue 14618) When there are multiple empty glyphs at the start of the data, ensure that the "first" glyph gets a correct `endOffset` to avoid skipping it during parsing in the `sanitizeGlyph` function.	2022-03-03 14:22:45 +01:00

... 18 19 20 21 22 ...

6178 Commits