pdf.js

Author	SHA1	Message	Date
Jonas Jenwald	05edd91bdb	Remove the `isNum` helper function The call-sites are replaced by direct `typeof`-checks instead, which removes unnecessary function calls. Note that in the `src/`-folder we already had more `typeof`-cases than `isNum`-calls. These changes were mostly done using regular expression search-and-replace, with two exceptions: - In `Font._charToGlyph` we no longer unconditionally update the `width`, since that seems completely unnecessary. - In `PDFDocument.documentInfo`, when parsing custom entries, we now do the `typeof`-check once.	2022-02-22 11:55:34 +01:00
Jonas Jenwald	b282814e38	Prefer `instanceof Name` rather than calling `isName()` with one argument Unless you actually need to check that something is both a `Name` and also of the correct type, using `instanceof Name` directly should be a tiny bit more efficient since it avoids one function call and an unnecessary `undefined` check. This patch uses ESLint to enforce this, since we obviously still want to keep the `isName` helper function for where it makes sense.	2022-02-21 12:45:00 +01:00
Jonas Jenwald	4df82ad31e	Prefer `instanceof Dict` rather than calling `isDict()` with one argument Unless you actually need to check that something is both a `Dict` and also of the correct type, using `instanceof Dict` directly should be a tiny bit more efficient since it avoids one function call and an unnecessary `undefined` check. This patch uses ESLint to enforce this, since we obviously still want to keep the `isDict` helper function for where it makes sense.	2022-02-21 12:44:56 +01:00
Jonas Jenwald	67b658e8d5	Prefer `instanceof Cmd` rather than calling `isCmd()` with one argument Unless you actually need to check that something is both a `Cmd` and also of the correct type, using `instanceof Cmd` directly should be a tiny bit more efficient since it avoids one function call and an unnecessary `undefined` check. This patch uses ESLint to enforce this, since we obviously still want to keep the `isCmd` helper function for where it makes sense.	2022-02-21 12:44:51 +01:00
Jonas Jenwald	bad15894fc	Improve the JSDocs for the `PDFObjects` class Given that we expose `PDFObjects`-instances, via the `commonObjs` and `objs` properties, on the `PDFPageProxy`-instances this ought to help provide slightly better TypeScript definitions.	2022-02-20 13:02:14 +01:00
Jonas Jenwald	f4712bc0ad	Simplify the data stored on `PDFObjects`-instances The manually tracked `resolved`-property is no longer necessary, since the same information is now directly available on all `PromiseCapability`-instances. Furthermore, since the `PDFObjects.resolve` method is not documented as accepting e.g. only Object-data, we probably shouldn't resolve the `PromiseCapability` with the `data` and instead only store it on the `PDFObjects`-instance.[1] --- [1] While Objects are passed by reference in JavaScript, other primitives such as e.g. strings are passed by value and the current implementation could thus lead to increased memory usage. Given how we're using `PDFObjects` in the PDF.js code-base none of this should be an issue, but it still cannot hurt to change this.	2022-02-20 12:33:33 +01:00
Jonas Jenwald	beecde3229	Introduce (some) private properties/methods in the `PDFObjects` class This ensures that the underlying data cannot be accessed directly, from the outside, since that's definately not intended here. Note that we expose `PDFObjects`-instances, via the `commonObjs` and `objs` properties, on the `PDFPageProxy`-instances hence these changes really cannot hurt.	2022-02-20 12:23:30 +01:00
Jonas Jenwald	2cb2f633ac	Remove the `isRef` helper function This helper function is not really needed, since it's just a wrapper around a simple `instanceof` check, and it only adds unnecessary indirection in the code.	2022-02-19 15:33:42 +01:00
Tim van der Meij	df0aa1a9c4	Merge pull request #14575 from Snuffleupagus/rm-isStream Remove the `isStream` helper function	2022-02-19 14:59:19 +01:00
Jonas Jenwald	05efe3017b	Change `PixelsPerInch` to a class with `static` properties (issue 14579) Please note: I'm completely fine with this patch being rejected, and the issue instead closed as WONTFIX, since this is unfortunately a case where the TypeScript definitions dictate how we can/cannot write JavaScript code. Apparently the TypeScript definitions generation converts the existing `PixelsPerInch` code into a `namespace` and simply ignores the getter; please see `a7fc0d33a1/types/src/display/display_utils.d.ts (L223-L226)` Initially I tried tagging `PixelsPerInch` as en `@enum`, see https://jsdoc.app/tags-enum.html, however that unfortunately didn't help. Hence the only good/simple solution, as far as I'm concerned, is to convert `PixelsPerInch` into a class with `static` properties. This patch results in the following diff, for the `gulp types` build target: ```diff @@ -195,9 +195,10 @@ */ static toDateObject(input: string): Date \| null; } -export namespace PixelsPerInch { - const CSS: number; - const PDF: number; +export class PixelsPerInch { + static CSS: number; + static PDF: number; + static PDF_TO_CSS_UNITS: number; } declare const RenderingCancelledException_base: any; export class RenderingCancelledException extends RenderingCancelledException_base { ```	2022-02-19 09:05:40 +01:00
Jonas Jenwald	530af48b8e	Merge pull request #14569 from brendandahl/smask-state Fix canvas state getting out of sync from smasks. (bug 1755507)	2022-02-18 19:35:58 +01:00
Brendan Dahl	7def6d12c8	Fix canvas state getting out of sync from smasks. (bug 1755507) Soft masks can be enabled/disabled at anytime and at different points in the save/restore stack. This can lead to the amount of save/restores becoming unbalanced across the two canvases. Instead of save/restoring on the temporary canvas change it so we only track state on the main (suspended canvas). I was also getting an out balance stack from patterns, so I've also fixed that and added a warning that will at least show up on chrome. It would be nice to add this so Firefox at some point too. Fixes #11328, #14297 and bug 1755507	2022-02-17 17:38:32 -08:00
Jonas Jenwald	1a31855977	Remove the `isStream` helper function At this point all the various Stream-classes extends an abstract base-class, hence this helper function is no longer necessary and only adds unnecessary indirection in the code.	2022-02-17 13:51:36 +01:00
Jonas Jenwald	fd319e94b3	Add a missing string-check in the `_collectJS` helper function Unfortunately I don't have a test-case that breaks without this change, however the `stringToPDFString` helper function will fail if anything other than a string is passed to it. The changes in this patch thus make this code more-or-less identical to that found in the `Catalog.{_collectJavaScript, parseDestDictionary}` methods.	2022-02-16 13:43:42 +01:00
Calixte Denizet	18e3a98c2b	[api-minor] Don't add in the text content the chars which are out-of-page (bug 1755201) - it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1755201; - if the glyph position is not within the view then skip it.	2022-02-13 21:07:11 +01:00
Tim van der Meij	c37d785b2a	Merge pull request #14560 from Snuffleupagus/Node-ReadableStream-polyfill [api-minor] Remove the, in `legacy` builds, bundled `ReadableStream` polyfill	2022-02-13 14:08:22 +01:00
Jonas Jenwald	b89595fd20	[api-minor] Remove the, in `legacy` builds, bundled `ReadableStream` polyfill According to the MDN compatibility data, see https://developer.mozilla.org/en-US/docs/Web/API/ReadableStream#browser_compatibility, all browsers that we support have native `ReadableStream` implementations (since quite some time too). Hence only Node.js is now lagging behind w.r.t. `ReadableStream` support, and its experimental implementation doesn't really help us given the life-span of the LTS releases (see https://en.wikipedia.org/wiki/Node.js#Releases). It seems quite unfortunate to bundle a `ReadableStream` polyfill in the `legacy` builds when it's unnecessary in browsers, given its overall size, but fortunately we can avoid that by simply listing `web-streams-polyfill` as a dependency for the `pdfjs-dist` library.	2022-02-13 10:15:58 +01:00
Jonas Jenwald	d642d34500	Remove the UTF-8 fallback, when `TextDecoder` is missing, from the Content-Disposition parser Given that `TextDecoder` is now supported by all modern browsers/environments, please see https://developer.mozilla.org/en-US/docs/Web/API/TextDecoder#browser_compatibility, there's no longer any good reason to keep a UTF-8 fallback in the Content-Disposition parser.	2022-02-12 10:30:25 +01:00
Jonas Jenwald	b87a243222	[api-minor] Stop exposing the `createObjectURL` helper function in the API With recent changes, specifically PR 14515 and the previous patch, the `createObjectURL` helper function is now only used with the SVG back-end. All other call-sites, throughout the code-base, are now using `URL.createObjectURL(...)` directly and it no longer seems necessary to keep exposing the helper function in the API. Finally, the `createObjectURL` helper function is moved into the `src/display/svg.js` file to avoid unnecessarily duplicating this code on both the main- and worker-threads.	2022-02-10 12:01:35 +01:00
Brendan Dahl	f8b2a99ddc	Merge pull request #14543 from Snuffleupagus/bug-1753983 Let `Lexer.getNumber` treat a single minus sign as zero (bug 1753983)	2022-02-09 14:06:35 -08:00
Jonas Jenwald	1f0fb270b1	[api-minor] Ensure that the `PDFDocumentLoadingTask`-promise is rejected when cancelling the PasswordPrompt (bug 1754421) This is essentially a continuation of PR 7926, where we added support for rejecting the current `PDFDocumentLoadingTask`-promise by throwing inside of the `onPassword`-callback. Hence the naive way to address [bug 1754421](https://bugzilla.mozilla.org/show_bug.cgi?id=1754421) would be to simply throw in the `onPassword`-callback used in the default viewer. However it unfortunately turns out to not work, since the password input/validation is asynchronous, and we thus need another approach. The simplest solution that I can come up with here, is thus to extend the `onPassword`-callback to also reject the current `PDFDocumentLoadingTask`-instance if an `Error` is explicitly passed as the input to the callback function. (This doesn't feel great, but I cannot see a better solution that isn't really complicated.)	2022-02-09 15:09:20 +01:00
Jonas Jenwald	64f3dbeb48	Let `Lexer.getNumber` treat a single minus sign as zero (bug 1753983) This appears to be consistent with the behaviour in both Adobe Reader and PDFium (in Google Chrome); this is essentially the same approach as used for a single decimal point in PR 9827.	2022-02-07 17:09:47 +01:00
Jonas Jenwald	03f5f6a421	[api-minor] Update the minimum supported browser versions Please note that while we "support" some (by now) fairly old browsers, that essentially means that the library (and viewer) will load and that the basic functionality will work as intended.[1] However, in older browsers, some functionality may not be available and generally we'll ask users to update to a modern browser when bugs (specific to old browsers) are reported.[2] There's always a question of just how old browsers the PDF.js contributors can realistically support, and here I'm suggesting that we place the cut-off point at approximately three years. With that in mind, this patch updates the minimum supported browsers (and environments) as follows: - Chrome 73, which was released on 2019-03-12; see https://en.wikipedia.org/wiki/Google_Chrome_version_history - Firefox ESR (as before); see https://wiki.mozilla.org/Release_Management/Calendar - Safari 12.1, which was released on 2019-03-25; see https://en.wikipedia.org/wiki/Safari_version_history#Safari_12 - Node.js 12, which was release on 2019-04-23 (and will soon reach EOL); see https://en.wikipedia.org/wiki/Node.js#Releases --- [1] Assuming a `legacy`-build is being used, of course. [2] In general it's never a good idea to use an old/outdated browser, since those may contain known security vulnerabilities.	2022-02-06 13:06:43 +01:00
Jonas Jenwald	403baa7bba	[api-minor] Remove the `normalizeWhitespace` option in the `PDFPageProxy.{getTextContent, streamTextContent}` methods (issue 14519, PR 14428 follow-up) With these changes, we'll now always replace all whitespaces with standard spaces (0x20). This behaviour is already, since many years, the default in both the viewer and the browser-tests.	2022-02-03 09:17:22 +01:00
calixteman	7a034706ba	Merge pull request #14510 from calixteman/14502 [api-minor] Annotations - Adjust the font size in text field in considering the total width (bug 1721335)	2022-01-30 15:58:51 +01:00
Calixte Denizet	ae842e1c3a	[api-minor] Annotations - Adjust the font size in text field in considering the total width (bug 1721335) - it aims to fix #14502 and bug 1721335; - Acrobat and Pdfium do the same; - it'll avoid to have truncated data when printed; - change the factor to compute font size in using field height: lineHeight = 1.35*fontSize - this is the value used by Acrobat. - in order to not have truncated strings on the bottom, add few basic metrics for standard fonts.	2022-01-30 15:53:31 +01:00
Jonas Jenwald	7cc761a8c0	Polyfill `structuredClone` with core-js (PR 13948 follow-up) This allows us to remove the manually implemented `structuredClone` polyfill, thus reducing the maintenance burden for the `LoopbackPort` class; refer to https://github.com/zloirock/core-js#structuredclone Please note: While `structuredClone` support landed already in Firefox 94, Google Chrome only added it in version 98 (currently in Beta). However, given that the `LoopbackPort` will only be used together with fake workers in browsers this shouldn't be too much of a problem.[1] For Node.js environments, where fake workers are unfortunately necessary, using a `legacy/`-build is already required which thus guarantees that the `structuredClone` polyfill is available. Also, the patch updates core-js to the latest version since that one includes `structuredClone` improvements; please see https://github.com/zloirock/core-js/releases/tag/v3.20.3 --- [1] Given that we only support browsers with proper worker support, if fake workers are being used that essentially indicates a configuration problem/error.	2022-01-27 21:11:42 +01:00
Jonas Jenwald	8f6965b197	Merge pull request #14506 from Snuffleupagus/license_header_2022 Update the year in the `license_header` files	2022-01-27 19:34:56 +01:00
Jonas Jenwald	00bd549e82	Update the year in the `license_header` files This also includes a couple of files that are included as-is in the `pdfjs-dist` library.	2022-01-27 19:24:31 +01:00
calixteman	838909f8c1	Merge pull request #14491 from quaoaris/lines-rendered-too-thick fix for lines (stroke) are rendered too thick (Bug 1743245)	2022-01-27 18:46:26 +01:00
Calixte Denizet	3a7004ca25	Take into account all rotations before comparing glyph positions - it aims to fix #14497; - previously, only rotations with an angle 0, 90, 180 or 270 were taken into account; - so generalize to any angle but keep the fast path for 0, 90, ... because they're likely more common than anything else.	2022-01-26 17:19:00 +01:00
quaoaris	3f77d80f31	fix for lines (stroke) are rendered too thick (Bug 1743245) This commit fixes Bug 1743245 (Grided PDF file lines rendered too thick) which was created by a fix for #12868 . The lineWidth was set to round(1 * this._combinedScaleFactor) when the pixel is drawn as a parallelorgam with a height <1. This fix changes this to floor(1*this._combinedScaleFactor) . This change shows a visual result comparable to Chrome and Acrobat. Regarding the last PR 3 statements in canvas.js are affected and will change with this commit (stroke and paintChar). renaming the reference files to naming comvention	2022-01-25 10:27:30 +01:00
Jonas Jenwald	8836593b9e	Add a (global) cache to the `getCharUnicodeCategory` function Given that the regular expression has already become more complex (after the initial patch adding it), it seems to me that it probably cannot hurt to add a global cache to reduce unnecessary re-parsing. Obviously the `Glyph`-instances are being cached per font, however in most documents multiple fonts are being used and in practice there's very often a fair amount of overlap between the /ToUnicode-data in different fonts[1]. Consider for example loading and rendering the entire `tracemonkey.pdf` document (from the test-suite), which isn't a particularily large document. In that case the `getCharUnicodeCategory` function is being called a total of `601` times, however there's only `106` unique unicode-chars being checked. Please note: In practice I suppose that this won't have a huge effect on overall performance, however given the relative simplicity of this patch I figured that it'd not hurt to submit it for review. --- [1] Consider e.g. how there's usually different fonts used for regular, bold, respectively italic text.	2022-01-25 09:59:34 +01:00
Calixte Denizet	e1d3a3b414	Remove the invisible format marks from the text chunks - it aims to fix issue #9186.	2022-01-24 13:47:24 +01:00
calixteman	88236e1163	Merge pull request #14430 from calixteman/beforeinput [JS] Use beforeinput event to trigger a keystroke event in the sandbox	2022-01-23 20:42:33 +01:00
Calixte Denizet	6ac296e48e	[JS] Use beforeinput event to trigger a keystroke event in the sandbox - it aims to fix issue #14307; - this event has been added recently in Firefox and we can now use it; - fix few bugs in aform.js or in annotation_layer.js; - add some integration tests to test keystroke events (see `AFSpecial_Keystroke`); - make dispatchEvent in the quickjs sandbox async.	2022-01-23 19:53:01 +01:00
Tim van der Meij	23b6fde9fc	Merge pull request #14464 from Snuffleupagus/issue-14462 Support Type1 font files with incomplete /CharStrings definitions (issue 14462)	2022-01-19 20:38:46 +01:00
calixteman	b0231cc887	Merge pull request #14456 from calixteman/1749563 Font renderer - get int8 instead of uint8 in composite glyphes (bug 1749563)	2022-01-19 01:20:49 -08:00
Calixte Denizet	74f25d2755	Font renderer - get int8 instead of uint8 in composite glyphes (bug 1749563) - it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1749563; - use some helper functions to get (u\|i)int** values in buffer: it helps to have a clearer code; - in composite glyphes the translations values with a transformations are signed so consequently get some int8 instead of uint8; - add few TODOs.	2022-01-18 22:06:23 +01:00
Jonas Jenwald	a13ae5d97d	Support Type1 font files with incomplete /CharStrings definitions (issue 14462) Please refer to https://www.pdfa.org/norm-refs/Type1Fonts.pdf#page=15 for the expected format for the /CharStrings entries. In the referenced PDF document the /CharStrings are missing the expected end-token, which causes us to swallow the start of the next glyph name.	2022-01-17 18:55:22 +01:00
Jonas Jenwald	ba37d600d7	Make the `normalizeWhitespace` handling, in the `PartialEvaluator`, more efficient (PR 14428 follow-up) After the changes in PR 14428 we can directly, and more efficiently, handle whitespace conversion in `PartialEvaluator.getTextContent` when the `normalizeWhitespace` option is being used. This way we no longer need a separate helper function for this, and can avoid having to (again) iterate through the text and checking each character. Finally, this also removes the need for using a regular expression on e.g. all non-ASCII text.	2022-01-16 08:29:21 +01:00
calixteman	da953f4b64	Merge pull request #14428 from calixteman/typo Use the correct dimension to know if we have to add an EOL in vertical mode	2022-01-15 12:47:10 -08:00
Calixte Denizet	9dae421a0d	Handle all the whitespaces the same way when creating text chunks	2022-01-15 21:44:00 +01:00
Tim van der Meij	922dac035c	Merge pull request #14448 from Snuffleupagus/Type3-circular-refs Prevent circular references in Type3 fonts	2022-01-15 14:11:47 +01:00
Tim van der Meij	a72d188599	Merge pull request #14439 from Snuffleupagus/issue-14438 Ignore Annotations with empty /Rect-entries in the display-layer (issue 14438)	2022-01-15 14:11:25 +01:00
Tim van der Meij	c0d2932faf	Merge pull request #14454 from Snuffleupagus/util-more-unreachable Replace some `assert` usage with `unreachable` in the `src/shared/util.js` file	2022-01-15 13:52:10 +01:00
Tim van der Meij	625f829842	Merge pull request #14446 from Snuffleupagus/issue-14435 Expose even more API-functionality in the TypeScript definitions (issue 14435, PR 14013 follow-up)	2022-01-15 13:46:11 +01:00
Jonas Jenwald	0e1b93bf20	Replace some `assert` usage with `unreachable` in the `src/shared/util.js` file Inlining the checks should be a tiny bit more efficient, since it avoids have to make unconditional function calls in these fairly commonly used helper functions.	2022-01-15 13:01:25 +01:00
Jonas Jenwald	12d8f0b64d	Re-factor the `stringToPDFString` helper function for UTF-16 strings This patch changes the function to instead utilize the `TextDecoder` for both kinds of UTF-16 BOM strings.	2022-01-14 20:38:40 +01:00
Jonas Jenwald	76444888fb	Add (basic) UTF-8 support in the `stringToPDFString` helper function (issue 14449) This patch implements this by looking for the UTF-8 BOM, i.e. `\xEF\xBB\xBF`, in order to determine the encoding.[1] The actual conversion is done using the `TextDecoder` interface, which should be available in all environments/browsers that we support; please see https://developer.mozilla.org/en-US/docs/Web/API/TextDecoder#browser_compatibility --- [1] Assuming that everything lacking a UTF-16 BOM would have to be UTF-8 encoded really doesn't seem correct.	2022-01-14 18:57:07 +01:00

1 2 3 4 5 ...

5157 Commits