pdf.js

Author	SHA1	Message	Date
Brendan Dahl	12aba0f91a	Merge pull request #11789 from Snuffleupagus/bug-792816 Add a new `pdfjs.enablePermissions` preference, off by default, to allow the PDF documents to disable copying in the viewer (bug 792816)	2020-04-09 13:28:04 -07:00
Jonas Jenwald	8521f70157	Add a new `pdfjs.enablePermissions` preference, off by default, to allow the PDF documents to disable copying in the viewer (bug 792816) Please note: Most of the necessary API work was done in PR 10033, and the only remaining thing to do here was to implement it in the viewer. The new preference should thus allow e.g. enterprise users to disable copying in the viewer, for PDF documents whose permissions specify that. In order to simplify things the "copy"-permission was implemented using CSS, as suggested in https://bugzilla.mozilla.org/show_bug.cgi?id=792816#c55, which should hopefully suffice.[1] The advantage of this approach, as opposed to e.g. disabling the `textLayer` completely, is first of all that it ensures that searching still works correctly even in copy-protected documents. Secondly this also greatly simplifies the overall implementation, since it doesn't require a lot of code for something that's disabled by default. --- [1] As the discussion in the bug shows, this kind of copy-protection is not very strong and is also generally easy to remove/circumvent in various ways. Hence a simple solution, targeting "regular"-users rather than "power"-users is hopefully deemed acceptable here.	2020-04-08 18:20:47 +02:00
Tim van der Meij	4fe92605b7	Merge pull request #11727 from Snuffleupagus/issue-11713 Add a heuristic to scale even single-char text, when the horizontal/vertical scaling differs significantly (issue 11713)	2020-04-07 23:13:02 +02:00
Jonas Jenwald	91efde5246	Add a heuristic to scale even single-char text, when the horizontal/vertical scaling differs significantly (issue 11713) At this point in time, compared to when the "ignore single-char" code was added, we should generally be doing a much better job of combining text into as few chunks as possible. However, there's still bad cases where we're not able to combine text as much as one would like, which is why I'm not proposing to simply measure/scale all text. Instead this patch will to only measure/scale single-char text in cases where the horizontal/vertical scale is off significantly, since that's were you'd expect bad text-selection behaviour otherwise. Note that most of the movement caused by this patch is with Type3 fonts, which is a somewhat special font type and one where our current text-selection behaviour is probably the least good.	2020-04-07 00:36:23 +02:00
Tim van der Meij	70c54ab9d9	Merge pull request #11746 from Snuffleupagus/issue-11740 Create the glyph mapping correctly for composite Type1, i.e. CIDFontType0, fonts (issue 11740)	2020-04-07 00:10:12 +02:00
Tim van der Meij	9871ccc69f	Merge pull request #11777 from Snuffleupagus/Font-exportData-2 [api-minor] Change `Font.exportData` to, by default, stop exporting properties which are completely unused on the main-thread and/or in the API (PR 11773 follow-up)	2020-04-06 22:54:14 +02:00
Jonas Jenwald	2d46230d23	[api-minor] Change `Font.exportData` to, by default, stop exporting properties which are completely unused on the main-thread and/or in the API (PR 11773 follow-up) For years now, the `Font.exportData` method has (because of its previous implementation) been exporting many properties despite them being completely unused on the main-thread and/or in the API. This is unfortunate, since among those properties there's a number of potentially very large data-structures, containing e.g. Arrays and Objects, which thus have to be first structured cloned and then stored on the main-thread. With the changes in this patch, we'll thus by default save memory for every `Font` instance created (there can be a lot in longer documents). The memory savings obviously depends a lot on the actual font data, but some approximate figures are: For non-embedded fonts it can save a couple of kilobytes, for simple embedded fonts a handful of kilobytes, and for composite fonts the size of this auxiliary can even be larger than the actual font program itself. All-in-all, there's no good reason to keep exporting these properties by default when they're unused. However, since we cannot be sure that every property is unused in custom implementations of the PDF.js library, this patch adds a new `getDocument` option (named `fontExtraProperties`) that still allows access to the following properties: - "cMap": An internal data structure, only used with composite fonts and never really intended to be exposed on the main-thread and/or in the API. Note also that the `CMap`/`IdentityCMap` classes are a lot more complex than simple Objects, but only their "internal" properties survive the structured cloning used to send data to the main-thread. Given that CMaps can often be very large, not exporting them can also save a fair bit of memory. - "defaultEncoding": An internal property used with simple fonts, and used when building the glyph mapping on the worker-thread. Considering how complex that topic is, and given that not all font types are handled identically, exposing this on the main-thread and/or in the API most likely isn't useful. - "differences": An internal property used with simple fonts, and used when building the glyph mapping on the worker-thread. Considering how complex that topic is, and given that not all font types are handled identically, exposing this on the main-thread and/or in the API most likely isn't useful. - "isSymbolicFont": An internal property, used during font parsing and building of the glyph mapping on the worker-thread. - "seacMap": An internal map, only potentially used with some Type1/CFF fonts and never intended to be exposed in the API. The existing `Font.{charToGlyph, charToGlyphs}` functionality already takes this data into account when handling text. - "toFontChar": The glyph map, necessary for mapping characters to glyphs in the font, which is built upon the various encoding information contained in the font dictionary and/or font program. This is not directly used on the main-thread and/or in the API. - "toUnicode": The unicode map, necessary for text-extraction to work correctly, which is built upon the ToUnicode/CMap information contained in the font dictionary, but not directly used on the main-thread and/or in the API. - "vmetrics": An array of width data used with fonts which are composite and vertical, but not directly used on the main-thread and/or in the API. - "widths": An array of width data used with most fonts, but not directly used on the main-thread and/or in the API.	2020-04-06 11:47:09 +02:00
Jonas Jenwald	8770ca3014	Make the `decryptAscii` helper function, in `src/core/type1_parser.js`, slightly more efficient By slicing the Uint8Array directly, rather than using the prototype and a `call` invocation, the runtime of `decryptAscii` is decreased slightly (~30% based on quick logging). The `decryptAscii` function is still less efficient than `decrypt`, however ASCII encoded Type1 font programs are sufficiently rare that it probably doesn't matter much (we've only seen two examples, issue 4630 and 11740).	2020-04-06 11:21:02 +02:00
Jonas Jenwald	938d519192	Create the glyph mapping correctly for composite Type1, i.e. CIDFontType0, fonts (issue 11740) This updates `Type1Font.getGlyphMapping` with a code-path "borrowed" from `CFFFont.getGlyphMapping`.	2020-04-06 11:21:02 +02:00
Jonas Jenwald	6a8c591301	Improve detection of binary/ASCII `eexec` encrypted Type1 font programs in `Type1Parser` (issue 11740) The PDF document, in the referenced issue, actually contains ASCII-encoded Type1 data which we currently incorrectly identify as binary. According to the specification, see https://www-cdf.fnal.gov/offline/PostScript/T1_SPEC.PDF#[{%22num%22%3A203%2C%22gen%22%3A0}%2C{%22name%22%3A%22XYZ%22}%2C87%2C452%2Cnull], the current checks are insufficient to decide between binary/ASCII encoded Type1 font programs.	2020-04-06 11:21:02 +02:00
Jonas Jenwald	2619272d73	Change the signature of `TranslatedFont`, and convert it to a proper class In preparation for the next patch, this changes the signature of `TranslatedFont` to take an object rather than individual parameters. This also, in my opinion, makes the call-sites easier to read since it essentially provides a small bit of documentation of the arguments. Finally, since it was necessary to touch `TranslatedFont` anyway it seemed like a good idea to also convert it to a proper `class`.	2020-04-05 20:53:48 +02:00
Tim van der Meij	0400109b87	Merge pull request #11773 from Snuffleupagus/Font-exportData-1 [api-minor] Change `Font.exportData` to use an explicit white-list of exportable properties, and stop exporting internal/unused properties	2020-04-05 20:50:33 +02:00
Jonas Jenwald	59f54b946d	Ensure that all `Font` instances have the `vertical` property set to a boolean Given that the `vertical` property is always accessed on the main-thread, ensuring that the property is explicitly defined seems like the correct thing to do since it also avoids boolean casting elsewhere in the code-base.	2020-04-05 16:27:50 +02:00
Jonas Jenwald	c5e1fd3fde	Use "standard" shadowing in the `Font.spaceWidth` method With `Font.exportData` now only exporting white-listed properties, there should no longer be any reason to not use standard shadowing in the `Font.spaceWidth` method. Furthermore, considering the amount of other changes to the code-base over the years it's not even clear to me that the special-case was necessary any more (regardless of the preceding patches).	2020-04-05 16:27:50 +02:00
Jonas Jenwald	a5e4cccf13	[api-minor] Prevent `Font.exportData` from exporting internal/unused properties A number of internal font properties, which only make sense on the worker-thread, were previously exported. Some of these properties could also contain potentially large Arrays/Objects, which thus unnecessarily increases memory usage since we're forced to copy these to the main-thread and also store them there. This patch stops exporting the following font properties: - "_shadowWidth": An internal property, which was never intended to be exported. - "charsCache": An internal cache, which was never intended to be exported and doesn't make any sense on the main-thread. Furthermore, by the time `Font.exportData` is called it's usually `undefined` or a mostly empty Object as well. - "cidEncoding": An internal property used with (some) composite fonts. As can be seen in the `PartialEvaluator.translateFont` method, `cidEncoding` will only be assigned a value when the font dictionary has an "Encoding" entry which is a `Name` (and not in the `Stream` case, since those obviously cannot be cloned). All-in-all this property doesn't really make sense on the main-thread and/or in the API, and note also that the resulting `cMap` property is (partially) available already. - "fallbackToUnicode": An internal map, part of the heuristics used to improve text-selection in (some) badly generated PDF documents with simple fonts. This was never intended to be exposed on the main-thread and/or in the API. - "glyphCache": An internal cache, which was never intended to be exported and which doesn't make any sense on the main-thread. Furthermore, by the time `Font.exportData` is called it's usually a mostly empty Object as well. - "isOpenType": An internal property, used only during font parsing on the worker-thread. In the very unlikely event that an API consumer actually needs that information, then `fontType` should be a (generally) much better property to use. Finally, in the (hopefully) unlikely event that any of these properties become necessary on the main-thread, re-adding them to the white-list is easy to do.	2020-04-05 16:27:50 +02:00
Jonas Jenwald	664f7de540	Change `Font.exportData` to use an explicit white-list of exportable properties This patch addresses an existing, and very long standing, TODO in the code such that it's no longer possible to send arbitrary/unnecessary font properties to the main-thread. Furthermore, by having a white-list it's also very easy to see exactly which font properties are being exported. Please note that in its current form, the list of exported properties contains every possible enumerable property that may exist in a `Font` instance. In practice no single font will contain all of these properties, and e.g. embedded/non-embedded/Type3 fonts will all differ slightly with respect to what properties are being defined. Hence why only explicitly set properties are included in the exported data, to avoid half of them being `undefined`, which however should not be a problem for any existing consumer (since they'd already need to handle those cases). Since a fair number of these font properties are completely internal functionality, and doesn't make any sense to expose on the main-thread and/or in the API, follow-up patch(es) will be required to trim down the list. (I purposely included all properties here for brevity and future documentation purposes.)	2020-04-05 16:27:48 +02:00
Tim van der Meij	09cccd8ecc	Merge pull request #11780 from Snuffleupagus/refactor-PDFViewerApplication-load Move the initialization of "page labels"/"metadata"/"auto print" out of `PDFViewerApplication.load`	2020-04-05 15:46:36 +02:00
Jonas Jenwald	9ef58347ed	A couple of small improvements of the `PDFViewerApplication.{_initializeMetadata, _initializePdfHistory}` methods - Use template strings when printing document/viewer information in `_initializeMetadata`, since the old format feels overly verbose. Also, get the WebGL state from the `BaseViewer` instance[1] rather than the `AppOptions`. Since the `AppOptions` value could theoretically have been changed (by the user) after the viewer components were initialized, it seems much more useful to print the actual value that'll be used during rendering. - Change `_initializePdfHistory` to actually do the "is embedded"-check first, in accordance with the comment and given that the "disableHistory" option usually shouldn't be set. --- [1] Admittedly reaching into the `BaseViewer` instance and just grabbing the value perhaps isn't a great approach overall, but given that the WebGL-backend isn't even on by default this probably doesn't matter too much.	2020-04-05 15:41:00 +02:00
Jonas Jenwald	b9add65099	Move the initialization of "auto print" out of `PDFViewerApplication.load` Over time, with more and more API-functionality added, the `PDFViewerApplication.load` method has become quite large and complex. In an attempt to improve the current situation somewhat, this patch moves the fetching and initialization of "auto print" out into its own (private) helper method instead.	2020-04-05 15:41:00 +02:00
Jonas Jenwald	d07be1a89b	Move the initialization of "metadata" out of `PDFViewerApplication.load` Over time, with more and more API-functionality added, the `PDFViewerApplication.load` method has become quite large and complex. In an attempt to improve the current situation somewhat, this patch moves the fetching and initialization of "metadata" out into its own (private) helper method instead.	2020-04-05 15:41:00 +02:00
Jonas Jenwald	32f1d0de76	Move the initialization of "page labels" out of `PDFViewerApplication.load` Over time, with more and more API-functionality added, the `PDFViewerApplication.load` method has become quite large and complex. In an attempt to improve the current situation somewhat, this patch moves the fetching and initialization of "page labels" out into its own (private) helper method instead.	2020-04-05 15:41:00 +02:00
Tim van der Meij	9dedaa5eb9	Merge pull request #11781 from Snuffleupagus/fix-gulp-jsdoc Update the "gulp jsdoc" task to account for API changes in the `mkdirp` package (PR 11772 follow-up)	2020-04-05 15:34:11 +02:00
Jonas Jenwald	f53e1409f6	Update the "gulp jsdoc" task to account for API changes in the `mkdirp` package (PR 11772 follow-up) I completely overlooked the fact that we had one occurrence of an asynchronous `mkdirp` call in the gulpfile, which thus breaks since the package now uses Promises rather than a callback function; sorry about that!	2020-04-05 12:20:10 +02:00
Tim van der Meij	702fec534d	Merge pull request #11769 from Snuffleupagus/charToGlyph-fontCharCode-range Ensure that `Font.charToGlyph` won't fail because `String.fromCodePoint` is given an invalid code point (issue 11768)	2020-04-04 14:36:52 +02:00
Jonas Jenwald	87142a635e	Ensure that `Font.charToGlyph` won't fail because `String.fromCodePoint` is given an invalid code point (issue 11768) Please note: This patch on its own is not sufficient to address the underlying problem in the referenced issue, hence why no test-case is included since the actual bug still needs to be fixed. As can be seen in the specification, https://tc39.es/ecma262/#sec-string.fromcodepoint, `String.fromCodePoint` will throw a RangeError for invalid code points. In the event that a CMap, in a composite font, contains invalid data and/or we fail to parse it correctly, it's thus possible that the glyph mapping that we build end up with entires that cause `String.fromCodePoint` to throw and thus `Font.charToGlyph` to break. If that happens, as is the case in issue 11768, significant portions of a page/document may fail to render which seems very unfortunate. While this patch doesn't fix the underlying problem, it's hopefully deemed useful not only for the referenced issue but also to prevent similar bugs in the future.	2020-04-03 09:49:50 +02:00
Tim van der Meij	79a99737a0	Merge pull request #11772 from Snuffleupagus/update-packages Update packages and translations	2020-04-02 23:44:38 +02:00
Jonas Jenwald	9a3b52f52b	Update l10n files	2020-04-02 12:22:18 +02:00
Jonas Jenwald	7b7fe60210	Update the `mkdirp` package, since its major version was increased	2020-04-02 12:22:13 +02:00
Jonas Jenwald	412fec1545	Update `npm` packages	2020-04-02 12:13:14 +02:00
Tim van der Meij	7ed71a0d7c	Merge pull request #11771 from Snuffleupagus/issue-11762 Fail early, in modern `GENERIC` builds, if certain required browser functionality is missing (issue 11762)	2020-04-01 22:05:19 +02:00
Jonas Jenwald	710704508c	Fail early, in modern `GENERIC` builds, if certain required browser functionality is missing (issue 11762) With two kind of builds now being produced, with/without translation/polyfills, it's unfortunately somewhat easy for users to accidentally pick the wrong one. In the case where a user would attempt to use a modern build of PDF.js in an older browser, such as e.g. IE11, the failure would be immediate when the code is loaded (given the use of unsupported ECMAScript features). However in some browsers/environments, in particular Node.js, a modern PDF.js build may load correctly and thus appear to function, only to fail for e.g. certain API calls. To hopefully lessen the support burden, and to try and improve things overall, this patch adds checks to ensure that a modern build of PDF.js cannot be used in browsers/environments which lack native support for critical functionality (such as e.g. `ReadableStream`). Hence we'll fail early, with an error message telling users to pick an ES5-compatible build instead. To ensure that we actually test things better especially w.r.t. usage of the PDF.js library in Node.js environments, the `gulp npm-test` task as used by Node.js/Travis was changed (back) to test an ES5-compatible build. (Since the bots still test the code as-is, without transpilation/polyfills, this shouldn't really be a problem as far as I can tell.) As part of these changes there's now both `gulp lib` and `gulp lib-es5` build targets, similar to e.g. the generic builds, which thanks to some re-factoring only required adding a small amount of code. Please note: While it's probably too early to tell if this will be a widespread issue, it's possible that this is the sort of patch that may warrant being `git cherry-pick`ed onto the current beta version (v2.4.456).	2020-04-01 19:42:48 +02:00
Tim van der Meij	ce1727626c	Merge pull request #11655 from Snuffleupagus/rm-getGlobalEventBus [api-minor] Remove the `getGlobalEventBus` viewer functionality, and the `eventBusDispatchToDOM` option/preference (PR 11631 follow-up)	2020-03-31 00:17:30 +02:00
Tim van der Meij	35c9f8de38	Merge pull request #11767 from Snuffleupagus/issue-11766 Replace the RTL images with CSS transforms of the standard images (issue 11766)	2020-03-30 23:53:49 +02:00
Jonas Jenwald	63efe61245	Replace the RTL images with CSS transforms of the standard images (issue 11766) This avoids unnecessary duplication of many images, thus reducing the size of PDF.js image resources slightly. Note that since the images should only be flipped horizontally, this required specifying the horizontal/vertical scaling separately for the hiDPI-images.	2020-03-30 22:47:49 +02:00
Jonas Jenwald	664b79abe0	[api-minor] Remove the `eventBusDispatchToDOM` option/preference, and thus the general ability to dispatch "viewer components" events to the DOM This functionality was only added to the default viewer for backwards compatibility and to support the various PDF viewer tests in mozilla-central, with the intention to eventually remove it completely. While the different mozilla-central tests cannot be easily converted from DOM events, it's however possible to limit that functionality to only MOZCENTRAL builds and when tests are running. Rather than depending of the re-dispatching of internal events to the DOM, the default viewer can instead be used in e.g. the following way: ```javascript document.addEventListener("webviewerloaded", function() { PDFViewerApplication.initializedPromise.then(function() { // The viewer has now been initialized, and its properties can be accessed. PDFViewerApplication.eventBus.on("pagerendered", function(event) { console.log("Has rendered page number: " + event.pageNumber); }); }); }); ```	2020-03-29 12:24:46 +02:00
Jonas Jenwald	7fd5f2dd61	[api-minor] Remove the `getGlobalEventBus` viewer functionality (PR 11631 follow-up) The correct/intended way of working with the "viewer components" is by providing an `EventBus` instance upon initialization, and the `getGlobalEventBus` was only added for backwards compatibility. Note, for example, that using `getGlobalEventBus` doesn't really work at all well with a use-case where there's multiple `PDFViewer` instances on a one page, since it may then be difficult/impossible to tell which viewer a particular event originated from. All of the "viewer components" examples have been previously updated, such that there's no longer any code/examples which relies on the now removed `getGlobalEventBus` functionality.	2020-03-29 12:20:23 +02:00
Tim van der Meij	c12ea21c14	Merge pull request #11755 from Snuffleupagus/rm-fonts-sizes-encoding Remove the unused `sizes` and `encoding` properties on `Font` instances	2020-03-27 21:44:16 +01:00
Jonas Jenwald	14c999e3ee	Remove the unused `sizes` and `encoding` properties on `Font` instances The `sizes` property doesn't appear to have been used ever since the code was first split into main/worker-threads, which is so many years ago that I wasn't able to easily find exactly in which PR/commit it became unused. The `encoding` property is always assigned the `properties.baseEncoding` value, however the `PartialEvaluator` doesn't actually compute/set that value any more. Again it was difficult to determine when it became unused, but it's been that way for years.	2020-03-27 10:12:01 +01:00
Tim van der Meij	fa4b431091	Merge pull request #11745 from Snuffleupagus/eslint-no-shadow Enable the ESLint `no-shadow` rule	2020-03-25 22:48:07 +01:00
Tim van der Meij	ff0f9fd018	Merge pull request #11747 from gdh1995/fix-removing-wheel Add `passive: false` when removing wheel listeners	2020-03-25 22:37:41 +01:00
Tim van der Meij	8745286dc1	Merge pull request #11646 from Snuffleupagus/_setDocumentAllowFetchPages Ensure that automatic printing still works when the viewer and/or its pages are hidden (bug 1618621, bug 1618955)	2020-03-25 22:27:19 +01:00
gdh1995	a527eb8c92	Add `passive: false` when removing wheel listeners Code of listening `wheel` event uses `{passive: false}`, while this argument will be treated as `true` before Firefox 49, accordin to https://developer.mozilla.org/en-US/docs/Web/API/EventTarget/addEventListener#Browser_compatibility . This commit adds it when removing wheel listeners, so that such listeners can be really removed.	2020-03-25 22:42:27 +08:00
Jonas Jenwald	fdfcde2b40	Remove a spurious `console.log` from the `ChromiumBrowser` function in `test/webbrowser.js` file This looks entirely like something which was left-over from debugging, and that line hasn't been touched since PR 4515, especially considering that the corresponding branch in `FirefoxBrowser` doesn't print anything.	2020-03-25 11:57:12 +01:00
Jonas Jenwald	dcb16af968	Whitelist closure related cases to address the remaining `no-shadow` linting errors Given the way that "classes" were previously implemented in PDF.js, using regular functions and closures, there's a fair number of false positives when the `no-shadow` ESLint rule was enabled. Note that while some of these `eslint-disable` statements can be removed if/when the relevant code is converted to proper `class`es, we'll probably never be able to get rid of all of them given our naming/coding conventions (however I don't really see this being a problem).	2020-03-25 11:57:12 +01:00
Jonas Jenwald	1d2f787d6a	Enable the ESLint `no-shadow` rule This rule is not currently enabled in mozilla-central, but it appears commented out[1] in the ESLint definition file; see https://searchfox.org/mozilla-central/rev/c80fa7258c935223fe319c5345b58eae85d4c6ae/tools/lint/eslint/eslint-plugin-mozilla/lib/configs/recommended.js#238-239 Unfortunately this rule is, for fairly obvious reasons, impossible to `--fix` automatically (even partially) and each case thus required careful manual analysis. Hence this ESLint rule is, by some margin, probably the most difficult one that we've enabled thus far. However, using this rule does seem like a good idea in general since allowing variable shadowing could lead to subtle (and difficult to find) bugs or at the very least confusing code. Please find additional details about the ESLint rule at https://eslint.org/docs/rules/no-shadow --- [1] Most likely, a very large number of lint errors have prevented this rule from being enabled thus far.	2020-03-25 11:56:05 +01:00
Tim van der Meij	475fa1f97f	Merge pull request #11744 from janpe2/cff-glyph-zero The first glyph in CFF CIDFonts must be named 0 instead of ".notdef"	2020-03-24 23:52:21 +01:00
Tim van der Meij	292b77fe7b	Merge pull request #11707 from Snuffleupagus/issue-11694 Always prefer the PDF.js JPEG decoder for very large images, in order to reduce peak memory usage (issue 11694)	2020-03-24 23:51:31 +01:00
Tim van der Meij	f85105379e	Merge pull request #11738 from Snuffleupagus/no-shadow-src-core Remove variable shadowing from the JavaScript files in the `src/core/` folder	2020-03-24 23:10:37 +01:00
Tim van der Meij	c54e773637	Merge pull request #11742 from Snuffleupagus/no-shadow-test-unit Remove variable shadowing from the JavaScript files in the `test/unit/` folder	2020-03-24 22:44:23 +01:00
Jonas Jenwald	a24ad28d75	Rename `BaseViewer._setDocumentViewerElement` to `BaseViewer._viewerElement` It was pointed out the the old name felt confusing, so let's just rename the getter since it's an internal property anyway.	2020-03-24 16:54:37 +01:00

... 3 4 5 6 7 ...

12595 Commits