Sakurai/pdf.js - pdf.js - Gitea on kemo

Sakurai/pdf.js

Author	SHA1	Message	Date
Jonas Jenwald	de628cec59	Some `hasJSActions`, and general annotation-code, related cleanup in the viewer and API - Add support for logical assignment operators, i.e. `&&=`, `\|\|=`, and `??=`, with a Babel-plugin. Given that these required incrementing the ECMAScript version in the ESLint and Acorn configurations, and that platform/browser support is still fairly limited, always transpiling them seems appropriate for now. - Cache the `hasJSActions` promise in the API, similar to the existing `getAnnotations` caching. With this implemented, the lookup should now be cheap enough that it can be called unconditionally in the viewer. - Slightly improve cleanup of resources when destroying the `WorkerTransport`. - Remove the `annotationStorage`-property from the `PDFPageView` constructor, since it's not necessary and also brings it more inline with the `BaseViewer`. - Update the `BaseViewer.createAnnotationLayerBuilder` method to actaually agree with the `IPDFAnnotationLayerFactory` interface.[1] - Slightly tweak a couple of JSDoc comments. --- [1] We probably ought to re-factor both the `IPDFTextLayerFactory` and `IPDFAnnotationLayerFactory` interfaces to take parameter objects instead, since especially the `IPDFAnnotationLayerFactory` one is becoming quite unwieldy. Given that that would likely be a breaking change for any custom viewer-components implementation, this probably requires careful deprecation.	2020-11-14 13:58:35 +01:00
Calixte Denizet	a5279897a7	JS -- Add listener for sandbox events only if there are some actions * When no actions then set it to null instead of empty object * Even if a field has no actions, it needs to listen to events from the sandbox in order to be updated if an action changes something in it.	2020-11-09 18:37:59 +01:00
Jonas Jenwald	1dad255784	Convert files in the `src/display/`-folder to use optional chaining where possible By using optional chaining, see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Optional_chaining, it's possible to reduce unnecessary code-repetition in many cases. Note that these changes also reduce the size of the built `pdf.js` file, when `SKIP_BABEL == true` is set, and for the `MOZCENTRAL` build-target that result in a `0.1%` filesize reduction from a simple and mostly mechanical code change.	2020-11-07 13:22:06 +01:00
Tim van der Meij	e341e6e542	Merge pull request #12525 from brendandahl/mark-info [api-minor] Implement API to get MarkInfo from the catalog.	2020-10-31 00:05:19 +01:00
Brendan Dahl	f5c821e9c3	[api-minor] Implement API to get MarkInfo from the catalog.	2020-10-30 10:59:45 -07:00
Jonas Jenwald	c293fc2b8f	Add (some) optional chaining usage in `src/display/api.js` Since we no longer use SystemJS to load the unit-tests, there's now nothing that prevents us from using optional chaining and nullish coalescing in the `src/display/` directory.	2020-10-26 11:11:48 +01:00
Jonas Jenwald	d9084c0be2	Load the fake worker, in non-`PRODUCTION` mode, with native async `import` This removes the last SystemJS usage from both the API and the default viewer.	2020-10-26 11:11:48 +01:00
Calixte Denizet	c30a3a94f0	JS - Add a function in api to get the fields ids in AcroForm::CO	2020-10-17 12:56:40 +02:00
Jonas Jenwald	3351d3476d	Don't store complex data in `PDFDocument.formInfo`, and replace the `fields` object with a `hasFields` boolean instead This patch is based on a couple of smaller things that I noticed when working on PR 12479. - Don't store the /Fields on the `formInfo` getter, since that feels like overloading it with unintended (and too complex) data, and utilize a `hasFields` boolean instead. This functionality was originally added in PR 12271, to help determine what kind of form data a PDF document contains, and I think that we should ensure that the return value of `formInfo` only consists of "simple" data. With these changes the `fieldObjects` getter instead has to look-up the /Fields manually, however that shouldn't be a problem since the access is guarded by a `formInfo.hasFields` check which ensures that the data both exists and is valid. Furthermore, most documents doesn't even have any /AcroForm data anyway. - Determine the `hasFields` property first, to ensure that it's always correct even if there's errors when checking e.g. the /XFA or /SigFlags entires, since the `fieldObjects` getter depends on it. - Simplify a loop in `fieldObjects`, since the object being accessed is a `Map` and those have built-in iteration support. - Use a higher logging level for errors in the `formInfo` getter, and include the actual error message, since that'd have helped with fixing PR 12479 a lot quicker. - Update the JSDoc comment in `src/display/api.js` to list the return values correctly, and also slightly extend/improve the description.	2020-10-16 12:47:27 +02:00
Calixte Denizet	71ecc3129b	Add the possibility to collect Javascript actions	2020-10-14 10:44:16 +02:00
Jonas Jenwald	2a8983d76b	Enable the ESLint `no-var` rule in the `src/display/` folder Previously this rule has been enabled in the `web/` folder, and in select files in the `src/` sub-folders. Note that a number of the files in the `src/display/` folder were already enforcing the `no-var` rule, and thanks to Prettier the necessary re-writing will be (mostly) handled automatically. Please find additional details about the ESLint rule at https://eslint.org/docs/rules/no-var	2020-10-02 16:16:23 +02:00
Jonas Jenwald	2393443e73	Include the `/Order` array, if available, when parsing the Optional Content configuration The `/Order` array is used to improve the display of Optional Content groups in PDF viewers, and it allows a PDF document to e.g. specify that Optional Content groups should be displayed as a (collapsable) tree-structure rather than as just a list. Note that not all available Optional Content groups must be present in the `/Order` array, and PDF viewers will often (by default) hide those toggles in the UI. To allow us to improve the UX around toggling of Optional Content groups, in the default viewer, these hidden-by-default groups are thus appended to the parsed `/Order` array under a custom nesting level (with `name == null`). Finally, the patch also slightly tweaks an `OptionalContentConfig` related JSDoc-comment in the API.	2020-08-30 16:28:40 +02:00
Jonas Jenwald	1f5021d76a	Prevent errors if `PDFDocumentProxy.saveDocument` is called without the `annotationStorage` parameter (PR 12241 follow-up) Obviously it doesn't make sense to call that method without providing an `AnnotationStorage`-instance, however we should ensure that doing so won't cause errors. Hence we need to check that `annotationStorage` is actually defined, before attempting to call its `resetModified` method.	2020-08-22 18:09:17 +02:00
Brendan Dahl	8023175103	Support file save triggered from the Firefox integrated version. Related to https://bugzilla.mozilla.org/show_bug.cgi?id=1659753 This allows Firefox trigger a "save" event from ctrl/cmd+s or the "Save Page As" context menu, which in turn lets pdf.js generate a new PDF if there is form data to save. I also now use `sourceEventType` on downloads so Firefox can determine if it should launch the "open with" dialog or "save as" dialog.	2020-08-20 18:05:08 -07:00
Aki Sasaki	83365a3756	confirm if leaving a modified form without saving	2020-08-20 17:23:06 -07:00
Jonas Jenwald	b26d736809	Ensure that the "DocException" message handler, in the API, will always either error or warn (depending on the build) if a valid `Error` isn't found Having this present would have made debugging issues 11941 and 12209 so much quicker and easier.	2020-08-13 13:17:30 +02:00
Calixte Denizet	1a6816ba98	Add support for saving forms	2020-08-12 10:32:59 +02:00
Jonas Jenwald	4d351eab93	A couple of (small) tweaks of the `AnnotationStorage` (PR 12173 follow-up) - Initialize the `AnnotationStorage`-instance, on `PDFDocumentProxy`, lazily. - Change the `AnnotationStorage` to use a `Map` internally, rather than a regular Object (simplifies the following points). - Let `AnnotationStorage.getAll` return `null` when there's no data stored, to avoid unnecessary parsing on the worker-thread. This ought to "just work", since the worker-thread code should already handle the `!annotationStorage` case everywhere. - Add a new `AnnotationStorage.size` getter, to be able to easily tell if there's any data stored.	2020-08-10 17:07:24 +02:00
Jonathan Grimes	ac723a1760	Allow loading pdf fonts into another document.	2020-08-08 02:52:32 +00:00
Takashi Tamura	4ac62d8787	Fix the type of PDFDocumentLoadingTask.destroy.	2020-08-07 16:10:19 +09:00
Jonas Jenwald	5e44b241b2	[api-minor] Fix the `annotationStorage` parameter in `PDFPageProxy.render` While the parameter name (clearly) suggests that an `AnnotationStorage`-instance is expected, looking at the only call-sites that include the parameter (i.e. the `PDFPrintServiceFactory` instances) it actually contains just a normal Object. Hence it seems much more reasonable to actually pass a valid `AnnotationStorage`-instance, as the name suggests, and simply have `PDFPageProxy.render` do the `annotationStorage.getAll()` call. (Since we cannot send an `AnnotationStorage`-instance as-is to the worker-thread, given the "structured clone algorithm".)	2020-08-05 23:02:30 +02:00
Takashi Tamura	a0f0ab78f3	Fix the type definition of TypedArray.	2020-08-05 17:01:08 +09:00
Tim van der Meij	56ca027c08	Improve consistency for the API documentation comments Over time we used multiple different formats for JSDoc comments. This commit standardizes those formats to the one we used most often. Moreover, this removes the example in the outline endpoint documentation since it now has a proper type definition and it didn't render correctly in JSDoc.	2020-08-04 23:27:22 +02:00
Tim van der Meij	ba4a07ce07	Fix incorrect types in the API documentation	2020-08-04 23:19:59 +02:00
Tim van der Meij	3116216e1d	Improve the API documentation for `PDFDocumentLoadingTask` This commit: - formats the documentation block according to the standards; - replaces the callback definitions with the `function` type (we have that for other definitions already and the callback type was not rendered correctly by JSDoc); - synchronizes the type documentation and the class documentation; - fixes the documentation by making it easier to read and making sure that all optional properties are indicated as such; - uses the `@link` tag to indicate links to other code. The `typestest` still passes and JSDoc now renders this class correctly.	2020-08-04 23:17:24 +02:00
Brendan Dahl	ac494a2278	Add support for optional marked content. Add a new method to the API to get the optional content configuration. Add a new render task param that accepts the above configuration. For now, the optional content is not controllable by the user in the viewer, but renders with the default configuration in the PDF. All of the test files added exhibit different uses of optional content. Fixes #269. Fix test to work with optional content. - Change the stopAtErrors test to ensure the operator list has something, instead of asserting the exact number of operators.	2020-08-04 09:26:55 -07:00
Tim van der Meij	00a8b42e67	Merge pull request #12102 from ineiti/add_types_annotations Add types annotations	2020-08-02 16:45:37 +02:00
Jonas Jenwald	05baa4c89f	Revert "[api-minor] Allow loading pdf fonts into another document."	2020-08-01 12:52:39 +02:00
Tim van der Meij	173b92a873	Merge pull request #12131 from jsg2021/issue-8271 [api-minor] Allow loading pdf fonts into another document.	2020-08-01 01:13:41 +02:00
Jonas Jenwald	6d192f987e	Prevent `Uncaught (in promise) AbortException` when running the unit-tests These errors can/will occur if data is still loading when the document is destroyed, which is the case in the API unit-tests that load the `tracemonkey.pdf` file. While this patch prevents these kind of problems, and thus allows us to update Jasmine again, I cannot help but thinking that it's slightly "hacky". Basically, we'll simply catch and ignore (some) rejected promises once the document is destroyed and/or its data loading is aborted. However, I don't think that these changes should cause issues in general, since we don't really care about errors once document destruction has started (note e.g. the fair number of `catch` handlers ignoring `AbortException`s already).	2020-07-31 23:29:05 +02:00
Jonathan Grimes	9b16b8ef71	Allow loading pdf fonts into another document.	2020-07-31 11:41:48 -05:00
Linus Gasser	f1bbfdc16d	Add typescript definitions This PR adds typescript definitions from the JSDoc already present. It adds a new gulp-target 'types' that calls 'tsc', the typescript compiler, to create the definitions. To use the definitions, users can simply do the following: ``` import {getDocument, GlobalWorkerOptions} from "pdfjs-dist"; import pdfjsWorker from "pdfjs-dist/build/pdf.worker.entry"; GlobalWorkerOptions.workerSrc = pdfjsWorker; const pdf = await getDocument("file:///some.pdf").promise; ``` Co-authored-by: @oBusk Co-authored-by: @tamuratak	2020-07-30 11:10:37 +02:00
Jonas Jenwald	f3ff526019	Send/receive Type3 images the same way as other globally-cached images There's quite frankly no particular reason to special-case Type3-fonts with image resources, which are very rare anyway, now that we have a general mechanism for sending/receiving images globally.	2020-07-27 13:20:15 +02:00
Calixte Denizet	584902dbf8	Add an annotation storage in order to save annotation data in acroforms	2020-07-24 10:50:11 +02:00
Jonas Jenwald	4a7e29865d	[api-minor] Use the `NodeCanvasFactory`/`NodeCMapReaderFactory` classes as defaults in Node.js environments (issue 11900) This moves, and slightly simplifies, code that's currently residing in the unit-test utils into the actual library, such that it's bundled with `GENERIC`-builds and used in e.g. the API-code. As an added bonus, this also brings out-of-the-box support for CMaps in e.g. the Node.js examples.	2020-07-02 04:44:23 +02:00
Jonas Jenwald	4cb0c032f3	Convert the `PDFPageProxy.intentStates` property from an `Object` to a `Map` As can be seen in the code there's a handful of places where this structure needs to be iterated, something that becomes cumbersome when dealing with `Object`s. Hence, by changing this to a `Map` instead we can both simplify the code and avoid creating unnecessary closures. Particularily the `PDFPageProxy._tryCleanup` method becomes a lot more readable, at least in my opinion. Finally, since this property is intended to be "private" the name is adjusted to reflect that.	2020-06-21 17:02:42 +02:00
Jonas Jenwald	cabc2cc4fc	Add a `InternalRenderTask.completed` getter and use it to simplify `PDFPageProxy._destroy` This patch aims to simplify the `PDFPageProxy._destroy` method, by: - Replacing the unnecessary `forEach` with a "regular" `for`-loop instead. - Use a more appropriate variable name, since `intentState.renderTasks` contain instances of `InternalRenderTask`. - Move the "is rendering completed"-handling to a new `InternalRenderTask.completed` getter, to abstract away some (mostly) internal `InternalRenderTask` state.	2020-06-21 15:56:14 +02:00
Jonas Jenwald	64378fc366	[api-minor] Remove the deprecated `PDFDocumentProxy.getOpenActionDestination` method (PR 11644 follow-up) This method has been printing a `deprecated` warning in two releases, hence it should hopefully be safe to remove now.	2020-06-02 12:28:00 +02:00
Jonas Jenwald	18e0b10d3c	[api-minor] Remove the `disableCreateObjectURL` option from the `getDocument` parameters, since it's now unused in the API With the changes in previous patches, the `disableCreateObjectURL` option/functionality is no longer used for anything in the API and/or in the Worker code. Note however that there's some functionality, mainly related to file loading/downloading, in the GENERIC version of the default viewer which still depends on this option. Hence the `disableCreateObjectURL` option (and related compatibility code) is moved into the viewer, see e.g. `web/app_options.js`, such that it's still available in the default viewer.	2020-05-22 00:22:48 +02:00
Jonas Jenwald	cc4cc8b11b	Remove the, now unused, `releaseImageResources` helper function With the changes in the previous patch, this is now dead code which should thus be removed.	2020-05-22 00:22:48 +02:00
Jonas Jenwald	0351852d74	[api-minor] Decode all JPEG images with the built-in PDF.js decoder in `src/core/jpg.js` Currently some JPEG images are decoded by the built-in PDF.js decoder in `src/core/jpg.js`, while others attempt to use the browser JPEG decoder. This inconsistency seem unfortunate for a number of reasons: - It adds, compared to the other image formats supported in the PDF specification, a fair amount of code/complexity to the image handling in the PDF.js library. - The PDF specification support JPEG images with features, e.g. certain ColorSpaces, that browsers are unable to decode natively. Hence, determining if a JPEG image is possible to decode natively in the browser require a non-trivial amount of parsing. In particular, we're parsing (part of) the raw JPEG data to extract certain marker data and we also need to parse the ColorSpace for the JPEG image. - While some JPEG images may, for all intents and purposes, appear to be natively supported there's still cases where the browser may fail to decode some JPEG images. In order to support those cases, we've had to implement a fallback to the PDF.js JPEG decoder if there's any issues during the native decoding. This also means that it's no longer possible to simply send the JPEG image to the main-thread and continue parsing, but you now need to actually wait for the main-thread to indicate success/failure first. In practice this means that there's a code-path where the worker-thread is forced to wait for the main-thread, while the reverse should always be the case. - The native decoding, for anything except the simplest of JPEG images, result in increased peak memory usage because there's a handful of short-lived copies of the JPEG data (see PR 11707). Furthermore this also leads to data being parsed on the main-thread, rather than the worker-thread, which you usually want to avoid for e.g. performance and UI-reponsiveness reasons. - Not all environments, e.g. Node.js, fully support native JPEG decoding. This has, historically, lead to some issues and support requests. - Different browsers may use different JPEG decoders, possibly leading to images being rendered slightly differently depending on the platform/browser where the PDF.js library is used. Originally the implementation in `src/core/jpg.js` were unable to handle all of the JPEG images in the test-suite, but over the last couple of years I've fixed (hopefully) all of those issues. At this point in time, there's two kinds of failure with this patch: - Changes which are basically imperceivable to the naked eye, where some pixels in the images are essentially off-by-one (in all components), which could probably be attributed to things such as different rounding behaviour in the browser/PDF.js JPEG decoder. This type of "failure" accounts for the vast majority of the total number of changes in the reference tests. - Changes where the JPEG images now looks ever so slightly blurrier than with the native browser decoder. For quite some time I've just assumed that this pointed to a general deficiency in the `src/core/jpg.js` implementation, however I've discovered when comparing two viewers side-by-side that the differences vanish at higher zoom levels (usually around 200% is enough). Basically if you disable [this downscaling in canvas.js](`8fb82e939c/src/display/canvas.js (L2356-L2395)`), which is what happens when zooming in, the differences simply vanish! Hence I'm pretty satisfied that there's no significant problems with the `src/core/jpg.js` implementation, and the problems are rather tied to the general quality of the downscaling algorithm used. It could even be seen as a positive that all images now share the same downscaling behaviour, since this actually fixes one old bug; see issue 7041.	2020-05-22 00:22:48 +02:00
Jonas Jenwald	dda6626f40	Attempt to cache repeated images at the document, rather than the page, level (issue 11878) Currently image resources, as opposed to e.g. font resources, are handled exclusively on a page-specific basis. Generally speaking this makes sense, since pages are separate from each other, however there's PDF documents where many (or even all) pages actually references exactly the same image resources (through the XRef table). Hence, in some cases, we're decoding the same images over and over for every page which is obviously slow and wasting both CPU and memory resources better used elsewhere.[1] Obviously we cannot simply treat all image resources as-if they're used throughout the entire PDF document, since that would end up increasing memory usage too much.[2] However, by introducing a `GlobalImageCache` in the worker we can track image resources that appear on more than one page. Hence we can switch image resources from being page-specific to being document-specific, once the image resource has been seen on more than a certain number of pages. In many cases, such as e.g. the referenced issue, this patch will thus lead to reduced memory usage for image resources. Scrolling through all pages of the document, there's now only a few main-thread copies of the same image data, as opposed to one for each rendered page (i.e. there could theoretically be twenty copies of the image data). While this obviously benefit both CPU and memory usage in this case, for very large image data this patch may possibly increase persistent main-thread memory usage a tiny bit. Thus to avoid negatively affecting memory usage too much in general, particularly on the main-thread, the `GlobalImageCache` will only cache a certain number of image resources at the document level and simply fallback to the default behaviour. Unfortunately the asynchronous nature of the code, with ranged/streamed loading of data, actually makes all of this much more complicated than if all data could be assumed to be immediately available.[3] Please note: The patch will lead to small movement in some existing test-cases, since we're now using the built-in PDF.js JPEG decoder more. This was done in order to simplify the overall implementation, especially on the main-thread, by limiting it to only the `OPS.paintImageXObject` operator. --- [1] There's e.g. PDF documents that use the same image as background on all pages. [2] Given that data stored in the `commonObjs`, on the main-thread, are only cleared manually through `PDFDocumentProxy.cleanup`. This as opposed to data stored in the `objs` of each page, which is automatically removed when the page is cleaned-up e.g. by being evicted from the cache in the default viewer. [3] If the latter case were true, we could simply check for repeat images before parsing started and thus avoid handling any duplicate image resources.	2020-05-21 18:13:45 +02:00
Jonas Jenwald	8d56a69e74	Reduce usage of SystemJS, in the development viewer, even further With these changes SystemJS is now only used, during development, on the worker-thread and in the unit/font-tests, since Firefox is currently missing support for worker modules; please see https://bugzilla.mozilla.org/show_bug.cgi?id=1247687 Hence all the JavaScript files in the `web/` and `src/display/` folders are now loaded natively by the browser (during development) using standard `import` statements/calls, thanks to a nice `import-maps` polyfill. Please note: As soon as https://bugzilla.mozilla.org/show_bug.cgi?id=1247687 is fixed in Firefox, we should be able to remove all traces of SystemJS and thus finally be able to use every possible modern JavaScript feature.	2020-05-20 13:36:52 +02:00
Jonas Jenwald	d4d933538b	Re-factor `setPDFNetworkStreamFactory`, in src/display/api.js, to also accept an asynchronous function As part of trying to reduce the usage of SystemJS in the development viewer, this patch is a necessary step that will allow removal of some `require` statements. Currently this uses `SystemJS.import` in non-PRODUCTION mode, but it should be possible to replace those with standard dynamic `import` calls in the future.	2020-05-20 13:18:18 +02:00
Jonas Jenwald	e1f340a0c2	Use the ESLint `no-restricted-syntax` rule to ensure that `assert` is always called with two arguments Having `assert` calls without a message string isn't very helpful when debugging, and it turns out that it's easy enough to make use of ESLint to enforce better `assert` call-sites. In a couple of cases the `assert` calls were changed to "regular" throwing of errors instead, since that seemed more appropriate. Please find additional details about the ESLint rule at https://eslint.org/docs/rules/no-restricted-syntax	2020-05-05 13:40:05 +02:00
Jonas Jenwald	c355f91d2e	[api-minor] Immediately release the `font.data` property once the font been attached to the DOM (PR 11777 follow-up) This patch implements https://github.com/mozilla/pdf.js/pull/11777#issuecomment-609741348 This extends the work from PR 11773 and 11777 further, by immediately releasing the `font.data` property once the font been attached to the DOM. By not unnecessarily holding onto this data on the main-thread, we'll thus reduce the memory usage of fonts even further (especially beneficial in longer documents with composite fonts). The new behaviour is controlled by the recently added `fontExtraProperties` API option (adding a new option just for this patch didn't seem necessary), since there's one edge-case in the SVG renderer where the `font.data` property is necessary (see the `pdf2svg` example). Note that while the default viewer does run clean-up with an idle timeout, that timeout will be reset whenever rendering occurs or when scrolling happens in the viewer. In practice this means that unless the user doesn't interact with the viewer in any way during an extended period of time, currently set to 30 seconds, the `PDFDocumentProxy.cleanup` method will never be called and font resources will thus not be cleaned-up.	2020-04-23 13:04:57 +02:00
Jonas Jenwald	746eaf3154	[api-minor] Fix the return value of `PDFDocumentProxy.getViewerPreferences` when no viewer preferences are present (PR 10738 follow-up) This patch fixes yet another instalment in the never-ending series of "what the bleep was I thinking", by changing the `PDFDocumentProxy.getViewerPreferences` method to return `null` by default. Not only is this method now consistent with many other API methods, for the data not present case, but it also avoids having to e.g. loop through an object to check if it's actually empty (note the old unit-test).	2020-04-14 23:25:50 +02:00
Jonas Jenwald	426945b480	Update Prettier to version 2.0 Please note that these changes were done automatically, using `gulp lint --fix`. Given that the major version number was increased, there's a fair number of (primarily whitespace) changes; please see https://prettier.io/blog/2020/03/21/2.0.0.html In order to reduce the size of these changes somewhat, this patch maintains the old "arrowParens" style for now (once mozilla-central updates Prettier we can simply choose the same formatting, assuming it will differ here).	2020-04-14 12:28:14 +02:00
Jonas Jenwald	2d46230d23	[api-minor] Change `Font.exportData` to, by default, stop exporting properties which are completely unused on the main-thread and/or in the API (PR 11773 follow-up) For years now, the `Font.exportData` method has (because of its previous implementation) been exporting many properties despite them being completely unused on the main-thread and/or in the API. This is unfortunate, since among those properties there's a number of potentially very large data-structures, containing e.g. Arrays and Objects, which thus have to be first structured cloned and then stored on the main-thread. With the changes in this patch, we'll thus by default save memory for every `Font` instance created (there can be a lot in longer documents). The memory savings obviously depends a lot on the actual font data, but some approximate figures are: For non-embedded fonts it can save a couple of kilobytes, for simple embedded fonts a handful of kilobytes, and for composite fonts the size of this auxiliary can even be larger than the actual font program itself. All-in-all, there's no good reason to keep exporting these properties by default when they're unused. However, since we cannot be sure that every property is unused in custom implementations of the PDF.js library, this patch adds a new `getDocument` option (named `fontExtraProperties`) that still allows access to the following properties: - "cMap": An internal data structure, only used with composite fonts and never really intended to be exposed on the main-thread and/or in the API. Note also that the `CMap`/`IdentityCMap` classes are a lot more complex than simple Objects, but only their "internal" properties survive the structured cloning used to send data to the main-thread. Given that CMaps can often be very large, not exporting them can also save a fair bit of memory. - "defaultEncoding": An internal property used with simple fonts, and used when building the glyph mapping on the worker-thread. Considering how complex that topic is, and given that not all font types are handled identically, exposing this on the main-thread and/or in the API most likely isn't useful. - "differences": An internal property used with simple fonts, and used when building the glyph mapping on the worker-thread. Considering how complex that topic is, and given that not all font types are handled identically, exposing this on the main-thread and/or in the API most likely isn't useful. - "isSymbolicFont": An internal property, used during font parsing and building of the glyph mapping on the worker-thread. - "seacMap": An internal map, only potentially used with some Type1/CFF fonts and never intended to be exposed in the API. The existing `Font.{charToGlyph, charToGlyphs}` functionality already takes this data into account when handling text. - "toFontChar": The glyph map, necessary for mapping characters to glyphs in the font, which is built upon the various encoding information contained in the font dictionary and/or font program. This is not directly used on the main-thread and/or in the API. - "toUnicode": The unicode map, necessary for text-extraction to work correctly, which is built upon the ToUnicode/CMap information contained in the font dictionary, but not directly used on the main-thread and/or in the API. - "vmetrics": An array of width data used with fonts which are composite and vertical, but not directly used on the main-thread and/or in the API. - "widths": An array of width data used with most fonts, but not directly used on the main-thread and/or in the API.	2020-04-06 11:47:09 +02:00
Jonas Jenwald	dcb16af968	Whitelist closure related cases to address the remaining `no-shadow` linting errors Given the way that "classes" were previously implemented in PDF.js, using regular functions and closures, there's a fair number of false positives when the `no-shadow` ESLint rule was enabled. Note that while some of these `eslint-disable` statements can be removed if/when the relevant code is converted to proper `class`es, we'll probably never be able to get rid of all of them given our naming/coding conventions (however I don't really see this being a problem).	2020-03-25 11:57:12 +01:00

1 2 3 4 5 ...