pdf.js

Author	SHA1	Message	Date
Tim van der Meij	24f80f1e38	Enable the `no-var` linting rule in `src/core/primitives.js`	2021-02-27 12:51:01 +01:00
Tim van der Meij	ed33727419	Enable the `no-var` linting rule in `src/core/glyphlist.js`	2021-02-27 12:46:57 +01:00
Tim van der Meij	e051d4d029	Enable the `no-var` linting rule in `src/core/ccitt_stream.js`	2021-02-27 12:44:55 +01:00
Tim van der Meij	0897dddbbe	Enable the `no-var` linting rule in `src/core/unicode.js`	2021-02-27 12:44:50 +01:00
Tim van der Meij	cb82dda755	Enable the `no-var` linting rule in `src/core/metrics.js`	2021-02-27 12:44:45 +01:00
Tim van der Meij	55786a4880	Merge pull request #13026 from Snuffleupagus/crypto-classes Convert code in `src/core/crypto.js` to use "normal" classes	2021-02-26 22:39:30 +01:00
Jonas Jenwald	6b4c4f80e4	Convert code in `src/core/crypto.js` to use "normal" classes All of this code predates the existence of native JS classes, however we can now clean this up a bit. This patch thus let us remove some variable "shadowing" from the code.	2021-02-26 15:51:45 +01:00
Jonas Jenwald	b884757873	Inline the `concatArrays` function in `calculatePDF20Hash` This helper function is first of all only called twice, and secondly it also leads to unnecessary intermediate allocations given how the `TypedArray`s are handled. Hence we can simply inline this small function, and thus directly allocate the combined `TypedArray` instead.	2021-02-26 15:51:39 +01:00
Jonas Jenwald	9a9a5b2365	Replace the `compareByteArrays` functions, in `src/core/crypto.js`, with the `isArrayEqual` helper function The `compareByteArrays` is first of all duplicated in multiple closures in the `src/core/crypto.js` file. Secondly, despite its name, it's also functionally equivalent to the now existing `isArrayEqual` helper function. The `isArrayEqual` helper function is changed to use a standard `for`-loop, rather than `Array.prototype.every`, since that ought to be slightly more efficient given that we're now using it with (potentially) larger data.	2021-02-26 15:51:32 +01:00
Jonas Jenwald	e69e8622a9	Convert code in `src/core/function.js` to use "normal" classes All of this code predates the existence of native JS classes, however we can now clean this up a bit. This patch thus let us remove some variable "shadowing" from the code.	2021-02-26 13:20:59 +01:00
calixteman	45329af926	XFA -- Add support for SOM expressions (#12983 ) - specifications: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=87; - add a parser for SOM expressions; - add search functions to resolve those expressions; - search functions will be used to bind data into template.	2021-02-24 10:13:02 +01:00
Jonas Jenwald	e9038cc3d1	Send the `AnnotationStorage`-data to the worker-thread as a `Map` Rather than converting the `AnnotationStorage`-data to an Object, before sending it to the worker-thread, we should be able to simply send the internal `Map` directly. The "structured clone algorithm" doesn't have a problem with `Map`s, however the `LoopbackPort` used when workers are disabled (e.g. in Node.js environments) didn't use to support them. With PR 12997 having lifted that restriction, we should now be able to simply send the `AnnotationStorage`-data as-is rather than having to iterate through it to first create an Object. Please note: The changes in `src/core/annotation.js` could have been a lot more compact if we were able to use optional chaining in the `src/core` folder. Unfortunately that's still not possible, since SystemJS is being used in the development viewer (i.g. `gulp server`) and fixing that is still blocked by [bug 1247687](https://bugzilla.mozilla.org/show_bug.cgi?id=1247687).	2021-02-18 17:13:43 +01:00
calixteman	0fa9976268	XFA - Add support for prototypes (#12979 ) - specifications: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=225&zoom=auto,-207,784 - add a clone method on nodes in order to be able to clone a proto; - support ids in template namespace; - prevent from cycle when applying protos.	2021-02-18 10:32:25 +01:00
Tim van der Meij	4619b1b568	Merge pull request #12997 from Snuffleupagus/metadata-worker Move the Metadata parsing to the worker-thread	2021-02-17 20:57:46 +01:00
calixteman	b5be515375	XFA - Add a lexer/parser for FormCalc language (#12936 ) - the language specifications are: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=1049 - it can be used to: * as a scripting language for calculation, validations, ... * in SOM expressions to select nodes: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=101	2021-02-17 20:28:06 +01:00
Jonas Jenwald	d366bbdf51	Move the `encodeToXmlString` helper function to `src/core/core_utils.js` With the previous patch this function is now only accessed on the worker-thread, hence it's no longer necessary to include it in the built `pdf.js` file.	2021-02-17 13:12:01 +01:00
Jonas Jenwald	b66f294f64	Move the XML-parser to the `src/core/`-folder With the previous patch this functionality is now only accessed on the worker-thread, hence it's no longer necessary to include it in the built `pdf.js` file.	2021-02-17 13:12:01 +01:00
Jonas Jenwald	cc3a6563ee	Move the Metadata parsing to the worker-thread The only reason, as far as I can tell, for parsing the Metadata on the main-thread is how it was originally implemented. When Metadata support was first implemented, it utilized the [`DOMParser`](https://developer.mozilla.org/en-US/docs/Web/API/DOMParser) which isn't available in workers. Today, with the custom XML-parser being used, that's no longer an issue and it seems reasonable to move the Metadata parsing to the worker-thread[1], since that's where all parsing should happen (for performance reasons). Based on these changes, we'll be able to reduce the now unnecessary duplication of the XML-parser (and related code) in both of the built `pdf.js`/`pdf.worker.js` files. Finally, this patch changes the `_repair` method to use "Array + join" rather than string concatenation. --- [1] This needed the previous patch, to enable sending of `Map`s between threads with workers disabled.	2021-02-17 13:12:01 +01:00
Calixte Denizet	0fc8267576	Avoid infinite loop when getting annotation field name - aims to fix issue #12963; - use a Set to track already visited objects; - remove the loop limit in getInheritableProperty and use a RefSet too.	2021-02-14 19:58:19 +01:00
Jonas Jenwald	1ee747a620	Remove unneeded `instanceof MissingDataException` checks The following checks are all unneeded, and could easily cause confusion when reading the code. (All of them are my fault as well, since I've sometimes added those checks without really thinking about the surrounding code.) - In `PartialEvaluator.hasBlendModes` there cannot be any `MissingDataException`s thrown, given that the `Page.getOperatorList` method waits for all the necessary /Resources to load first. Furthermore, note also that if an error is thrown from `PartialEvaluator.hasBlendModes` then it'd completely break rendering of that page, since any errors thrown from `Page.getOperatorList` are simply sent to the main-thread. - In `PartialEvaluator.handleColorN` there cannot be any `MissingDataException`s thrown, given that again the `Page.getOperatorList` method waits for all the necessary /Resources to load before operatorList parsing starts. - In `XRef.readXRef` there cannot be any `MissingDataException`s thrown, given that we're explicitly requesting (and waiting for) the entire document in `pdfManagerReady` (in `src/core/worker.js`) before re-parsing of a corrupt document starts.	2021-02-13 12:26:05 +01:00
Calixte Denizet	ea06bb0e36	[api-minor] Annotation -- Don't compute appearance when nothing has changed * don't set a value in annotationStorage by default: - having an undefined when the annotation is rendered for saving/printing means nothing has changed so use normal appearance - aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1681687 * change the way to compute font size when this one is null in DA: - make fontSize proportional to line height - in multiline case, take into account the number of lines for text entered to adapt the font size	2021-02-12 19:27:21 +01:00
calixteman	0479deef4e	XFA -- Add other objects (#12949 ) - connectionSet: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=969 - datasets: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=1038 - signature: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=1040 - stylesheet: the same - xhtml: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=1187	2021-02-11 12:30:37 +01:00
calixteman	3787bd41ef	XFA -- Add localset object (#12948 ) - Specifications: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=943	2021-02-10 18:04:43 +01:00
Jonas Jenwald	0068dba009	[api-minor] Rename `-es5` to `-legacy`, to reduce confusion over what's actually supported (issue 12976) Please note that this will also require some edits of the Wiki.	2021-02-10 16:01:59 +01:00
Jonas Jenwald	31098c404d	Use `Math.hypot`, instead of `Math.sqrt` with manual squaring (#12973 ) When the PDF.js project started `Math.hypot` didn't exist yet, and until recently we still supported browsers (IE 11) without a native `Math.hypot` implementation; please see this compatibility information: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Math/hypot#browser_compatibility Furthermore, somewhat recently there were performance improvements of `Math.hypot` in Firefox; see https://bugzilla.mozilla.org/show_bug.cgi?id=1648820 Finally, this patch also replaces a couple of multiplications with the exponentiation operator.	2021-02-10 12:28:49 +01:00
Jonas Jenwald	e6fe8a7d53	Handle errors gracefully, in `PartialEvaluator.translateFont`, when fetching the font file (issue 9462) The third page of the referenced PDF document currently fails to render completely, since one of its font files fail to load. Since that error isn't handled, a large part of the text is thus missing which looks quite bad. By "replacing" the font data with an empty stream, we'll thus be able to fallback to rendering the text with a standard font (instead of using `ErrorFont`). While there's obviously no guarantee that things will look perfect, actually rendering the text at all should be an improvement in general. Also, print a warning in `PartialEvaluator.loadFont` when the `PartialEvaluator.translateFont` method rejects, since that'd have helped debug/fix the issue faster.	2021-02-06 19:44:53 +01:00
Jonas Jenwald	d3e65f24e3	Request all data, rather than throwing, when encountering general errors in `ObjectLoader._walk` (issue 9462, PR 3289 follow-up) As far as I can tell, this has been broken ever since PR 3289 (back in 2013) without anyone noticing. For any non-`MissingDataException` errors encountered in `ObjectLoader._walk`, we're simply throwing immediately which thus has the potential to completely break rendering of an entire page. In practice this is obviously only an issue for PDF documents which are in one way or another corrupt, since that's the only way that `XRef.fetch` will throw non-`MissingDataException` errors. To make matters worse these errors are intermittent, since they can only occur if the document is still loading when the `ObjectLoader`-code runs (note the early return in `ObjectLoader.load`). Please note that we cannot simply catch the error and let "normal" parsing continue in `ObjectLoader._walk`, since that could lead to errors elsewhere given that resources "below" the current one (in the graph) might not be checked as intended then. All-in-all, the only way to make absolutely sure that we won't cause unexpected `MissingDataException`s somewhere else in the code-base is to fallback to fetching the entire document in this edge-case.	2021-02-06 14:33:50 +01:00
Brendan Dahl	a392082e30	Merge pull request #12944 from calixteman/xfa_config XFA -- Update config object	2021-02-05 15:06:09 -08:00
Calixte Denizet	9d47e69771	XFA -- Update config object	2021-02-05 19:22:51 +01:00
Calixte Denizet	652ff57897	XFA -- Add template object - Specifications: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=596	2021-02-03 21:05:10 +01:00
Calixte Denizet	7e0554afe2	XFA -- Add attributes and children in XFAObject - in order to evaluate SOM expressions nodes and their attributes must be checked in the same order as in the xml; - add an object XFAObjectArray with a parameter max to handle multiple children with the same name.	2021-02-03 18:56:00 +01:00
Calixte Denizet	0ff5cd7eb5	XFA - Add a parser for XFA files - the parser is base on a class extending XMLParserBase - it handle xml namespaces: * each namespace is assocated with a builder * builder builds nodes belonging to the namespace * when a node is inserted in the parent namespace compatibility is checked (if required) - to avoid name collision between xml names and object properties, use Symbol.	2021-02-01 13:45:31 +01:00
Tim van der Meij	e4e92d10e8	Merge pull request #12922 from Snuffleupagus/getTextContent-globalImageCache Ignore globally cached images in `PartialEvaluator.getTextContent` (PR 11930 follow-up)	2021-01-28 23:44:10 +01:00
Tim van der Meij	8805614a03	Merge pull request #12924 from brendandahl/fix-clone Fix font data clone error when pdfBug is enabled.	2021-01-28 23:42:12 +01:00
Jonas Jenwald	72da2aa166	Ignore globally cached images in `PartialEvaluator.getTextContent` (PR 11930 follow-up) Given that we'll only cache `/XObject`s of the `Image`-type globally, we can utilize that in `PartialEvaluator.getTextContent` as well. This way, in cases such as e.g. issue 12098, we can avoid having to fetch/parse `/XObject`s that we already know to be `Image`s. This is helpful, since `Stream`s are not cached on the `XRef` instance (given their potential size) and the lookup can thus be somewhat expensive in general. Also, skip a redundant `RefSetCache.has` check in the `GlobalImageCache.getData` method.	2021-01-28 10:19:26 +01:00
Brendan Dahl	52fb5abb0b	Fix font data clone error when pdfBug is enabled. The widths property should be an object to match what metrics returns. In ZapfDingbats.pdf I was getting a data clone error with pdfBug enabled. In buildCharCodeToWidth() there was an encoding with the name "at" which is also the name of a method on an array. buildCharCodeToWidth assumes an object is passed in, so when it checked for the "at" property, it found the method and copied it over. This only seemed to affect Firefox.	2021-01-27 14:38:43 -08:00
Jonas Jenwald	1ab6d2c604	Improve global image caching for small images (PR 11912 follow-up, issue 12098) When implementing the `GlobalImageCache` functionality I was mostly worried about the effect of very large images, hence the maximum number of cached images were purposely kept quite low[1]. However, there's one fairly obvious problem with that approach: In documents with hundreds, or even thousands, of small images the `GlobalImageCache` as implemented becomes essentially pointless. Hence this patch, where the `GlobalImageCache`-implementation is changed in the following ways: - We're still guaranteed to be able to cache a minimum number of images, set to `10` (similar as before). - If the total size of all the cached image data is below a threshold[2], we're allowed to cache additional images. This patch thus improve, but doesn't completely fix, issue 12098. Note that that document is created by a very poor PDF generator, since every single page contains the entire document (with all of its /Resources) and to create the individual pages clipping is used.[3] --- [1] Currently set to `10` images; imagine what would happen to overall memory usage if we encountered e.g. 50 images each 10 MB in size. [2] This value was chosen, somewhat randomly, to be `40` megabytes; basically five times the [maximum individual image size per page](`6249ef517d/src/display/api.js (L2483-L2484)`). [3] This surely has to be some kind of record w.r.t. how badly PDF generators can mess things up...	2021-01-26 12:00:12 +01:00
calixteman	a3f6882b06	JS -- add support for choice widget (#12826 )	2021-01-25 23:40:57 +01:00
Tim van der Meij	25b84ce84c	Merge pull request #12828 from dhufnagel/feature/annotation_layer_display_fontsize [api-minor] Set font size and color for text widget annotations	2021-01-23 16:08:07 +01:00
Jonas Jenwald	6bcb4e3ad9	Ensure that `parseDefaultAppearance` won't attempt to access a not yet defined variable (PR 12831 follow-up) Note how, in the `if (this.stateManager.stateStack.length !== 0) {` branch, we're attempting to access the not yet defined variable[1] `args`. If this code-path is ever hit, an Error will be thrown and parsing will thus be aborted immediately (likely leading to e.g. rendering bugs). Note that I found this purely by accident, since I happened to glance at the LGTM report. However, I've since found that the error is also present during the unit-test[2] and with this patch we're actually testing the intended thing here. As part of fixing this, and to avoid re-introducing a similar bug in the future, we'll now instead always reset `args.length` before attempting to read the next operator. Also, we can use the existing `EvaluatorPreprocessor.savedStatesDepth` getter to simplify the save/restore detection a tiny bit. --- [1] The ESLint rule `no-use-before-define` would have helped catch this problem, but unfortunately we cannot enable that without quite a bit of refactoring all over the code-base. [2] The unit-test was updated such that it would fail in the `master`-branch.	2021-01-23 15:33:28 +01:00
Dominik Hufnagel	c5083cda02	set font size and color on annotation layer use the default appearance to set the font size and color of a text annotation widget	2021-01-22 23:12:14 +01:00
Tim van der Meij	6ffb6b1c0c	Merge pull request #12885 from Snuffleupagus/worker-tweak-caching Simplify the `PDFFunctionFactory._localFunctionCache` initialization (PR 12034 follow-up); Fix the `gStateObj` lookup in `TranslatedFont._removeType3ColorOperators` (PR 12718 follow-up)	2021-01-22 20:24:33 +01:00
Jonas Jenwald	ca1f58ea42	Use `_defaultAppearanceData` directly in `WidgetAnnotation._getSaveFieldResources` (PR 12831 follow-up) With the changes in PR 12831, it's no longer necessary to keep track of the `fontName`-string separately since it's available through the `_defaultAppearanceData`-property as well.	2021-01-22 13:23:04 +01:00
Jonas Jenwald	8137c0547d	Fix the `gStateObj` lookup in `TranslatedFont._removeType3ColorOperators` (PR 12718 follow-up) As can be seen in `2cba290361/src/core/evaluator.js (L986)` the `gStateObj` (which is actually an Array despite its name), is wrapped in Array when it's inserted into the OperatorList. Hence we obviously need to take this into account when accessing it in `TranslatedFont._removeType3ColorOperators`; this mistake happened because we don't have any test-cases for this particular code-path as far as I know.	2021-01-22 12:27:38 +01:00
Jonas Jenwald	cfaf23dee2	Simplify the `PDFFunctionFactory._localFunctionCache` initialization (PR 12034 follow-up) By changing this a `shadow`ed getter, we can simply access it directly and not worry about it being initialized. I have no idea why I didn't just implement it this way in the first place.	2021-01-22 12:25:05 +01:00
Brendan Dahl	2cba290361	Merge pull request #12836 from calixteman/update_buttons JS -- update radio/checkbox values even if there are no actions	2021-01-21 14:00:26 -08:00
calixteman	1039698697	Add a parser to get font data from the default appearance (#12831 ) * Add a parser to get font data from the default appearance - pdfium & poppler use a special parser too to get these info. * Update src/core/default_appearance.js Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com> Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com>	2021-01-21 20:15:31 +01:00
Jonas Jenwald	b4eb55250e	Remove redundant compatibility checks, for modern `generic` builds, in `src/core/worker.js` With the recent additions of optional chaining and nullish coalescing to the PDF.js code-base, a couple of the checks in `src/core/worker.js` are now redundant; please see this compatibility information: - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Optional_chaining#browser_compatibility - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Nullish_coalescing_operator#browser_compatibility In practice, for the non-translated/non-polyfilled PDF.js builds, browsers without support for optional chaining and nullish coalescing will simply throw immediately upon loading of the code. Hence both the `globalThis` and `Promise.allSettled` checks are now unnecessary, given this compatibility information: - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/globalThis#browser_compatibility - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise/allSettled#browser_compatibility Please note: The `ReadableStream` check is however still necessary, since Node.js doesn't support that.	2021-01-20 13:09:56 +01:00
Jonas Jenwald	2600e59acb	Always re-measure non-embedded ArialNarrow fonts (bug 1671312, PR 12725 follow-up) While PR 12725 fixed bug 1671312 as reported, i.e. the "In the upper right corner "Purposes' has bad kerning."-part, it however broke other parts of the text rendering. Note in particular the tables, e.g. on page 2 and beyond, where the glyphs are now rendered too close together. The reason for this is that the fonts in question are non-embedded ArialNarrow, which we just replace with Helvetica which obviously is not narrow. Given that the font replacement isn't a perfect fit for non-embedded ArialNarrow, we still need to re-measure the glyph widths in this case.	2021-01-14 15:51:48 +01:00
calixteman	1de1ae0be6	Merge pull request #12838 from calixteman/authors [api-minor] Change the "dc:creator" Metadata field to an Array	2021-01-12 02:44:58 -08:00

1 2 3 4 5 ...

1968 Commits