pdf.js

Author	SHA1	Message	Date
Calixte Denizet	ea06bb0e36	[api-minor] Annotation -- Don't compute appearance when nothing has changed * don't set a value in annotationStorage by default: - having an undefined when the annotation is rendered for saving/printing means nothing has changed so use normal appearance - aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1681687 * change the way to compute font size when this one is null in DA: - make fontSize proportional to line height - in multiline case, take into account the number of lines for text entered to adapt the font size	2021-02-12 19:27:21 +01:00
calixteman	0479deef4e	XFA -- Add other objects (#12949 ) - connectionSet: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=969 - datasets: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=1038 - signature: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=1040 - stylesheet: the same - xhtml: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=1187	2021-02-11 12:30:37 +01:00
calixteman	3787bd41ef	XFA -- Add localset object (#12948 ) - Specifications: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=943	2021-02-10 18:04:43 +01:00
Jonas Jenwald	0068dba009	[api-minor] Rename `-es5` to `-legacy`, to reduce confusion over what's actually supported (issue 12976) Please note that this will also require some edits of the Wiki.	2021-02-10 16:01:59 +01:00
Jonas Jenwald	31098c404d	Use `Math.hypot`, instead of `Math.sqrt` with manual squaring (#12973 ) When the PDF.js project started `Math.hypot` didn't exist yet, and until recently we still supported browsers (IE 11) without a native `Math.hypot` implementation; please see this compatibility information: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Math/hypot#browser_compatibility Furthermore, somewhat recently there were performance improvements of `Math.hypot` in Firefox; see https://bugzilla.mozilla.org/show_bug.cgi?id=1648820 Finally, this patch also replaces a couple of multiplications with the exponentiation operator.	2021-02-10 12:28:49 +01:00
Jonas Jenwald	e6fe8a7d53	Handle errors gracefully, in `PartialEvaluator.translateFont`, when fetching the font file (issue 9462) The third page of the referenced PDF document currently fails to render completely, since one of its font files fail to load. Since that error isn't handled, a large part of the text is thus missing which looks quite bad. By "replacing" the font data with an empty stream, we'll thus be able to fallback to rendering the text with a standard font (instead of using `ErrorFont`). While there's obviously no guarantee that things will look perfect, actually rendering the text at all should be an improvement in general. Also, print a warning in `PartialEvaluator.loadFont` when the `PartialEvaluator.translateFont` method rejects, since that'd have helped debug/fix the issue faster.	2021-02-06 19:44:53 +01:00
Jonas Jenwald	d3e65f24e3	Request all data, rather than throwing, when encountering general errors in `ObjectLoader._walk` (issue 9462, PR 3289 follow-up) As far as I can tell, this has been broken ever since PR 3289 (back in 2013) without anyone noticing. For any non-`MissingDataException` errors encountered in `ObjectLoader._walk`, we're simply throwing immediately which thus has the potential to completely break rendering of an entire page. In practice this is obviously only an issue for PDF documents which are in one way or another corrupt, since that's the only way that `XRef.fetch` will throw non-`MissingDataException` errors. To make matters worse these errors are intermittent, since they can only occur if the document is still loading when the `ObjectLoader`-code runs (note the early return in `ObjectLoader.load`). Please note that we cannot simply catch the error and let "normal" parsing continue in `ObjectLoader._walk`, since that could lead to errors elsewhere given that resources "below" the current one (in the graph) might not be checked as intended then. All-in-all, the only way to make absolutely sure that we won't cause unexpected `MissingDataException`s somewhere else in the code-base is to fallback to fetching the entire document in this edge-case.	2021-02-06 14:33:50 +01:00
Brendan Dahl	a392082e30	Merge pull request #12944 from calixteman/xfa_config XFA -- Update config object	2021-02-05 15:06:09 -08:00
Calixte Denizet	9d47e69771	XFA -- Update config object	2021-02-05 19:22:51 +01:00
Calixte Denizet	652ff57897	XFA -- Add template object - Specifications: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=596	2021-02-03 21:05:10 +01:00
Calixte Denizet	7e0554afe2	XFA -- Add attributes and children in XFAObject - in order to evaluate SOM expressions nodes and their attributes must be checked in the same order as in the xml; - add an object XFAObjectArray with a parameter max to handle multiple children with the same name.	2021-02-03 18:56:00 +01:00
Calixte Denizet	0ff5cd7eb5	XFA - Add a parser for XFA files - the parser is base on a class extending XMLParserBase - it handle xml namespaces: * each namespace is assocated with a builder * builder builds nodes belonging to the namespace * when a node is inserted in the parent namespace compatibility is checked (if required) - to avoid name collision between xml names and object properties, use Symbol.	2021-02-01 13:45:31 +01:00
Tim van der Meij	e4e92d10e8	Merge pull request #12922 from Snuffleupagus/getTextContent-globalImageCache Ignore globally cached images in `PartialEvaluator.getTextContent` (PR 11930 follow-up)	2021-01-28 23:44:10 +01:00
Tim van der Meij	8805614a03	Merge pull request #12924 from brendandahl/fix-clone Fix font data clone error when pdfBug is enabled.	2021-01-28 23:42:12 +01:00
Jonas Jenwald	72da2aa166	Ignore globally cached images in `PartialEvaluator.getTextContent` (PR 11930 follow-up) Given that we'll only cache `/XObject`s of the `Image`-type globally, we can utilize that in `PartialEvaluator.getTextContent` as well. This way, in cases such as e.g. issue 12098, we can avoid having to fetch/parse `/XObject`s that we already know to be `Image`s. This is helpful, since `Stream`s are not cached on the `XRef` instance (given their potential size) and the lookup can thus be somewhat expensive in general. Also, skip a redundant `RefSetCache.has` check in the `GlobalImageCache.getData` method.	2021-01-28 10:19:26 +01:00
Brendan Dahl	52fb5abb0b	Fix font data clone error when pdfBug is enabled. The widths property should be an object to match what metrics returns. In ZapfDingbats.pdf I was getting a data clone error with pdfBug enabled. In buildCharCodeToWidth() there was an encoding with the name "at" which is also the name of a method on an array. buildCharCodeToWidth assumes an object is passed in, so when it checked for the "at" property, it found the method and copied it over. This only seemed to affect Firefox.	2021-01-27 14:38:43 -08:00
Jonas Jenwald	1ab6d2c604	Improve global image caching for small images (PR 11912 follow-up, issue 12098) When implementing the `GlobalImageCache` functionality I was mostly worried about the effect of very large images, hence the maximum number of cached images were purposely kept quite low[1]. However, there's one fairly obvious problem with that approach: In documents with hundreds, or even thousands, of small images the `GlobalImageCache` as implemented becomes essentially pointless. Hence this patch, where the `GlobalImageCache`-implementation is changed in the following ways: - We're still guaranteed to be able to cache a minimum number of images, set to `10` (similar as before). - If the total size of all the cached image data is below a threshold[2], we're allowed to cache additional images. This patch thus improve, but doesn't completely fix, issue 12098. Note that that document is created by a very poor PDF generator, since every single page contains the entire document (with all of its /Resources) and to create the individual pages clipping is used.[3] --- [1] Currently set to `10` images; imagine what would happen to overall memory usage if we encountered e.g. 50 images each 10 MB in size. [2] This value was chosen, somewhat randomly, to be `40` megabytes; basically five times the [maximum individual image size per page](`6249ef517d/src/display/api.js (L2483-L2484)`). [3] This surely has to be some kind of record w.r.t. how badly PDF generators can mess things up...	2021-01-26 12:00:12 +01:00
calixteman	a3f6882b06	JS -- add support for choice widget (#12826 )	2021-01-25 23:40:57 +01:00
Tim van der Meij	25b84ce84c	Merge pull request #12828 from dhufnagel/feature/annotation_layer_display_fontsize [api-minor] Set font size and color for text widget annotations	2021-01-23 16:08:07 +01:00
Jonas Jenwald	6bcb4e3ad9	Ensure that `parseDefaultAppearance` won't attempt to access a not yet defined variable (PR 12831 follow-up) Note how, in the `if (this.stateManager.stateStack.length !== 0) {` branch, we're attempting to access the not yet defined variable[1] `args`. If this code-path is ever hit, an Error will be thrown and parsing will thus be aborted immediately (likely leading to e.g. rendering bugs). Note that I found this purely by accident, since I happened to glance at the LGTM report. However, I've since found that the error is also present during the unit-test[2] and with this patch we're actually testing the intended thing here. As part of fixing this, and to avoid re-introducing a similar bug in the future, we'll now instead always reset `args.length` before attempting to read the next operator. Also, we can use the existing `EvaluatorPreprocessor.savedStatesDepth` getter to simplify the save/restore detection a tiny bit. --- [1] The ESLint rule `no-use-before-define` would have helped catch this problem, but unfortunately we cannot enable that without quite a bit of refactoring all over the code-base. [2] The unit-test was updated such that it would fail in the `master`-branch.	2021-01-23 15:33:28 +01:00
Dominik Hufnagel	c5083cda02	set font size and color on annotation layer use the default appearance to set the font size and color of a text annotation widget	2021-01-22 23:12:14 +01:00
Tim van der Meij	6ffb6b1c0c	Merge pull request #12885 from Snuffleupagus/worker-tweak-caching Simplify the `PDFFunctionFactory._localFunctionCache` initialization (PR 12034 follow-up); Fix the `gStateObj` lookup in `TranslatedFont._removeType3ColorOperators` (PR 12718 follow-up)	2021-01-22 20:24:33 +01:00
Jonas Jenwald	ca1f58ea42	Use `_defaultAppearanceData` directly in `WidgetAnnotation._getSaveFieldResources` (PR 12831 follow-up) With the changes in PR 12831, it's no longer necessary to keep track of the `fontName`-string separately since it's available through the `_defaultAppearanceData`-property as well.	2021-01-22 13:23:04 +01:00
Jonas Jenwald	8137c0547d	Fix the `gStateObj` lookup in `TranslatedFont._removeType3ColorOperators` (PR 12718 follow-up) As can be seen in `2cba290361/src/core/evaluator.js (L986)` the `gStateObj` (which is actually an Array despite its name), is wrapped in Array when it's inserted into the OperatorList. Hence we obviously need to take this into account when accessing it in `TranslatedFont._removeType3ColorOperators`; this mistake happened because we don't have any test-cases for this particular code-path as far as I know.	2021-01-22 12:27:38 +01:00
Jonas Jenwald	cfaf23dee2	Simplify the `PDFFunctionFactory._localFunctionCache` initialization (PR 12034 follow-up) By changing this a `shadow`ed getter, we can simply access it directly and not worry about it being initialized. I have no idea why I didn't just implement it this way in the first place.	2021-01-22 12:25:05 +01:00
Brendan Dahl	2cba290361	Merge pull request #12836 from calixteman/update_buttons JS -- update radio/checkbox values even if there are no actions	2021-01-21 14:00:26 -08:00
calixteman	1039698697	Add a parser to get font data from the default appearance (#12831 ) * Add a parser to get font data from the default appearance - pdfium & poppler use a special parser too to get these info. * Update src/core/default_appearance.js Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com> Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com>	2021-01-21 20:15:31 +01:00
Jonas Jenwald	b4eb55250e	Remove redundant compatibility checks, for modern `generic` builds, in `src/core/worker.js` With the recent additions of optional chaining and nullish coalescing to the PDF.js code-base, a couple of the checks in `src/core/worker.js` are now redundant; please see this compatibility information: - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Optional_chaining#browser_compatibility - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Nullish_coalescing_operator#browser_compatibility In practice, for the non-translated/non-polyfilled PDF.js builds, browsers without support for optional chaining and nullish coalescing will simply throw immediately upon loading of the code. Hence both the `globalThis` and `Promise.allSettled` checks are now unnecessary, given this compatibility information: - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/globalThis#browser_compatibility - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise/allSettled#browser_compatibility Please note: The `ReadableStream` check is however still necessary, since Node.js doesn't support that.	2021-01-20 13:09:56 +01:00
Jonas Jenwald	2600e59acb	Always re-measure non-embedded ArialNarrow fonts (bug 1671312, PR 12725 follow-up) While PR 12725 fixed bug 1671312 as reported, i.e. the "In the upper right corner "Purposes' has bad kerning."-part, it however broke other parts of the text rendering. Note in particular the tables, e.g. on page 2 and beyond, where the glyphs are now rendered too close together. The reason for this is that the fonts in question are non-embedded ArialNarrow, which we just replace with Helvetica which obviously is not narrow. Given that the font replacement isn't a perfect fit for non-embedded ArialNarrow, we still need to re-measure the glyph widths in this case.	2021-01-14 15:51:48 +01:00
calixteman	1de1ae0be6	Merge pull request #12838 from calixteman/authors [api-minor] Change the "dc:creator" Metadata field to an Array	2021-01-12 02:44:58 -08:00
Calixte Denizet	43d5512f5c	[api-minor] Change the "dc:creator" Metadata field to an Array - add scripting support for doc.info.authors - doc.info.metadata is the raw string with xml code	2021-01-11 21:34:07 +01:00
Tim van der Meij	f85b8721d1	Merge pull request #12842 from Snuffleupagus/issue-12841 Improve handling of JPEG images without an EOI marker (issue 12841)	2021-01-10 13:21:28 +01:00
Jonas Jenwald	81525fd446	Use ESLint to ensure that `export`s are sorted alphabetically There's built-in ESLint rule, see `sort-imports`, to ensure that all `import`-statements are sorted alphabetically, since that often helps with readability. Unfortunately there's no corresponding rule to sort `export`-statements alphabetically, however there's an ESLint plugin which does this; please see https://www.npmjs.com/package/eslint-plugin-sort-exports The only downside here is that it's not automatically fixable, but the re-ordering is a one-time "cost" and the plugin will help maintain a consistent ordering of `export`-statements in the future. Note: To reduce the possibility of introducing any errors here, the re-ordering was done by simply selecting the relevant lines and then using the built-in sort-functionality of my editor.	2021-01-09 20:37:51 +01:00
Jonas Jenwald	cd9422a075	Improve handling of JPEG images without an EOI marker (issue 12841) Given that the PDF document in the issue contains the same very large JPEG image three times, this patch includes a test-case where only the first page has been extracted from it.	2021-01-09 20:19:39 +01:00
Calixte Denizet	7172f0a928	JS -- update radio/checkbox values even if there are no actions	2021-01-08 16:43:16 +01:00
Calixte Denizet	83119b9000	In a text widget, Font resources can be in the appearance	2021-01-08 10:13:47 +01:00
Tim van der Meij	048081fb69	Merge pull request #12824 from Snuffleupagus/preEvaluateFont-errors Improve the handling of errors, in `PartialEvaluator.loadFont`, occuring in `PartialEvaluator.preEvaluateFont` (issue 12823)	2021-01-07 23:15:41 +01:00
Tim van der Meij	5bde4b71f8	Merge pull request #12292 from calixteman/encoding Fix encoding issues when printing/saving a form with non-ascii characters	2021-01-07 22:56:42 +01:00
Jonas Jenwald	78c32c2697	Improve the handling of errors, in `PartialEvaluator.loadFont`, occuring in `PartialEvaluator.preEvaluateFont` (issue 12823) Currently any errors thrown in `preEvaluateFont`, which is a synchronous method, will not be handled at all in the `loadFont` method and we were thus failing to return an `ErrorFont`-instance as intended here. Also, add an explicit check in `PartialEvaluator.preEvaluateFont` to ensure that Type0-fonts always have a valid dictionary.	2021-01-07 11:38:38 +01:00
Calixte Denizet	56424967f2	Fix encoding issues when printing/saving a form with non-ascii characters	2021-01-05 17:23:18 +01:00
Tim van der Meij	ca18af6af3	Merge pull request #12774 from calixteman/doc_action_test JS -- Add tests for print/save actions	2021-01-03 18:46:37 +01:00
Tim van der Meij	50303fc8f4	Merge pull request #12766 from Snuffleupagus/issue-11004 Ignore, rather than throwing on, unsupported Coding style default (COD) options in JPEG 2000 images (issue 11004)	2020-12-28 20:26:10 +01:00
Calixte Denizet	ffd4bc790c	JS -- Add tests for print/save actions * change PDFDocument::hasJSActions to return true when there are JS actions in catalog.	2020-12-24 18:51:00 +01:00
Calixte Denizet	7c3facb174	JS -- Add support for buttons * radio buttons * checkboxes	2020-12-22 16:41:51 +01:00
Jonas Jenwald	cffb7af3b0	Ignore, rather than throwing on, unsupported Coding style default (COD) options in JPEG 2000 images (issue 11004) Similar to other markers that we currently skip, by ignoring unsupported Coding style default (COD) options we'll at least render something here (although some JPEG 2000 images may look slightly wrong). Note that if the unsupported COD options lead to additional errors, during parsing, we'll still abort parsing of the JPEG 2000 image.	2020-12-21 20:35:52 +01:00
Brendan Dahl	3ea1c43b15	Merge pull request #12751 from calixteman/da_not_a_string Add a default DA for textfield to avoid issues when printing or saving	2020-12-21 09:44:08 -08:00
Calixte Denizet	a7c682c600	Add a default DA for textfield to avoid issues when printing or saving * it aims to fix issue #12750	2020-12-19 23:38:45 +01:00
calixteman	e6e2809825	Merge pull request #12702 from calixteman/doc_actions JS - Collect and execute actions at doc level	2020-12-18 21:33:32 +01:00
Calixte Denizet	1e2173f038	JS - Collect and execute actions at doc and pages level * the goal is to execute actions like Open or OpenAction * can be tested with issue6106.pdf (auto-print) * once #12701 is merged, we can add page actions	2020-12-18 20:03:59 +01:00
Jonas Jenwald	48a76aea2b	Ignore, rather than throwing on, Coding style component (COC) markers in JPEG 2000 images (issue 12752) Similar to other markers that we currently skip, by ignoring the Coding style component (COC) marker we'll at least prevent outright errors (although some JPEG 2000 images may look slightly wrong).	2020-12-18 18:18:32 +01:00

1 2 3 4 5 ...

1948 Commits