pdf.js

Author	SHA1	Message	Date
Calixte Denizet	0ff5cd7eb5	XFA - Add a parser for XFA files - the parser is base on a class extending XMLParserBase - it handle xml namespaces: * each namespace is assocated with a builder * builder builds nodes belonging to the namespace * when a node is inserted in the parent namespace compatibility is checked (if required) - to avoid name collision between xml names and object properties, use Symbol.	2021-02-01 13:45:31 +01:00
Tim van der Meij	e4e92d10e8	Merge pull request #12922 from Snuffleupagus/getTextContent-globalImageCache Ignore globally cached images in `PartialEvaluator.getTextContent` (PR 11930 follow-up)	2021-01-28 23:44:10 +01:00
Tim van der Meij	8805614a03	Merge pull request #12924 from brendandahl/fix-clone Fix font data clone error when pdfBug is enabled.	2021-01-28 23:42:12 +01:00
Jonas Jenwald	72da2aa166	Ignore globally cached images in `PartialEvaluator.getTextContent` (PR 11930 follow-up) Given that we'll only cache `/XObject`s of the `Image`-type globally, we can utilize that in `PartialEvaluator.getTextContent` as well. This way, in cases such as e.g. issue 12098, we can avoid having to fetch/parse `/XObject`s that we already know to be `Image`s. This is helpful, since `Stream`s are not cached on the `XRef` instance (given their potential size) and the lookup can thus be somewhat expensive in general. Also, skip a redundant `RefSetCache.has` check in the `GlobalImageCache.getData` method.	2021-01-28 10:19:26 +01:00
Brendan Dahl	52fb5abb0b	Fix font data clone error when pdfBug is enabled. The widths property should be an object to match what metrics returns. In ZapfDingbats.pdf I was getting a data clone error with pdfBug enabled. In buildCharCodeToWidth() there was an encoding with the name "at" which is also the name of a method on an array. buildCharCodeToWidth assumes an object is passed in, so when it checked for the "at" property, it found the method and copied it over. This only seemed to affect Firefox.	2021-01-27 14:38:43 -08:00
Jonas Jenwald	1ab6d2c604	Improve global image caching for small images (PR 11912 follow-up, issue 12098) When implementing the `GlobalImageCache` functionality I was mostly worried about the effect of very large images, hence the maximum number of cached images were purposely kept quite low[1]. However, there's one fairly obvious problem with that approach: In documents with hundreds, or even thousands, of small images the `GlobalImageCache` as implemented becomes essentially pointless. Hence this patch, where the `GlobalImageCache`-implementation is changed in the following ways: - We're still guaranteed to be able to cache a minimum number of images, set to `10` (similar as before). - If the total size of all the cached image data is below a threshold[2], we're allowed to cache additional images. This patch thus improve, but doesn't completely fix, issue 12098. Note that that document is created by a very poor PDF generator, since every single page contains the entire document (with all of its /Resources) and to create the individual pages clipping is used.[3] --- [1] Currently set to `10` images; imagine what would happen to overall memory usage if we encountered e.g. 50 images each 10 MB in size. [2] This value was chosen, somewhat randomly, to be `40` megabytes; basically five times the [maximum individual image size per page](`6249ef517d/src/display/api.js (L2483-L2484)`). [3] This surely has to be some kind of record w.r.t. how badly PDF generators can mess things up...	2021-01-26 12:00:12 +01:00
calixteman	a3f6882b06	JS -- add support for choice widget (#12826 )	2021-01-25 23:40:57 +01:00
Tim van der Meij	25b84ce84c	Merge pull request #12828 from dhufnagel/feature/annotation_layer_display_fontsize [api-minor] Set font size and color for text widget annotations	2021-01-23 16:08:07 +01:00
Jonas Jenwald	6bcb4e3ad9	Ensure that `parseDefaultAppearance` won't attempt to access a not yet defined variable (PR 12831 follow-up) Note how, in the `if (this.stateManager.stateStack.length !== 0) {` branch, we're attempting to access the not yet defined variable[1] `args`. If this code-path is ever hit, an Error will be thrown and parsing will thus be aborted immediately (likely leading to e.g. rendering bugs). Note that I found this purely by accident, since I happened to glance at the LGTM report. However, I've since found that the error is also present during the unit-test[2] and with this patch we're actually testing the intended thing here. As part of fixing this, and to avoid re-introducing a similar bug in the future, we'll now instead always reset `args.length` before attempting to read the next operator. Also, we can use the existing `EvaluatorPreprocessor.savedStatesDepth` getter to simplify the save/restore detection a tiny bit. --- [1] The ESLint rule `no-use-before-define` would have helped catch this problem, but unfortunately we cannot enable that without quite a bit of refactoring all over the code-base. [2] The unit-test was updated such that it would fail in the `master`-branch.	2021-01-23 15:33:28 +01:00
Dominik Hufnagel	c5083cda02	set font size and color on annotation layer use the default appearance to set the font size and color of a text annotation widget	2021-01-22 23:12:14 +01:00
Tim van der Meij	6ffb6b1c0c	Merge pull request #12885 from Snuffleupagus/worker-tweak-caching Simplify the `PDFFunctionFactory._localFunctionCache` initialization (PR 12034 follow-up); Fix the `gStateObj` lookup in `TranslatedFont._removeType3ColorOperators` (PR 12718 follow-up)	2021-01-22 20:24:33 +01:00
Jonas Jenwald	ca1f58ea42	Use `_defaultAppearanceData` directly in `WidgetAnnotation._getSaveFieldResources` (PR 12831 follow-up) With the changes in PR 12831, it's no longer necessary to keep track of the `fontName`-string separately since it's available through the `_defaultAppearanceData`-property as well.	2021-01-22 13:23:04 +01:00
Jonas Jenwald	8137c0547d	Fix the `gStateObj` lookup in `TranslatedFont._removeType3ColorOperators` (PR 12718 follow-up) As can be seen in `2cba290361/src/core/evaluator.js (L986)` the `gStateObj` (which is actually an Array despite its name), is wrapped in Array when it's inserted into the OperatorList. Hence we obviously need to take this into account when accessing it in `TranslatedFont._removeType3ColorOperators`; this mistake happened because we don't have any test-cases for this particular code-path as far as I know.	2021-01-22 12:27:38 +01:00
Jonas Jenwald	cfaf23dee2	Simplify the `PDFFunctionFactory._localFunctionCache` initialization (PR 12034 follow-up) By changing this a `shadow`ed getter, we can simply access it directly and not worry about it being initialized. I have no idea why I didn't just implement it this way in the first place.	2021-01-22 12:25:05 +01:00
Brendan Dahl	2cba290361	Merge pull request #12836 from calixteman/update_buttons JS -- update radio/checkbox values even if there are no actions	2021-01-21 14:00:26 -08:00
calixteman	1039698697	Add a parser to get font data from the default appearance (#12831 ) * Add a parser to get font data from the default appearance - pdfium & poppler use a special parser too to get these info. * Update src/core/default_appearance.js Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com> Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com>	2021-01-21 20:15:31 +01:00
Jonas Jenwald	b4eb55250e	Remove redundant compatibility checks, for modern `generic` builds, in `src/core/worker.js` With the recent additions of optional chaining and nullish coalescing to the PDF.js code-base, a couple of the checks in `src/core/worker.js` are now redundant; please see this compatibility information: - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Optional_chaining#browser_compatibility - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Nullish_coalescing_operator#browser_compatibility In practice, for the non-translated/non-polyfilled PDF.js builds, browsers without support for optional chaining and nullish coalescing will simply throw immediately upon loading of the code. Hence both the `globalThis` and `Promise.allSettled` checks are now unnecessary, given this compatibility information: - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/globalThis#browser_compatibility - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise/allSettled#browser_compatibility Please note: The `ReadableStream` check is however still necessary, since Node.js doesn't support that.	2021-01-20 13:09:56 +01:00
Jonas Jenwald	2600e59acb	Always re-measure non-embedded ArialNarrow fonts (bug 1671312, PR 12725 follow-up) While PR 12725 fixed bug 1671312 as reported, i.e. the "In the upper right corner "Purposes' has bad kerning."-part, it however broke other parts of the text rendering. Note in particular the tables, e.g. on page 2 and beyond, where the glyphs are now rendered too close together. The reason for this is that the fonts in question are non-embedded ArialNarrow, which we just replace with Helvetica which obviously is not narrow. Given that the font replacement isn't a perfect fit for non-embedded ArialNarrow, we still need to re-measure the glyph widths in this case.	2021-01-14 15:51:48 +01:00
calixteman	1de1ae0be6	Merge pull request #12838 from calixteman/authors [api-minor] Change the "dc:creator" Metadata field to an Array	2021-01-12 02:44:58 -08:00
Calixte Denizet	43d5512f5c	[api-minor] Change the "dc:creator" Metadata field to an Array - add scripting support for doc.info.authors - doc.info.metadata is the raw string with xml code	2021-01-11 21:34:07 +01:00
Tim van der Meij	f85b8721d1	Merge pull request #12842 from Snuffleupagus/issue-12841 Improve handling of JPEG images without an EOI marker (issue 12841)	2021-01-10 13:21:28 +01:00
Jonas Jenwald	81525fd446	Use ESLint to ensure that `export`s are sorted alphabetically There's built-in ESLint rule, see `sort-imports`, to ensure that all `import`-statements are sorted alphabetically, since that often helps with readability. Unfortunately there's no corresponding rule to sort `export`-statements alphabetically, however there's an ESLint plugin which does this; please see https://www.npmjs.com/package/eslint-plugin-sort-exports The only downside here is that it's not automatically fixable, but the re-ordering is a one-time "cost" and the plugin will help maintain a consistent ordering of `export`-statements in the future. Note: To reduce the possibility of introducing any errors here, the re-ordering was done by simply selecting the relevant lines and then using the built-in sort-functionality of my editor.	2021-01-09 20:37:51 +01:00
Jonas Jenwald	cd9422a075	Improve handling of JPEG images without an EOI marker (issue 12841) Given that the PDF document in the issue contains the same very large JPEG image three times, this patch includes a test-case where only the first page has been extracted from it.	2021-01-09 20:19:39 +01:00
Calixte Denizet	7172f0a928	JS -- update radio/checkbox values even if there are no actions	2021-01-08 16:43:16 +01:00
Calixte Denizet	83119b9000	In a text widget, Font resources can be in the appearance	2021-01-08 10:13:47 +01:00
Tim van der Meij	048081fb69	Merge pull request #12824 from Snuffleupagus/preEvaluateFont-errors Improve the handling of errors, in `PartialEvaluator.loadFont`, occuring in `PartialEvaluator.preEvaluateFont` (issue 12823)	2021-01-07 23:15:41 +01:00
Tim van der Meij	5bde4b71f8	Merge pull request #12292 from calixteman/encoding Fix encoding issues when printing/saving a form with non-ascii characters	2021-01-07 22:56:42 +01:00
Jonas Jenwald	78c32c2697	Improve the handling of errors, in `PartialEvaluator.loadFont`, occuring in `PartialEvaluator.preEvaluateFont` (issue 12823) Currently any errors thrown in `preEvaluateFont`, which is a synchronous method, will not be handled at all in the `loadFont` method and we were thus failing to return an `ErrorFont`-instance as intended here. Also, add an explicit check in `PartialEvaluator.preEvaluateFont` to ensure that Type0-fonts always have a valid dictionary.	2021-01-07 11:38:38 +01:00
Calixte Denizet	56424967f2	Fix encoding issues when printing/saving a form with non-ascii characters	2021-01-05 17:23:18 +01:00
Tim van der Meij	ca18af6af3	Merge pull request #12774 from calixteman/doc_action_test JS -- Add tests for print/save actions	2021-01-03 18:46:37 +01:00
Tim van der Meij	50303fc8f4	Merge pull request #12766 from Snuffleupagus/issue-11004 Ignore, rather than throwing on, unsupported Coding style default (COD) options in JPEG 2000 images (issue 11004)	2020-12-28 20:26:10 +01:00
Calixte Denizet	ffd4bc790c	JS -- Add tests for print/save actions * change PDFDocument::hasJSActions to return true when there are JS actions in catalog.	2020-12-24 18:51:00 +01:00
Calixte Denizet	7c3facb174	JS -- Add support for buttons * radio buttons * checkboxes	2020-12-22 16:41:51 +01:00
Jonas Jenwald	cffb7af3b0	Ignore, rather than throwing on, unsupported Coding style default (COD) options in JPEG 2000 images (issue 11004) Similar to other markers that we currently skip, by ignoring unsupported Coding style default (COD) options we'll at least render something here (although some JPEG 2000 images may look slightly wrong). Note that if the unsupported COD options lead to additional errors, during parsing, we'll still abort parsing of the JPEG 2000 image.	2020-12-21 20:35:52 +01:00
Brendan Dahl	3ea1c43b15	Merge pull request #12751 from calixteman/da_not_a_string Add a default DA for textfield to avoid issues when printing or saving	2020-12-21 09:44:08 -08:00
Calixte Denizet	a7c682c600	Add a default DA for textfield to avoid issues when printing or saving * it aims to fix issue #12750	2020-12-19 23:38:45 +01:00
calixteman	e6e2809825	Merge pull request #12702 from calixteman/doc_actions JS - Collect and execute actions at doc level	2020-12-18 21:33:32 +01:00
Calixte Denizet	1e2173f038	JS - Collect and execute actions at doc and pages level * the goal is to execute actions like Open or OpenAction * can be tested with issue6106.pdf (auto-print) * once #12701 is merged, we can add page actions	2020-12-18 20:03:59 +01:00
Jonas Jenwald	48a76aea2b	Ignore, rather than throwing on, Coding style component (COC) markers in JPEG 2000 images (issue 12752) Similar to other markers that we currently skip, by ignoring the Coding style component (COC) marker we'll at least prevent outright errors (although some JPEG 2000 images may look slightly wrong).	2020-12-18 18:18:32 +01:00
Calixte Denizet	03814bd6a2	Don't use 'in' operator to check if key is in a Map	2020-12-16 16:00:12 +01:00
Tim van der Meij	d1848f5022	Merge pull request #12725 from brendandahl/remeasure-std Use widths defined by font for standard fonts.	2020-12-11 20:36:19 +01:00
Jonas Jenwald	67e5db75d8	Ignore color-operators in Type3 glyphs beginning with a `d1` operator (issue 12705) Please refer to the PDF specification at https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G8.1977497 and https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G7.3998470 This patch removes the color-operators in the evaluator, since that should be more efficient than doing it repeatedly in the main-thread when rendering the Type3 glyphs.	2020-12-11 15:49:13 +01:00
Brendan Dahl	45d9ab6e45	Use widths defined by font for standard fonts. There doesn't seem to be anything definitive about this in the spec, but from experimenting, it seems acrobat lets PDFs override the widths of the standard fonts.	2020-12-10 15:30:39 -08:00
Tim van der Meij	00b4f86db3	Merge pull request #12717 from Snuffleupagus/issue-12714 Ensure that the /Annots-entry, on /Page-instances, is actually an Array (issue 12714)	2020-12-10 23:06:59 +01:00
Calixte Denizet	25bf504ff5	Be sure that CalculationOrder is either null or a non-empty array	2020-12-10 16:02:11 +01:00
Jonas Jenwald	796a0d3155	Ensure that the /Annots-entry, on /Page-instances, is actually an Array (issue 12714) In the referenced PDF document, the second and third page has corrupt /Annots-entries which contain /Dict-data rather than the intended Arrays.	2020-12-10 11:42:00 +01:00
Tim van der Meij	012e15f7a3	Fix non-standard quadpoints orders for annotations This change requires us to use valid quadpoints arrays in the existing unit tests too due to the normalization.	2020-12-06 16:02:41 +01:00
Jonas Jenwald	c42029489e	Run `gulp lint --fix`, to account for changes in Prettier version `2.2.1` Please refer to https://github.com/prettier/prettier/blob/master/CHANGELOG.md#221 for additional details.	2020-11-29 10:01:46 +01:00
Tim van der Meij	256068556d	Merge pull request #12662 from Snuffleupagus/issue-12402 Check the top-level /Pages dictionary when finding the trailer in `XRef.indexObjects` (issue 12402)	2020-11-25 21:54:41 +01:00
Jonas Jenwald	8a132f584d	Check the top-level /Pages dictionary when finding the trailer in `XRef.indexObjects` (issue 12402) In addition to the existing /Root and /Pages validation, also check that the /Pages-entry actually is a dictionary and that it has a valid /Count-entry. This way we can avoid picking a trailer candidate which e.g. the `Catalog.numPages` getter will just end up rejecting, thus breaking PDF document loading completely.	2020-11-25 15:14:53 +01:00

1 2 3 4 5 ...

1887 Commits