pdf.js

Author	SHA1	Message	Date
Calixte Denizet	64a6efd95e	Follow-up of pr #12344	2020-09-09 11:46:02 +02:00
Brendan Dahl	e51e9d1f33	Merge pull request #12345 from calixteman/save_btn Don't try to save something for a button which is neither a checkbox nor a radio	2020-09-08 15:44:04 -07:00
calixteman	68b99c59ee	Save form data in XFA datasets when pdf is a mix of acroforms and xfa (#12344 ) * Move display/xml_parser.js in shared to use it in worker * Save form data in XFA datasets when pdf is a mix of acroforms and xfa Co-authored-by: Brendan Dahl <brendan.dahl@gmail.com>	2020-09-08 15:13:52 -07:00
Calixte Denizet	7e5026dfc5	Don't try to save something for a button which is neither a checkbox nor a radio	2020-09-08 20:47:46 +02:00
Tim van der Meij	20c891542b	Merge pull request #12269 from calixteman/highlight Add support for missing appearances for hightlights, strikeout, squiggly and underline annotations.	2020-09-06 22:25:36 +02:00
Calixte Denizet	65ecd981fe	Add support for missing appearances for hightlights, strikeout, squiggly and underline annotations.	2020-09-06 15:40:15 +02:00
Jonas Jenwald	6c8f1f7d6f	Run `gulp lint --fix`, to account for changes in Prettier version `2.1.x`	2020-09-06 12:23:59 +02:00
Jonas Jenwald	784a420027	Add support, in `Dict.merge`, for merging of "sub"-dictionaries This allows for merging of dictionaries one level deeper than previously. This could be useful e.g. for /Resources dictionaries, where you want to e.g. merge their respective /Font dictionaries (and other) together rather than picking just the first one.	2020-08-30 23:18:32 +02:00
Jonas Jenwald	2393443e73	Include the `/Order` array, if available, when parsing the Optional Content configuration The `/Order` array is used to improve the display of Optional Content groups in PDF viewers, and it allows a PDF document to e.g. specify that Optional Content groups should be displayed as a (collapsable) tree-structure rather than as just a list. Note that not all available Optional Content groups must be present in the `/Order` array, and PDF viewers will often (by default) hide those toggles in the UI. To allow us to improve the UX around toggling of Optional Content groups, in the default viewer, these hidden-by-default groups are thus appended to the parsed `/Order` array under a custom nesting level (with `name == null`). Finally, the patch also slightly tweaks an `OptionalContentConfig` related JSDoc-comment in the API.	2020-08-30 16:28:40 +02:00
Tim van der Meij	06b53d770a	Merge pull request #12259 from brendandahl/cmap-fix Fix handling of symbolic fonts and unicode cmaps.	2020-08-30 16:01:24 +02:00
Brendan Dahl	45e8a31cc0	Fix handling of symbolic fonts and unicode cmaps. In issue 12120, the font has a 1,0 cmap and is marked symbolic which according to the spec means we should directly use the cmap instead of the extra steps that are defined in 9.6.6.4. However, just fixing that caused bug 1057544 to break. The font in bug 1057544 has a 0,1 cmap (Unicode 1.1) which we were not using, but is easy to support. We're also easily able to support some of the other unicode cmaps, so I added those as well. There was also a second issue with bug 1057544, the cmap doesn't have a mapping for the "quoteright" glyph, but it is defined in the post table. To handle this, I've moved post table as a fallback for any font that has an encoding.	2020-08-27 14:33:11 -07:00
Calixte Denizet	ba94f04ba3	Bug 1661226 - Push button are not rendered with renderInteractiveForms enabled	2020-08-27 10:45:14 +02:00
Tim van der Meij	0f229d537f	Inline the `setup` method in the `parse` method in `src/core/document.js` Now that the `parse` method is simplified we can inline the `setup` method in the `parse` method since it's only two lines of code. This avoids some indirection.	2020-08-25 23:28:55 +02:00
Tim van der Meij	280207c740	Redo the form type detection logic and include unit tests Good form type detection is important to get reliable telemetry and to only show the fallback bar if a form cannot be filled out by the user. PDF.js only supports AcroForm data, so XFA data is explicitly unsupported (tracked in issue #2373). However, the previous form type detection couldn't separate AcroForm and XFA well enough, causing form type telemetry to be incorrect sometimes and the fallback bar to be shown for forms that could in fact be filled out by the user. The solution in this commit is found by studying the specification and the form documents that are available to us. In a nutshell the rules are: - There is XFA data if the `XFA` entry is a non-empty array or stream. - There is AcroForm data if the `Fields` entry is a non-empty array and it doesn't consist of only document signatures. The document signatures part was not handled in the old code, causing a document with only XFA data to also be marked as having AcroForm data. Moreover, the old code didn't check all the data types. Now that AcroForm and XFA can be distinguished, the viewer is configured to only show the fallback bar for documents that only have XFA data. If a document also has AcroForm data, the viewer can use that to render the form. We have not found documents where the XFA data was necessary in that case. Finally, we include unit tests to ensure that all cases are covered and move the form type detection out of the `parse` function so that it's only executed if the document information is actually requested (potentially making initial parsing a tiny bit faster).	2020-08-25 23:28:55 +02:00
Tim van der Meij	f0bf62ff54	Mark the `catDict` member as private in the `Catalog` class Not only is `catDict` never accessed anymore outside of this file, it should also never happen since it's internal to the catalog. If data from it is needed elsewhere, the catalog should provide a getter for it that can do basic data integrity checks and abstract away any unnecessary details.	2020-08-25 23:28:55 +02:00
Tim van der Meij	f20f0bcc78	Move the AcroForm logic from the document to the catalog The `AcroForm` entry is part of the catalog, not of the document, so its logic should be placed there instead. The document should look in the catalog to fetch it, and not have knowledge of `catDict`, which is a member internal to the catalog. Moreover, make the AcroForm member private on the document instance. It's only used internally and was also never intended to be public. For users it's exposed by the `getMetadata` API endpoint as `IsAcroFormPresent`. Only a boolean is exposed, so we now also only store the boolean on the document instance. Finally, the annotation code needs access to the full AcroForm dictionary, so it's updated to fetch the data from the catalog instead of the document that now only holds the boolean.	2020-08-25 23:28:55 +02:00
Tim van der Meij	b41a2f4d5a	Move the collection logic from the document to the catalog The `Collection` entry is part of the catalog, not of the document, so its logic should be placed there instead. The document should look in the catalog to fetch it, and not have knowledge of `catDict`, which is a member internal to the catalog. Moreover, remove the collection member from the document instance. It's only used internally and was also never intended to be public. For users it's exposed by the `getMetadata` API endpoint as `IsCollectionPresent`. Moving this out of the `parse` function makes sure that the getter is only executed if the document information is actually requested (potentially making initial parsing a tiny bit faster).	2020-08-25 23:28:55 +02:00
Tim van der Meij	935d95b462	Move the version logic from the document to the catalog The `Version` entry is part of the catalog, not of the document, so its logic should be placed there instead. The document should look in the catalog to fetch it, and not have knowledge of `catDict`, which is a member internal to the catalog. Moreover, make the version member private on the document instance. It's only used internally and was also never intended to be public. For users it's exposed by the `getMetadata` API endpoint as `PDFFormatVersion`. Finally, clarify how the version from the header and the version from the catalog are treated using a comment.	2020-08-25 23:28:55 +02:00
Jonas Jenwald	bd16c363ce	Access the `Catalog` data correctly in the "GetPageIndex" handler in `src/core/worker.js` Even though the code obviously works as-is, given that we have unit-tests for it, it still feels incorrect to just assume that the `Catalog`-instance has all of its properties immediately available. Especially when (almost) all of the other handlers, in `src/core/worker.js`, protect their data accesses with appropriate `pdfManager.ensure` calls.	2020-08-25 12:14:14 +02:00
Jonas Jenwald	2e6e2c3b41	Access the `XRef` data correctly in the "GetStats" handler in `src/core/worker.js` Even though the code obviously works as-is, given that we have unit-tests for it, it still feels incorrect to just assume that the `XRef`-instance has all of its properties immediately available. Especially when (almost) all of the other handlers, in `src/core/worker.js`, protect their data accesses with appropriate `pdfManager.ensure` calls.	2020-08-25 12:14:11 +02:00
Jani Pehkonen	e7febbf0f7	Accent positioning in Type1 `seac` glyphs In `display/canvas.js` the accent offsets must be multiplied by `fontSize` to make the offsets large enough. Another problem is in `core/type1_parser.js` when the Type1 command `seac` is handled. There is an error in the Adobe Type1 spec. See chapter 6 in Type1 Font Format Supplement, which provides an errata: The arguments of `seac` specify the offset of the left side bearing (LSB) points, not the offset of origins. This can be fixed in `core/type1_parser.js` by adding the difference of the LSB values.	2020-08-23 21:01:25 +03:00
Tim van der Meij	a8efc0296b	Obtain the export values for choice widgets from the normal appearance The down appearance (`D`) is optional and not available in the document from #12233, so the checkboxes are never saved/printed as checked because the checked appearance is based on the export value that is missing because the `D` entry is not available. Instead, we should use the normal appearance (`N`) since that one is required and therefore always available. Finally, the /Off appearance is optional according to section 12.7.4.2.3 of the specification, so that needs to be taken into account to match the specification and to fix reference test failures for the `annotation-button-widget-print` test. That is a file that doesn't specify an /Off appearance in the normal appearance dictionary.	2020-08-23 13:00:02 +02:00
Tim van der Meij	1b82ad8fff	Decode widget form values consistently The helper method `_decodeFormValue` is used to ensure that it happens in one place. Note that form values are field values, display values and export values.	2020-08-23 13:00:01 +02:00
Tim van der Meij	12c20772ac	Improve the field value parsing for choice widgets to handle `null` values The specification states that the field value is `null` if no item is selected and we didn't handle this case properly. Even though this did not break the rendering because we always convert the value to an array and the `includes` check in the display layer would simply not match, the field value would be `[null]` which is not expected and strange from an API perspective. This commit fixes that by ensuring that we return an empty array in case the field value is `null`. The API therefore still always gives an array for the field value, but now the code is more specific so that the value is either an empty array or an array of strings.	2020-08-19 23:27:50 +02:00
Jonas Jenwald	1058f16605	Add (basic) support for transfer functions to Images (issue 6931, bug 1149713) This is similar to the existing transfer function support for SMasks, but extended to simple image data. Please note that the extra amount of data now being sent to the worker-thread, for affected /ExtGState entries, is limited to at most 4 `Uint8Array`s each with a length of 256 elements. Refer to https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G9.1658137 for additional details.	2020-08-17 10:34:12 +02:00
Jonas Jenwald	9d3e046a4f	Don't cache /ExtGState entries that contain fonts (PR 12087 follow-up) I completely overlooked the fact that `PartialEvaluator.handleSetFont` also updates the current `state`, which means that currently we're not actually handling font data correctly for cached /ExtGState data. (Thankfully, using /ExtGState to set a font is somewhat rare in practice.)	2020-08-17 08:17:25 +02:00
Calixte Denizet	1a6816ba98	Add support for saving forms	2020-08-12 10:32:59 +02:00
Brendan Dahl	7fb01f9f2a	Merge pull request #12186 from brendandahl/loca-2 Fix bad truetype loca tables.	2020-08-10 20:34:19 -07:00
Brendan Dahl	f6dff81223	Fix bad truetype loca tables. Some fonts have loca tables that aren't sorted or use 0 as an offset to signal a missing glyph. This fixes the bad loca tables by sorting them and then rewriting the loca table and potentially re-ordering the glyf table to match. Fixes #11131 and bug 1650302.	2020-08-10 14:15:49 -07:00
Calixte Denizet	88b112ab0c	Support comb textfields for printing	2020-08-09 14:41:26 +02:00
Calixte Denizet	cd8bb7293b	Support multiline textfields for printing	2020-08-09 12:14:34 +02:00
Calixte Denizet	1747d259f9	Support textfield and choice widgets for printing	2020-08-06 14:45:23 +02:00
Brendan Dahl	ac494a2278	Add support for optional marked content. Add a new method to the API to get the optional content configuration. Add a new render task param that accepts the above configuration. For now, the optional content is not controllable by the user in the viewer, but renders with the default configuration in the PDF. All of the test files added exhibit different uses of optional content. Fixes #269. Fix test to work with optional content. - Change the stopAtErrors test to ensure the operator list has something, instead of asserting the exact number of operators.	2020-08-04 09:26:55 -07:00
Tim van der Meij	5a66c56eca	Merge pull request #12108 from calixteman/radio Add support for radios printing	2020-08-02 14:47:46 +02:00
Jonas Jenwald	6d192f987e	Prevent `Uncaught (in promise) AbortException` when running the unit-tests These errors can/will occur if data is still loading when the document is destroyed, which is the case in the API unit-tests that load the `tracemonkey.pdf` file. While this patch prevents these kind of problems, and thus allows us to update Jasmine again, I cannot help but thinking that it's slightly "hacky". Basically, we'll simply catch and ignore (some) rejected promises once the document is destroyed and/or its data loading is aborted. However, I don't think that these changes should cause issues in general, since we don't really care about errors once document destruction has started (note e.g. the fair number of `catch` handlers ignoring `AbortException`s already).	2020-07-31 23:29:05 +02:00
Calixte Denizet	538017f7a7	Add support for radios printing	2020-07-31 14:31:49 +02:00
Tim van der Meij	eb4d6a0652	Merge pull request #12107 from calixteman/checkbox Add support for checkboxes printing	2020-07-30 00:11:41 +02:00
Calixte Denizet	cb60523a15	Add support for checkboxes printing	2020-07-29 16:42:57 +02:00
Jonas Jenwald	fbe90b63ec	[src/core/worker.js] Remove a useless Promise handler from the `pdfManagerReady` function Looking carefully at this code, you'll notice that the `loadDocument` function has no less than three Promise handling functions. This obviously makes no sense, since a Promise can only have one resolve and one reject handler. Hence the final `onFailure`-case is unreachable, which only serves to add confusion when reading the code. Note that this code has been re-factored more than once over the years, but it seems as if this may even have been incorrect already in PR 3310 (and no-one have noticed for seven years :-).	2020-07-28 14:51:50 +02:00
Jonas Jenwald	835b5ffddd	Only check `isType3Font` the first time that `TranslatedFont.loadType3Data` is called If the `TranslatedFont.type3Loaded` property exists, then you already know that the font must be a Type3 one.	2020-07-27 13:20:15 +02:00
Jonas Jenwald	f3ff526019	Send/receive Type3 images the same way as other globally-cached images There's quite frankly no particular reason to special-case Type3-fonts with image resources, which are very rare anyway, now that we have a general mechanism for sending/receiving images globally.	2020-07-27 13:20:15 +02:00
Jonas Jenwald	7c9d0d5939	Improve how Type3-fonts with dependencies are handled While the `CharProcs` streams of Type3-fonts usually don't rely on dependencies, such as e.g. images, it does happen in some cases. Currently any dependencies are simply appended to the parent operatorList, which in practice means only the operatorList of the first page where the Type3-font is being used. However, there's one thing that's slightly unfortunate with that approach: Since fonts are global to the PDF document, we really ought to ensure that any Type3 dependencies are appended to the operatorList of all pages where the Type3-font is being used. Otherwise there's a theoretical risk that, if one page has its rendering paused, another page may try to use a Type3-font whose dependencies are not yet fully resolved. In that case there would be errors, since Type3 operatorLists are executed synchronously. Hence this patch, which ensures that all relevant pages will have Type3 dependencies appended to the main operatorList. (Note here that the `OperatorList.addDependencies` method, via `OperatorList.addDependency`, ensures that a dependency is only added once to any operatorList.) Finally, these changes also remove the need for the "waiting for the main-thread"-hack that was added to `PartialEvaluator.buildPaintImageXObject` as part of fixing issue 10717.	2020-07-27 13:20:13 +02:00
Calixte Denizet	584902dbf8	Add an annotation storage in order to save annotation data in acroforms	2020-07-24 10:50:11 +02:00
Jonas Jenwald	684a7b89ac	Remove unnecessary duplication in the `addChildren` helper function (used by the `ObjectLoader`) Besides being fewer lines of code overall, this also avoids one `node instanceof Dict` check for both of the `Dict`/`Stream`-cases.	2020-07-17 16:32:24 +02:00
Jonas Jenwald	ea8e432c45	Add a `getRawValues` method, to `Dict` instances, to provide an easier way of getting all raw values When the old `Dict.getAll()` method was removed, it was replaced with a `Dict.getKeys()` call and `Dict.get(...)` calls (in a loop). While this pattern obviously makes a lot of sense in many cases, there's some instances where we actually want the raw `Dict` values (i.e. `Ref`s where applicable). In those cases, `Dict.getRaw(...)` calls are instead used within the loop. However, by introducing a new `Dict.getRawValues()` method we can reduce the number of (strictly unnecessary) function calls by simply getting the raw `Dict` values directly.	2020-07-17 16:32:00 +02:00
Jonas Jenwald	6381b5b08f	Add a `size` getter, to `Dict` instances, to provide an easier way of checking the number of entries This removes the need to manually call `Dict.getKeys()` and check its length.	2020-07-17 16:06:11 +02:00
Tim van der Meij	e63d1ebff5	Merge pull request #12087 from Snuffleupagus/LocalGStateCache Add local caching of "simple" Graphics State (ExtGState) data in `PartialEvaluator.{getOperatorList, getTextContent}` (issue 2813)	2020-07-17 16:02:45 +02:00
Tim van der Meij	b19a1796ac	Convert `RefSetCache` to a proper class and to use a `Map` internally Using a `Map` instead of an `Object` provides some advantages such as cheaper ways to get the size of the cache, to find out if an entry is contained in the cache and to iterate over the cache. Moreover, we can clear and re-use the same `Map` object now instead of creating a new one.	2020-07-17 13:35:29 +02:00
Jonas Jenwald	b3480842b3	Use a `RefSet`, rather than a plain Object, for tracking already processed nodes in `PartialEvaluator.hasBlendModes`	2020-07-17 09:52:36 +02:00
Jonas Jenwald	03547b5633	Change `PartialEvaluator.setGState` to an `async` method Since this method calls `Dict.get` to fetch data, there could thus be `Error`s thrown in corrupt PDF documents when attempting to resolve an indirect object. To ensure that this won't ever become a problem, we change the method to be `async` such that a rejected Promise would be returned and general OperatorList parsing won't break.	2020-07-15 14:27:18 +02:00

1 2 3 4 5 ...

1771 Commits