pdf.js

Author	SHA1	Message	Date
calixteman	98e893b84f	Merge pull request #13880 from eltociear/patch-5 Fix typo in cff_parser_spec.js	2021-08-06 19:31:52 +02:00
Ikko Ashimine	23236f1b0b	Fix typo in cff_parser_spec.js shoudn't -> shouldn't	2021-08-06 19:30:36 +09:00
Brendan Dahl	a38d1122d8	XFA - Support aria heading and table structure. (bug 1723421) (bug 1723425) https://bugzilla.mozilla.org/show_bug.cgi?id=1723421 https://bugzilla.mozilla.org/show_bug.cgi?id=1723425	2021-08-05 15:25:04 -07:00
Brendan Dahl	3e003245b1	[XFA] Add alt text for images. (bug 1723418) Not many XFA PDFs have alt text. Some examples: bug1723422.pdf xfa_bug1718670_1.pdf xfa_issue13611.pdf xfa_issue13633.pdf xfa_issue13634.pdf	2021-08-03 17:18:58 -07:00
Brendan Dahl	6cf1ee3251	Merge pull request #13858 from brendandahl/xfa-aria-label Add aria-labels to XFA form elements. (bug 1723422)	2021-08-03 17:18:08 -07:00
Brendan Dahl	6ea56f35ab	Add aria-labels to XFA form elements. (bug 1723422)	2021-08-03 15:58:33 -07:00
Jonas Jenwald	766299016f	Remove the `isEOF` helper function and slightly re-factor `EOF` Given how trivial the `isEOF` function is, we can simply inline the check at the various call-sites and remove the function (which ought to be ever so slightly more efficient as well). Furthermore, this patch also changes the `EOF` primitive itself to a `Symbol` instead of an Object since that has the nice benefit of making it unclonable (thus preventing accidentally trying to send `EOF` from the worker-thread).	2021-08-03 20:19:32 +02:00
Jonas Jenwald	16a09eaed8	Fix a broken regular expression in the `docId` unit-test (issue 13838, PR 13813 follow-up) The current regular expression contains a typo, leading to intermittent test-failures for certain `docId`s; sorry about that!	2021-08-01 15:18:25 +02:00
Tim van der Meij	d1c0f8f91c	Implement unit tests for the `parseQueryString` utility function Now that these unit tests are in place, we also take the opportunity to slightly modernize the code itself by using a `for ... of` loop.	2021-08-01 14:14:33 +02:00
Jonas Jenwald	b18620ac0f	Remove the closure used with the `PDFDocumentLoadingTask` class This patch utilizes the same approach as used in lots of other parts of the code-base, which thus slightly reduces the size of this code. By removing some of the (current) indirection, we can also simplify the JSDocs a little bit. Looking at the `gulp jsdoc` output, this actually seem to improve the documentation for this class.	2021-07-30 11:34:47 +02:00
Calixte Denizet	1d07ef597e	XFA - Must use bindItems element even if there is no direct binding (bug 1720907)	2021-07-20 17:07:32 +02:00
Tim van der Meij	07955fa1d3	Merge pull request #13735 from Snuffleupagus/bug-1720411 Ensure that the field value, for checkboxes, refers to an existing appearance state (bug 1720411)	2021-07-18 13:48:34 +02:00
Jonas Jenwald	03cf28bf17	[api-minor] Add `intent` support to the `PDFPageProxy.getOperatorList` method (issue 13704) With this patch, the `PDFPageProxy.getOperatorList` method will now return `PDFOperatorList`-instances that also include Annotation-operatorLists (when those exist). Hence this closes a small, but potentially confusing, gap between the `render` and `getOperatorList` methods. Previously we've been somewhat reluctant to do this, as explained below, but given that there's actual use-cases where it's required probably means that we'll have to implement it now. Since we still need the ability to separate "normal" rendering operations from direct `getOperatorList` calls in the worker-thread, this API-change unfortunately causes the internal renderingIntent to become a bit "messy" which is indeed unfortunate (note the `"oplist-"` strings in various spots). As-is I suppose that it's not all that bad, but we may want to consider changing the internal renderingIntent to e.g. a bitfield in the future. Besides fixing issue 13704, this patch would also be necessary if someone ever tries to implement e.g. issue 10165 (since currently `PDFPageProxy.getOperatorList` doesn't include Annotation-operatorLists). Please note: This patch is also tagged "api-minor" for a second reason, which is that we're now including the Annotation-id in the `beginAnnotation` argument. The reason for this is to allow correlating the Annotation-data returned by `PDFPageProxy.getAnnotations`, with its corresponding operatorList-data (for those Annotations that have it).	2021-07-16 17:16:30 +02:00
Jonas Jenwald	da808aeab3	Ensure that the field value, for checkboxes, refers to an existing appearance state (bug 1720411) Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1720411	2021-07-16 13:11:48 +02:00
Jonas Jenwald	3838c4e27c	Re-factor the handling of empty `Name`-instances (PR 13612 follow-up) When working on PR 13612, I mostly prioritized a simple solution that didn't require touching a lot of code. However, while working on PR 13735 I started to realize that the static `Name.empty` construction really wasn't a good idea. In particular, having a special `Name`-instance where the `name`-property isn't actually a String is confusing (to put it mildly) and can easily lead to issues elsewhere. The only reason for not simply allowing the `name`-property to be an empty string, in PR 13612, was to avoid having to touch a lot of existing code. However, it turns out that this is only limited to a few methods in the `PartialEvaluator` and a few of the `BaseLocalCache`-implementations, all of which can be easily re-factored to handle empty `Name`-instances. All-in-all, I think that this patch is even an overall improvement since we're now validating (what should always be) `Name`-data better in the `PartialEvaluator`. This is what I ought to have done from the start, sorry about the code churn here!	2021-07-15 12:00:42 +02:00
Calixte Denizet	9bbc194846	XFA - Support assist element	2021-07-11 21:01:18 +02:00
Calixte Denizet	58e1f51688	XFA - Fix text positions (bug 1718741) - font line height is taken into account by acrobat when it isn't with masterpdfeditor: I extracted a font from a pdf, modified some ascent/descent properties thanks to ttx and the reinjected the font in the pdf: only Acrobat is taken it into account. So in this patch, line heights for some substituted fonts are added. - it seems that Acrobat is using a line height of 1.2 when the line height in the font is not enough (it's the only way I found to fix correctly bug 1718741). - don't use flex in wrapper container (which was causing an horizontal overflow in the above bug). - consequently, the above fixes introduced a lot of small regressions, so in order to see real improvements on reftests, I fixed the regressions in this patch: - replace margin by padding in some case where padding is a part of a container dimensions; - remove some flex display: some containers are wrongly sized when rendered; - set letter-spacing to 0.01px: it helps to be sure that text is not broken because of not enough width in Firefox.	2021-07-09 18:11:12 +02:00
Jonas Jenwald	661c60ecc9	[api-minor] Support accessing both the original and modified PDF fingerprint The PDF.js API has only ever supported accessing the original file ID, however the second one that (should) exist in modified documents have thus far been completely inaccessible through the API. That seems like a simple oversight, caused e.g. by the viewer not needing it, since it really shouldn't hurt to provide API-users with the ability to check if a PDF document has been modified since its creation.[1] Please refer to https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G13.2261661 for additional information. For an example of how to update existing code to use the new API, please see the changes in the `web/app.js` file included in this patch. Please note: While I'm not sure if we'll ever be able to remove the old `PDFDocumentProxy.fingerprint` getter, given that it's existed since "forever", that probably isn't a big deal given that it's now limited to only `GENERIC`-builds. --- [1] Although this obviously depends on the PDF software following the specification, by updating the second file ID as intended.	2021-07-03 13:56:33 +02:00
Calixte Denizet	ff440d13e7	XFA - Remove empty pages - it aims to fix #13583; - fix the switch to breakBefore target; - force the layout of an unsplittable element on an empty page; - don't fail when there is horizontal overflow (except in lr-tb); - handle correctly overflow in the same content area (bug 1717805, bug 1717668); - fix a typo in radial gradient first argument.	2021-06-30 16:32:27 +02:00
Calixte Denizet	429ffdcd2f	XFA - Save filled data in the pdf when downloading the file (Bug 1716288) - when binding (after parsing) we get a map between some template nodes and some data nodes; - so set user data in input handlers in using data node uids in the annotation storage; - to save the form, just put the value we have in the storage in the correct data nodes, serialize the xml as a string and then write the string at the end of the pdf using src/core/writer.js; - fix few bugs around data bindings: - the "Off" issue in Bug 1716980.	2021-06-25 18:57:01 +02:00
Brendan Dahl	f4f00a9bc6	Merge pull request #13618 from calixteman/bind_root XFA - Always bind root subform on root data	2021-06-23 13:14:12 -07:00
Calixte Denizet	b836616667	XFA - Always bind root subform on root data - it partially fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1717805 (on the data side at least but there is still a layout issue).	2021-06-23 20:46:41 +02:00
Jonas Jenwald	6467907318	Support corrupt documents with empty `Name`-entries (issue 13610) Apparently some really bad PDF software can create documents with empty `Name`-entries, which we thus need to somehow deal with. While I don't know if this patch is necessarily the best solution, it should at least ensure that the empty `Name`-instance cannot accidentally match a proper `Name`-instance (and it doesn't require changes to a lot of existing code).[1] --- [1] I briefly considered using a `Symbol` rather than an Object, but quickly decided against that since the former one [is not clonable](https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm#supported_types) and `Name`-instances may be sent to the API.	2021-06-22 16:55:44 +02:00
calixteman	56a75f8b26	Revert "Revert "XFA - Fix the way to select page on breaking"" - and fix the error which caused the backout: add an $extra property when creating html. - switch to next content area when breaking on page area.	2021-06-21 17:07:31 +02:00
calixteman	a9385bbb52	Revert "XFA - Fix the way to select page on breaking"	2021-06-21 15:45:04 +02:00
Calixte Denizet	7aea8faa34	XFA - Fix the way to select page on breaking - it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1716838. - some fonts in the pdf in the bug where bold when they shouldn't so write the font properties in the html to avoid to use some wrong inherited ones.	2021-06-21 12:45:23 +02:00
Calixte Denizet	7cdbc98716	XFA - Match font family correctly - partial fix for https://bugzilla.mozilla.org/show_bug.cgi?id=1716980; - some pdf can contain an invalid font family (e.g. 'Windings 3') so in this case remove the space; - the font family in typeface attribute doesn't always match the one defined in the FontDescriptor dictionary.	2021-06-20 15:16:28 +02:00
Calixte Denizet	df08b1548b	XFA - Fix layout issues - PR #13554 is buggy, so this patch aims to fix bugs. - check if a component fits into its parent in taking into account the parent layout. - introduce method isSplittable for template nodes to know if a component can be splitted in case of overflow.	2021-06-17 16:09:22 +02:00
Calixte Denizet	8eeb7ab4a3	XFA - Add the possibily to layout and measure text - some containers doesn't always have their 2 dimensions and those dimensions re based on contents; - so in order to measure text, we must get the glyph widths (for the xfa fonts) before starting the layout; - implement a word-wrap algorithm; - handle font change during text layout.	2021-06-17 14:17:02 +02:00
Calixte Denizet	793a0156ce	XFA - By default a text ui has only one line when in a field element - it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1716809.	2021-06-16 20:18:29 +02:00
Calixte Denizet	d89c429d78	XFA - Handle maxChars property for text fields - it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1716294.	2021-06-14 13:07:06 +02:00
Brendan Dahl	d333af7848	Merge pull request #13527 from calixteman/bind_inf_loop XFA - Avoid infinite loop when creating some nodes in data	2021-06-09 12:37:29 -07:00
Brendan Dahl	aa2712744d	Merge pull request #13502 from calixteman/contentarea XFA - contentarea must be on top of the other containers in a pageArea	2021-06-09 12:36:21 -07:00
Calixte Denizet	cddc1d869d	XFA - Avoid infinite loop when creating some nodes in data	2021-06-09 19:07:59 +02:00
Jonas Jenwald	a01c599247	Cache the "raw" standard font data in the worker-thread (PR 12726 follow-up) This implementation is basically a copy of the pre-existing `builtInCMapCache` implementation. For some, badly generated, PDF documents it's possible that we'll end up having to fetch the same standard font data over and over (which is obviously inefficient). While not common, it's certainly possible that a PDF document uses custom font names where the actual font then references one of the standard fonts; see e.g. issue 11399 for one such example. Note that I did suggest adding worker-thread caching of standard font data in PR 12726, however it wasn't deemed necessary at the time. Now that we have a real-world example that benefit from caching, I think that we should simply implement this now.	2021-06-09 18:27:51 +02:00
Calixte Denizet	34a2fa72c7	XFA - Add Liberation-Sans font as a substitution for some missing fonts - Some js files contain scale factors for each glyph in order to rescale Liberation to have a final font with the correct width. - A lot of XFA have some containers where their dimensions are based on their text content, so using default font from browser can lead to an almost unreadable pdf.	2021-06-09 16:55:45 +02:00
Calixte Denizet	1486608f32	XFA - contentarea must be on top of the other containers in a pageArea	2021-06-09 15:29:29 +02:00
Calixte Denizet	cfa727474e	XFA - Fix layout issues (again) - some elements weren't displayed because their rotation angle was not taken into account; - fix box model (XFA concept): - remove use of outline; - position correctly border which isn't part of box dimensions; - fix margins issues (see issue #13474). - move border on button instead of having it on wrapping div;	2021-06-08 17:42:53 +02:00
Jonas Jenwald	e7dc822e74	Merge pull request #12726 from brendandahl/standard-fonts [api-minor] Include and use the 14 standard font files.	2021-06-08 10:09:40 +02:00
Brendan Dahl	4c1dd47e65	Include and use the 14 standard fonts files.	2021-06-07 11:10:11 -07:00
Calixte Denizet	5dc7f4ade8	XFA - CDATA can be xml so parse it when required	2021-06-07 10:38:39 +02:00
Calixte Denizet	112645ea3d	XFA - Don't bind a form node with an empty value when the data node doesn't exist	2021-06-06 17:59:01 +02:00
Calixte Denizet	11573ddd16	XFA - Implement usehref support - attribute 'use' was already implemented but not usehref - in general, usehref should make reference to current document - add support for SOM expressions in use and usehref to search a node. - get prototype for all nodes if any.	2021-06-04 14:57:05 +02:00
Jonas Jenwald	af78ba64bd	Don't change options of the globally used `PartialEvaluator` in the "should render checkbox with fallback font for printing" unit-test Given that the same `PartialEvaluator`-instance is used for a lot of these unit-tests, manually changing the options in any one test-case could lead to intermittently failing unit-tests since they're run in a random order. To fix this, we simply have to use the existing method to clone the `PartialEvaluator`-instance but with the custom options.	2021-05-31 12:14:58 +02:00
Calixte Denizet	45c3f00a27	XFA - Move the fake HTML representation of XFA from the worker to the main thread - the only goal of this patch is to be able to get synchronously the fake html when printing from firefox: - in order to print we need to inject some html in beforeprint callback but we cannot block in waiting for all the pages. - from a memory point of view: it doesn't change anything since the fake HTML is deleted in the worker; - this way we don't break any assumptions.	2021-05-25 19:33:07 +02:00
Calixte Denizet	7cebdbd58c	XFA - Fix lot of layout issues - I thought it was possible to rely on browser layout engine to handle layout stuff but it isn't possible - mainly because when a contentArea overflows, we must continue to layout in the next contentArea - when no more contentArea is available then we must go to the next page... - we must handle breakBefore and breakAfter which allows to "break" the layout to go to the next container - Sometimes some containers don't provide their dimensions so we must compute them in order to know where to put them in their parents but to compute those dimensions we need to layout the container itself... - See top of file layout.js for more explanations about layout. - fix few bugs in other places I met during my work on layout.	2021-05-25 17:51:36 +02:00
Tim van der Meij	d1d9b9043d	Merge pull request #13415 from Snuffleupagus/getDestination-out-of-order Improve handling of named destinations in out-of-order NameTrees (PR 10274 follow-up)	2021-05-21 20:15:09 +02:00
Jonas Jenwald	8d5689387b	Improve handling of named destinations in out-of-order NameTrees (PR 10274 follow-up) According to the specification, see https://web.archive.org/web/20210404042322if_/https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G6.2384179, the keys of a NameTree/NumberTree should be ordered. For corrupt PDF files, which violate this assumption, it's thus possible that trying to lookup a single entry fails. Previously, in PR 10274, we implemented a fallback that only applies to the "bottom" node of a NameTree/NumberTree, which in general might not actually help for sufficiently corrupt NameTree/NumberTree data. Instead we remove the current limited fallback from `NameOrNumberTree.get`, and defer to the call-site to handle this case explicitly e.g. by using `NameOrNumberTree.getAll` for data where that makes sense. For well-formed documents, these changes should not lead to any additional data fetching/parsing. Finally, as part of these changes, the validation of named destination data is improved in the `Catalog` and a new unit-test is also added.	2021-05-21 15:48:37 +02:00
Jonas Jenwald	1a8d05fdcf	Remove some, with Prettier `2.3.0`, unnecessary `// prettier-ignore` comments To get the maximum benefit from something like Prettier, you obviously don't want to disable the automatic formatting unless absolutely necessary. When we added Prettier there were a number of cases, mostly involving larger Arrays, which required disabling of the automatic formatting for overall readability and/or to not break inline comments. With changes in Prettier version `2.3.0`, see [the release notes](https://prettier.io/blog/2021/05/09/2.3.0.html#concise-formatting-of-number-only-arrays-10106httpsgithubcomprettierprettierpull10106-10160httpsgithubcomprettierprettierpull10160-by-thorn0httpsgithubcomthorn0), there's now better formatting support for Arrays containing only numbers. Hence we can now remove a number of `// prettier-ignore` comments, and thus get the benefit of automatic formatting in (slightly) more of the code-base.	2021-05-19 11:36:03 +02:00
Calixte Denizet	4544ebf38a	Handle PI with no value in xml parser - an XML PI contains a target and optionally some content (see https://en.wikipedia.org/wiki/Processing_Instruction) - the parser expected to always have some content and so it could lead to wrong parsing.	2021-05-18 10:22:18 +02:00

1 2 3 4 5 ...

832 Commits