pdf.js

Author	SHA1	Message	Date
calixteman	bbb64369f1	Merge pull request #13424 from calixteman/chunks2 [api-minor] Fix issues in text selection	2021-10-18 06:14:15 -07:00
Calixte Denizet	61d1063276	Fix issues in text selection - PR #13257 fixed a lot of issues but not all and this patch aims to fix almost all remaining issues. - the idea in this new patch is to compare position of new glyph with the last position where a glyph has been drawn; - no space are "drawn": it just moves the cursor but they aren't added in the chunk; - so this way a space followed by a cursor move can be treated as only one space: it helps to merge all spaces into one. - to make difference between real spaces and tracking ones, we used a factor of the space width (from the font) - it was a pretty good idea in general but it fails with some fonts where space was too big: - in Poppler, they're using a factor of the font size: this is an excellent idea (<= 0.1 * fontSize implies tracking space).	2021-10-17 16:27:05 +02:00
Jonas Jenwald	00720d059a	[api-minor] Include the /Lang-property in the `documentInfo`, and use it in the viewer (issue 14110) Please note: This is a tentative patch, since I don't have the necessary a11y-software to actually test it. To avoid having to add a new API-method just for a single string, I figured that adding the new property to the existing `documentInfo`-data (accessed via `PDFDocumentProxy.getMetadata` in the API) will hopefully be deemed acceptable.	2021-10-16 14:27:47 +02:00
Jonas Jenwald	0041230072	Re-name the `XFAFactory.numberPages` getter to `XFAFactory.numPages` for consistency All other similar getters are called `numPages` throughout the code-base, and improved consistency should always be a good thing.	2021-10-16 12:56:21 +02:00
Jonas Jenwald	0e5348180e	Fix the inconsistent return type of the `PDFDocument.isPureXfa` getter Also (slightly) simplifies a couple of small getters/methods related to the `XFAFactory`-instance.	2021-10-16 12:56:20 +02:00
Jonas Jenwald	cd94a44ca1	Remove some duplication in simple shadowed getters in `src/core/`-code In these cases there's no good reason, in my opinion, to duplicate the `shadow`-lines since that unnecessarily increases the risk of simple typos (see the previous patch).	2021-10-16 12:56:17 +02:00
Jonas Jenwald	1450da4168	Fix a `xfaFaxtory` typo in the shadowing in the `PDFDocument.xfaFactory` getter With this typo the shadowing doesn't actually work, which causes these checks to be unnecessarily repeated. In this particular case it didn't have a significant performance impact, however we should definately fix this nonetheless.	2021-10-16 11:54:12 +02:00
Jane-Kotovich	c2af309917	XFA - Embedded image is missing	2021-10-15 21:12:29 +10:00
Jay Berkenbilt	586295fad6	Implement TrueType character map "format 2" (fixes #14117 ) If a PDF included an embedded TrueType font whose preferred character map (cmap) was in "format 2", the code would select that character map and then refuse to read it because of an unsupported format, thus causing the characters not to be rendered. This commit implements support for format 2 as described at the link below. https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6cmap.html	2021-10-13 07:37:14 -04:00
Tim van der Meij	56e3ef68d4	Merge pull request #14106 from calixteman/names Empty name is allowed in ISO 32000	2021-10-09 14:29:10 +02:00
Jonas Jenwald	69a97bcba7	Take the /CIDToGIDMap data into account when computing the hash, in `PartialEvaluator.preEvaluateFont`, for composite fonts (bug 1734802) This is unfortunately yet another bug in the `preEvaluateFont`-implementation, and I've lost count of the number of times I've had to tweak this code over the years :-( I really cannot help thinking that PR 4423 was way too simplistic, since it missed a bunch of cases that leads to broken font rendering in many PDF documents. Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1734802	2021-10-08 13:15:21 +02:00
Calixte Denizet	f384ad2356	Empty name is allowed in ISO 32000 - the exact sentence from the spec: "The token SOLIDUS (a slash followed by no regular characters) introduces a unique valid name defined by the empty sequence of characters." - so just remove the warning.	2021-10-06 20:50:39 +02:00
Jonas Jenwald	bb9c905c5d	Ensure that various URL-related options are applied in the `xfaLayer` too Note how both the annotationLayer and the document outline will apply various URL-related options when creating the link-elements. For consistency the `xfaLayer`-rendering should obviously use the same options, to ensure that the existing options are indeed applied to all URLs regardless of where they originate.	2021-10-02 09:32:23 +02:00
Jonas Jenwald	284d259054	Merge pull request #14057 from Snuffleupagus/bug-920426 Support CMap-data with only strings, when parsing TrueType composite fonts (bug 920426)	2021-10-01 23:22:25 +02:00
Calixte Denizet	aecbd7cd89	AcroForm: Add support for ResetForm action - it aims to fix #12721. - Thanks to PR #14023, we've now the fieldObjects in the annotation layer so we can easily map fields names on their id if needed. - Reset values in the storage, in the JS sandbox and in the visible html elements.	2021-09-30 22:02:33 +02:00
Jonas Jenwald	d3ca28bc34	Support CMap-data with only strings, when parsing TrueType composite fonts (bug 920426) In the referenced bug, the embedded fonts contain custom CMap-data that only include strings. Note how for embedded composite TrueType fonts we're using the CMap-data when building the glyph mapping, and currently we end up with a completely empty map because the code expects only CID numbers. Furthermore, just fixing the glyph mapping alone isn't sufficient to fully address the bug, since we also need to consider this "special" kind of CMap-data when looking up glyph widths.	2021-09-30 18:10:47 +02:00
Tim van der Meij	9a74f3e6e0	Merge pull request #14049 from calixteman/bg_from_mk Annotation - Use border and background colors from MK dictionary	2021-09-29 21:13:20 +02:00
Calixte Denizet	0776cd9b90	Annotation - Use border and background colors from MK dictionary - it aims to fix #13003; - set the bg and fg colors as they're in the pdf; - put a transparent overlay to help to see the fields.	2021-09-26 20:49:26 +02:00
Jonas Jenwald	e6e04694f4	[api-minor] Move the `addDefaultProtocolToUrl`/`tryConvertUrlEncoding` functionality into the `createValidAbsoluteUrl` function Having recently worked with, and reviewed patches touching, this code it seemed that it's probably not a bad idea to move that functionality into `createValidAbsoluteUrl` as new options instead. For the `addDefaultProtocolToUrl` functionality in particular, the existing helper function was not only moved but slightly improved as well. Looking at the code, I realized that there's a small risk that it would incorrectly match a relative URL-string too. With these changes, the `createValidAbsoluteUrl` call-sites in the `src/core/`-code can be simplified a little bit. Please note: This patch may, indirectly, change the format of the `unsafeUrl`-property returned with relevant Annotations and OutlineItems; hence the `api-minor` tag. However, I'd argue that it's actually more correct this way since the whole purpose of `unsafeUrl` is/was to return the URL data as-is without any parsing done.	2021-09-26 14:29:54 +02:00
Calixte Denizet	558e58f354	XFA - Add <a> element in button when an url is detected (bug 1716758) - it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1716758; - some buttons have a JS action with the pattern `app.launchURL(...)` (or similar) so extract when it's possible the url and generate a <a> element with the href equals to the found url; - pdf.js already had some code to handle that so this patch slightly refactor that.	2021-09-25 21:59:39 +02:00
Calixte Denizet	c0e9108d00	Annotation - Some checkboxes have an empty N dictionary - it aims to fix #14021; - the N dict is empty here so just create a default one; - it implies that the checked checkbox has no appearance so create a default one too in order to print it; - in the pdf in the issue, a checked box is not printed because it has no default appearance so we need to guess its appearance from its state.	2021-09-25 16:00:47 +02:00
Tim van der Meij	cc110b8542	Merge pull request #14064 from Snuffleupagus/issue-13845 Fallback to font name matching, when checking for serif fonts (issue 13845)	2021-09-25 12:41:57 +02:00
Jonas Jenwald	b23b8d8a5d	Merge pull request #14074 from Snuffleupagus/issue-14046 [api-minor] Add basic support for RTL text-content in PopupAnnotations (issue 14046)	2021-09-25 12:37:44 +02:00
Tim van der Meij	36dc93fe5d	Merge pull request #14065 from Snuffleupagus/fewer-EXPORT_DATA_PROPERTIES [api-minor] Stop exporting, by default, a few additional Font properties (PR 11777 follow-up)	2021-09-25 12:25:56 +02:00
Jonas Jenwald	1dcd2f0cd3	[api-minor] Add basic support for RTL text-content in PopupAnnotations (issue 14046) In order to implement this, we utilize the existing `bidi` function to infer the text-direction of /T and /Contents entries. While this may not be perfect in cases where one PopupAnnotation mixes LTR and RTL languages, it should work well enough in most cases. To avoid having to add two new properties in lots of annotations, supplementing the existing `title`/`contents`-properties, this patch instead re-factors the existing code such that the properties are replaced by Objects (containing `str` and `dir`). Please note: In order avoid breaking existing third-party implementations, `GENERIC`-builds of the PDF.js library will still provide the old `title`/`contents`-properties on annotations returned by `PDFPageProxy.getAnnotations`.	2021-09-25 09:18:58 +02:00
calixteman	104e049338	Merge pull request #14073 from calixteman/bindItems XFA - Bind items when there's a bindItems entry	2021-09-24 09:01:52 -07:00
Calixte Denizet	97c1e076a1	XFA - Bind items when there's a bindItems entry - In the pdf in issue #14071, some select fields don't contain any values; - the corresponding node has a bindItems and a bind elements and _bindItems function was just not called.	2021-09-24 16:08:58 +02:00
Calixte Denizet	cd73e282eb	XFA - Create a new page in case of overflow - it aims to fix #14071; - a subform is overflowing and the the target in case of overflow is itself. In this case we must create a new page.	2021-09-24 14:57:55 +02:00
Calixte Denizet	4b0538d07a	Don't save anything in XFA entry if no XFA! (bug 1732344) - it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1732344 - rename some variables to have a more clear code; - and last but no least, add a unit test to test saving.	2021-09-23 19:51:23 +02:00
Jonas Jenwald	9acfe486d4	Fallback to font name matching, when checking for serif fonts (issue 13845) In order to handle fonts that specify completely bogus /Flags-entries, fallback to font name matching to determine if the font is a serif one.	2021-09-23 01:11:57 +02:00
Jonas Jenwald	e027748627	[api-minor] Stop exporting, by default, a few additional Font properties (PR 11777 follow-up) This is similar to the "isSymbolicFont"-property, which is no longer exported by default after PR 11777. Both "isMonospace" and "isSerifFont" are internal properties, used during font parsing and building of the glyph mapping on the worker-thread. However both of these properties are completely unused on the main-thread and/or in the API, and accessing them they will now require setting the `fontExtraProperties`-option when calling `getDocument`.	2021-09-23 00:44:43 +02:00
Jonas Jenwald	81a1c1cef7	Correctly validate URLs in XFA documents (bug 1731240) With this patch we'll ensure that only valid absolute URLs can be used in XFA documents, similar to the existing validation done for "regular" PDF documents. Furthermore, we'll also attempt to add a default protocol (i.e. `http`) to URLs beginning with "www." in XFA documents as well; this on its own is enough to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1731240	2021-09-21 21:21:01 +02:00
Jonas Jenwald	8ea27ce157	Tweak how fonts with an /Encoding are handled in `adjustToUnicode` (issue 14048, PR 13277 follow-up) Currently we only exclude /Encoding entries that also contains a /Differences array, which is the cause of the text-selection problem in the referenced issue. In order to address this we'll now also exclude /Encoding entries that contain one of the predefined named encodings, and no longer require that it also contains a /Differences array. Please note: This patch cases a small "regression" in the `bug1130815-text` test-case, however this is actually an improvement when compared with Adobe Reader and PDFium (in Google Chrome).	2021-09-18 22:44:25 +02:00
Tim van der Meij	83d3bb43f4	Merge pull request #14041 from Snuffleupagus/issue-9367 Support cmaps with only CID characters, when building the ToUnicode-map (issue 9367)	2021-09-18 16:47:06 +02:00
Calixte Denizet	2fc10727c5	XFA - Only warn about the wrong xfa type when there is an xfa thing	2021-09-18 15:44:05 +02:00
Jonas Jenwald	e3223b68fc	Extract some of the glyphMap handling, for non-embedded composite standard fonts, into a helper function This reduces some unnecessary duplication, since we currently have essentially the same code in a handful of places in the `Font.fallbackToSystemFont`-method.	2021-09-18 12:39:48 +02:00
Jonas Jenwald	ed73cf6d50	Support cmaps with only CID characters, when building the ToUnicode-map (issue 9367) In this particular case the `CMap`-data that we create contains only numbers, but no strings, which causes `PartialEvaluator.readToUnicode` to create a ToUnicode-map with only empty strings. Please note: This is yet another case where I don't know if it's necessarily the best and most correct solution, but it does fix the referenced issue.	2021-09-18 00:26:15 +02:00
Calixte Denizet	5bef8120e7	Annotation - For checkboxes, get field value from AS (if any) instead of V (bug 1722036) - it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1722036. - AS and V should share the same value for checkbox: it's at least what the specs say; - the pdf in the above bug opens correctly in Acrobat so it likely means that AS is chosen over V.	2021-09-17 13:04:16 +02:00
Jonas Jenwald	a11343e9af	Improve glyph mapping for non-embedded composite standard fonts with a /CIDToGIDMap (issue 11915) Please note: All of this feels very handwavy, but at least it passes all tests locally. Hopefully we have enough tests for this part of the font code. For non-embedded composite standard fonts with an "incomplete" /CIDToGIDMap, we'll now fallback to an explicitly defined /ToUnicode map even when that one happens to be an /Identity-H or /Identity-V map. The `Font.fallbackToSystemFont` method is unfortunately getting more and more special-cases, however that might be unavoidable given all the weird non-embedded fonts found in the wild :-(	2021-09-15 11:30:40 +02:00
Calixte Denizet	9812e35916	XFA - Don't create images for unsupported mime types	2021-09-14 10:55:25 +02:00
Jonas Jenwald	7025b9f859	[src/core/writer.js] Support `null` values in the `writeValue` function This fixes something that I noticed, having recently looked at both the `Lexer.getObj` and `writeValue` code. Please note that I unfortunately don't have an example of a form where saving fails without this patch. However, given its overall simplicity and that unit-tests are added, it's hopefully deemed useful to fix this potential issue pro-actively rather than waiting for a bug report. At this point one might, and rightly so, wonder if there's actually any real-world PDF documents where a `null` value is being used? Unfortunately the answer is yes, and we have a couple of examples in the test-suite (although none of those are related to forms); please see: `issue1015`, `issue2642`, `issue10402`, `issue12823`, `issue13823`, and `pr12564`.	2021-09-12 18:24:37 +02:00
Jonas Jenwald	5d578ea36a	[src/core/writer.js] Remove unnecessary string-wrapping for boolean values in `writeValue` (PR 13998 follow-up)	2021-09-12 15:45:45 +02:00
Jonas Jenwald	761519ef3f	Merge pull request #13998 from calixteman/bug1729971 Write boolean value when saving a form (bug 1729971)	2021-09-12 15:38:10 +02:00
Jonas Jenwald	a47844d1fc	Let `Lexer.getObj` return a dummy-`Cmd` for commands that start with a non-visible ASCII character (issue 13999) This way we avoid breaking badly generated PDF documents where a non-visible ASCII character is "glued" to a valid command.	2021-09-11 19:54:13 +02:00
Tim van der Meij	e97f01b17c	Merge pull request #13977 from Snuffleupagus/enqueueChunk-batch [api-minor] Reduce `postMessage` overhead, in `PartialEvaluator.getTextContent`, by sending text chunks in batches (issue 13962)	2021-09-11 13:34:07 +02:00
Jonas Jenwald	9ce63a6dc6	Merge pull request #13991 from brendandahl/interpolate Enable/disable image smoothing based on image interpolate value. (bug 1722191)	2021-09-11 10:02:53 +02:00
Brendan Dahl	f38fb42b42	Enable/disable image smoothing based on image interpolate value. (bug 1722191) While some of the output looks worse to my eye, this behavior more closely matches what I see when I open the PDFs in Adobe acrobat. Fixes: #4706, #9713, #8245, #1344	2021-09-10 14:23:35 -07:00
Calixte Denizet	474ab7c86d	Write boolean value when saving a form (bug 1729971) - it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1729971#c4.	2021-09-10 14:10:25 +02:00
Calixte Denizet	c5841b3794	XFA - Handle shorcut in SOM expression (issue #13994 )	2021-09-09 19:54:45 +02:00
Jonas Jenwald	45ddb12f61	Remove no-op `onPull`/`onCancel` streamSink callbacks from the "GetTextContent"-handler The `MessageHandler`-implementation already handles either of these callbacks being undefined, hence there's no particular reason (as far as I can tell) to add no-op functions here. Also, in a couple of `MessageHandler`-methods, utilize an already existing local variable more.	2021-09-09 00:01:10 +02:00

1 2 3 4 5 ...

2509 Commits