Commit Graph

5832 Commits

Author SHA1 Message Date
Calixte Denizet
ae842e1c3a [api-minor] Annotations - Adjust the font size in text field in considering the total width (bug 1721335)
- it aims to fix #14502 and bug 1721335;
 - Acrobat and Pdfium do the same;
 - it'll avoid to have truncated data when printed;
 - change the factor to compute font size in using field height: lineHeight = 1.35*fontSize
  - this is the value used by Acrobat.
 - in order to not have truncated strings on the bottom, add few basic metrics for standard fonts.
2022-01-30 15:53:31 +01:00
Jonas Jenwald
7cc761a8c0 Polyfill structuredClone with core-js (PR 13948 follow-up)
This allows us to remove the manually implemented `structuredClone` polyfill, thus reducing the maintenance burden for the `LoopbackPort` class; refer to https://github.com/zloirock/core-js#structuredclone

*Please note:* While `structuredClone` support landed already in Firefox 94, Google Chrome only added it in version 98 (currently in Beta). However, given that the `LoopbackPort` will only be used together with *fake workers* in browsers this shouldn't be too much of a problem.[1]
For Node.js environments, where *fake workers* are unfortunately necessary, using a `legacy/`-build is already required which thus guarantees that the `structuredClone` polyfill is available.

Also, the patch updates core-js to the latest version since that one includes `structuredClone` improvements; please see https://github.com/zloirock/core-js/releases/tag/v3.20.3

---
[1] Given that we only support browsers with proper worker support, if *fake workers* are being used that essentially indicates a configuration problem/error.
2022-01-27 21:11:42 +01:00
Jonas Jenwald
8f6965b197
Merge pull request #14506 from Snuffleupagus/license_header_2022
Update the year in the `license_header` files
2022-01-27 19:34:56 +01:00
Jonas Jenwald
00bd549e82 Update the year in the license_header files
This also includes a couple of files that are included as-is in the `pdfjs-dist` library.
2022-01-27 19:24:31 +01:00
calixteman
838909f8c1
Merge pull request #14491 from quaoaris/lines-rendered-too-thick
fix for lines (stroke) are rendered too thick  (Bug 1743245)
2022-01-27 18:46:26 +01:00
Calixte Denizet
3a7004ca25 Take into account all rotations before comparing glyph positions
- it aims to fix #14497;
 - previously, only rotations with an angle 0, 90, 180 or 270 were taken into account;
 - so generalize to any angle but keep the fast path for 0, 90, ... because they're likely more common than anything else.
2022-01-26 17:19:00 +01:00
quaoaris
3f77d80f31 fix for lines (stroke) are rendered too thick (Bug 1743245)
This commit fixes Bug 1743245 (Grided PDF file lines rendered too thick) which was created by a fix for  #12868 .
The lineWidth was set to round(1 * this._combinedScaleFactor) when the pixel is drawn as a parallelorgam with a height <1. This fix changes this to floor(1*this._combinedScaleFactor) .

This change shows a visual result comparable to Chrome and Acrobat.
Regarding the last PR 3 statements in canvas.js are affected and will change with this commit (stroke and paintChar).

renaming the reference files to naming comvention
2022-01-25 10:27:30 +01:00
Jonas Jenwald
8836593b9e Add a (global) cache to the getCharUnicodeCategory function
Given that the regular expression has already become more complex (after the initial patch adding it), it seems to me that it probably cannot hurt to add a global cache to reduce unnecessary re-parsing.
Obviously the `Glyph`-instances are being cached *per* font, however in most documents multiple fonts are being used and in practice there's very often a fair amount of overlap between the /ToUnicode-data in different fonts[1].

Consider for example loading and rendering the entire `tracemonkey.pdf` document (from the test-suite), which isn't a particularily large document. In that case the `getCharUnicodeCategory` function is being called a total of `601` times, however there's only `106` *unique* unicode-chars being checked.

*Please note:* In practice I suppose that this won't have a *huge* effect on overall performance, however given the relative simplicity of this patch I figured that it'd not hurt to submit it for review.

---
[1] Consider e.g. how there's usually different fonts used for regular, bold, respectively italic text.
2022-01-25 09:59:34 +01:00
Calixte Denizet
e1d3a3b414 Remove the invisible format marks from the text chunks
- it aims to fix issue #9186.
2022-01-24 13:47:24 +01:00
calixteman
88236e1163
Merge pull request #14430 from calixteman/beforeinput
[JS] Use beforeinput event to trigger a keystroke event in the sandbox
2022-01-23 20:42:33 +01:00
Calixte Denizet
6ac296e48e [JS] Use beforeinput event to trigger a keystroke event in the sandbox
- it aims to fix issue #14307;
 - this event has been added recently in Firefox and we can now use it;
 - fix few bugs in aform.js or in annotation_layer.js;
 - add some integration tests to test keystroke events (see `AFSpecial_Keystroke`);
 - make dispatchEvent in the quickjs sandbox async.
2022-01-23 19:53:01 +01:00
Tim van der Meij
23b6fde9fc
Merge pull request #14464 from Snuffleupagus/issue-14462
Support Type1 font files with incomplete /CharStrings definitions (issue 14462)
2022-01-19 20:38:46 +01:00
calixteman
b0231cc887
Merge pull request #14456 from calixteman/1749563
Font renderer - get int8 instead of uint8 in composite glyphes (bug 1749563)
2022-01-19 01:20:49 -08:00
Calixte Denizet
74f25d2755 Font renderer - get int8 instead of uint8 in composite glyphes (bug 1749563)
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1749563;
 - use some helper functions to get (u|i)int** values in buffer: it helps to have a clearer code;
 - in composite glyphes the translations values with a transformations are signed so consequently get some int8 instead of uint8;
 - add few TODOs.
2022-01-18 22:06:23 +01:00
Jonas Jenwald
a13ae5d97d Support Type1 font files with incomplete /CharStrings definitions (issue 14462)
Please refer to https://www.pdfa.org/norm-refs/Type1Fonts.pdf#page=15 for the expected format for the /CharStrings entries.
In the referenced PDF document the /CharStrings are missing the expected end-token, which causes us to swallow the start of the next glyph name.
2022-01-17 18:55:22 +01:00
Jonas Jenwald
ba37d600d7 Make the normalizeWhitespace handling, in the PartialEvaluator, more efficient (PR 14428 follow-up)
After the changes in PR 14428 we can *directly*, and more efficiently, handle whitespace conversion in `PartialEvaluator.getTextContent` when the `normalizeWhitespace` option is being used.
This way we no longer need a separate helper function for this, and can avoid having to (again) iterate through the text and checking each character. Finally, this also removes the need for using a regular expression on e.g. all non-ASCII text.
2022-01-16 08:29:21 +01:00
calixteman
da953f4b64
Merge pull request #14428 from calixteman/typo
Use the correct dimension to know if we have to add an EOL in vertical mode
2022-01-15 12:47:10 -08:00
Calixte Denizet
9dae421a0d Handle all the whitespaces the same way when creating text chunks 2022-01-15 21:44:00 +01:00
Tim van der Meij
922dac035c
Merge pull request #14448 from Snuffleupagus/Type3-circular-refs
Prevent circular references in Type3 fonts
2022-01-15 14:11:47 +01:00
Tim van der Meij
a72d188599
Merge pull request #14439 from Snuffleupagus/issue-14438
Ignore Annotations with empty /Rect-entries in the display-layer (issue 14438)
2022-01-15 14:11:25 +01:00
Tim van der Meij
c0d2932faf
Merge pull request #14454 from Snuffleupagus/util-more-unreachable
Replace some `assert` usage with `unreachable` in the `src/shared/util.js` file
2022-01-15 13:52:10 +01:00
Tim van der Meij
625f829842
Merge pull request #14446 from Snuffleupagus/issue-14435
Expose even more API-functionality in the TypeScript definitions (issue 14435, PR 14013 follow-up)
2022-01-15 13:46:11 +01:00
Jonas Jenwald
0e1b93bf20 Replace some assert usage with unreachable in the src/shared/util.js file
Inlining the checks should be a *tiny bit* more efficient, since it avoids have to make *unconditional* function calls in these fairly commonly used helper functions.
2022-01-15 13:01:25 +01:00
Jonas Jenwald
12d8f0b64d Re-factor the stringToPDFString helper function for UTF-16 strings
This patch changes the function to instead utilize the `TextDecoder` for both kinds of UTF-16 BOM strings.
2022-01-14 20:38:40 +01:00
Jonas Jenwald
76444888fb Add (basic) UTF-8 support in the stringToPDFString helper function (issue 14449)
This patch implements this by looking for the UTF-8 BOM, i.e. `\xEF\xBB\xBF`, in order to determine the encoding.[1]
The actual conversion is done using the `TextDecoder` interface, which should be available in all environments/browsers that we support; please see https://developer.mozilla.org/en-US/docs/Web/API/TextDecoder#browser_compatibility

---
[1] Assuming that everything lacking a UTF-16 BOM would have to be UTF-8 encoded really doesn't seem correct.
2022-01-14 18:57:07 +01:00
Jonas Jenwald
53d4ee7990 Prevent circular references in Type3 fonts
In corrupt PDF documents Type3 fonts may introduce circular dependencies, thus resulting in the affected font(s) never loading and parsing/rendering never completing.
Note that I've not seen any real-world examples of this kind of font corruption, but the attached PDF document was rather found in https://github.com/pdf-association/safedocs/tree/main/Miscellaneous%20Targeted%20Test%20PDFs

*Please note:* That repository contains a number of reduced test-cases that are specifically intended to test interoperability (between PDF viewer) and parsing/rendering for various kinds of strange/corrupt PDF documents.
Some of the test-cases found there may thus not make sense to try and "fix" upfront, in my opinion, unless the problems are also found in real-world PDF documents.
2022-01-13 17:58:37 +01:00
Jonas Jenwald
b9849e38b8 Expose even more API-functionality in the TypeScript definitions (issue 14435, PR 14013 follow-up)
While `PageViewport` apparently makes sense in TypeScript environments, given that it's being returned by the `PDFPageProxy.getViewport`-method in the API, we really don't want to extend the *public* API by simply exporting the class directly in `src/pdf.js` since it should never be called/initialized manually.
Hence we follow the same pattern as in PR 14013, and also extend the API unit-tests to ensure that `PDFPageProxy.getViewport` always returns a `PageViewport`-instance as expected.
2022-01-13 12:05:40 +01:00
Jonas Jenwald
08d88a0235 Ignore Annotations with empty /Rect-entries in the display-layer (issue 14438)
This prevents the `BaseSVGFactory.create`-method from throwing, and thus preventing any remaining Annotations (on the page) from rendering in corrupt documents.
2022-01-11 13:54:35 +01:00
Tim van der Meij
8ac0ccc227
Merge pull request #14424 from Snuffleupagus/mv-addLinkAttributes
[api-minor] Move `addLinkAttributes`, `LinkTarget`, and `removeNullCharacters` into the viewer (PR 14092 follow-up)
2022-01-08 13:19:11 +01:00
Calixte Denizet
6369617e6f [JS] Fix few errors around AFSpecial_Keystroke
- @cincodenada found some errors which are fixed in this patch;
 - it partially fixes issue #14306;
 - add some tests.
2022-01-08 12:34:56 +01:00
Calixte Denizet
9bb636402a Use the correct dimension to know if we have to add an EOL in vertical mode 2022-01-07 15:19:03 +01:00
Jonas Jenwald
7b8794b37e [api-minor] Move removeNullCharacters into the viewer
This helper function has never been used in e.g. the worker-thread, hence its placement in `src/shared/util.js` led to a *small* amount of unnecessary duplication.
After the previous patches this helper function is now *only* used in the viewer, hence it no longer seems necessary to expose it through the official API.

*Please note:* It seems somewhat unlikely that third-party users were relying *directly* on this helper function, which is why it's not being exported as part of the viewer components. (If necessary, we can always change this later on.)
2022-01-06 12:25:33 +01:00
Jonas Jenwald
2d2b6463b8 [api-minor] Move addLinkAttributes and LinkTarget into the viewer
As part of the changes/improvement in PR 14092, we're no longer using the `addLinkAttributes` directly in e.g. the AnnotationLayer-code.
Given that the helper function is now *only* used in the viewer, hence it no longer seems necessary to expose it through the official API.

*Please note:* It seems somewhat unlikely that third-party users were relying *directly* on the helper function, which is why it's not being exported as part of the viewer components. (If necessary, we can always change this later on.)
2022-01-06 12:25:33 +01:00
Calixte Denizet
6cdae5ac4d Use positive dimensions for text chunks in the text layer (issue #14415). 2022-01-05 10:49:56 +01:00
Jonas Jenwald
b0e774d9c5 Convert Catalog.getAllPageDicts to an async method
The patch in PR 14335 *essentially* re-introduced the old code from before PR 3848, however looking at this code a bit closer it should be possible to simplify it by making the method asynchronous.

While this method is currently only used as a *fallback* in corrupt documents, the way that `MissingDataException`s are handled is less than ideal. Note that if a `MissingDataException` is thrown, we're forced to re-parse the *entire* /Pages tree[1].
With this method now being asynchronous, we're able to handle fetching of References in a *much* easier/nicer way than before without having to throw `MissingDataException`s and re-parse anything.
These changes also let us simplify the call-site slightly, by calling the method *directly* instead of using the `PDFManager`-instance (since again it will no longer throw `MissingDataException`s).

Furthermore, this patch contains the following other changes:
 - Reduce unnecessary duplication in the various `catch` handlers throughout the method, by simply moving the `XRefEntryException` handling into the `addPageError` helper function instead.
 - Move the "circular references"-check to occur slightly earlier, since there's obviously no point in asynchronously fetching data just to then throw an Error *immediately* afterwards.

---
[1] Imagine e.g. a thousand page document, where there's a `MissingDataException` thrown when fetching/parsing page 900.
2021-12-31 22:03:10 +01:00
Jonas Jenwald
1491459dea Improve caching for the Catalog.getPageIndex method (PR 13319 follow-up)
This method is now being used a lot more, compared to when it's added, since it's now used together with scripting as part of the `PDFDocument.fieldObjects` parsing (called during viewer initialization).
For /Page Dictionaries that we've already parsed, the `pageIndex` corresponding to a particular Reference is already known and we're thus able to skip *all* parsing in the `Catalog.getPageIndex` method for those cases.
2021-12-29 20:29:14 +01:00
Jonas Jenwald
a20393e6e4 Update PDFDocument._getLinearizationPage to do the /Type-check correctly (PR 14400 follow-up)
I forgot about this in PR 14400, since we should obviously be consistent *and* given that the existing check is actually wrong; sorry about this!
2021-12-29 13:26:58 +01:00
Tim van der Meij
e42d54e1b5
Merge pull request #14400 from Snuffleupagus/getPageDict-async
[api-minor] Convert `Catalog.getPageDict` to an asynchronous method
2021-12-28 19:40:34 +01:00
Jonas Jenwald
b513c64d9d [api-minor] Convert Catalog.getPageDict to an asynchronous method
Besides converting `Catalog.getPageDict` to an `async` method, thus simplifying the code, this patch also allows us to pro-actively fix a existing issue.
Note how we're looking up References in such a way that `MissingDataException`s won't cause trouble, however it's *technically possible* that the entries (i.e. /Count, /Kids, and /Type) in a /Pages Dictionary could actually be indirect objects as well. In the existing code this could lead to *some*, or even all, pages failing to load/render as intended.
In practice that doesn't *appear* to happen in real-world PDF documents, but given all the weird things that PDF software do I'd prefer to fix this pro-actively (rather than waiting for a bug report).
With `Catalog.getPageDict` being `async` this is now really simple to address, however I didn't want to introduce a bunch more *unconditional* asynchronicity in this method if it could be avoided (since that could slow things down). Hence we'll *synchronously* lookup the *raw* data in a /Pages Dictionary, and only fallback to asynchronous data lookup when a Reference was encountered.

In addition to the above, this patch also makes the following notable changes:
 - Let `Catalog.getPageDict` *consistently* reject with the actual error, regardless of what data we're fetching. Previously we'd "swallow" the actual errors except when looking up Dictionary entries, which is inconsistent and thus seem unfortunate. As can be seen from the updated unit-tests this change is API-observable, hence why the patch is tagged `[api-minor]`.

 - Improve the consistency of the Dictionary /Type-checks in both the `Catalog.getPageDict` and `Catalog.getAllPageDicts` methods.
   In `Catalog.getPageDict` there's a fallback code-path where we're *incorrectly* checking the /Page Dictionary for a /Contents-entry, which is wrong since a /Page Dictionary doesn't need to have a /Contents-entry in order to be valid.
   For consistency the `Catalog.getAllPageDicts` method is also updated to handle errors in the /Type-lookup correctly.

 - Reduce the `PagesCountLimit.PAUSE_EAGER_PAGE_INIT` viewer constant, to further improve loading/rendering performance of the *second* page during initialization of very long documents; PR 14359 follow-up.
2021-12-25 15:22:48 +01:00
KouWakai
98158b67a3 Handle non-integer Annotation border widths correctly (issue 14203)
The existing code appears to be wrong, since according to the PDF specification the border width of an Annotation only has to be a number and not specifically an integer. Please see:
 - https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#page=392
 - https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G11.2096210
 - https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G6.1965562
2021-12-24 22:10:19 +09:00
Jonas Jenwald
e0dba504d2 Fix broken/missing JSDocs and typedefs, to allow updating TypeScript to the latest version (issue 14342)
This patch circumvents the issues seen when trying to update TypeScript to version `4.5`, by "simply" fixing the broken/missing JSDocs and `typedef`s such that `gulp typestest` now passes.
As always, given that I don't really know anything about TypeScript, I cannot tell if this is a "correct" and/or proper way of doing things; we'll need TypeScript users to help out with testing!

*Please note:* I'm sorry about the size of this patch, but given how intertwined all of this unfortunately is it just didn't seem easy to split this into smaller parts.
However, one good thing about this TypeScript update is that it helped uncover a number of pre-existing bugs in our JSDocs comments.
2021-12-15 23:14:25 +01:00
Tim van der Meij
d3e1d7090a
Merge pull request #14370 from Snuffleupagus/getPageDict-sync-Pages
Slightly reduce asynchronicity in the `Catalog.getPageDict` method (PR 14338 follow-up)
2021-12-15 19:40:39 +01:00
Jonas Jenwald
760f765e56 Move the /Lang handling into the BaseViewer (PR 14114 follow-up)
In PR 14114 this was only added to the default viewer, which means that in the viewer components the user would need to *manually* implement /Lang handling. This was (obviously) a bad choice, since the viewer components already support e.g. structTrees by default; sorry about overlooking this!

To avoid having to make *two* `getMetadata` API-calls[1] very early during initialization, in the default viewer, the API will now cache its result. This will also come in handy elsewhere in the default viewer, e.g. by reducing parsing when opening the "document properties" dialog.

---
[1] This not only includes a round-trip to the worker-thread, but also having to re-parse the /Metadata-entry when it exists.
2021-12-14 13:19:05 +01:00
Jonas Jenwald
fa51fd9428 Slightly reduce asynchronicity in the Catalog.getPageDict method (PR 14338 follow-up)
After the changes in PR 14338, specifically in the `XRef.parse`-method, the /Pages-entry will now always have been fetched/validated when the `Catalog`-instance is created.
Hence we can directly access the /Pages-entry in `Catalog.getPageDict` and thus avoid *one* asynchronous data-lookup per page in the document. (In practice this is unlikely to show up in e.g. benchmarks, but it really cannot hurt.)

Finally, make sure that the `getPageDict`/`getAllPageDicts`-methods track the /Pages-tree reference correctly to prevent circular references in corrupt documents.
2021-12-13 21:18:06 +01:00
Tim van der Meij
a6dd39b645
Merge pull request #14358 from Snuffleupagus/checkLastPage-improvements
Improve `PDFDocument.checkLastPage`/`Catalog.getAllPageDicts` for documents with corrupt XRef tables (PR 14311, 14335 follow-up)
2021-12-11 13:07:54 +01:00
Tim van der Meij
70809a80ce
Merge pull request #14355 from Snuffleupagus/api-page-caches-Map
Change `WorkerTransport.{pageCache, pagePromises}` from an Array to a Map
2021-12-11 13:00:11 +01:00
Jonas Jenwald
70ac6b1694 Update Catalog.getAllPageDicts to always propagate the actual Errors (PR 14335 follow-up)
Rather than "swallowing" the actual Errors, when data fetching fails, ensure that they're always being propagated as intended to the call-site instead.
Note that we purposely handle `XRefEntryException` specially, to make it possible to fallback to indexing all XRef objects.
2021-12-10 15:22:36 +01:00
Jonas Jenwald
47f9eef584 Improve PDFDocument.checkLastPage for documents with corrupt XRef tables (PR 14311, 14335 follow-up)
Rather than trying, and failing, to fetch the entire /Pages-tree for documents with corrupt XRef tables, let's fallback to indexing all objects *before* trying to invoke the `Catalog.getAllPageDicts` method.
2021-12-10 11:45:09 +01:00
Jonas Jenwald
f39536a30b Change WorkerTransport.pagePromises from an Array to a Map
Given that not all pages necessarily are being accessed, or that the pages may be accessed out of order, using a `Map` seems like a more appropriate data-structure here.

Finally, also changes the `pagePromises` to a *private* property since it's not supposed to be accessed from the "outside".
2021-12-09 15:30:10 +01:00
Jonas Jenwald
c5525dcb69 Change WorkerTransport.pageCache from an Array to a Map
Given that not all pages necessarily are being accessed, or that the pages may be accessed out of order, using a `Map` seems like a more appropriate data-structure here.
For one thing, this simplifies iteration since we no longer have to worry about/check if `pageCache`-entries are undefined (which will happen for *sparse* `Array`s).

Of particular note is that we're no longer attempting to "null" the `pageCache`-entry from within the `PDFPageProxy._destroy`-method. Given that *synchronous* JavaScript will always run to completion[1] and that we're looping through all pages in `WorkerTransport.destroy` and immediately clear the cache afterwards, that code did/does not really make a lot of sense (as far as I can tell).

Finally, also changes the `pageCache` to a *private* property since it's not supposed to be accessed from the "outside".

---
[1] Unless there are errors, of course.
2021-12-09 15:29:47 +01:00
Jonas Jenwald
8a05db230e Further improve caching in Catalog.getPageDict, for disableAutoFetch mode (PR 8207 follow-up)
PR 8207 added caching to improve the performance of `Catalog.getPageDict`, by not having to repeatedly fetch the same data and also reducing the asynchronicity of that method.
However, because of *another* oversight on my part, we're only caching /Page references once we've found the correct page. As long as all pages are loaded *in order* this doesn't really matter (happens by default in the viewer), but when `disableAutoFetch` is used the pages may be fetched in a more random order (this patch reduces the asynchronicity of `Catalog.getPageDict` slightly in that case).
2021-12-09 12:54:49 +01:00
Tim van der Meij
97dc048e56
Merge pull request #14350 from Snuffleupagus/ccitt-infinite-loop
Prevent an infinite loop when parsing corrupt /CCITTFaxDecode data (issue 14305)
2021-12-08 20:01:21 +01:00
Jonas Jenwald
e8562173b8 Prevent an infinite loop when parsing corrupt /CCITTFaxDecode data (issue 14305)
Fixes one of the documents in issue 14305.
2021-12-07 13:57:25 +01:00
Jonas Jenwald
5f295ba280 Improve caching in Catalog.getPageDict (PR 8207 follow-up)
PR 8207 added caching to improve the performance of `Catalog.getPageDict`, by not having to repeatedly fetch the same data and also reducing the asynchronicity of that method.
However, because of annoying off-by-one errors[1] the caching became less efficient than it could/should be.[2] Note here that the /Pages-tree is zero-indexed, and that e.g. `pageIndex = 5` thus correspond to the *sixth* page of the document.

---
[1] In particular the `currentPageIndex + count < pageIndex` part.

[2] For example, even when loading a relatively small/simple document such as `tracemonkey.pdf` in the viewer, the number of `xref.fetchAsync(currentNode)` calls are reduced from `56` to `44` with this patch.
2021-12-06 11:49:31 +01:00
Tim van der Meij
335c4c8a43
Merge pull request #14338 from Snuffleupagus/XRef-more-Pages-validation
[api-minor] Clear all caches in `XRef.indexObjects`, and improve /Root dictionary validation in `XRef.parse` (issue 14303)
2021-12-04 13:23:40 +01:00
Tim van der Meij
3117985c55
Merge pull request #14340 from Snuffleupagus/Metadata-fetch-error
Handle errors when fetching the raw /Metadata (issue 14305)
2021-12-04 13:19:37 +01:00
Jonas Jenwald
d9fac34596 Ensure that the shadow helper function is passed a valid property (PR 14152 follow-up)
Trying to shadow a non-existent property is always an implementation mistake, since it leads to the `shadow`-call not having any effect.

In PR 14152 I overlooked the fact that it's fairly easy to enforce this during development/testing, since that can help catch e.g. simple spelling bugs.
2021-12-04 10:07:21 +01:00
Jonas Jenwald
40291d1943 Handle errors when fetching the raw /Metadata (issue 14305)
Currently the `Catalog.metadata` getter only handles errors during parsing, however in a *corrupt* PDF document fetching of the raw /Metadata can obviously fail as well.
Without this patch the `PDFDocumentProxy.getMetadata` method, in the API, can thus fail which it *never* should and this will cause the viewer to not initialize all state as expected.

Fixes one of the documents in issue 14305.
2021-12-04 09:41:42 +01:00
Jonas Jenwald
ad3a271fc4 [api-minor] Clear all caches in XRef.indexObjects, and improve /Root dictionary validation in XRef.parse (issue 14303)
*This patch improves handling of a couple of PDF documents from issue 14303.*

 - Update `XRef.indexObjects` to actually clear *all* XRef-caches. Invalid XRef tables *usually* cause issues early enough during parsing that we've not populated the XRef-cache, however to prevent any issues we obviously need to clear that one as well.

 - Improve the /Root dictionary validation in `XRef.parse` (PR 9827 follow-up). In addition to checking that a /Pages entry exists, we'll now also check that it can be successfully fetched *and* that it's of the correct type. There's really no point trying to use a /Root dictionary that e.g. `Catalog.toplevelPagesDict` will reject, and this way we'll be able to fallback to indexing the objects in corrupt documents.

 - Throw an `InvalidPDFException`, rather than a general `FormatError`, in `XRef.parse` when no usable /Root dictionary could be found. That really seems more appropriate overall, since all attempts at parsing/recovery have failed. (This part of the patch is API-observable, hence the tag.)

With these changes, two existing test-cases are improved and the unit-tests are updated/re-factored to highlight that. In particular `GHOSTSCRIPT-698804-1-fuzzed.pdf` will now both load and "render" correctly, whereas `poppler-395-0-fuzzed.pdf` will now fail immediately upon loading (rather than *appearing* to work).
2021-12-03 11:57:38 +01:00
Jonas Jenwald
1fac6371d3 [Regression] Eagerly fetch/parse the entire /Pages-tree in corrupt documents (issue 14303, PR 14311 follow-up)
*Please note:* This is similar to the method that existed prior to PR 3848, but the new method will *only* be used as a fallback when parsing of corrupt PDF documents.

The implementation in PR 14311 unfortunately turned out to be *way* too simplistic, as evident by the recently added test-files in issue 14303, since it may *cause* infinite loops in `PDFDocument.checkLastPage` for some corrupt PDF documents.[1]
To avoid this, the easiest solution that I could come up with was to fallback to eagerly parsing the *entire* /Pages-tree when the /Count-entry validation fails during document initialization.

Fixes *at least* two of the issues listed in issue 14303, namely the `poppler-395-0.pdf...` and `GHOSTSCRIPT-698804-1.pdf...` documents.

---
[1] The whole point of PR 14311 was obviously to *get rid of* infinte loops during document initialization, not to introduce any more of those.
2021-12-02 14:31:04 +01:00
Jonas Jenwald
e045cd4520 Remove the unused skipCount parameter from Catalog.getPageDict (PR 14311 follow-up)
This was added in PR 14311, but given that I completely missed to update the `PDFDocument.getPage` signature accordingly it's completely unused.
Given that things work just as fine as-is, let's simply remove that optional parameter for now; sorry about the churn here!
2021-12-02 11:51:38 +01:00
Jonas Jenwald
63be23f05b Handle errors correctly when data lookup fails during /Pages-tree parsing (issue 14303)
This only applies to severely corrupt documents, where it's possible that the `Parser` throws when we try to access e.g. a /Kids-entry in the /Pages-tree.

Fixes two of the issues listed in issue 14303, namely the `poppler-742-0.pdf...` and `poppler-937-0.pdf...` documents.
2021-12-02 10:54:40 +01:00
Jonas Jenwald
a807ffe907 Prevent circular references in XRef tables from hanging the worker-thread (issue 14303)
*Please note:* While this patch on its own is sufficient to prevent the worker-thread from hanging, however in combination with PR 14311 these PDF documents will both load *and* render correctly.

Rather than focusing on the particular structure of these PDF documents, it seemed (at least to me) to make sense to try and prevent all circular references when fetching/looking-up data using the XRef table.
To avoid a solution that required tracking the references manually everywhere, the implementation settled on here instead handles that internally in the `XRef.fetch`-method. This should work, since that method *and* the `Parser`/`Lexer`-implementations are completely synchronous.

Note also that the existing `XRef`-caching, used for all data-types *except* Streams, should hopefully help to lessen the performance impact of these changes.
One *potential* problem with these changes could be certain *browser* exceptions, since those are generally not catchable in JavaScript code, however those would most likely "stop" worker-thread parsing anyway (at least I hope so).

Finally, note that I settled on returning dummy-data rather than throwing an exception. This was done to allow parsing, for the rest of the document, to continue such that *one* bad reference doesn't prevent an entire document from loading.

Fixes two of the issues listed in issue 14303, namely the `poppler-91414-0.zip-2.gz-53.pdf` and `poppler-91414-0.zip-2.gz-54.pdf` documents.
2021-11-27 23:50:26 +01:00
Jonas Jenwald
a669fce762 Inline the isDict, isRef, and isStream checks in the src/core/xref.js file 2021-11-27 23:49:17 +01:00
Jonas Jenwald
680e0efb9d Use Array-destructuring in the XRef.readXRefStream-method 2021-11-27 23:49:17 +01:00
Jonas Jenwald
d0c4bbd828 [api-minor] Validate the /Pages-tree /Count entry during document initialization (issue 14303)
*This patch basically extends the approach from PR 10392, by also checking the last page.*

Currently, in e.g. the `Catalog.numPages`-getter, we're simply assuming that if the /Pages-tree has an *integer* /Count entry it must also be correct/valid.
As can be seen in the referenced PDF documents, that entry may be completely bogus which causes general parsing to breaking down elsewhere in the worker-thread (and hanging the browser).

Rather than hoping that the /Count entry is correct, similar to all other data found in PDF documents, we obviously need to validate it. This turns out to be a little less straightforward than one would like, since the only way to do this (as far as I know) is to parse the *entire* /Pages-tree and essentially counting the pages.
To avoid doing that for all documents, this patch tries to take a short-cut by checking if the last page (based on the /Count entry) can be successfully fetched. If so, we assume that the /Count entry is correct and use it as-is, otherwise we'll iterate through (potentially) the *entire* /Pages-tree to determine the number of pages.

Unfortunately these changes will have a number of *somewhat* negative side-effects, please see a possibly incomplete list below, however I cannot see a better way to address this bug.
 - This will slow down initial loading/rendering of all documents, at least by some amount, since we now need to fetch/parse more of the /Pages-tree in order to be able to access the *last* page of the PDF documents.
 - For poorly generated PDF documents, where the entire /Pages-tree only has *one* level, we'll unfortunately need to fetch/parse the *entire* /Pages-tree to get to the last page. While there's a cache to help reduce repeated data lookups, this will affect initial loading/rendering of *some* long PDF documents,
 - This will affect the `disableAutoFetch = true` mode negatively, since we now need to fetch/parse more data during document initialization. While the `disableAutoFetch = true` mode should still be helpful in larger/longer PDF documents, for smaller ones the effect/usefulness may unfortunately be lost.

As one *small* additional bonus, we should now also be able to support opening PDF documents where the /Pages-tree /Count entry is completely invalid (e.g. contains a non-integer value).

Fixes two of the issues listed in issue 14303, namely the `poppler-67295-0.pdf` and `poppler-85140-0.pdf` documents.
2021-11-27 21:57:35 +01:00
Tim van der Meij
9a1e27efc5
Merge pull request #14313 from Snuffleupagus/PDFDocument_pagePromises-map
Change the `_pagePromises` cache, in the worker, from an Array to a Map
2021-11-27 20:58:23 +01:00
calixteman
bbd8b5ce9f
Merge pull request #14319 from calixteman/xfa_arc
XFA - Draw arcs correctly
2021-11-27 11:32:32 -08:00
Calixte Denizet
31e13515f5 XFA - Draw arcs correctly
- it aims to fix #14315;
- take into account the startAngle to compute the coordinates of the final point.
2021-11-27 19:30:12 +01:00
Calixte Denizet
cfdaa57353 Handle sub/super-scripts in rich text
- it aims to fix #14317;
 - change the fontSize and the verticalAlign properties according to the position of the text.
2021-11-27 16:06:09 +01:00
Jonas Jenwald
4c56214ab4 Convert PDFDocument._getLinearizationPage to an async method
This, ever so slightly, simplifies the code and reduces overall indentation.
2021-11-26 19:57:47 +01:00
Jonas Jenwald
080996ac68 Change the _pagePromises cache, in the worker, from an Array to a Map
Given that not all pages necessarily are being accessed, or that the pages may be accessed out of order, using a `Map` seems like a more appropriate data-structure here.
Furthermore, this patch also adds (currently missing) caching for XFA-documents. Loading a couple of such documents in the viewer, with logging added, shows that we're currently re-creating `Page`-instances unnecessarily for XFA-documents.
2021-11-26 19:53:57 +01:00
Jonas Jenwald
ca8d2bdce4 Abort parsing when the XRef /W-array contain bogus entries (issue 14303)
For this particular PDF document, we have `/W [1 2 166666666666666666666666666]` which obviously makes no sense.

While this patch makes no attempt at actually validating the entries in the /W-array, we'll now simply abort all processing when the end of the PDF document has been reached (thus preventing hanging the browser).
Please note that this patch doesn't enable the PDF document to be loaded/rendered, but at least it fails "correctly" now.

Fixes one of the issues listed in issue 14303, namely the `REDHAT-1531897-0.pdf`document.
2021-11-25 18:35:08 +01:00
Jonas Jenwald
ae4f1ae3e7 Ensure that ChunkedStream won't attempt to request data *beyond* the document size (issue 14303)
This bug was surprisingly difficult to track down, since it didn't just depend on range-requests being used but also on how quickly the document was loaded. To even be able to reproduce this locally, I had to use a very small `rangeChunkSize`-value (note the unit-test).

The cause of this bug is a bogus entry in the XRef-table, causing us to attempt to request data from *beyond* the actual document size and thus getting into an infinite loop.

Fixes *one* of the issues listed in issue 14303, namely the `PDFBOX-4352-0.pdf` document.
2021-11-24 19:19:43 +01:00
Jonas Jenwald
6da0944fc7 [api-minor] Replace PDFDocumentProxy.getStats with a synchronous PDFDocumentProxy.stats getter
*Please note:* These changes will primarily benefit longer documents, somewhat at the expense of e.g. one-page documents.

The existing `PDFDocumentProxy.getStats` function, which in the default viewer is called for each rendered page, requires a round-trip to the worker-thread in order to obtain the current document stats. In the default viewer, we currently make one such API-call for *every rendered* page.
This patch proposes replacing that method with a *synchronous* `PDFDocumentProxy.stats` getter instead, combined with re-factoring the worker-thread code by adding a `DocStats`-class to track Stream/Font-types and *only send* them to the main-thread *the first time* that a type is encountered.

Note that in practice most PDF documents only use a fairly limited number of Stream/Font-types, which means that in longer documents most of the `PDFDocumentProxy.getStats`-calls will return the same data.[1]
This re-factoring will obviously benefit longer document the most[2], and could actually be seen as a regression for one-page documents, since in practice there'll usually be a couple of "DocStats" messages sent during the parsing of the first page. However, if the user zooms/rotates the document (which causes re-rendering), note that even a one-page document would start to benefit from these changes.

Another benefit of having the data available/cached in the API is that unless the document stats change during parsing, repeated `PDFDocumentProxy.stats`-calls will return *the same identical* object.
This is something that we can easily take advantage of in the default viewer, by now *only* reporting "documentStats" telemetry[3] when the data actually have changed rather than once per rendered page (again beneficial in longer documents).

---
[1] Furthermore, the maximium number of `StreamType`/`FontType` are `10` respectively `12`, which means that regardless of the complexity and page count in a PDF document there'll never be more than twenty-two "DocStats" messages sent; see 41ac3f0c07/src/shared/util.js (L206-L232)

[2] One example is the `pdf.pdf` document in the test-suite, where rendering all of its 1310 pages only result in a total of seven "DocStats" messages being sent from the worker-thread.

[3] Reporting telemetry, in Firefox, includes using `JSON.stringify` on the data and then sending an event to the `PdfStreamConverter.jsm`-code.
In that code the event is handled and `JSON.parse` is used to retrieve the data, and in the "documentStats"-case we'll then iterate through the data to avoid double-reporting telemetry; see https://searchfox.org/mozilla-central/rev/8f4c180b87e52f3345ef8a3432d6e54bd1eb18dc/toolkit/components/pdfjs/content/PdfStreamConverter.jsm#515-549
2021-11-20 12:20:55 +01:00
Tim van der Meij
41ac3f0c07
Merge pull request #14291 from Snuffleupagus/force-postMessageTransfers
[api-minor] Only use Workers when `postMessage` transfers are supported (PR 11123 follow-up)
2021-11-19 20:02:51 +01:00
Brendan Dahl
c6cb39ef30
Merge pull request #14262 from Snuffleupagus/issue-14261
Include the /Lang-property, when it exists, in the StructTree-data (issue 14261)
2021-11-19 07:51:21 -08:00
Jonas Jenwald
6f22327e61 [api-minor] Only use Workers when postMessage transfers are supported (PR 11123 follow-up)
Given that all modern browsers now support `postMessage` transfers, and have for years, it no longer seems necessary for the PDF.js library to support using Workers unless the `postMessage` transfers functionality is available.
This patch is a follow-up to PR 11123, which made it impossible to *manually* disable `postMessage` transfers for performance reasons (since it increases memory usage), which hasn't caused any bug reports as far as I know.[1]

Hence we'll now only support *proper* Worker implementations, with fully working `postMessage` transfers, and fallback to using "fake" Workers otherwise.

---
[1] At the time of that PR we still "supported" IE, which is why this code was left intact.
2021-11-19 16:47:58 +01:00
Tim van der Meij
3dccaccbb4
Merge pull request #14278 from Snuffleupagus/rm-removeChild
Replace the remaining `Node.removeChild()` instances with `Element.remove()`
2021-11-17 20:17:55 +01:00
Jonas Jenwald
4ef1a129fa Replace the remaining Node.removeChild() instances with Element.remove()
Using `Element.remove()` is a slightly more compact way of removing an element, since you no longer need to explicitly find/use its parent element.
Furthermore, the patch also replaces a couple of loops that're used to delete all elements under a node with simply overwriting the contents directly (a pattern already used throughout the viewer).

See also:
 - https://developer.mozilla.org/en-US/docs/Web/API/Node/removeChild
 - https://developer.mozilla.org/en-US/docs/Web/API/Element/remove
2021-11-16 17:52:50 +01:00
Brendan Dahl
3209c013c4
Merge pull request #14247 from calixteman/button
[api-minor] Render pushbuttons on their own canvas (bug 1737260)
2021-11-16 08:10:40 -08:00
Jonas Jenwald
971ac8e993 Include the /Lang-property, when it exists, in the StructTree-data (issue 14261)
*Please note:* This is a tentative patch, since I don't have the necessary a11y-software to actually test it.
2021-11-14 12:37:41 +01:00
Jonas Jenwald
a54bed4963 Enable the ESLint no-loss-of-precision rule
Please refer to https://eslint.org/docs/rules/no-loss-of-precision
2021-11-14 10:48:50 +01:00
calixteman
85c6dd59ce
Merge pull request #14268 from calixteman/outline
Remove non-displayable chars from outline title (#14267)
2021-11-13 08:12:56 -08:00
Calixte Denizet
7041c62ccf Remove non-displayable chars from outline title (#14267)
- it aims to fix #14267;
 - there is nothing about chars in range [0-1F] in the specs but acrobat doesn't display them in any way.
2021-11-13 16:56:08 +01:00
Jonas Jenwald
afcc99a86d When parsing corrupt documents without any trailer-dictionary, fallback to the "top"-dictionary (issue 14269)
There's obviously no guarantee that this will work in general, if the document is sufficiently corrupt, but it should hopefully be better than just throwing `InvalidPDFException` as currently happens.

Please note that, as is often the case with corrupt documents, it's somewhat difficult to know if we're rendering the document "correctly" with this patch[1]. In this case even Adobe Reader cannot open the document, which is always a good sign that it's *really* corrupt, however we're at least able to render *something* with this patch.

---
[1] Whatever "correct" even means when dealing with corrupt PDF documents, where often times different PDF viewers won't agree completely.
2021-11-13 13:21:38 +01:00
Jonas Jenwald
28fb3975eb
Merge pull request #14266 from calixteman/bug931481
Don't consider space as real space when there is an extra spacing (bug 931481)
2021-11-12 21:42:32 +01:00
Calixte Denizet
a88ff34eb7 Don't consider space as real space when there is an extra spacing (bug 931481)
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=931481;
 - real space chars are pushed in the chunk but when there is an extra spacing, the next char position must be compared with the previous one;
 - for example, an extra spacing can cancel a space so visually there are no space.
2021-11-12 18:53:48 +01:00
Calixte Denizet
5b7e1f5232 XFA - Avoid an exception when looking for a font in a parent node
- it aims to fix issue https://github.com/mozilla/pdf.js/issues/14150;
  - a parent can be null in case the root has been reached, so just add a check.
2021-11-12 16:27:08 +01:00
Calixte Denizet
33ea817b20 [api-minor] Render pushbuttons on their own canvas (bug 1737260)
- First step to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1737260;
 - several interactive pdfs use the possibility to hide/show buttons to show different icons;
 - render pushbuttons on their own canvas and then insert it the annotation_layer;
 - update test/driver.js in order to convert canvases for pushbuttons into images.
2021-11-12 15:37:33 +01:00
Jonas Jenwald
ea1c348c67 Always prefer abbreviated keys, over full ones, when doing any dictionary lookups (issue 14256)
Note that issue 14256 was specifically about *inline* images, please refer to:
 - https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G7.1852045
 - https://www.pdfa.org/safedocs-unearths-pdf-inline-image-issue/
 - https://pdf-issues.pdfa.org/32000-2-2020/clause08.html#H8.9.7

However, during review of the initial PR in https://github.com/mozilla/pdf.js/pull/14257#issuecomment-964469710, it was suggested that we instead do this *unconditionally for all* dictionary lookups.
In addition to re-ordering the existing call-sites in the `src/core`-code, and adding non-PRODUCTION/TESTING asserts to catch future errors, for consistency a number of existing `if`/`switch`-blocks were re-factored to also check the abbreviated keys first.
2021-11-10 11:56:18 +01:00
calixteman
4bb9de4b00
Merge pull request #14239 from calixteman/1739502
XFA - Fix a breakBefore issue when target is a contentArea and startNew is 1 (bug 1739502)
2021-11-08 03:14:42 -08:00
Calixte Denizet
13ae6d493a XFA - Encode tag names in UTF-8 when saving (fix #14249) 2021-11-07 21:41:37 +01:00
calixteman
efb4455749
Merge pull request #14240 from calixteman/14014
XFA - Get each page asynchronously in order to avoid blocking the event loop (#14014)
2021-11-06 13:21:43 -07:00
Calixte Denizet
1681e25008 XFA - Get each page asynchronously in order to avoid blocking the event loop (#14014) 2021-11-06 13:25:03 +01:00
Brendan Dahl
b56cca0324 Create shading patterns the size of the current path. (bug 1722807)
Previously, when we created a shading pattern canvas we created it
as the same size as the page. This was good for caching if the same
pattern was used over and over again, but when lots of different
shadings are created that caused us to create many full page
canvases.

Instead of creating the full page canvses, create the canvas
as the same size as the current path bounding box. This reduces memory
consumption by a lot since most paths are pretty small. Also, in real world
PDFs it's rare for a shading (non shading fill) to be reused over and over again.
Bug 1721949 is an example where the same pattern is reused and it will be slightly
slower than before.
2021-11-05 20:44:18 -07:00
Brendan Dahl
8161d3f29d Don't double apply a group xobject's bbox.
In `beginGroup` we create a new canvas that is the size of the
bounding box and we translate it to the offset. This means we don't need to
also apply the bounding box during `paintFormXObjectBegin`.

This improves #6961 quite a bit, but it still is missing the indention
in the ruler.
2021-11-05 15:40:58 -07:00
Calixte Denizet
a08763f4aa XFA - Fix a breakBefore issue when target is a contentArea and startNew is 1 (bug 1739502)
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1739502;
 - when the target area was the current content area, everything was pushed in it instead of creating a new one (and consequently a new pageArea is created).
 - the pdf shows an alignment issue on page 4:
   - the hAlign is "center" but the subform was the width of its parent, so compute the real width of the subform with tb layout;
 - there is an extra empty page at the end of the pdf:
   - there is a subform with some hidden elements which are not rendered for now (since there is no plugged JS engine it isn't possible to draw them in changing their visibility).
   - so in case a subform is empty and has no real dimensions (at least one is 0), we just consider it as empty.
2021-11-05 18:59:55 +01:00
calixteman
e136afbabc
Merge pull request #14218 from janekotovich/subform_min_0
XFA subform with occur min=0 and no bound data displaying.
2021-11-05 04:12:34 -07:00
Jonas Jenwald
8222d6530b
Merge pull request #14232 from brendandahl/show-text-pattern
Use correct matrix for patterns with showText.
2021-11-05 10:04:56 +01:00
Brendan Dahl
1c7048399b Use correct matrix for patterns with showText.
We were incorrectly using the transform in the pattern before it had been
adjusted causing the pattern to be misplaced relative to the page.

Fixes: ShowText-ShadingPattern.pdf (already in corpus)
Fixes: #8111
Fixes: #9243
2021-11-04 16:57:36 -07:00
Jane-Kotovich
56b502391c XFA subform with occur min=0 and no bound data displaying
Subfrom nomin displays even though it's subform is set to <occur max=-1 min=0>
If we look through specs of XFA 3.3 : https://www.pdfa.org/norm-refs/XFA-3_3.pdf
- The min attribute is used when processing a form that contains data. Regardless of the data at least this number of instances is included. It is permissible to set this value to zero, in which case the container is entirely excluded if there is no data for it.

However, in our case it doesn't happen, because we let our empty dataNode get through. Though by setting a clause:
- eliminate unmatched data with occur min=0
we are checking our empty data and sending it to uselessNode array where at the end it gets removed;
2021-11-04 20:22:05 +10:00
Jonas Jenwald
e1a35e7bb6
Merge pull request #14213 from Snuffleupagus/issue-11656
Tweak the Bidi-detection heuristics for very short RTL strings (issue 11656)
2021-11-03 22:09:14 +01:00
Jonas Jenwald
5f77d3719b Tweak the Bidi-detection heuristics for very short RTL strings (issue 11656)
Very short strings can narrowly miss the existing Bidi-detection threshold, leading to incorrect text-selection and copying behaviour.

In my testing, neither Adobe Reader or PDFium seem to handle copying "correctly" for this document. Hence it's not entirely clear to me that we actually want to fix this, since tweaking these heuristics can *obviously* cause regressions elsewhere (and our test coverage for RTL-text isn't exactly great).
2021-11-03 20:31:57 +01:00
Brendan Dahl
039a7a670f Reset path bounding box tracking when starting a new path.
Starting a new path will wipe out any of the current subpaths in the
current graphics state, so we should reset the min/maxes.

This makes a number of the bounding boxes smaller and reduces the number
of composed pixels. For the smask tests in the corpus, the number of
composed pixesl goes from 19,872,109 to 19,676,905. The difference is much
larger on other PDFs though.
2021-11-03 11:46:52 -07:00
Jonas Jenwald
8c70258065
Merge pull request #14182 from calixteman/richtext
Support rich content in markup annotation
2021-10-31 14:41:56 +01:00
Calixte Denizet
cf8dc750d6 Support rich content in markup annotation
- use the xfa parser but in the xhtml namespace.
2021-10-31 13:44:51 +01:00
calixteman
2d8b6fda8f
Merge pull request #14207 from janekotovich/forms_version_popup
JS - Avoid a popup to ask for specific version of Acrobat
2021-10-30 05:45:31 -07:00
Tim van der Meij
ec1633c33c
Merge pull request #14201 from Snuffleupagus/bug-1219400
Use the correct border-style for Annotations, when a dash array is specified (bug 1219400)
2021-10-30 12:39:46 +02:00
Jane-Kotovich
12f89d2ab1 JS - Avoid a popup to ask for specific version of Acrobat
Embedded JS in PDF keep throwing alert reagdring specific version of Acrobat (Spanish and version 5.0 or greater).
This happens because:
- JS in pdf is enabled
- PDF contains some unsupported features (e.g. XFA)
Alert come when app.formVersion = undefined || app.formVersion < 5.0
In pdf.js we were using FORM_VERSION = undefined. After researching based on https://opensource.adobe.com/dc-acrobat-sdk-docs/acrobatsdk/pdfs/acrobatsdk_jsapiref.pdf\#G4.1993509 and Acrobat DC we decided to go with the larger number to avoid unnecessary popups.
Through investigation we realise that VIEWER_VERSION should have same value - a number.
Due to all that, we implemented 21.00720099 as a value for both FORMS_VERSION and VIEWER_VERSION
2021-10-29 23:09:59 +10:00
Tim van der Meij
0e7614df7f
Merge pull request #14180 from Snuffleupagus/bug-1627427
Handle ranges that "overflow" the last byte in `CMap.mapBfRange` (bug 1627427)
2021-10-27 20:06:09 +02:00
Jonas Jenwald
884caf602e Use the correct border-style for Annotations, when a dash array is specified (bug 1219400)
Even though we cannot use the dash array in the display layer, at least ensure that we use the correct border-style.
2021-10-27 13:20:21 +02:00
Jane-Kotovich
91fc643ff9 [api-minor] Implement securityHandler in the scripting API (bug 1731578) 2021-10-26 23:42:04 +10:00
Jonas Jenwald
aa1b78684f Handle ranges that "overflow" the last byte in CMap.mapBfRange (bug 1627427) 2021-10-24 13:48:38 +02:00
Tim van der Meij
0aaa4e3dbe
Merge pull request #14156 from Snuffleupagus/escodegen-fork
Add support for modern ECMAScript `class` features
2021-10-23 19:12:44 +02:00
Jonas Jenwald
52372b9378
Merge pull request #14175 from brendandahl/smask-v2
Use a new method for handling soft masks.
2021-10-23 09:27:18 +02:00
Brendan Dahl
82681ea20c Track the clipping box and bounding box of the path.
This allows us to compose much smaller regions of soft
mask making them much faster. This should also allow
for further optimizations in the pattern code.

For example locally I see issue #6573 go from 55s
to 5s with this change.

Fixes #6573
2021-10-22 13:41:29 -07:00
Brendan Dahl
2d1f9ff7a3 Use a new method for handling soft masks.
The old method of handling soft masks had a number of issues where the temporary
drawing canvas and the suspended main canvas could get out of sync
(e.g. mismatched save/restores or clip state) or we could end up compositing at
the wrong time. A good example of things getting out sync is the reduced test
case in #9017.

To fix this I've changed two big things:

1) Duplicate all the needed graphics state from the temporary canvas to the
suspended main canvas. This ensure the canvases stay in sync so that when we
switch back to the main canvas the graphics state stack is the same
(e.g. transforms, clip paths).

2) Immediately composite after each drawing operation. This ensures that if
there's an active clip region that we'll still be able to composite the correct
portions of the canvas. Note: This solution could be avoided by using
getImageData and putImageData since those ignore clipping region, but this is
very very slow. Note2: I also think the old way of only compositing at the end
of the soft mask is incorrect and can lead to wrong colors if drawing over the
same region, but in practice this doesn't seem to matter much.

Fixes: #5781
Fixes: #5853
Fixes: #7267
Fixes: #7891
Fixes: #8403
Fixes: #8624
Fixes: #12798
Fixes: #13891
Fixes: #9017 (reduced test case)
Fixes: https://bugzilla.mozilla.org/show_bug.cgi?id=1703683
2021-10-22 13:41:21 -07:00
Jonas Jenwald
89785a23f3 Convert Metadata to use private class fields
Please refer to https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Classes/Private_class_fields
2021-10-22 22:01:19 +02:00
Tim van der Meij
11f030d301
Merge pull request #14171 from Snuffleupagus/issue-14170
Prevent run-time errors in Node.js versions with `URL.createObjectURL` support (issue 14170)
2021-10-22 21:07:19 +02:00
Jonas Jenwald
044197808a Prevent double-rendering borders for PushButton-annotations (PR 14083 follow-up)
With ResetForm-action support added in PR 14083, there's a regression in the `issue12716` test-case. More specifically the border around the "Clear Form"-link is now rendered *twice*, once in the canvas via the appearance-stream and once in the annotationLayer via the border-data.
This looks slightly weird, and was most likely not intended, which is why this patch suggests that we ignore the border in the annotationLayer when an appearance-stream exists.
2021-10-21 13:31:16 +02:00
Jonas Jenwald
ff9d2b2ab1 Prevent run-time errors in Node.js versions with URL.createObjectURL support (issue 14170)
Apparently Node.js has added *global* `URL.createObjectURL` support, but not done the same thing for `Blob`. Hence we also need to check for the availability of `Blob` in the `createObjectURL` helper function, and it's probably a good idea to also update `examples/node/pdf2svg.js` to work-around this until these changes reach an official PDF.js release.
2021-10-21 10:32:44 +02:00
Tim van der Meij
382be22c11
Merge pull request #14160 from Snuffleupagus/pr-13770-followup
Fix pattern handling regression in `SVGGraphics` (PR 13770 follow-up)
2021-10-19 19:31:18 +02:00
Brendan Dahl
b66239d6dc
Merge pull request #14114 from Snuffleupagus/issue-14110
[api-minor] Include the /Lang-property in the `documentInfo`, and use it in the viewer (issue 14110)
2021-10-19 08:08:08 -07:00
Jonas Jenwald
68e6622c57 Ignore Square/Circle-annnotations with a zero borderWidth when creating a fallback appearance stream (issue 14164)
Trying to render these Annotation-types, when the borderWidth is `0`, causes a "hairline" border to appear. If these Annotations included an appearance stream, as they are supposed to, this wouldn't have happened and the simplest solution here seem to be to just ignore these particular Annotations.
2021-10-19 15:27:42 +02:00
Jonas Jenwald
8c6f1e45c7 Fix pattern handling regression in SVGGraphics (PR 13770 follow-up)
While the FAQ clearly lists the SVG back-end as unsupported, see https://github.com/mozilla/pdf.js/wiki/Frequently-Asked-Questions#backends, I suppose that small/simple regressions still makes sense to fix.
2021-10-18 21:40:10 +02:00
calixteman
bbb64369f1
Merge pull request #13424 from calixteman/chunks2
[api-minor] Fix issues in text selection
2021-10-18 06:14:15 -07:00
Calixte Denizet
61d1063276 Fix issues in text selection
- PR #13257 fixed a lot of issues but not all and this patch aims to fix almost all remaining issues.
  - the idea in this new patch is to compare position of new glyph with the last position where a glyph has been drawn;
    - no space are "drawn": it just moves the cursor but they aren't added in the chunk;
    - so this way a space followed by a cursor move can be treated as only one space: it helps to merge all spaces into one.
  - to make difference between real spaces and tracking ones, we used a factor of the space width (from the font)
    - it was a pretty good idea in general but it fails with some fonts where space was too big:
    - in Poppler, they're using a factor of the font size: this is an excellent idea (<= 0.1 * fontSize implies tracking space).
2021-10-17 16:27:05 +02:00
Jonas Jenwald
00720d059a [api-minor] Include the /Lang-property in the documentInfo, and use it in the viewer (issue 14110)
*Please note:* This is a tentative patch, since I don't have the necessary a11y-software to actually test it.

To avoid having to add a new API-method just for a single string, I figured that adding the new property to the existing `documentInfo`-data (accessed via `PDFDocumentProxy.getMetadata` in the API) will hopefully be deemed acceptable.
2021-10-16 14:27:47 +02:00
Jonas Jenwald
0041230072 Re-name the XFAFactory.numberPages getter to XFAFactory.numPages for consistency
All other similar getters are called `numPages` throughout the code-base, and improved consistency should always be a good thing.
2021-10-16 12:56:21 +02:00
Jonas Jenwald
0e5348180e Fix the inconsistent return type of the PDFDocument.isPureXfa getter
Also (slightly) simplifies a couple of small getters/methods related to the `XFAFactory`-instance.
2021-10-16 12:56:20 +02:00
Jonas Jenwald
cd94a44ca1 Remove some duplication in *simple* shadowed getters in src/core/-code
In these cases there's no good reason, in my opinion, to duplicate the `shadow`-lines since that unnecessarily increases the risk of simple typos (see the previous patch).
2021-10-16 12:56:17 +02:00
Jonas Jenwald
1450da4168 Fix a xfaFaxtory typo in the shadowing in the PDFDocument.xfaFactory getter
With this typo the shadowing doesn't actually work, which causes these checks to be unnecessarily repeated. In this particular case it didn't have a significant performance impact, however we should definately fix this nonetheless.
2021-10-16 11:54:12 +02:00
Jane-Kotovich
c2af309917 XFA - Embedded image is missing 2021-10-15 21:12:29 +10:00
Tim van der Meij
f6d9d91965
Merge pull request #14116 from Snuffleupagus/api-more-optional-chaining
Use even more optional chaining in the `src/display/api.js` file
2021-10-13 19:38:03 +02:00
Jay Berkenbilt
586295fad6 Implement TrueType character map "format 2" (fixes #14117)
If a PDF included an embedded TrueType font whose preferred character
map (cmap) was in "format 2", the code would select that character map
and then refuse to read it because of an unsupported format, thus
causing the characters not to be rendered. This commit implements
support for format 2 as described at the link below.

https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6cmap.html
2021-10-13 07:37:14 -04:00
Jonas Jenwald
8fc9c7e41c Use even more optional chaining in the src/display/api.js file
This patch (slightly) simplifies a couple of `onProgress` and `onUnsupportedFeature` call-sites.
Finally, while unrelated, also removes some unnecessary `return undefined;` statements (PR 11601 follow-up).
2021-10-12 12:05:59 +02:00
Jonas Jenwald
8721557a08 For Annotations that define a closed area, make all of it toggle the PopupAnnotation (issue 14107)
For Circle, Square, and Polygon Annotations it's currently only possible to toggle the associated PopupAnnotation by clicking on its border. Depending on the border width, and also the current zoom-level in the viewer, that can make interacting with certain Annotations *practically* impossible (which is the case in issue 14107).
Hence, in order to improve this, change the "fill"-property of the SVG element in the annotationLayer to make the *entire* element part of the click/mouse-over target.

*Please note:* Given that this is a viewer-related issue, there's no simple way to test this as far as I can tell.
2021-10-09 15:55:15 +02:00
Tim van der Meij
56e3ef68d4
Merge pull request #14106 from calixteman/names
Empty name is allowed in ISO 32000
2021-10-09 14:29:10 +02:00
Jonas Jenwald
69a97bcba7 Take the /CIDToGIDMap data into account when computing the hash, in PartialEvaluator.preEvaluateFont, for composite fonts (bug 1734802)
This is unfortunately *yet another* bug in the `preEvaluateFont`-implementation, and I've lost count of the number of times I've had to tweak this code over the years :-(
I really cannot help thinking that PR 4423 was way too simplistic, since it missed a bunch of cases that leads to broken font rendering in many PDF documents.

Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1734802
2021-10-08 13:15:21 +02:00
Calixte Denizet
f384ad2356 Empty name is allowed in ISO 32000
- the exact sentence from the spec:
    "The token SOLIDUS (a slash followed by no regular characters) introduces a unique valid name defined by the empty sequence of characters."
  - so just remove the warning.
2021-10-06 20:50:39 +02:00
Jonas Jenwald
d49b1bf2ee Use the native structuredClone implementation when it's available
With a recent addition to the HTML specification, the internal structured clone algorithm used in browsers is (or will be, once it's implemented) *directly* accessible to JavaScript; please see https://developer.mozilla.org/en-US/docs/Web/API/WindowOrWorkerGlobalScope/structuredClone

Hence we'll *eventually* not need to maintain our own structured clone functionality in the `LoopbackPort`-class in the API, however for the time being we'll feature detect `structuredClone` and fallback to the existing PDF.js implementation.

Given that https://bugzilla.mozilla.org/show_bug.cgi?id=1722576 has landed in Firefox 94, we should no longer need the manually implemented `cloneValue`-functionality in MOZCENTRAL builds. Note also that in the Firefox built-in PDF Viewer it's not possible for users to *easily* disable workers, which should further reduce the risk of these changes.
2021-10-03 10:55:33 +02:00
Jonas Jenwald
8cb6efec2d [api-minor] Add a wrapper around the addLinkAttributes-function, in the API, to the PDFLinkService implementations
This patch helps reduce some duplication, given that we now have a few essentially identical `addLinkAttributes` call-sites in the code-base.
To prevent runtime errors in the Annotation/XFA-layer code, we'll warn if a custom/incomplete `PDFLinkService` is being used (limited to GENERIC builds).
2021-10-02 12:28:00 +02:00
Jonas Jenwald
bb9c905c5d Ensure that various URL-related options are applied in the xfaLayer too
Note how both the annotationLayer and the document outline will apply various URL-related options when creating the link-elements.
For consistency the `xfaLayer`-rendering should obviously use the same options, to ensure that the existing options are indeed applied to all URLs regardless of where they originate.
2021-10-02 09:32:23 +02:00
Jonas Jenwald
284d259054
Merge pull request #14057 from Snuffleupagus/bug-920426
Support CMap-data with only strings, when parsing TrueType composite fonts (bug 920426)
2021-10-01 23:22:25 +02:00
Jonas Jenwald
67a642c826 Replace a couple of Array.prototype.forEach-invocations with for..of instead
Given that `NodeList`s can be iterated using `for..of` we can use that instead, since it's a little bit nicer and easier to read than the `Array.prototype.forEach` format.
2021-10-01 09:06:17 +02:00
Calixte Denizet
aecbd7cd89 AcroForm: Add support for ResetForm action
- it aims to fix #12721.
  - Thanks to PR #14023, we've now the fieldObjects in the annotation layer so we can easily map fields names on their id if needed.
  - Reset values in the storage, in the JS sandbox and in the visible html elements.
2021-09-30 22:02:33 +02:00
Jonas Jenwald
d3ca28bc34 Support CMap-data with only strings, when parsing TrueType composite fonts (bug 920426)
In the referenced bug, the embedded fonts contain custom CMap-data that only include strings. Note how for embedded composite TrueType fonts we're using the CMap-data when building the glyph mapping, and currently we end up with a completely empty map because the code expects only CID *numbers*.
Furthermore, just fixing the glyph mapping alone isn't sufficient to fully address the bug, since we also need to consider this "special" kind of CMap-data when looking up glyph widths.
2021-09-30 18:10:47 +02:00
Tim van der Meij
9a74f3e6e0
Merge pull request #14049 from calixteman/bg_from_mk
Annotation - Use border and background colors from MK dictionary
2021-09-29 21:13:20 +02:00
Calixte Denizet
0776cd9b90 Annotation - Use border and background colors from MK dictionary
- it aims to fix #13003;
  - set the bg and fg colors as they're in the pdf;
  - put a transparent overlay to help to see the fields.
2021-09-26 20:49:26 +02:00
Jonas Jenwald
e6e04694f4 [api-minor] Move the addDefaultProtocolToUrl/tryConvertUrlEncoding functionality into the createValidAbsoluteUrl function
Having recently worked with, and reviewed patches touching, this code it seemed that it's probably not a bad idea to move that functionality into `createValidAbsoluteUrl` as new options instead.

For the `addDefaultProtocolToUrl` functionality in particular, the existing helper function was not only moved but slightly improved as well. Looking at the code, I realized that there's a small risk that it would incorrectly match a *relative* URL-string too.

With these changes, the `createValidAbsoluteUrl` call-sites in the `src/core/`-code can be simplified a little bit.

*Please note:* This patch may, indirectly, change the format of the `unsafeUrl`-property returned with relevant Annotations and OutlineItems; hence the `api-minor` tag.
However, I'd argue that it's actually more correct this way since the whole purpose of `unsafeUrl` is/was to return the URL data as-is without any parsing done.
2021-09-26 14:29:54 +02:00
Calixte Denizet
558e58f354 XFA - Add <a> element in button when an url is detected (bug 1716758)
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1716758;
  - some buttons have a JS action with the pattern `app.launchURL(...)` (or similar) so extract when it's possible the url and generate a <a> element with the href equals to the found url;
  - pdf.js already had some code to handle that so this patch slightly refactor that.
2021-09-25 21:59:39 +02:00
Calixte Denizet
c0e9108d00 Annotation - Some checkboxes have an empty N dictionary
- it aims to fix #14021;
  - the N dict is empty here so just create a default one;
  - it implies that the checked checkbox has no appearance so create a default one too in order to print it;
  - in the pdf in the issue, a checked box is not printed because it has no default appearance so we need to guess its appearance from its state.
2021-09-25 16:00:47 +02:00
Tim van der Meij
cc110b8542
Merge pull request #14064 from Snuffleupagus/issue-13845
Fallback to font name matching, when checking for serif fonts (issue 13845)
2021-09-25 12:41:57 +02:00
Jonas Jenwald
b23b8d8a5d
Merge pull request #14074 from Snuffleupagus/issue-14046
[api-minor] Add basic support for RTL text-content in PopupAnnotations (issue 14046)
2021-09-25 12:37:44 +02:00
Tim van der Meij
36dc93fe5d
Merge pull request #14065 from Snuffleupagus/fewer-EXPORT_DATA_PROPERTIES
[api-minor] Stop exporting, by default, a few additional Font properties (PR 11777 follow-up)
2021-09-25 12:25:56 +02:00
Tim van der Meij
ee34572fd0
Merge pull request #14070 from Snuffleupagus/MessageHandler-local-vars
Some small readability improvements in the `MessageHandler` code
2021-09-25 12:22:17 +02:00
Tim van der Meij
07558c158d
Merge pull request #14069 from Snuffleupagus/deprecate-OPS-paintJpegXObject
Mark the `paintJpegXObject` operator as deprecated (PR 11601 follow-up)
2021-09-25 12:15:33 +02:00
Jonas Jenwald
1dcd2f0cd3 [api-minor] Add basic support for RTL text-content in PopupAnnotations (issue 14046)
In order to implement this, we utilize the existing `bidi` function to infer the text-direction of /T and /Contents entries. While this may not be perfect in cases where one PopupAnnotation mixes LTR and RTL languages, it should work well enough in most cases.
To avoid having to add *two new* properties in lots of annotations, supplementing the existing `title`/`contents`-properties, this patch instead re-factors the existing code such that the properties are replaced by Objects (containing `str` and `dir`).

*Please note:* In order avoid breaking existing third-party implementations, `GENERIC`-builds of the PDF.js library will still provide the old `title`/`contents`-properties on annotations returned by `PDFPageProxy.getAnnotations`.
2021-09-25 09:18:58 +02:00
calixteman
104e049338
Merge pull request #14073 from calixteman/bindItems
XFA - Bind items when there's a bindItems entry
2021-09-24 09:01:52 -07:00
Calixte Denizet
97c1e076a1 XFA - Bind items when there's a bindItems entry
- In the pdf in issue #14071, some select fields don't contain any values;
  - the corresponding node has a bindItems and a bind elements and _bindItems function was just not called.
2021-09-24 16:08:58 +02:00
Calixte Denizet
cd73e282eb XFA - Create a new page in case of overflow
- it aims to fix #14071;
  - a subform is overflowing and the the target in case of overflow is itself. In this case we must create a new page.
2021-09-24 14:57:55 +02:00
Jonas Jenwald
890a6c1108 Some small readability improvements in the MessageHandler code
In particular the `_processStreamMessage`-method is a bit cumbersome to read, given the way that the current streamController/streamSink is accessed, which we can improve with a couple of local variables.
2021-09-24 13:07:20 +02:00
Jonas Jenwald
7d56fb4cbf Mark the paintJpegXObject operator as deprecated (PR 11601 follow-up)
After PR 11601, the `paintJpegXObject` operator is no longer used for anything. While I don't think we can just remove it, and essentially leave a "hole" in the `OPS` structure, we should at least mark it as explicitly unused to aid readability/maintainability of the code.
2021-09-24 12:47:28 +02:00
Brendan Dahl
d370a281c4
Merge pull request #14067 from calixteman/1732344
Don't save anything in XFA entry if no XFA! (bug 1732344)
2021-09-23 15:07:00 -07:00
Calixte Denizet
4b0538d07a Don't save anything in XFA entry if no XFA! (bug 1732344)
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1732344
  - rename some variables to have a more clear code;
  - and last but no least, add a unit test to test saving.
2021-09-23 19:51:23 +02:00
Jonas Jenwald
fd1f0f647f Print a special warning message, in the viewer, for XFA Foreground documents
Currently XFAF documents use the same warning message as in the XFA *disabled* case, which is neither helpful nor correct.
2021-09-23 15:02:24 +02:00
Jonas Jenwald
6cba5509f2 Re-factor document.getElementsByName lookups in the AnnotationLayer (issue 14003)
This replaces direct `document.getElementsByName` lookups with a helper method which:
 - Lets the AnnotationLayer use the data returned by the `PDFDocumentProxy.getFieldObjects` API-method, such that we can directly lookup only the necessary DOM elements.
 - Fallback to using `document.getElementsByName` as before, such that e.g. the standalone viewer components still work.

Finally, to fix the problems reported in issue 14003, regardless of the code-path we now also enforce that the DOM elements found were actually created by the AnnotationLayer code.
With these changes we'll thus be able to update form elements on all visible pages just as before, but we'll additionally update the AnnotationStorage for not-yet-rendered elements thus fixing a pre-existing bug.
2021-09-23 13:05:18 +02:00
Jonas Jenwald
9acfe486d4 Fallback to font name matching, when checking for serif fonts (issue 13845)
In order to handle fonts that specify completely bogus /Flags-entries, fallback to font name matching to determine if the font is a serif one.
2021-09-23 01:11:57 +02:00
Jonas Jenwald
e027748627 [api-minor] Stop exporting, by default, a few additional Font properties (PR 11777 follow-up)
*This is similar to the "isSymbolicFont"-property, which is no longer exported by default after PR 11777.*

Both "isMonospace" and "isSerifFont" are internal properties, used during font parsing and building of the glyph mapping on the worker-thread.
However both of these properties are completely unused on the main-thread and/or in the API, and accessing them they will now require setting the `fontExtraProperties`-option when calling `getDocument`.
2021-09-23 00:44:43 +02:00
Tim van der Meij
5254676ef3
Merge pull request #14055 from Snuffleupagus/PDF_TO_CSS_UNITS
Add `PDF_TO_CSS_UNITS` to the `PixelsPerInch`-structure
2021-09-22 22:24:51 +02:00
Jonas Jenwald
81a1c1cef7 Correctly validate URLs in XFA documents (bug 1731240)
With this patch we'll ensure that only valid absolute URLs can be used in XFA documents, similar to the existing validation done for "regular" PDF documents.
Furthermore, we'll also attempt to add a default protocol (i.e. `http`) to URLs beginning with "www." in XFA documents as well; this on its own is enough to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1731240
2021-09-21 21:21:01 +02:00
Jonas Jenwald
3e550f392a Add PDF_TO_CSS_UNITS to the PixelsPerInch-structure
Rather than re-computing this value in a number of different places throughout the code-base[1], we can expose this in the API via the existing `PixelsPerInch`-structure instead.
There's also been feature requests asking for the old `CSS_UNITS` viewer constant to be made accessible, such that it could be used in third-party implementations.

I suppose that it could be argued that it's somewhat confusing to place a unitless property in `PixelsPerInch`, however given that the `PDF_TO_CSS_UNITS`-property is defined strictly in terms of the existing properties this is hopefully deemed reasonable.

---
[1] These include:
 - The viewer, with the `CSS_UNITS` name.
 - The reference-tests.
 - The display-layer, when rendering images; see PR 13991.
2021-09-20 13:20:09 +02:00
Jonas Jenwald
8ea27ce157 Tweak how fonts with an /Encoding are handled in adjustToUnicode (issue 14048, PR 13277 follow-up)
Currently we only exclude /Encoding entries that also contains a /Differences array, which is the cause of the text-selection problem in the referenced issue.
In order to address this we'll now also exclude /Encoding entries that contain one of the predefined *named* encodings, and no longer require that it also contains a /Differences array.

*Please note:* This patch cases a small "regression" in the `bug1130815-text` test-case, however this is actually an improvement when compared with Adobe Reader and PDFium (in Google Chrome).
2021-09-18 22:44:25 +02:00
Tim van der Meij
83d3bb43f4
Merge pull request #14041 from Snuffleupagus/issue-9367
Support cmaps with only CID characters, when building the ToUnicode-map (issue 9367)
2021-09-18 16:47:06 +02:00
Jonas Jenwald
20eb6ca2ec
Merge pull request #14044 from calixteman/bug1719148
Annotations - Avoid empty value in text field when storage contains something for it (bug 1719148)
2021-09-18 16:31:45 +02:00
Jonas Jenwald
6634afd646
Merge pull request #14045 from calixteman/noise
XFA - Only warn about the wrong xfa type when there is an xfa thing
2021-09-18 16:13:20 +02:00
Tim van der Meij
c870fb489e
Merge pull request #14013 from Snuffleupagus/api-unittest-instanceof
Improve the API unit-tests, and try to expose more API-functionality in the TypeScript definitions
2021-09-18 16:08:19 +02:00
Calixte Denizet
2fc10727c5 XFA - Only warn about the wrong xfa type when there is an xfa thing 2021-09-18 15:44:05 +02:00
calixteman
ffa2572bdf
Merge pull request #14038 from calixteman/saveas
JS - Implement few possibilities with app.execMenuItem (bug 1724399)
2021-09-18 15:33:03 +02:00
Calixte Denizet
eb762ad624 Annotations - Avoid empty value in text field when storage contains something for it (bug 1719148)
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1719148;
  - JS can set a property for a non-rendered annotation using the annotationStorage but the other unset default properties must be used when the annotation is finally rendered;
  - so this patch just adds the properties already set in the annotationStorage to the default value.
2021-09-18 15:08:22 +02:00
Calixte Denizet
bfd570038d JS - Implement few possibilities with app.execMenuItem (bug 1724399)
- it aims to fix: https://bugzilla.mozilla.org/show_bug.cgi?id=1724399.
2021-09-18 13:52:32 +02:00
Jonas Jenwald
e3223b68fc Extract some of the glyphMap handling, for non-embedded composite standard fonts, into a helper function
This reduces some unnecessary duplication, since we currently have essentially the same code in a handful of places in the `Font.fallbackToSystemFont`-method.
2021-09-18 12:39:48 +02:00
Jonas Jenwald
ed73cf6d50 Support cmaps with only CID characters, when building the ToUnicode-map (issue 9367)
In this particular case the `CMap`-data that we create contains only numbers, but no strings, which causes `PartialEvaluator.readToUnicode` to create a ToUnicode-map with only empty strings.

*Please note:* This is yet another case where I don't know if it's necessarily the best and most correct solution, but it does fix the referenced issue.
2021-09-18 00:26:15 +02:00
Calixte Denizet
e87c12bf34 JS - Avoid the Stay/Leave popup when clicking on a button with a JS action
- it aims to fix #14039.
2021-09-17 21:04:07 +02:00
Calixte Denizet
5bef8120e7 Annotation - For checkboxes, get field value from AS (if any) instead of V (bug 1722036)
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1722036.
  - AS and V should share the same value for checkbox: it's at least what the specs say;
  - the pdf in the above bug opens correctly in Acrobat so it likely means that AS is chosen over V.
2021-09-17 13:04:16 +02:00
Brendan Dahl
d6a27860e3
Merge pull request #14025 from Snuffleupagus/issue-11915
Improve glyph mapping for non-embedded composite standard fonts with a /CIDToGIDMap (issue 11915)
2021-09-16 08:06:35 -07:00
Calixte Denizet
a3aa6dd6ab Annotation - Checkboxes with the same name and export values must be in unison
- it aims to fix #14024.
  - this patch adds an attribute `acroformExportValue` to the HTML input in order to set the checked attribute in taking into account the exportValue for the checkboxes with the same name.
2021-09-15 15:30:24 +02:00
Jonas Jenwald
a11343e9af Improve glyph mapping for non-embedded composite standard fonts with a /CIDToGIDMap (issue 11915)
*Please note:* All of this feels very handwavy, but at least it passes all tests locally. Hopefully we have enough tests for this part of the font code.

For non-embedded composite standard fonts with an "incomplete" /CIDToGIDMap, we'll now fallback to an *explicitly defined* /ToUnicode map even when that one happens to be an /Identity-H or /Identity-V map.

The `Font.fallbackToSystemFont` method is unfortunately getting more and more special-cases, however that might be unavoidable given all the weird non-embedded fonts found in the wild :-(
2021-09-15 11:30:40 +02:00
Calixte Denizet
9812e35916 XFA - Don't create images for unsupported mime types 2021-09-14 10:55:25 +02:00
Jonas Jenwald
95057a4e56 Try to expose more API-functionality in the TypeScript definitions
While these types apparently makes sense in TypeScript environments, we really don't want to extend the *public* API by simply exporting the relevant classes directly in `src/pdf.js` (since they should never be called/initialized manually).

Please see e.g. issue 12384 where this was first requested, and note that a possible work-around was also provided there. This patch simply implements that work-around[1], which will hopefully be helpful to TypeScript users.

---
[1] Based on the discussion in PR 13957, the two previous patches appear to be necessary for this to actually work.
2021-09-13 13:57:56 +02:00
Jonas Jenwald
d854352cd5 Improve the API unit-tests by checking that PDFPageProxy.render returns a RenderTask-instance
This is similar to existing unit-tests, which checks for `PDFDocumentProxy`- and `PDFPageProxy`-instances.
2021-09-13 13:34:37 +02:00
Jonas Jenwald
fa7a607d33 Improve the API unit-tests by checking that getDocument returns a PDFDocumentLoadingTask-instance
This is similar to existing unit-tests, which checks for `PDFDocumentProxy`- and `PDFPageProxy`-instances.
2021-09-13 13:34:28 +02:00
Jonas Jenwald
7025b9f859 [src/core/writer.js] Support null values in the writeValue function
*This fixes something that I noticed, having recently looked at both the `Lexer.getObj` and `writeValue` code.*

Please note that I unfortunately don't have an example of a form where saving fails without this patch. However, given its overall simplicity and that unit-tests are added, it's hopefully deemed useful to fix this potential issue pro-actively rather than waiting for a bug report.

At this point one might, and rightly so, wonder if there's actually any real-world PDF documents where a `null` value is being used?
Unfortunately the answer is *yes*, and we have a couple of examples in the test-suite (although none of those are related to forms); please see: `issue1015`, `issue2642`, `issue10402`, `issue12823`, `issue13823`, and `pr12564`.
2021-09-12 18:24:37 +02:00
Jonas Jenwald
5d578ea36a [src/core/writer.js] Remove unnecessary string-wrapping for boolean values in writeValue (PR 13998 follow-up) 2021-09-12 15:45:45 +02:00
Jonas Jenwald
761519ef3f
Merge pull request #13998 from calixteman/bug1729971
Write boolean value when saving a form (bug 1729971)
2021-09-12 15:38:10 +02:00
Jonas Jenwald
a47844d1fc Let Lexer.getObj return a dummy-Cmd for commands that start with a non-visible ASCII character (issue 13999)
This way we avoid breaking badly generated PDF documents where a non-visible ASCII character is "glued" to a valid command.
2021-09-11 19:54:13 +02:00
Tim van der Meij
e97f01b17c
Merge pull request #13977 from Snuffleupagus/enqueueChunk-batch
[api-minor] Reduce `postMessage` overhead, in `PartialEvaluator.getTextContent`, by sending text chunks in batches (issue 13962)
2021-09-11 13:34:07 +02:00
Jonas Jenwald
0e54f568fb Re-factor the CSS_PIXELS_PER_INCH/PDF_PIXELS_PER_INCH exports (PR 13991 follow-up)
For improved maintainability, since these constants are being exposed in the official API, this patch moves them into an Object instead.
2021-09-11 11:15:25 +02:00
Jonas Jenwald
bd51bbfd16 Remove mozImageSmoothingEnabled fallback in CanvasGraphics.endGroup
This was added all the way back in PR 2936, however it's been unnecessary ever since Firefox 51 (released on 2017-01-24); please see the MDN compatibility data:
https://developer.mozilla.org/en-US/docs/Web/API/CanvasRenderingContext2D/imageSmoothingEnabled#browser_compatibility
2021-09-11 10:30:39 +02:00
Jonas Jenwald
9ce63a6dc6
Merge pull request #13991 from brendandahl/interpolate
Enable/disable image smoothing based on image interpolate value. (bug 1722191)
2021-09-11 10:02:53 +02:00
Brendan Dahl
f38fb42b42 Enable/disable image smoothing based on image interpolate value. (bug 1722191)
While some of the output looks worse to my eye, this behavior more
closely matches what I see when I open the PDFs in Adobe acrobat.

Fixes: #4706, #9713, #8245, #1344
2021-09-10 14:23:35 -07:00
Calixte Denizet
474ab7c86d Write boolean value when saving a form (bug 1729971)
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1729971#c4.
2021-09-10 14:10:25 +02:00
calixteman
57b80074a2
Merge pull request #13995 from calixteman/xfa_record
XFA - Handle $record shorcut in SOM expression (issue #13994)
2021-09-10 13:57:50 +02:00
Calixte Denizet
c5841b3794 XFA - Handle shorcut in SOM expression (issue #13994) 2021-09-09 19:54:45 +02:00
Calixte Denizet
623860bf8f XFA - Remove the checked attribute from the checkbox when unchecked (bug 1729877)
- it aims to fix: https://bugzilla.mozilla.org/show_bug.cgi?id=1729877.
2021-09-09 19:14:16 +02:00
Jonas Jenwald
45ddb12f61 Remove no-op onPull/onCancel streamSink callbacks from the "GetTextContent"-handler
The `MessageHandler`-implementation already handles either of these callbacks being undefined, hence there's no particular reason (as far as I can tell) to add no-op functions here.

Also, in a couple of `MessageHandler`-methods, utilize an already existing local variable more.
2021-09-09 00:01:10 +02:00
Jonas Jenwald
f90f9466e3 [api-minor] Reduce postMessage overhead, in PartialEvaluator.getTextContent, by sending text chunks in batches (issue 13962)
Following the STR in the issue, this patch reduces the number of `PartialEvaluator.getTextContent`-related `postMessage`-calls by approximately 78 percent.[1]
Note that by enforcing a relatively low value when batching text chunks, we should thus improve worst-case scenarios while not negatively affect all `textLayer` building.

While working on these changes I noticed, thanks to our unit-tests, that the implementation of the `appendEOL` function unfortunately means that the number and content of the textItems could actually be affected by the particular chunking used.
That seems *extremely* unfortunate, since in practice this means that the particular chunking used is thus observable through the API. Obviously that should be a completely internal implementation detail, which is why this patch also modifies `appendEOL` to mitigate that.[2]

Given that this patch adds a *minimum* batch size in `enqueueChunk`, there's obviously nothing preventing it from becoming a lot larger then the limit (depending e.g. on the PDF structure and the CPU load/speed).
While sending more text chunks at once isn't an issue in itself, it could become problematic at the main-thread during `textLayer` building. Note how both the `PartialEvaluator` and `CanvasGraphics` implementations utilize `Date.now()`-checks, to prevent long-running parsing/rendering from "hanging" the respective thread. In the `textLayer` building we don't utilize such a construction[3], and streaming of textContent is thus essentially acting as a *simple* stand-in for that functionality.
Hence why we want to avoid choosing a too large minimum batch size, since that could thus indirectly affect main-thread performance negatively.

---
[1] While it'd be possible to go even lower, that'd likely require more invasive re-factoring/changes to the `PartialEvaluator.getTextContent`-code to ensure that the batches don't become too large.

[2] This should also, as far as I can tell, explain some of the regressions observed in the "enhance" text-selection tests back in PR 13257.
    Looking closer at the `appendEOL` function it should potentially be changed even more, however that should probably not be done here.

[3] I'd really like to avoid implementing something like that for the `textLayer` building as well, given that it'd require adding a fair bit of complexity.
2021-09-09 00:01:07 +02:00
Jonas Jenwald
69034ab8dc Improve glyph mapping for non-embedded composite standard fonts (issue 11088)
For non-embedded CIDFontType2 fonts with a non-/Identity encoding, use the /ToUnicode data to improve the glyph mapping.
2021-09-08 15:15:33 +02:00
Jonas Jenwald
4c1b586dd2 Reduce the size of TextLayerRenderTask._textDivProperties in "regular" text-selection mode
While these changes will obviously not have a significant effect on overall memory usage, it cannot hurt as far as I'm concerned. This patch makes the following changes:
 - Clear out `_textDivProperties` once rendering is done, since those properties are only necessary to keep alive when *enhanced* text-selection is being used.

 - Reduce the size of the `_textDivProperties`-entries by default, since a majority of the properties are only relevant when *enhanced* text-selection is being used.
2021-09-05 12:12:34 +02:00
Tim van der Meij
1b20f61b56
Merge pull request #13972 from Snuffleupagus/issue-13971
Treat all content as visible when no optional content groups are defined (issue 13971)
2021-09-04 15:53:44 +02:00
Tim van der Meij
680f33c31c
Merge pull request #13961 from Snuffleupagus/simpler-regexp
Simplify some regular expressions
2021-09-04 15:39:30 +02:00
Jonas Jenwald
6318ccf6d2 Treat all content as visible when no optional content groups are defined (issue 13971)
In the referenced PDF document the /Contents stream contains MarkedContent-operators, however no optional content dictionary exists; according to [the specification](https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G7.3883825):

> Null values or references to deleted objects shall be ignored. If this entry is
  not present, is an empty array, or contains references only to null or deleted
  objects,  the  membership  dictionary  shall  have  no  effect  on  the  visibility  of
  any content.
2021-09-04 08:13:37 +02:00
Jonas Jenwald
3ccf277f58 Fallback to the /ToUnicode map for TrueType fonts with (3, 1) and (1, 0) cmap-tables (issue 13316)
In the PDF document some of the glyphs have bogus `differences`-entries[1] that cannot be resolved to valid glyph names, thus causing the glyph mapping to fail.
My initial idea was to use a similar approach as in the `PartialEvaluator._simpleFontToUnicode`-method, to extract the charCodes from those entries, however it turned out that that didn't actually help in this case (the mapping was still wrong).

To fix this I'm thus proposing that we fallback to the /ToUnicode map when no other useable data exists (e.g. no post-table), since it *hopefully* shouldn't make things any worse than leaving parts of the glyph map empty (which currently happens).

---
[1] As can be seem below, some of the entries are completely normal while others are non-standard:
```
Differences (array)
    0 = 65
    1 = /g5167
    2 = /space
    3 = /g11927
    4 = /g17737
    5 = /g11540
    6 = /g2180
    7 = /K
    8 = /P
    9 = /two
    10 = /zero
    11 = /one
    12 = /five
    13 = /four
    14 = /g6932
    15 = /g7246
    16 = /g1691
    17 = /g2343
    18 = /g14792
    19 = /g3325
    20 = /g4280
    21 = /g20383
    22 = /g18166
    23 = /g16988
    24 = /g17943
    25 = /g19223
    26 = /g10830
    27 = 97
    28 = /g982
    29 = /g1226
    30 = /g5059
    31 = /g2677
    32 = /g1042
    33 = /g11568
    34 = /L
    35 = /three
    36 = /seven
    37 = /g2364
    38 = /g12063
    39 = /g5356
    40 = /g2173
    41 = /g17877
    42 = /g7273
    43 = /g7647
    44 = /g7224
    45 = /g19327
    46 = /g5054
    47 = /g2342
    48 = /g10136
    49 = /g6856
    50 = /g13381
    51 = /g7257
    52 = /g12093
    53 = /g2359
```
2021-09-04 07:38:22 +02:00
Brendan Dahl
da15dbf962
Merge pull request #13698 from linfangrong/master
[FIX] fix jpx tag tree decode (issue 11957)
2021-09-03 10:00:19 -07:00
Brendan Dahl
a8ce15a2d7
Merge pull request #13966 from calixteman/no_ns
XFA - Created data node mustn't belong to datasets namespace
2021-09-03 09:59:40 -07:00
Calixte Denizet
77b9657e57 XFA - Overwrite AcroForm dictionary when saving if no datasets in XFA (bug 1720179)
- aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1720179
  - in some pdfs the XFA array in AcroForm dictionary doesn't contain an entry for 'datasets' (which contains saved data), so basically this patch allows to overwrite the AcroForm dictionary with an updated XFA array when doing an incremental update.
2021-09-03 17:04:03 +02:00
Calixte Denizet
57ae3a5a76 XFA - Created data node mustn't belong to datasets namespace
- when some named nodes in the template don't have their counterpart in datasets we create some nodes: the main node mustn't belong to the datasets namespace because it doesn't make sense and Acrobat Reader isn't able to read pdf with such nodes.
  - so created nodes under a datasets node have a namespaceId set to -1 and consequently when serialized no namespace prefix will appear.
2021-09-03 15:43:25 +02:00
Brendan Dahl
804abb3786
Merge pull request #13959 from calixteman/encrypt
Correctly pad strings when saving an encrypted pdf (bug 1726789)
2021-09-02 11:41:02 -07:00
Jonas Jenwald
c42887221a Simplify some regular expressions
There's a fair number of regular expressions througout the code-base which are slightly more verbose than strictly necessary, in particular:
 - We have a lot of regular expressions that use `[0-9]` explicitly, and those can be simplified to use `\d` instead.
 - We have one instance of a regular expression containing a `A-Za-z0-9_` sequence, which can be simplified to use `\w` instead.
2021-09-02 11:50:42 +02:00
Calixte Denizet
9619bf92be Correctly pad strings when saving an encrypted pdf (bug 1726789) 2021-09-02 10:37:21 +02:00
Tim van der Meij
0a366dda6a
Merge pull request #13955 from Snuffleupagus/issue-13433
Always prefer the post-table for TrueType fonts with (0, x) cmap-tables (issue 13433)
2021-09-01 21:46:34 +02:00
Tim van der Meij
19ce2de6f7
Merge pull request #13952 from Snuffleupagus/ItcSymbol
Extend `getNonStdFontMap` for non-embedded versions of the ItcSymbol font (issue 11532)
2021-09-01 21:38:59 +02:00
Jonas Jenwald
b7b6076294 Always prefer the post-table for TrueType fonts with (0, x) cmap-tables (issue 13433)
While I don't know if this is necessarily the "correct" solution, it does fix issue 13433 without breaking any of the existing reference-tests.
2021-09-01 12:35:49 +02:00
Jonas Jenwald
ba9f004097 Extend getNonStdFontMap for non-embedded versions of the ItcSymbol font (issue 11532)
Despite its name, the fonts in ItcSymbol-family are "regular" fonts and not Symbol ones. However, given that the font name contains the word "Symbol" we ended up picking the wrong code-path in the `Font.fallbackToSystemFont`-method.

*Please note:* While this patch ensures that the text becomes readable, by falling back a standard font, the rendering will obviously not be perfect. However, that's the PDF generators "fault" since non-embedded fonts cannot be guaranteed to render correctly in all environments.
2021-08-31 23:21:16 +02:00
Jonas Jenwald
1f56451d56 Implement PDFNetworkStreamRangeRequestReader._onError, to handle range request errors with XMLHttpRequest (issue 9883)
Given that the Fetch API is normally being used now, these changes are probably less important now than they used to be. However, given that it's simple enough to implement this I figured why not just fix issue 9883 (better late than never I suppose).
2021-08-31 10:23:57 +02:00
Jonas Jenwald
bd9a92a161 Use optional chaining more in the src/display/network.js file
Also changes the different `_onDone`/`_onProgress` methods to use consistent parameter names, and some other small improvements.
2021-08-31 10:23:54 +02:00
linfangrong
369f1899c6 [FIX] fix jpx tag tree decode (issue 11957) 2021-08-31 11:44:26 +08:00
Brendan Dahl
a7f807b059 Only use base encoding if it's populated. (bug 1727053)
The font dict in this file has an encoding entry, but only specifies a
differences map. The base encoding is empty in this case and shouldn't
be used.
2021-08-30 12:51:59 -07:00
Brendan Dahl
306119b12a
Merge pull request #13932 from Snuffleupagus/oc-images
Support Optional Content in Image-/XObjects (issue 13931)
2021-08-30 10:10:14 -07:00
Jonas Jenwald
cf0ccc4bab
Merge pull request #13937 from overleaf/jpa-fix-error-handling
Fix handling of fetch errors
2021-08-30 15:50:03 +02:00
Jakob Ackermann
291ffd3059
Fix handling of fetch errors
Testing:
- delete the pdf file while the initial request is inflight
- delete the pdf file after the initial request has finished

Repeat for a small file and large file, exercising both one-off and
 chunked transports.
2021-08-30 12:43:28 +01:00
Tim van der Meij
954e1a1694
Merge pull request #13943 from Snuffleupagus/api-more-async
Use `async` a bit more in the API
2021-08-29 14:34:14 +02:00
Jonas Jenwald
ce3f5ea2bf Use async a bit more in the API
This patch changes the `PDFDocumentLoadingTask.destroy`-method and the `_fetchDocument`-function to be `async`, which slightly simplifies the relevant code.

Furthermore, remove the catch-handler from the `WorkerTransport.getPageIndex`-method since it's no longer needed. Given that the `MessageHandler` is nowadays wrapping every possible Exception, it's no longer necessary to try and re-wrap the reason here.
2021-08-29 12:31:28 +02:00
Jonas Jenwald
9ea3fa0747 Ensure that PasswordException is handled correctly in the wrapReason function
While running the unit-tests with some logging statements added to this code, I noticed that `PasswordException` was missing from the list of potential Errors that could be passed to the `wrapReason` function.
2021-08-28 12:24:12 +02:00
Tim van der Meij
153d058b3a
Merge pull request #13933 from brendandahl/xfa-checkbox2
Fix saving of XFA checkboxes. (bug 1726381)
2021-08-27 22:45:44 +02:00
Jonas Jenwald
b34d2cdc42 Ensure that beginMarkedContentProps/endMarkedContent-operators, for /XObjects, are balanced in corrupt documents (PR 13854 follow-up)
Something that I *just* realized is that while PR 13854 fixed an issue as reported, it could still cause bugs in other similarily broken documents since we'll not insert a matching endMarkedContent-operator in the operatorList.
2021-08-26 17:05:30 +02:00
Jonas Jenwald
853b1172a1 Support Optional Content in Image-/XObjects (issue 13931)
Currently, in the `PartialEvaluator`, we only support Optional Content in Form-/XObjects. Hence this patch adds support for Image-/XObjects as well, which looks like a simple oversight in PR 12095 since the canvas-implementation already contains the necessary code to support this.
2021-08-26 16:54:15 +02:00
Brendan Dahl
6d2193a812 Fix saving of XFA checkboxes. (bug 1726381)
Previously were were always setting the storage value to the on value.
2021-08-24 15:53:55 -07:00
Jonas Jenwald
2a0ad8e696 Add deprecation warnings for the renderInteractiveForms and includeAnnotationStorage options, in PDFPageProxy.render
*This is done separately from the previous patch, to make it easier to revert these changes once they've been included in a couple of releases.*

Please note that because these two options are mutually exclusive, which is a large part of the reason for the previous patch, it's not guaranteed that the fallback-values will always be correct in every situation (but it's the best that we can do).
2021-08-24 01:40:12 +02:00
Jonas Jenwald
41efa3c071 [api-minor] Introduce a new annotationMode-option, in PDFPageProxy.{render, getOperatorList}
*This is a follow-up to PRs 13867 and 13899.*

This patch is tagged `api-minor` for the following reasons:
 - It replaces the `renderInteractiveForms`/`includeAnnotationStorage`-options, in the `PDFPageProxy.render`-method, with the single `annotationMode`-option that controls which annotations are being rendered and how. Note that the old options were mutually exclusive, and setting both to `true` would result in undefined behaviour.

 - For improved consistency in the API, the `annotationMode`-option will also work together with the `PDFPageProxy.getOperatorList`-method.

 - It's now also possible to disable *all* annotation rendering in both the API and the Viewer, since the other changes meant that this could now be supported with a single added line on the worker-thread[1]; fixes 7282.

---
[1] Please note that in order to simplify the overall implementation, we'll purposely only support disabling of *all* annotations and that the option is being shared between the API and the Viewer. For any more "specialized" use-cases, where e.g. only some annotation-types are being rendered and/or the API and Viewer render different sets of annotations, that'll have to be handled in third-party implementations/forks of the PDF.js code-base.
2021-08-24 01:13:02 +02:00
Brendan Dahl
56e7bb626c
Merge pull request #13660 from calixteman/no_xfaf
XFA - Disable xfa rendering for XFAF pdfs
2021-08-23 12:30:29 -07:00
Calixte Denizet
04573d2dc8 XFA - Disable xfa rendering for XFAF pdfs
- we'll implement XFAF support later.
2021-08-23 12:18:20 -07:00
Brendan Dahl
bf5a45ce6d
Merge pull request #13908 from brendandahl/xfa-find
[api-minor] XFA - Support text search in XFA documents.
2021-08-23 08:53:02 -07:00
Brendan Dahl
bb47128864 XFA - Support text search in XFA documents.
Moves the logic out of TextLayerBuilder to handle
highlighting matches into a new separate class `TextHighlighter`
that can be used with regular PDFs and XFA PDFs.

To mimic the current find functionality in XFA, two arrays
from the XFA rendering are created to get the text content
and map those to DOM nodes.

Fixes #13878
2021-08-23 08:44:20 -07:00
Tim van der Meij
83e1064360
Merge pull request #13920 from Snuffleupagus/issue-13916
Extend the glyph maps for standard respectively Calibri fonts (issue 13916)
2021-08-21 15:05:08 +02:00
Tim van der Meij
db11ba024d
Merge pull request #13899 from Snuffleupagus/includeAnnotationStorage-fix-caching
[Regression] Re-factor the *internal* `includeAnnotationStorage` handling, since it's currently subtly wrong
2021-08-21 15:04:28 +02:00
Jonas Jenwald
ac27f96987 Extend the glyph maps for standard respectively Calibri fonts (issue 13916) 2021-08-21 00:48:38 +02:00
Jonas Jenwald
5f25fea0fe Re-factor the LocalTilingPatternCache to cache by Ref rather than Name (PR 12458 follow-up, issue 13780)
This way there cannot be any *incorrect* cache hits, since Refs are guaranteed to be unique.
Please note that the reason for caching by Ref rather than doing something along the lines of the `localShadingPatternCache` (which uses a `Map` directly), is that TilingPatterns are streams and those cannot be cached on the `XRef`-instance (this way we avoid unnecessary parsing).
2021-08-18 12:49:01 +02:00
Jonas Jenwald
8ee5acd85d Tweak handling of the onlyRefs-option in the BaseLocalCache class 2021-08-18 12:24:51 +02:00
Jonas Jenwald
a7f0301f21 [Regression] Re-factor the *internal* includeAnnotationStorage handling, since it's currently subtly wrong
*This patch is very similar to the recently fixed `renderInteractiveForms`-options, see PR 13867.*
As far as I can tell, this *subtle* bug has existed ever since `AnnotationStorage`-support was first added in PR 12106 (a little over a year ago).

The value of the `includeAnnotationStorage`-option, as passed to the `PDFPageProxy.render` method, will (potentially) affect the size/content of the operatorList that's returned from the worker (for documents with forms).
Given that operatorLists will generally, unless they contain huge images, be cached in the API, repeated `PDFPageProxy.render` calls where the form-data has been changed by the user in between, can thus *wrongly* return a cached operatorList.

In the viewer we're only using the `includeAnnotationStorage`-option when printing, which is probably why this has gone unnoticed for so long. Note that we, for performance reasons, don't cache printing-operatorLists in the API.
However, there's nothing stopping an API-user from using the `includeAnnotationStorage`-option during "normal" rendering, which could thus result in *subtle* (and difficult to understand) rendering bugs.

In order to handle this, we need to know if the `AnnotationStorage`-instance has been updated since the last `PDFPageProxy.render` call. The most "correct" solution would obviously be to create a hash of the `AnnotationStorage` contents, however that would require adding a bunch of code, complexity, and runtime overhead.
Given that operatorList caching in the API doesn't have to be perfect[1], but only have to avoid *false* cache-hits, we can simplify things significantly be only keeping track of the last time that the `AnnotationStorage`-data was modified.

*Please note:* While working on this patch, I also noticed that the `renderInteractiveForms`- and `includeAnnotationStorage`-options in the `PDFPageProxy.render` method are mutually exclusive.[2]
Given that the various Annotation-related options in `PDFPageProxy.render` have been added at different times, this has unfortunately led to the current "messy" situation.[3]

---
[1] Note how we're already not caching operatorLists for pages with *huge* images, in order to save memory, hence there's no guarantee that operatorLists will always be cached.

[2] Setting both to `true` will result in undefined behaviour, since trying to insert `AnnotationStorage`-values into fields that are being excluded from the operatorList-building will obviously not work, which isn't at all clear from the documentation.

[3] My intention is to try and fix this in a follow-up PR, and I've got a WIP patch locally, however it will result in a number of API-observable changes.
2021-08-18 10:09:03 +02:00
Jonas Jenwald
1465b1670f [src/display/api.js] Move the getRenderingIntent helper function into WorkerTransport
By doing this re-factoring *separately*, since it's mostly a mechanical change, the size/scope of the next patch will be reduced somewhat.
2021-08-18 09:58:26 +02:00
Tim van der Meij
e9146b19e6
Merge pull request #13892 from Snuffleupagus/Dict-merge-refactor-2
Move some validation, in `Dict.merge`, used during merging of sub-dictionaries (PR 13775 follow-up)
2021-08-14 12:26:19 +02:00
Jonas Jenwald
e2aa067603 Simplify the ReadableStream polyfill
At this point in time, all of the supported browsers (in the PDF.js project) have native `ReadableStream` implementations; see https://developer.mozilla.org/en-US/docs/Web/API/ReadableStream#browser_compatibility

Hence the polyfill is *only* necessary in Node.js environments now, and we shouldn't need to do any detailed feature detection either (since that was only done for the non-Chromium versions of the MS Edge browser).
Finally, we can slightly reduce the size of the Chromium-extension since the polyfill shouldn't be needed there either.
2021-08-13 12:28:55 +02:00
Jonas Jenwald
3369f9a783 Move some validation, in Dict.merge, used during merging of sub-dictionaries (PR 13775 follow-up)
By not adding any additional non-`Dict` entries to the list of candidates for merging of sub-dictionaries, we can very slightly reduce the amount of parsing required by not having to *again* iterate through unmergeable data.
2021-08-12 11:32:11 +02:00
Jonas Jenwald
6167566f1b Re-factor the BaseException.name handling, and clean-up some code
Once we're finally able to get rid of SystemJS, which is unfortunately still blocked on [bug 1247687](https://bugzilla.mozilla.org/show_bug.cgi?id=1247687), we might also want to clean-up (or even completely remove) the `BaseException` abstraction and simply extend `Error` directly instead.

At that point we'd need to (explicitly) set the `name` on each class anyway, so this patch is essentially preparing for future clean-up. Furthermore, after the `BaseException` abstraction was added there's been *multiple* issues filed about third-party minification breaking our code since `this.constructor.name` is not guaranteed to always do what you intended.

While hard-coding the strings indeed feels quite unfortunate, it's likely the "best" solution to avoid the problem described above.
2021-08-10 11:27:47 +02:00
Jonas Jenwald
7f2d524df5 Improve caching of Annotations-data, by using a Map, in the API
Rather than caching only the *last* `PDFPageProxy.getAnnotations` call, and having to handle the intent separately, we can instead implement the caching in exactly the same way as done in the `PDFPageProxy.{render, getOperatorList}` methods.
2021-08-08 08:14:51 +02:00
Tim van der Meij
036b81496e
Merge pull request #13882 from Snuffleupagus/PDFWorker-rm-closure
[api-minor] Remove the closure from the `PDFWorker` class, in the `src/display/api.js` file
2021-08-07 19:52:39 +02:00
Tim van der Meij
952f6366bf
Merge pull request #13867 from Snuffleupagus/RenderingIntentFlag
[api-minor] Re-factor the *internal* renderingIntent, and change the default `intent` value in the `PDFPageProxy.getAnnotations` method
2021-08-07 19:25:51 +02:00
Jonas Jenwald
1cf9405281 [api-minor] Remove the closure from the PDFWorker class, in the src/display/api.js file
This patch removes the only remaining closure in the `src/display/api.js` file, utilizing a similar approach as used in lots of other parts of the code-base, which results in a small decrease in the size of the *build* `pdf.js` file.

Given that `PDFWorker` is exposed through the *public* API, this complicates things somewhat since there's a couple of worker-related properties that really should stay *private*. Initially, while working on PR 13813, I believed that we'd need support for private (static) class fields in order to get rid of this closure, however I've managed to come up with what's hopefully deemed an acceptable work-around here.
Furthermore, some helper functions were simply moved into the `PDFWorker` class as static methods, thus simplifying the overall implementation (e.g. we don't need to manually cache the Promise in the `PDFWorker._setupFakeWorkerGlobal`-method).

Finally, as part of this re-factoring a number of missing JSDoc-comments were added which *together* with the removal of the closure significantly improves the `gulp jsdoc` output for the `PDFWorker` class.

*Please note:* This patch is tagged with `api-minor` since it deprecates `PDFWorker.getWorkerSrc()` in favor of the shorter `PDFWorker.workerSrc`, with the fallback limited to `GENERIC` builds.
2021-08-07 10:43:39 +02:00
Brendan Dahl
3d18c76a53
Merge pull request #13881 from calixteman/bug_1723734
XFA - Elements under an area must be bound (bug 1723734)
2021-08-06 11:56:58 -07:00
Calixte Denizet
328383ea7a XFA - Elements under an area must be bound (bug 1723734)
- aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1723734.
2021-08-06 20:20:19 +02:00
Jonas Jenwald
107efdb178 [Regression] Re-factor the *internal* renderInteractiveForms handling, since it's currently subtly wrong
The value of the `renderInteractiveForms` parameter, as passed to the `PDFPageProxy.render` method, will (potentially) affect the size/content of the operatorList that's returned from the worker (for documents with forms).
Given that operatorLists will generally, unless they contain huge images, be cached in the API, repeated `PDFPageProxy.render` calls that *only* change the `renderInteractiveForms` parameter can thus return an incorrect operatorList.

As far as I can tell, this *subtle* bug has existed ever since `renderInteractiveForms`-support was first added in PR 7633 (which is almost five years ago).
With the previous patch, fixing this is now really simple by "encoding" the `renderInteractiveForms` parameter in the *internal* renderingIntent handling.
2021-08-06 00:40:43 +02:00
Jonas Jenwald
47f94235ab [api-minor] Re-factor the *internal* renderingIntent, and change the default intent value in the PDFPageProxy.getAnnotations method
With the changes made in PR 13746 the *internal* renderingIntent handling became somewhat "messy", since we're now having to do string-matching in various spots in order to handle the "oplist"-intent correctly.
Hence this patch, which implements the idea from PR 13746 to convert the `intent`-strings, used in various API-methods, into an *internal* renderingIntent that's implemented using a bit-field instead. *Please note:* This part of the patch, in itself, does *not* change the public API (but see below).

This patch is tagged `api-minor` for the following reasons:
 1. It changes the *default* value for the `intent` parameter, in the `PDFPageProxy.getAnnotations` method, to "display" in order to be consistent across the API.
 2. In order to get *all* annotations, with the `PDFPageProxy.getAnnotations` method, you now need to explicitly set "any" as the `intent` parameter.
 3. The `PDFPageProxy.getOperatorList` method will now also support the new "any" intent, to allow accessing the operatorList of all annotations (limited to those types that have one).
 4. Finally, for consistency across the API, the `PDFPageProxy.render` method also support the new "any" intent (although I'm not sure how useful that'll be).

Points 1 and 2 above are the significant, and thus breaking, changes in *default* behaviour here. However, unfortunately I cannot see a good way to improve the overall API while also keeping `PDFPageProxy.getAnnotations` unchanged.
2021-08-06 00:39:42 +02:00
Brendan Dahl
a38d1122d8 XFA - Support aria heading and table structure. (bug 1723421) (bug 1723425)
https://bugzilla.mozilla.org/show_bug.cgi?id=1723421
https://bugzilla.mozilla.org/show_bug.cgi?id=1723425
2021-08-05 15:25:04 -07:00
Calixte Denizet
fef939d347 Annotation & XFA: Add focus outlines on different fields (bug 1723615, bug 1718528)
- set a default tabindex to be sure they'll be taken into account in the TAB cycle (https://bugzilla.mozilla.org/show_bug.cgi?id=1723615).
  - show default outline when fields are focused (it was an a11y bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1718528).
2021-08-05 13:33:46 +02:00
Calixte Denizet
71a100a4d0 Annotation & XFA: Scale the font size in choicelist using zoom factor (bug 1715996)
- this is an accessibility issue which could be painful for some people with visual disabilities.
2021-08-04 20:36:04 +02:00
calixteman
52ef63f1fe
Merge pull request #13856 from calixteman/xfa_layout_rounding
XFA - Avoid to put something in very small areas
2021-08-04 10:09:13 +02:00
Brendan Dahl
3e003245b1 [XFA] Add alt text for images. (bug 1723418)
Not many XFA PDFs have alt text.

Some examples:
bug1723422.pdf
xfa_bug1718670_1.pdf
xfa_issue13611.pdf
xfa_issue13633.pdf
xfa_issue13634.pdf
2021-08-03 17:18:58 -07:00
Brendan Dahl
6cf1ee3251
Merge pull request #13858 from brendandahl/xfa-aria-label
Add aria-labels to XFA form elements. (bug 1723422)
2021-08-03 17:18:08 -07:00
Brendan Dahl
6ea56f35ab Add aria-labels to XFA form elements. (bug 1723422) 2021-08-03 15:58:33 -07:00
Tim van der Meij
85be62c684
Merge pull request #13854 from Snuffleupagus/issue-13851
Prevent breaking errors when an optional content group is undefined (issue 13851)
2021-08-03 23:34:34 +02:00
Tim van der Meij
ad90fe90ed
Merge pull request #13848 from Snuffleupagus/rm-lgtm
Remove the LGTM configuration and inline disable comments (issue 13829)
2021-08-03 23:13:05 +02:00
Jonas Jenwald
766299016f Remove the isEOF helper function and slightly re-factor EOF
Given how trivial the `isEOF` function is, we can simply inline the check at the various call-sites and remove the function (which ought to be ever so slightly more efficient as well).
Furthermore, this patch also changes the `EOF` primitive itself to a `Symbol` instead of an Object since that has the nice benefit of making it unclonable (thus preventing *accidentally* trying to send `EOF` from the worker-thread).
2021-08-03 20:19:32 +02:00
Calixte Denizet
be1ee155d1 XFA - Avoid to put something in very small areas
- it aims to fix #13855.
2021-08-03 17:05:29 +02:00
Jonas Jenwald
d5e14d3dc3 Prevent breaking errors when an optional content group is undefined (issue 13851)
In the referenced PDF document *most* of the form `/Form` XObjects don't have an `/OC` entry, which thus causes the runtime failure during rendering.
2021-08-03 15:59:29 +02:00
Jonas Jenwald
8fef8630fe Remove the LGTM configuration and inline disable comments (issue 13829)
Given that the GitHub Advanced Security workflow now covers everything that LGTM does, but generally faster and with better GitHub-integration, there's no longer much point in also running LGTM separately.
As a follow-up to this patch, we should also disable/remove the LGTM-integration from the PDF.js repository.
2021-08-03 11:14:49 +02:00
Jonas Jenwald
705d1cfad3 Remove useless assignment of availableSpace in the src/core/xfa/template.js file (issue 13829, 13835) 2021-08-03 10:58:57 +02:00
Tim van der Meij
10a1db6980
Merge pull request #13824 from Snuffleupagus/issue-13823
When no "V" entry exists, let the fieldValue fallback to the "DV" entry (issue 13823)
2021-07-30 22:48:38 +02:00
Tim van der Meij
67f4c34f63
Merge pull request #13822 from Snuffleupagus/ReadableStreams-cancel-no-Uncaught_promise
Prevent "Uncaught promise" messages in the console when cancelling (some) `ReadableStream`s
2021-07-30 22:09:29 +02:00
Tim van der Meij
99b14a9da0
Merge pull request #13813 from Snuffleupagus/rm-closure-API
Remove a couple of closures in the `src/display/api.js` file
2021-07-30 21:55:45 +02:00
Jonas Jenwald
ff71be793d When no "V" entry exists, let the fieldValue fallback to the "DV" entry (issue 13823) 2021-07-30 16:17:42 +02:00
Calixte Denizet
7bb5331087 XFA - Avoid an error when an exdata is a string (bug 1723114) 2021-07-30 14:43:53 +02:00
Jonas Jenwald
1df9da949e Prevent "Uncaught promise" messages in the console when cancelling (some) ReadableStreams
While fixing issue 13794, I noticed that cancelling the `ReadableStream` returned by the `PDFPageProxy.streamTextContent`-method could lead to "Uncaught promise" messages in the console.[1]
Generally speaking, we don't really care about errors when *cancelling* a `ReadableStream` and it thus seems reasonable to simply suppress any output in those cases.

---
[1] Although, after that issue was fixed you'd now need to set the API-option `stopAtErrors = true` to actually trigger this.
2021-07-30 14:27:38 +02:00
Jonas Jenwald
5fac0a4350 Simplify some code related to fallbackWorkerSrc and getMainThreadWorkerMessageHandler 2021-07-30 11:34:47 +02:00
Jonas Jenwald
4c679d80ac Remove the closure used with the InternalRenderTask class
This patch utilizes the same approach as used in lots of other parts of the code-base, which thus *slightly* reduces the size of this code.
2021-07-30 11:34:47 +02:00
Jonas Jenwald
b18620ac0f Remove the closure used with the PDFDocumentLoadingTask class
This patch utilizes the same approach as used in lots of other parts of the code-base, which thus *slightly* reduces the size of this code.

By removing some of the (current) indirection, we can also simplify the JSDocs a little bit. Looking at the `gulp jsdoc` output, this actually seem to *improve* the documentation for this class.
2021-07-30 11:34:47 +02:00
Brendan Dahl
4ad5c5d52a
Merge pull request #13808 from brendandahl/pattern-cache-v2
Improve caching of shading patterns. (bug 1721949)
2021-07-28 11:17:16 -07:00
Brendan Dahl
c836e1f0fb Improve caching of shading patterns. (bug 1721949)
The PDF in bug 1721949 uses many unique pattern objects
that references the same shading many times. This caused
a new canvas pattern to be created and cached many times
driving up memory use.

To fix, I've changed the cache in the worker to key off the
shading object and instead send the shading and matrix
separately. While that worked well to fix the above bug,
there could be PDFs that use many shading that could
cause memory issues, so I've also added a LRU cache
on the main thread for canvas patterns. This should prevent
memory use from getting too high.
2021-07-28 10:29:20 -07:00
Calixte Denizet
4a4591bd2c XFA - Fix font scale factors (bug 1720888)
- All the scale factors in for the substitution font were wrong because of different glyph positions between Liberation and the other ones:
    - regenerate all the factors
  - Text may have polish chars for example and in this case the glyph widths were wrong:
    - treat substitution font as a composite one
    - add a map glyphIndex to unicode for Liberation in order to generate width array for cid font
2021-07-28 19:10:42 +02:00
Calixte Denizet
92f4cc52a6 XFA - Add a transparent blue background on all text fields for consistency 2021-07-28 14:47:29 +02:00
Calixte Denizet
76d882b560 XFA - Fix auto-sized fields (bug 1722030)
- In order to better compute text fields size, use line height with no gaps (and consequently guessed height for text are slightly better in general).
  - Fix default background color in fields.
2021-07-28 09:43:15 +02:00
Tim van der Meij
336a74a0e5
Merge pull request #13796 from Snuffleupagus/issue-13794
Allow `StreamsSequenceStream.readBlock` to skip sub-streams with errors (issue 13794)
2021-07-27 22:25:58 +02:00
calixteman
45f3804737
Merge pull request #13807 from calixteman/fulltext
XFA - Get the full value when binding and not only the 1st line (bug 1718725)
2021-07-27 22:22:37 +02:00
Tim van der Meij
e51cbe63bf
Merge pull request #13801 from Snuffleupagus/AnnotationLayer-check-navigator
Access `navigator` safely in the `src/display/annotation_layer.js` file
2021-07-27 22:10:27 +02:00
Calixte Denizet
bd6f55186d XFA - Get the full value when binding and not only the 1st line (bug 1718725) 2021-07-27 20:25:33 +02:00
Jonas Jenwald
4b3ab1472c Access navigator safely in the src/display/annotation_layer.js file
For code that's part of the core library, rather than e.g. the `web/`-folder, we should always be careful about *directly* accessing any DOM methods.
The `navigator` is one such structure, which shouldn't be assumed to always be available and we should thus check that it's actually present.[1]

Hence this patch re-factors the `navigator.platform` access, in the `AnnotationLayer`-code, to ensure that it's generally safe. Furthermore, to reduce unnecessary repeated string-matching to determine the current platform, we're now using a shadowed getter which is evaluated only once instead (at first access).

---
[1] Note e.g. the `isSyncFontLoadingSupported` getter, in the `src/display/font_loader.js` file.
2021-07-27 09:40:42 +02:00
Calixte Denizet
959120e6c9 XFA - Elements created outside of XML must have all their properties (bug 1722029)
- an Image element was created, attached to its parent but the $globalData property was not set and that led to an error.
  - the pdf in bug 1722029 has 27 rendered rows (checked in Acrobat) when only one was displayed: this patch some binding issues around the occur element.
2021-07-26 19:38:52 +02:00
Jonas Jenwald
885e7a8aa4 Allow StreamsSequenceStream.readBlock to skip sub-streams with errors (issue 13794)
This patch makes use of the existing `ignoreErrors` option, thus allowing a page to continue parsing/rendering even if (some of) its sub-streams are corrupt. Obviously this may cause *part* of a page to be broken/missing, however it should be better than (potentially) rendering nothing.
Also, to the best of my knowledge, this is the first bug of its kind that we've encountered.

To avoid having to pass in a bunch of, for a `BaseStream`-instance, mostly unrelated parameters when initializing a `StreamsSequenceStream`-instance, I settled on utilizing a callback function instead to allow conditional Error-suppression.
Note that the `StreamsSequenceStream`-class is a *special* stream-implementation that we only use when the `/Contents`-entry, in the `/Page`-dictionary, consists of an Array with streams.
2021-07-26 16:42:50 +02:00
Jonas Jenwald
e1fa845293 Only define *existing* methods, when converting the OPS format to method-names on the CanvasGraphics.prototype
There's no good reason, as far as I can tell, to explicitly define a bunch of methods to be `undefined`, which the current unconditional "copying" of methods will do.
Note that of the `OPS` ~23 percent don't, for various reasons, have an associated method on the `CanvasGraphics.prototype`.
2021-07-25 13:28:28 +02:00
Jonas Jenwald
fbaafdc4e8 Remove the remaining closure in the src/display/canvas.js file
For e.g. the `gulp mozcentral` command, the *built* `pdf.js` file decreases from `304 607` to `301 295` bytes with this patch. The improvement comes mostly from having less overall indentation in the code.
2021-07-25 13:14:58 +02:00
Jonas Jenwald
833f27c677 Disable a LGTM warning, again (PR 13787 follow-up)
Apparently I didn't put one of the disable comments on the *correct* line, since I didn't read the instructions carefully enough, so let's try again.

Note that, most unfortunately, disabling of warnings isn't applied until *after* a patch has been merged.
2021-07-25 10:32:40 +02:00
Tim van der Meij
41a2b5c809
Merge pull request #13787 from Snuffleupagus/lgtm-fix-warnings
Fix (most) LGTM warnings
2021-07-24 15:20:07 +02:00
Tim van der Meij
7b6767d415
Merge pull request #13784 from Snuffleupagus/issue-13783
When parsing corrupt documents, avoid inserting obviously broken data in the XRef-table (issue 13783)
2021-07-24 14:37:39 +02:00
Tim van der Meij
687cfcecd4
Merge pull request #13786 from Snuffleupagus/rm-more-src-core-closures
Remove a couple of small closures in `src/core/` code
2021-07-24 14:26:57 +02:00
Jonas Jenwald
70bac87fed Fix (most) LGTM warnings
Most of the warnings we don't really care about, and those are simply white-listed using inline comments; however two cases prompted actual code changes:

 - In `src/display/pattern_helper.js` the branch in question is indeed unreachable, and should thus be safe to remove. (This code originated in PR 4192, which is now over seven years ago.)

 - In `test/test.js`, the function in question indeed doesn't accept any arguments. (The patch also re-formats a string just above, which didn't seem worthy of a separated patch.)

This now leaves only *one* warning in the LGTM report, however that one is a false positive that we'll need to report upstream.
2021-07-24 14:23:59 +02:00
Tim van der Meij
9854b85dc1
Merge pull request #13775 from Snuffleupagus/Dict-merge-refactor
Remove some duplication in the `Dict.merge` method
2021-07-24 14:21:41 +02:00
Jonas Jenwald
ebbbc973a5 Remove the closure used with the PostScriptToken class
This patch uses the same approach as used in lots of other parts of the code-base, which thus *slightly* reduces the size of this code.
2021-07-24 13:05:46 +02:00
Jonas Jenwald
81009d42cf Remove the closure used with the PostScriptStack class
This patch uses the same approach as used in lots of other parts of the code-base, which thus *slightly* reduces the size of this code.
2021-07-24 12:59:53 +02:00
Jonas Jenwald
b82c802dff When parsing corrupt documents, avoid inserting obviously broken data in the XRef-table (issue 13783)
In cases where even the very *first* attempt at reading from an object will throw, simply ignoring such objects will help improve rendering of *some* corrupt documents.
Note that this will lead to more parsing in some cases, but considering that this only applies to *corrupt* documents that shouldn't be a big deal.
2021-07-23 18:10:53 +02:00
Jonas Jenwald
51f0a81085
Merge pull request #13770 from brendandahl/cache-pattern
Improve performance of reused patterns.
2021-07-23 10:43:23 +02:00
Brendan Dahl
da1af02ac8 Improve performance of reused patterns.
Bug 1721218 has a shading pattern that was used thousands of times.
To improve performance of this PDF:
 - add a cache for patterns in the evaluator and only send the IR form once
   to the main thread (this also makes caching in canvas easier)
 - cache the created canvas radial/axial patterns
 - for shading fill radial/axial use the pattern directly instead of creating temporary
   canvas
2021-07-22 16:47:40 -07:00
Calixte Denizet
a51c4a3a0f XFA - A field without an ui must provide a default one (bug 1718245) 2021-07-22 20:31:25 +02:00
Jonas Jenwald
e1ee3835cd Remove some duplication in the Dict.merge method
Currently the `!mergeSubDicts` code-path is essentially just duplicated code, which we can easily avoid by simply moving that check. (This may lead to ever so slightly more parsing for this case, but the difference ought to be negligible in practice.)
2021-07-22 14:01:43 +02:00
Jonas Jenwald
2cf90cd9ad
Merge pull request #13766 from Snuffleupagus/issue-13751
XFA - Handle `startIndex` correctly in the `Template.$toHTML` method (issue 13751)
2021-07-21 18:58:29 +02:00
Calixte Denizet
5555114bb3 XFA - Remove namespace from nodes under xfa:data node
- in real life some xfa contains xml like <xfa:data><xfa:Foo><xfa:Bar>...</xfa:data>
    since there are no Foo or Bar in the xfa namespace the JS representation are empty
    and that leads to errors.
  - so the idea is to make all nodes under xfa:data namespace agnostic which means
    that ns are removed from nodes in the parser but only xfa:data descendants.
2021-07-21 17:11:31 +02:00
Jonas Jenwald
7d1c19f8bd XFA - Handle startIndex correctly in the Template.$toHTML method (issue 13751)
*Please note:* The PDF document in issue 13751 is *dynamically* created (in e.g. Adobe Reader), with pages added when certain buttons are clicked, hence this patch simply fixes the breaking error and nothing more.

It looks like the current code contains a little bit too much copy-and-paste from the *similar* `index` branch above, since we cannot set the `startIndex` to a negative value. Note how it's being used to initialize the loop-variable, which is then used to lookup values in an Array and accessing the `-1`th element of an Array obviously makes no sense.
2021-07-21 16:17:13 +02:00
Jonas Jenwald
6c9b6bc599
Merge pull request #13764 from Snuffleupagus/issue-13748
XFA - Add a missing method to `XFAAttribute`, to prevent breaking errors (issue 13748)
2021-07-20 18:55:23 +02:00
Jonas Jenwald
c2fe493abe XFA - Add a missing method to XFAAttribute, to prevent breaking errors (issue 13748)
*This is yet another case where I've got no idea if the patch is correct, but it does at least fix a breaking error :-)*

Note how in the [`Binder._bindValue` method](683ce66a48/src/core/xfa/bind.js (L92-L93)), we're assuming that if a `data`-value exists then it'll also be possible to actually access it. For the `XFAAttribute`-implementation however, the second method is missing and that's what causes the breaking errors in issue 13748.

Please note that another possible way of "fixing" the error wouldn't been to simply change the exists-check to return `false`, and I could see that being a preferred solution.
However, the reason for submitting the current patch is that we get *fewer* warnings about Nodes with mis-matched types this way.
2021-07-20 17:41:05 +02:00
Calixte Denizet
1d07ef597e XFA - Must use bindItems element even if there is no direct binding (bug 1720907) 2021-07-20 17:07:32 +02:00
Jonas Jenwald
cf7978d507 XFA - Prevent breaking errors in Binder, when searchNode doesn't return data (issue 13756)
As can be seen in the code (see below), the `searchNode` helper function will return `null` in some cases and all of its call-sites should protect against that before attempting to access the returned data.
While only one of these changes were necessary to fix the breaking errors in issue 13756, in order to prevent future bugs I've added similar defensive code throughout this file.

 - 07955fa1d3/src/core/xfa/som.js (L169)
 - 07955fa1d3/src/core/xfa/som.js (L239)
 - 07955fa1d3/src/core/xfa/som.js (L254)
2021-07-19 18:07:07 +02:00
Tim van der Meij
07955fa1d3
Merge pull request #13735 from Snuffleupagus/bug-1720411
Ensure that the field value, for checkboxes, refers to an existing appearance state (bug 1720411)
2021-07-18 13:48:34 +02:00
Tim van der Meij
668c58d68d
Merge pull request #13746 from Snuffleupagus/getOperatorList-intent
[api-minor] Add `intent` support to the `PDFPageProxy.getOperatorList` method (issue 13704)
2021-07-18 13:28:08 +02:00
Jonas Jenwald
481af097b4 Convert PDFFunction to a standard class with static methods
For e.g. `gulp mozcentral`, the *built* `pdf.worker.js` file decreases from `1 837 608` to `1 834 907` bytes with this patch-series.
The improvement comes first of all from less overall indentation in `PDFFunction`, and secondly from the removal of (now) unnecessary indirection in the code.
2021-07-17 16:46:57 +02:00
Jonas Jenwald
d35fe3e796 Remove the IR (internal representation) part of the PDFFunction parsing
*This follows the exact same princial as PR 12083, but for the `PDFFunction` parsing instead.*

Given that the IR format is completely unused now, all that the current code does is add a bunch of unnecessary indirection/overhead to the handling of PDF-functions.
2021-07-17 16:44:58 +02:00
Jonas Jenwald
03cf28bf17 [api-minor] Add intent support to the PDFPageProxy.getOperatorList method (issue 13704)
With this patch, the `PDFPageProxy.getOperatorList` method will now return `PDFOperatorList`-instances that also include Annotation-operatorLists (when those exist). Hence this closes a small, but potentially confusing, gap between the `render` and `getOperatorList` methods.

Previously we've been somewhat reluctant to do this, as explained below, but given that there's actual use-cases where it's required probably means that we'll *have* to implement it now.
Since we still need the ability to separate "normal" rendering operations from direct `getOperatorList` calls in the worker-thread, this API-change unfortunately causes the *internal* renderingIntent to become a bit "messy" which is indeed unfortunate (note the `"oplist-"` strings in various spots). As-is I suppose that it's not all that bad, but we may want to consider changing the *internal* renderingIntent to e.g. a bitfield in the future.

Besides fixing issue 13704, this patch would also be necessary if someone ever tries to implement e.g. issue 10165 (since currently `PDFPageProxy.getOperatorList` doesn't include Annotation-operatorLists).

*Please note:* This patch is *also* tagged "api-minor" for a second reason, which is that we're now including the Annotation-id in the `beginAnnotation` argument. The reason for this is to allow correlating the Annotation-data returned by `PDFPageProxy.getAnnotations`, with its corresponding operatorList-data (for those Annotations that have it).
2021-07-16 17:16:30 +02:00
Jonas Jenwald
da808aeab3 Ensure that the field value, for checkboxes, refers to an existing appearance state (bug 1720411)
Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1720411
2021-07-16 13:11:48 +02:00
Calixte Denizet
3fb30ddde5 XFA - Checkboxes must be printed (bug 1720182)
- to avoid future regressions, annotationStorage is passed to the xfa render in reftests.
2021-07-16 11:32:03 +02:00
calixteman
4b2e0d0d01
Merge pull request #13732 from calixteman/rect
XFA - A rectangle must have the width of its parent but without inner margins
2021-07-15 22:30:25 +02:00
Jonas Jenwald
3838c4e27c Re-factor the handling of *empty* Name-instances (PR 13612 follow-up)
When working on PR 13612, I mostly prioritized a simple solution that didn't require touching a lot of code. However, while working on PR 13735 I started to realize that the static `Name.empty` construction really wasn't a good idea.

In particular, having a special `Name`-instance where the `name`-property isn't actually a String is confusing (to put it mildly) and can easily lead to issues elsewhere. The only reason for not simply allowing the `name`-property to be an *empty* string, in PR 13612, was to avoid having to touch a lot of existing code. However, it turns out that this is only limited to a few methods in the `PartialEvaluator` and a few of the `BaseLocalCache`-implementations, all of which can be easily re-factored to handle *empty* `Name`-instances.

All-in-all, I think that this patch is even an *overall* improvement since we're now validating (what should always be) `Name`-data better in the `PartialEvaluator`.
This is what I ought to have done from the start, sorry about the code churn here!
2021-07-15 12:00:42 +02:00
calixteman
64f86de5cb
Merge pull request #13734 from calixteman/print_issue
XFA - Cannot print fields with no names
2021-07-14 19:45:53 +02:00
Calixte Denizet
019699acfb XFA - Cannot print fields with no names
- it was not possible to print pdf file in issue #13500.
2021-07-14 17:38:35 +02:00
Calixte Denizet
5081167e7f XFA - A rectangle must have the width of its parent but without inner margins
- it aims to fix #13584;
  - to avoid bad rendering because of clipping just set overflow to visible on SVG element.
2021-07-14 16:46:13 +02:00
Jonas Jenwald
b6c6a0cb7c Avoid all rendering breaking completely when CanvasPattern.setTransform() is unsupported
*Please note:* This patch doesn't fix rendering of (various) patterns in browsers/environments without full `CanvasPattern.setTransform()` support, but it at least prevents outright failures and thus allows the rest of the page to render.

This patch provides a temporary work-around for Firefox 78 ESR[1], and for Node.js environments (see issue 13724), where rendering is currently completely broken.

---
[1] Please note that the `createMatrix` helper function doesn't actually work as intended. The reason is that it's not `DOMMatrix` itself which is unsupported in older Firefox versions, but rather calling `CanvasPattern.setTransform(...)` with a `DOMMatrix`-argument.
Furthermore, the `createSVGMatrix` fallback won't actually help either since that method doesn't accept any parameters and would thus require *manually* specifying the matrix-state; see e.g. https://developer.mozilla.org/en-US/docs/Web/API/CanvasPattern/setTransform#examples
Finally, given that it's less than a month to the [Firefox 91 ESR release](https://wiki.mozilla.org/RapidRelease/Calendar) and that as-is all patterns are completely broken e.g. when using the latest viewer in Firefox 78 ESR, I'm just not convinced that it's worth the "hassle" of providing a more proper work-around.
2021-07-13 19:36:06 +02:00
Calixte Denizet
dd55e76f5d XFA - Avoid to have containers not pushed in the html
- it aims to fix issue #13668.
2021-07-12 21:34:58 +02:00
calixteman
140c2bc563
Revert "XFA - Avoid to have containers not pushed in the html" 2021-07-12 09:46:38 +02:00
calixteman
b6445ddc08
Merge pull request #13716 from calixteman/layout7
XFA - Avoid to have containers not pushed in the html
2021-07-12 09:31:27 +02:00
Calixte Denizet
9bbc194846 XFA - Support assist element 2021-07-11 21:01:18 +02:00
Calixte Denizet
fccc6c2242 XFA - Avoid to have containers not pushed in the html
- it aims to fix issue #13668.
2021-07-11 19:14:44 +02:00
Calixte Denizet
690b5d1941 XFA - Use fake MyriadPro as a fallback for missing fonts
- aims to fix #13597.
2021-07-11 13:52:13 +02:00
calixteman
d416b23898
Merge pull request #13705 from calixteman/lineheight3
XFA - Fix text positions (bug 1718741)
2021-07-10 14:19:03 +02:00
Jonas Jenwald
700b79a305 XFA - Always compute the transformed BBox values in checkDimensions (PR 13691 follow-up)
This way we ensure that these BBox values are *always* defined as expected for every `case`-block, and we also don't need to duplicate the lookup in multiple places. (Also, the patch removes a couple of unnecessary line-breaks in existing comments.)

Fixes https://github.com/mozilla/pdf.js/pull/13691#pullrequestreview-702356627, which was flagged by LGTM.
2021-07-10 11:24:05 +02:00
calixteman
a4f60fc417
Merge pull request #13708 from calixteman/xfa_tab
XFA - Add support for traversal and traverse element
2021-07-09 21:59:50 +02:00
Calixte Denizet
ccac125623 XFA - Add support for traversal and traverse element
- For now, just implement the "next" target in using tabindex attribute of html elements.
2021-07-09 20:50:25 +02:00
calixteman
caaf77375f
Merge pull request #13703 from eltociear/patch-4
XFA - Fix typo in factory.js
2021-07-09 20:12:55 +02:00
Calixte Denizet
58e1f51688 XFA - Fix text positions (bug 1718741)
- font line height is taken into account by acrobat when it isn't with masterpdfeditor: I extracted a font from a pdf, modified some ascent/descent properties thanks to ttx and the reinjected the font in the pdf: only Acrobat is taken it into account. So in this patch, line heights for some substituted fonts are added.
  - it seems that Acrobat is using a line height of 1.2 when the line height in the font is not enough (it's the only way I found to fix correctly bug 1718741).
   - don't use flex in wrapper container (which was causing an horizontal overflow in the above bug).
   - consequently, the above fixes introduced a lot of small regressions, so in order to see real improvements on reftests, I fixed the regressions in this patch:
     - replace margin by padding in some case where padding is a part of a container dimensions;
     - remove some flex display: some containers are wrongly sized when rendered;
     - set letter-spacing to 0.01px: it helps to be sure that text is not broken because of not enough width in Firefox.
2021-07-09 18:11:12 +02:00
Ikko Ashimine
dba30eac7b
XFA - Fix typo in factory.js
occured -> occurred
2021-07-09 21:51:37 +09:00
Calixte Denizet
ad195e0f05 XFA - Scale correctly images 2021-07-08 20:28:49 +02:00
calixteman
c33bf0b5e8
Merge pull request #13692 from calixteman/bind_global
XFA - Correctly bind global data (bug 1718725)
2021-07-08 12:42:22 +02:00
Jonas Jenwald
df6107714a XFA - remove unnecessary check in the handleBreak function (PR 13687 follow-up)
With the changes in PR 13687 we're now checking if `target` is defined *twice* in a row, which shouldn't be necessary :-)

(I noticed this when glancing at the unofficial LGTM results; maybe we should re-evalute the decision to not integrate that into the CI.)
2021-07-07 23:22:29 +02:00
calixteman
fbd2f28618
Merge pull request #13693 from calixteman/textfields
XFA - Enable disabled fields (bug 1719464)
2021-07-07 21:43:21 +02:00
Calixte Denizet
d9a776caf8 XFA - Enable disabled fields (bug 1719464)
- it's a workaround in waiting for JS implementation to let the use fill manually some fields.
2021-07-07 19:11:36 +02:00
Calixte Denizet
8a06df9253 XFA - Handle correctly nested containers with lr-tb layout (bug 1718670)
- and avoid to push a field with no dimensions when we have some available space in width in a parent.
2021-07-07 18:54:32 +02:00
Calixte Denizet
778800a53a XFA - Correctly bind global data (bug 1718725) 2021-07-07 17:36:56 +02:00
Jonas Jenwald
37d2808977 Merge the supplemental font data files used with XFA documents
When XFA support was added, the size of the *built* `pdf.worker.js` file increased quite a bit. Hence I think that it makes sense to, where easily possible, do what we can to (slightly) reduce the size of the PDF.js library.

The supplemental font data files (added for XFA rendering), containing rescale-factors respectively widths, seem like an excellent candidate here since they're not particularly large in either line-count or file sizes.
In this patch these files are instead merged into a *single* file per font, rather than four different ones, and even with these changes the resulting source files don't become all that large.[1]

For e.g. the `gulp mozcentral` build, this reduces the size of the *built* `pdf.worker.js` file by more than `3 kB`. Given the overall simplicity of the patch, that kind of size decrease definitely seem worthwhile to me.

---
[1] Especially when compared to truly large files such as e.g. `glyphlist.js`, `metrics.js`, and `unicode.js`.
2021-07-07 11:56:34 +02:00
calixteman
1eb9a3e9eb
Merge pull request #13687 from calixteman/failing_som
XFA - Don't fail xfa loading because of a JS subexpression in SOM expressions
2021-07-07 11:50:26 +02:00
Calixte Denizet
0486d24e36 XFA - Don't fail xfa loading because of a JS subexpression in SOM expressions
- Fix for one pdf in bug 1717668 (PDFIUM-292-0.pdf).
2021-07-07 10:47:53 +02:00
Jonas Jenwald
05ebb6329b
Merge pull request #13683 from brendandahl/mask-fixes
Fix transformations when painting image masks and tiling patterns.
2021-07-07 10:24:01 +02:00
Brendan Dahl
a52c0c6988 Fix transformations when painting image masks and tiling patterns.
Previously, when we filled image masks we didn't copy over the current transformation,
this caused patterns to be misaligned when painted. Now we create a temporary
canvas with the mask and have the transform copied over and offset it relative to
where the mask would be painted. We also weren't properly offsetting tiling patterns.
This isn't usually noticeable since patters repeat, but in the case of #13561 the pattern
is only drawn once and has to be in the correct position to line up with the mask image.

These fixes broke #11473, but highlighted that we were drawing that correctly by
accident and not correctly handling negative bounding boxes on tiling patterns.

Fixes #6297,  #13561, #13441

Partially fixes #1344 (still blurry but boxes are in correct position now)
2021-07-06 17:29:32 -07:00
Calixte Denizet
c47f0f0f40 XFA - Default background in rectangle is white
- Fix a typo in order to open the pdf in issue #13679
  - After fixing the fill default color there wer some regressions because of z-index
    and when fixing z-index there were some regressions because of borders
  - So fix the borders rendering.
2021-07-06 21:17:20 +02:00
Calixte Denizet
5f76b6370c XFA - Layout correctly a subform with row layout (bug 1718740)
- Fix issues with subformSet elements which are not a real container.
2021-07-06 14:11:25 +02:00
calixteman
ba2d685166
Merge pull request #13673 from calixteman/images2
XFA - An image can be a stream in the pdf (bug 1718521)
2021-07-06 09:53:29 +02:00
calixteman
b9e84ba70e
Merge pull request #13665 from calixteman/reserve
XFA - Fix indentation for justified paragraph
2021-07-05 15:45:59 +02:00
Calixte Denizet
5cdee80c8e XFA - An image can be a stream in the pdf (bug 1718521) - hrefs can be found in catalog > Names > XFAImages 2021-07-05 14:06:23 +02:00
calixteman
02c069481e
Merge pull request #13653 from calixteman/lineheight
XFA - Improve text layout
2021-07-05 13:32:42 +02:00
calixteman
783cbc1793
Revert "XFA - An image can be a stream in the pdf (bug 1718521)" 2021-07-05 12:47:14 +02:00
calixteman
b370d4714f
Merge pull request #13654 from calixteman/images
XFA - An image can be a stream in the pdf (bug 1718521)
2021-07-05 12:04:34 +02:00
Jonas Jenwald
819be0e78b Fix the remaining ESLint operator-assignment errors 2021-07-04 15:23:56 +02:00
Jonas Jenwald
901b24e8af Enable the ESLint operator-assignment rule
This patch was generated automatically, using the `gulp lint --fix` command.

Please find additional details about the ESLint rule at https://eslint.org/docs/rules/operator-assignment
2021-07-04 12:57:45 +02:00
Jonas Jenwald
de80590157
Merge pull request #13662 from calixteman/no_arial
XFA - Don't use system font when a font is not embeded but there is a substitution
2021-07-04 11:35:50 +02:00
Calixte Denizet
9b5574d3ef XFA - Fix indentation for justified paragraph
- and ceil the reserve for a caption to avoid to split it;
  - both issues are present in the pdf in issue #13633.
2021-07-03 18:07:01 +02:00
Calixte Denizet
5744dd773d XFA - Don't use system font when a font is not embeded but there is a substitution
- always use a font coming from pdf.js when there is one: this way we don't use a system font which could looks wrong.
2021-07-03 15:13:56 +02:00
Jonas Jenwald
661c60ecc9 [api-minor] Support accessing both the original and modified PDF fingerprint
The PDF.js API has only ever supported accessing the original file ID, however the second one that (should) exist in *modified* documents have thus far been completely inaccessible through the API.
That seems like a simple oversight, caused e.g. by the viewer not needing it, since it really shouldn't hurt to provide API-users with the ability to check if a PDF document has been modified since its creation.[1]

Please refer to https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G13.2261661 for additional information.

For an example of how to update existing code to use the new API, please see the changes in the `web/app.js` file included in this patch.

*Please note:* While I'm not sure if we'll ever be able to remove the old `PDFDocumentProxy.fingerprint` getter, given that it's existed since "forever", that probably isn't a big deal given that it's now limited to only `GENERIC`-builds.

---
[1] Although this obviously depends on the PDF software following the specification, by updating the second file ID as intended.
2021-07-03 13:56:33 +02:00
Tim van der Meij
f9d506cf50
Merge pull request #13658 from Snuffleupagus/cloneValue
Don't attempt to structure clone unsupported types with workers disabled
2021-07-03 13:01:46 +02:00
Jonas Jenwald
bdf6f733bf Don't attempt to structure clone unsupported types with workers disabled
Please refer to https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm

Based on that information, and manually testing our code, the implementation in `cloneValue` has the following shortcomings:
 - Attempting to clone `function`s is only prevented when they're part of an Object, but is currently allowed when they occur standalone.
 - Cloning of `Symbol`s is currently not prevented, which it should be since the native structured clone algorithm throws.
 - Any disallowed types should be checked first, to reduce the risk of future changes accidentally allowing something that shouldn't be supported.
2021-07-03 11:56:33 +02:00
Jonas Jenwald
909ff8e29f Replace instanceof Object with typeof checks
Using `instanceof Object` is generally problematic, since it's not guaranteed to always do the right thing for all Objects.
(I stumbled upon this while working on another patch, when I noticed that the `outlineView` was broken with workers disabled.)
2021-07-03 11:30:46 +02:00
Calixte Denizet
f16828be49 XFA - An image can be a stream in the pdf (bug 1718521)
- hrefs can be found in catalog > Names > XFAImages
2021-07-02 20:34:10 +02:00
Calixte Denizet
f7d3b22480 XFA - Improve text layout
- support paragraph margins, line height, letter spacing, ...
  - compute missing dimensions from fields based almost on the dimensions of caption contents.
2021-07-02 17:53:32 +02:00
calixteman
d80651e572
Merge pull request #13598 from calixteman/dhl
XFA - Remove empty pages
2021-06-30 20:43:07 +02:00
calixteman
a8a5c5f10b
Merge pull request #13648 from calixteman/xfa_bg
XFA - Don't fill when the fill element is not visible (bug 1718735)
2021-06-30 18:12:13 +02:00
Calixte Denizet
08e08d5852 XFA - Don't fill when the fill element is not visible (bug 1718735) 2021-06-30 17:14:08 +02:00
Calixte Denizet
ff440d13e7 XFA - Remove empty pages
- it aims to fix #13583;
  - fix the switch to breakBefore target;
  - force the layout of an unsplittable element on an empty page;
  - don't fail when there is horizontal overflow (except in lr-tb);
  - handle correctly overflow in the same content area (bug 1717805, bug 1717668);
  - fix a typo in radial gradient first argument.
2021-06-30 16:32:27 +02:00
Tim van der Meij
6307349e31
Merge pull request #13640 from Snuffleupagus/issue-6759
Add non-PRODUCTION/TESTING overflow `assert`s to various string helper-functions (issue 6759)
2021-06-29 21:22:34 +02:00
calixteman
f35e4cc9ab
Merge pull request #13645 from calixteman/bug1718241
XFA - Choice list has no selected value by default (bug 1718241)
2021-06-28 23:46:59 +02:00
calixteman
04dc902933
Merge pull request #13644 from calixteman/xfa_missing_fonts
XFA - Support non-embedded fonts without a Widths entry
2021-06-28 23:46:09 +02:00
Calixte Denizet
70bb672dcd XFA - Support non-embedded fonts without a Widths entry
- some pdf use some fonts which are not embedded or they don't have any width array or don't have any css info (e.g. for standard fonts or Arial).
  - so add widths arrays for Liberation fonts in order to compute the ones for other fonts in using scale factors array.
2021-06-28 23:05:08 +02:00
Calixte Denizet
1de133a7c9 XFA - Choice list has no selected value by default 2021-06-28 22:10:26 +02:00
Calixte Denizet
71d17b0cc4 XFA - Implement aspect property on image element
- it aims to fix issue #13634;
  - move some img-related functions in test/drivers.js in order to have images in xfa reftests.
2021-06-28 20:43:39 +02:00
Calixte Denizet
b261446981 XFA - Fix width of a container with lr-tb layout (bug 1718037) 2021-06-28 17:47:04 +02:00
calixteman
03dff1c5f5
Merge pull request #13639 from calixteman/old_break
XFA - Replace deprecated break element (bug 1718053)
2021-06-28 17:44:03 +02:00
calixteman
191db4145e
Merge pull request #13642 from calixteman/quotes
XFA - Remove quotes of font name in xhtml
2021-06-28 13:25:47 +02:00
Calixte Denizet
677332aa7b XFA - Remove quotes of font name in xhtml 2021-06-27 18:05:12 +02:00
Jonas Jenwald
273d8cb746 Add non-PRODUCTION/TESTING overflow asserts to various string helper-functions (issue 6759) 2021-06-27 16:06:30 +02:00
Calixte Denizet
257de0e8c5 XFA - Replace deprecated break element (bug 1718053)
- the break element has been deprecated in XFA 2.4 but some old documents can use it, so replace it with one (or more) of its possible substitutions:
    - breakBefore;
    - breakAfter;
    - overflow.
2021-06-27 15:03:00 +02:00
Jonas Jenwald
d02146b13b Add a OnProgressParameters typedef to reduce (some) duplication in src/display/api.js 2021-06-27 11:55:53 +02:00
Jonas Jenwald
ea4b162328 Use the RefProxy typedef in more JSDoc comments in src/display/api.js 2021-06-27 11:34:59 +02:00
Tim van der Meij
d7f8a0e9b9
Merge pull request #13628 from Snuffleupagus/issue-13626
Check that TrueType (3, 0) cmap tables, for symbolic fonts, are sorted correctly (issue 13626)
2021-06-26 14:17:11 +02:00
Calixte Denizet
429ffdcd2f XFA - Save filled data in the pdf when downloading the file (Bug 1716288)
- when binding (after parsing) we get a map between some template nodes and some data nodes;
  - so set user data in input handlers in using data node uids in the annotation storage;
  - to save the form, just put the value we have in the storage in the correct data nodes, serialize the xml as a string and then write the string at the end of the pdf using src/core/writer.js;
  - fix few bugs around data bindings:
    - the "Off" issue in Bug 1716980.
2021-06-25 18:57:01 +02:00
Jonas Jenwald
50edd5da63 Suppress OTS warnings about the caretOffset in the hhea-table
- https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6hhea.html
 - https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6head.html
2021-06-25 17:02:02 +02:00
Jonas Jenwald
185be678ec Check that TrueType (3, 0) cmap tables, for symbolic fonts, are sorted correctly (issue 13626)
According to a comment in `readCmapTable`, we're assuming that the cmap tables (when more than one exist) are sorted in ascending order. If that's not the case, keep checking the following cmap tables in order to fix the referenced issue.
2021-06-25 16:56:00 +02:00
Brendan Dahl
d7fdb72a3f
Merge pull request #13619 from calixteman/bg
XFA - Add back empty subforms (which can have a background)
2021-06-23 16:21:28 -07:00
Brendan Dahl
f4f00a9bc6
Merge pull request #13618 from calixteman/bind_root
XFA - Always bind root subform on root data
2021-06-23 13:14:12 -07:00
Tim van der Meij
f74562b19c
Merge pull request #13613 from Snuffleupagus/xfa-printing-tweaks
[api-minor] Slightly tweak/improve various code related to XFA-printing
2021-06-23 21:56:29 +02:00
Tim van der Meij
ad4b2ce021
Merge pull request #13612 from Snuffleupagus/issue-13610
Support corrupt documents with *empty* `Name`-entries (issue 13610)
2021-06-23 21:49:02 +02:00
Calixte Denizet
b836616667 XFA - Always bind root subform on root data
- it partially fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1717805 (on the data side at least but there is still a layout issue).
2021-06-23 20:46:41 +02:00
Calixte Denizet
f168998d93 XFA - Add back empty subforms (which can have a background) 2021-06-23 19:42:36 +02:00
Calixte Denizet
e82446fa5a XFA - Get line height from the font
- when the CSS line-height property is set to 'normal' then the value depends of the user agent. So use a line height based on the font itself and if for any reasons this value is not available use 1.2 as default.
  - it's a partial fix for https://bugzilla.mozilla.org/show_bug.cgi?id=1717681.
2021-06-23 14:11:10 +02:00
Jonas Jenwald
87be43c193 [api-minor] Add a new getXfaPageViewport helper function to support printing
This patch provides an overall simpler *and* more consistent way of handling the `viewport` parameter during printing of XFA forms, since it's now again guaranteed to always be an instance of `PageViewport`.
Furthermore, for anyone attempting to e.g. implement custom printing of XFA forms this probably cannot hurt either.
2021-06-23 08:17:58 +02:00
Jonas Jenwald
6467907318 Support corrupt documents with *empty* Name-entries (issue 13610)
Apparently some really bad PDF software can create documents with *empty* `Name`-entries, which we thus need to somehow deal with.
While I don't know if this patch is necessarily the best solution, it should at least ensure that the *empty* `Name`-instance cannot accidentally match a proper `Name`-instance (and it doesn't require changes to a lot of existing code).[1]

---
[1] I briefly considered using a `Symbol` rather than an Object, but quickly decided against that since the former one [is not clonable](https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm#supported_types) and `Name`-instances may be sent to the API.
2021-06-22 16:55:44 +02:00
Calixte Denizet
aca102a35e XFA - Add margins if needed after having layout some text 2021-06-22 10:36:01 +02:00
calixteman
b886b6c995
Merge pull request #13604 from calixteman/xfa_proto_propr
XFA - A prototype can have a property which needs itself to resolve a proto
2021-06-22 09:58:29 +02:00
calixteman
e84b3bbf6e
Merge pull request #13592 from calixteman/xfa_print_only
XFA - Don't display print-only elements
2021-06-21 21:02:53 +02:00
Calixte Denizet
72c32b3498 XFA - A prototype can have a property which needs itself to resolve a proto 2021-06-21 17:26:29 +02:00
calixteman
56a75f8b26 Revert "Revert "XFA - Fix the way to select page on breaking"" - and fix the error which caused the backout: add an $extra property when creating html. - switch to next content area when breaking on page area. 2021-06-21 17:07:31 +02:00
calixteman
a9385bbb52
Revert "XFA - Fix the way to select page on breaking" 2021-06-21 15:45:04 +02:00
calixteman
da19997781
Merge pull request #13573 from calixteman/bug1716838
XFA - Fix the way to select page on breaking
2021-06-21 15:06:03 +02:00
Calixte Denizet
7aea8faa34 XFA - Fix the way to select page on breaking
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1716838.
  - some fonts in the pdf in the bug where bold when they shouldn't so write the font properties in the html to avoid to use some wrong inherited ones.
2021-06-21 12:45:23 +02:00
calixteman
34acf29403
Merge pull request #13593 from calixteman/xfa_rect_border
XFA - Don't display invisible rectangle borders
2021-06-21 12:24:23 +02:00
Calixte Denizet
d99a7c070f XFA - Don't display print-only elements
- partial fix for https://bugzilla.mozilla.org/show_bug.cgi?id=1716980.
2021-06-21 10:08:10 +02:00
Calixte Denizet
7cb92a64b1 XFA - Add support for access property
- it's a partial fix for https://bugzilla.mozilla.org/show_bug.cgi?id=1716816.
2021-06-21 09:56:28 +02:00
calixteman
2e6d3d6b00
Merge pull request #13591 from calixteman/xfa_default_font
XFA - Match font family correctly
2021-06-21 09:28:59 +02:00
Calixte Denizet
d76f11a0ce XFA - Don't display invisible rectangle borders
- partial fix for https://bugzilla.mozilla.org/show_bug.cgi?id=1716980.
2021-06-20 15:45:58 +02:00
Calixte Denizet
7cdbc98716 XFA - Match font family correctly
- partial fix for https://bugzilla.mozilla.org/show_bug.cgi?id=1716980;
  - some pdf can contain an invalid font family (e.g. 'Windings 3') so in this case remove the space;
  - the font family in typeface attribute doesn't always match the one defined in the FontDescriptor dictionary.
2021-06-20 15:16:28 +02:00
Jonas Jenwald
c4334dcfe7 Allow using the standard font data for non-Type1 fonts (issue 13585, PR 12726 follow-up)
Given that we're not imposing any font-type restrictions[1] in the non-/FontDescriptor case, it's not really clear to me why we'd actually need to do that in the general case.
Please note that there's some *expected* movement, all of which should be improvements, in the `fips197.pdf` file with this patch.

---
[1] With the exception of Type3-fonts, of course.
2021-06-20 11:13:49 +02:00
Jonas Jenwald
d9ed14a2f5 Set the default value of useSystemFonts correctly, depending on disableFontFace, in the API (PR 13516 follow-up)
*Sorry about the churn here, since the change that I made in PR 13516 was not very smart.*

With the current code, it's now *impossible* for a user to actually control the `useSystemFonts` option manually. To prevent outright breakage we obviously still need to default to setting `useSystemFonts = false` when `disableFontFace === true`, however that should be possible for an API consumer to override.
2021-06-19 13:53:13 +02:00
Brendan Dahl
5d251a3a3e
Merge pull request #13566 from calixteman/layout4
XFA - Fix layout issues
2021-06-17 13:23:28 -07:00
Calixte Denizet
e65b41f891 XFA - When no fonts in the pdf just use font size as width when measuring text 2021-06-17 16:50:56 +02:00
Calixte Denizet
df08b1548b XFA - Fix layout issues
- PR #13554 is buggy, so this patch aims to fix bugs.
  - check if a component fits into its parent in taking into account the parent layout.
  - introduce method isSplittable for template nodes to know if a component can be splitted in case of overflow.
2021-06-17 16:09:22 +02:00
Calixte Denizet
226c228c2a XFA - Fix reftest for xfa_issue13500 2021-06-17 15:48:13 +02:00
Calixte Denizet
8eeb7ab4a3 XFA - Add the possibily to layout and measure text
- some containers doesn't always have their 2 dimensions and those dimensions re based on contents;
  - so in order to measure text, we must get the glyph widths (for the xfa fonts) before starting the layout;
  - implement a word-wrap algorithm;
  - handle font change during text layout.
2021-06-17 14:17:02 +02:00
calixteman
335d4cb2fc
Merge pull request #13570 from calixteman/xfa_field
XFA - By default a text ui has only one line when in a field element
2021-06-17 09:09:59 +02:00
Brendan Dahl
d6deb95f11
Merge pull request #13565 from brendandahl/fix-pattern-mask
Fix how patterns are applied to image mask objects.
2021-06-16 20:07:57 -07:00
Brendan Dahl
5efaaa0fea Fix how patterns are applied to image mask objects.
Note, this only really fixes Radial/Axial shading patterns with masks.
I'm guessing tiling patterns and mesh patterns would also be broken
if applied like the test pdf. Hopefully I'll have some time to make
test cases for the other shadings.

Fixes #13372
2021-06-16 20:06:41 -07:00
Calixte Denizet
793a0156ce XFA - By default a text ui has only one line when in a field element
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1716809.
2021-06-16 20:18:29 +02:00
calixteman
473322298f
Merge pull request #13569 from calixteman/visibility
XFA - Container wrapper must take the visibility of the wrapped content
2021-06-16 18:55:44 +02:00
Calixte Denizet
143d190a61 XFA - Container wrapper must take the visibility of the wrapped content 2021-06-16 17:29:02 +02:00
Jonas Jenwald
7fa61c062c
Merge pull request #13393 from Snuffleupagus/adjustToUnicode-hasIncludedToUnicodeMap
Tweak `adjustToUnicode` to allow extending a built-in /ToUnicode map
2021-06-16 17:06:17 +02:00
Calixte Denizet
0ea5792c86 XFA - Add support for overflow element
- and fix few bugs:
    - avoid infinite loop when layout the document;
    - avoid confusion between break and layout failure;
    - don't add margin width in tb layout when getting available space.
2021-06-15 12:32:01 +02:00
Jonas Jenwald
229a49b9b9 Re-factor the fallbackToUnicode functionality (PR 9192 follow-up)
Rather than having to create and check a *separate* `ToUnicodeMap` to handle these cases, we can simply use the `fallbackToUnicode`-data (when it exists) to directly supplement *missing* /ToUnicode entires in the regular `ToUnicodeMap` instead.
2021-06-14 15:05:14 +02:00
Jonas Jenwald
7190bc23a8 Remove unnecessary in checks of Arrays, when building the charCodeToGlyphId for TrueType fonts
Note that all standard Encodings have the same length (i.e. `256` elements) and that missing entries are always represented by empty strings, hence why a separate exists-check isn't necessary in the `baseEncoding` case.
2021-06-14 15:05:14 +02:00
Jonas Jenwald
edc38de37a Convert PartialEvaluator.buildToUnicode to an async method
This removes the need to *manually* wrap all return values in a Promise.
2021-06-14 15:05:14 +02:00
Jonas Jenwald
3660aaac85 Tweak adjustToUnicode to allow extending a built-in /ToUnicode map
*This is somewhat similiar to the recent changes, in PR 13277, for fonts with an /Encoding entry.*

Currently we're *completely* ignoring the `builtInEncoding`, from the font data itself, for fonts which have a built-in /ToUnicode map.
While it (obviously) doesn't seem like a good idea in general to simply overwrite existing built-in /ToUnicode entries, it should however not hurt to use the `builtInEncoding` to supplement *missing* /ToUnicode entires.
2021-06-14 15:05:14 +02:00
Calixte Denizet
d89c429d78 XFA - Handle maxChars property for text fields
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1716294.
2021-06-14 13:07:06 +02:00
Calixte Denizet
150fa3d96e XFA - Fix error when creating a new data node
- fix for issue #13556;
  - value in a field can be empty.
2021-06-14 11:33:08 +02:00
Calixte Denizet
088db47849 XFA - Value in field can be html 2021-06-13 19:50:28 +02:00
calixteman
96c103462a
Merge pull request #13548 from calixteman/default_fill
XFA - Default fill color for rectangle is transparent
2021-06-13 14:11:22 +02:00
Jonas Jenwald
ddea90b8f6 Remove the isFetchSupported function since the Fetch API is available in all supported browsers
The currently supported browsers, note the minimum versions [listed here](5a4e06af2d/gulpfile.js (L78-L88)), should now have native support for all of the features checked in the `isFetchSupported` function:

 - https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API#browser_compatibility
 - https://developer.mozilla.org/en-US/docs/Web/API/Response#browser_compatibility
 - https://developer.mozilla.org/en-US/docs/Web/API/Body/body#browser_compatibility
 - https://developer.mozilla.org/en-US/docs/Web/API/ReadableStream#browser_compatibility

Hence this function can now be removed, and the code can thus be simplified a little bit.
2021-06-12 08:01:54 +02:00
Brendan Dahl
5a4e06af2d
Merge pull request #13547 from calixteman/cerfa
XFA - Handle correctly subformSet
2021-06-11 13:09:43 -07:00
Tim van der Meij
c7c59feeaf
Merge pull request #13546 from Snuffleupagus/base_factory-tweaks
Re-factor the `DOMCanvasFactory` and `DOMSVGFactory` implementations slightly
2021-06-11 21:19:11 +02:00
Calixte Denizet
694b14c047 XFA - Default fill color for rectangle is transparent 2021-06-11 18:03:19 +02:00
Calixte Denizet
d1e945998b XFA - Handle correctly subformSet
- it aims to avoid to loop forever when opening pdf in #13213;
  - the idea is to consider subformSet as inexistent when running in the tree. So if we've subformA > subformSet > subformB then subformB will be visited as a direct child of subformA.
2021-06-11 17:49:17 +02:00
Jonas Jenwald
7b17dc8bfd Re-factor the fetchData helper function, in src/display/display_utils.js to be asynchronous 2021-06-11 17:16:00 +02:00
Jonas Jenwald
b05a22d01b Re-factor the DOMSVGFactory to extend an abstract base class
This is first of all consistent with all of the other (similar) factories, and secondly it will also simplify a future addition of a corresponding `NodeSVGFactory` (if that's ever deemed necessary).
2021-06-11 17:15:49 +02:00
Jonas Jenwald
d10b850916 Move most functionality in the create methods into the BaseCanvasFactory
This *slightly* reduces the amount of code duplication in the `DOMCanvasFactory.create` and `NodeCanvasFactory.create` methods.
2021-06-11 17:15:47 +02:00
Calixte Denizet
da4916e3c1 XFA - Center vertically radio without caption
- and fix intent value which is used to name the radio button group.
2021-06-11 13:24:45 +02:00
Calixte Denizet
367d1ad137 XFA - Return html element for the different possible value
- it aims to fix #13536.
2021-06-11 11:51:54 +02:00
Brendan Dahl
a3b0596cf2
Merge pull request #13534 from calixteman/missing_page
XFA - Flush contents when breakBefore target is 'auto'
2021-06-10 13:17:20 -07:00
Brendan Dahl
02c03795f3
Merge pull request #13532 from calixteman/radio
XFA - Give all the available space to the caption in case of checkButton
2021-06-10 12:30:49 -07:00
Jonas Jenwald
3bd24d8d5a Throw errors directly, rather than using assert, in the DOMSVGFactory
This is similar to all of the other factories in this file, since they *directly* throw errors.
2021-06-10 21:08:23 +02:00
Jonas Jenwald
26011c65f4 Add a DOMMatrix polyfill for Node.js environments (PR 13361 follow-up)
Given that `DOMMatrix` is, unsurprisingly, not supported in Node.js the `createMatrix` helper function in `src/display/pattern_helper.js` is most likely broken in Node.js environments. It will obviously try to fallback to the `DOMSVGFactory`, however that isn't intended for Node.js usage and errors will be thrown.

Rather than trying to implement a `NodeSVGFactory`, this patch takes the easier route of just adding a `DOMMatrix` polyfill using: https://www.npmjs.com/package/dommatrix
This isn't done only for simplicity, but it'll become necessary anyway since the `createMatrix` helper function is only temporary and will be removed in the future.
2021-06-10 21:08:23 +02:00
Calixte Denizet
d7d53e7c6c XFA - Flush contents when breakBefore target is 'auto'
- some page can be missed in the final document because of that (see pdf in the test case which has 4 pages (when only 3 are rendered right now)
2021-06-10 17:15:08 +02:00
Calixte Denizet
58633ab9fd XFA - Give all the available space to the caption in case of checkButton
- a checkbox or radio doesn't have to be rescaled when the container is large so give the extra space to the caption to avoid some word wrapping.
  - when the caption is on the right, then put ui on the left as first element and so remove flex:row-reverse stuff.
2021-06-10 15:30:23 +02:00
Calixte Denizet
3bd936709c XFA - Handle caption with inline placement as left one
- it's just a temporary workaround to unblock release in Firefox.
2021-06-09 22:13:48 +02:00
Brendan Dahl
d333af7848
Merge pull request #13527 from calixteman/bind_inf_loop
XFA - Avoid infinite loop when creating some nodes in data
2021-06-09 12:37:29 -07:00
Brendan Dahl
aa2712744d
Merge pull request #13502 from calixteman/contentarea
XFA - contentarea must be on top of the other containers in a pageArea
2021-06-09 12:36:21 -07:00
Jonas Jenwald
69477bfb06 Always use standard font data, with disableFontFace set in the API (PR 12726 follow-up)
We must force-fetch standard font data, when `disableFontFace = true` is set in the API, since otherwise rendering in e.g. the viewer is still broken (same as before PR 12726 landed).

*Please note:* We still need to also load standard font data for patterns and/or some text-rendering modes, however that will require larger changes so I figured that it cannot hurt to submit *this* patch right now.
2021-06-09 21:21:02 +02:00
Tim van der Meij
2a65455c71
Merge pull request #13525 from Snuffleupagus/api-conditional-Factory
[api-minor] Re-factor the `disableFontFace` fallback value, and skip initializing factories with `useWorkerFetch` set
2021-06-09 21:15:39 +02:00
Calixte Denizet
cddc1d869d XFA - Avoid infinite loop when creating some nodes in data 2021-06-09 19:07:59 +02:00
Jonas Jenwald
a01c599247 Cache the "raw" standard font data in the worker-thread (PR 12726 follow-up)
*This implementation is basically a copy of the pre-existing `builtInCMapCache` implementation.*

For some, badly generated, PDF documents it's possible that we'll end up having to fetch the *same* standard font data over and over (which is obviously inefficient).
While not common, it's certainly possible that a PDF document uses *custom* font names where the actual font then references one of the standard fonts; see e.g. issue 11399 for one such example.

Note that I did suggest adding worker-thread caching of standard font data in PR 12726, however it wasn't deemed necessary at the time. Now that we have a real-world example that benefit from caching, I think that we should simply implement this now.
2021-06-09 18:27:51 +02:00
calixteman
6d88d8cdaa
Merge pull request #13517 from calixteman/liberation
XFA - Add Liberation-Sans font as a substitution for some missing fonts
2021-06-09 18:19:07 +02:00
Calixte Denizet
34a2fa72c7 XFA - Add Liberation-Sans font as a substitution for some missing fonts
- Some js files contain scale factors for each glyph in order to rescale Liberation to have a final font with the correct width.
  - A lot of XFA have some containers where their dimensions are based on their text content, so using default font from browser can lead to an almost unreadable pdf.
2021-06-09 16:55:45 +02:00
Calixte Denizet
fd1110adb4 Add the possibility to rescale each glyph in a font
- a lot of xfa files are using Myriad pro or Arial fonts without embedding them and some containers have some dimensions based on those font metrics. So not having the exact same font leads to a wrong display.
  - since it's pretty hard to find a replacement font with the exact same metrics, this patch gives the possibility to read glyf table, rescale each glyph and then write a new table.
  - so once PR #12726 is merged we could rescale for example Helvetica to replace Myriad Pro.
2021-06-09 16:01:13 +02:00
Jonas Jenwald
2f8e2548f2 Don't initialize CMapReaderFactory/StandardFontDataFactory when the useWorkerFetch API option is set
Given that there's no fallback on the worker-thread, it shouldn't be necessary to initialize `CMapReaderFactory`/`StandardFontDataFactory` when `useWorkerFetch = true` is set.

Slightly unrelated, but this patch also ensures that the `useSystemFonts` default value only does the `isNodeJS` check in builds where that's actually necessary.
2021-06-09 15:35:23 +02:00
Jonas Jenwald
312326991f [api-minor] Set the disableFontFace fallback value directly in the API
At this point in time, the `apiCompatibilityParams` is essentially unused with the sole exception of the `disableFontFace` handling for Node.js environments.
Given that `isNodeJS` is a constant now (originally it was a function), we can simply set the correct fallback value for `disableFontFace` directly in the API and clean-up the code a bit here.
2021-06-09 15:35:23 +02:00
Calixte Denizet
1f6345b6c2 XFA - Display rectangle, line and arc 2021-06-09 15:34:31 +02:00
Calixte Denizet
1486608f32 XFA - contentarea must be on top of the other containers in a pageArea 2021-06-09 15:29:29 +02:00
Jonas Jenwald
d995f90183 Fetch binary CMap data in the worker-thread, when useWorkerFetch is set
This patch uses the new option added in PR 12726 to *also* allow fetching binary CMap data directly in the worker-thread in browsers.
Given that these changes remove the need to transfer data between threads for the default (browser) use-case, we can also revert the changes in PR 11118 since that simplifies the overall implementation.
2021-06-08 21:51:07 +02:00
Jonas Jenwald
248113bbf0 Move BaseCanvasFactory, BaseCMapReaderFactory, and BaseStandardFontDataFactory to their own file
Given that these factories are being used in *different* files, for Browser respectively Node.js implementations, it seems reasonable to move them into their own file instead.
2021-06-08 21:48:49 +02:00
Calixte Denizet
cfa727474e XFA - Fix layout issues (again)
- some elements weren't displayed because their rotation angle was not taken into account;
  - fix box model (XFA concept):
    - remove use of outline;
    - position correctly border which isn't part of box dimensions;
    - fix margins issues (see issue #13474).
  - move border on button instead of having it on wrapping div;
2021-06-08 17:42:53 +02:00
Calixte Denizet
63caa101f8 XFA - Add support for reftests 2021-06-08 10:37:26 +02:00
Jonas Jenwald
e7dc822e74
Merge pull request #12726 from brendandahl/standard-fonts
[api-minor] Include and use the 14 standard font files.
2021-06-08 10:09:40 +02:00
Brendan Dahl
4c1dd47e65 Include and use the 14 standard fonts files. 2021-06-07 11:10:11 -07:00
calixteman
8b4acb4e36
Merge pull request #13501 from calixteman/13500
XFA - CDATA can be xml so parse it when required
2021-06-07 11:27:49 +02:00
Jonas Jenwald
e0abf87bc3
Merge pull request #13505 from Snuffleupagus/createMatrix-DOMSVGFactory
Use the `DOMSVGFactory`, rather than manually creating the SVG-element, in `createMatrix` (PR 13361 follow-up)
2021-06-07 11:16:07 +02:00
Calixte Denizet
5dc7f4ade8 XFA - CDATA can be xml so parse it when required 2021-06-07 10:38:39 +02:00
Jonas Jenwald
9e632ee323 Use the DOMSVGFactory, rather than manually creating the SVG-element, in createMatrix (PR 13361 follow-up)
Generally, in the `src/display/` folder, we utilize `DOMSVGFactory` rather than manually creating an SVG-element; hence let's do the same thing in `src/display/pattern_helper.js` as well.
2021-06-07 10:15:20 +02:00
Calixte Denizet
112645ea3d XFA - Don't bind a form node with an empty value when the data node doesn't exist 2021-06-06 17:59:01 +02:00
Tim van der Meij
2b63d97b9d
Merge pull request #13461 from Snuffleupagus/issue-6605
Improve text-selection for Type3 fonts with empty /FontBBox-entries (issue 6605)
2021-06-06 14:37:52 +02:00
Jonas Jenwald
04ab4bd406 Normalize the coordinates used in SVGGraphics._makeTilingPattern (issue 12996)
While this prevents the error which is currently thrown by the `assert` in the `DOMSVGFactory.create` method, the pattern still doesn't actually render (visibly). However, in the interest of getting rid of some open issues, this patch should make (some) sense and there's already other issues about patterns in the SVG-backend,

Given that, as clearly [outlined in the FAQ](https://github.com/mozilla/pdf.js/wiki/Frequently-Asked-Questions#backends), the SVG-backend is *not* officially supported and that there's currently no development of it; this is probably the most that is reasonable to do here.
2021-06-05 09:15:23 +02:00
Jonas Jenwald
1dd01b8506
Merge pull request #13494 from brendandahl/stepper-show-text
Add more info for showText operator in stepper.
2021-06-05 08:37:11 +02:00
Jonas Jenwald
eefc94ceb7 Ensure that we fully load Type3 fonts in PartialEvaluator.getTextContent
This is necessary now, since with the previous patch the /FontBBox potentially depends on the contents of the /CharProcs-streams.
Note that if `getOperatorList` is called *before* `getTextContent`, this patch doesn't matter since the font is already fully loaded/parsed. However, for e.g. the `text` test-cases this is necessary to ensure correct reference images.
2021-06-05 08:09:29 +02:00
Jonas Jenwald
20770cb06a Improve text-selection for Type3 fonts with empty /FontBBox-entries (issue 6605)
For Type3 fonts where the /CharProcs-streams of the individual glyph starts with a `d1` operator, we can use that to build a fallback bounding box for the font and thus improve text-selection in some cases.
2021-06-05 08:09:29 +02:00
Brendan Dahl
6255c2a8f3
Merge pull request #13376 from calixteman/6132
Replace command with not enough args by an endchar in CFF font
2021-06-04 14:00:51 -07:00
Brendan Dahl
17f1857556 Add more info for showText operator in stepper.
Adds a table that shows original char code, font char code, and unicode.
2021-06-04 13:58:05 -07:00
Jonas Jenwald
75113e4517 Initialize HTMLResult.{FAILURE, EMPTY} lazily
While these objects aren't exactly that big and/or complex, they are nonetheless *only* necessary for XFA documents.
However, currently these objects are initialized *eagerly* for all PDF documents. By using the same pattern as elsewhere in the code-base, it's very easy to make these lazily initialized; so let's just do that instead :-)
2021-06-04 21:01:14 +02:00
calixteman
e0676ec298
Merge pull request #13473 from calixteman/usehref
XFA - Implement usehref support
2021-06-04 20:13:22 +02:00
Calixte Denizet
11573ddd16 XFA - Implement usehref support
- attribute 'use' was already implemented but not usehref
  - in general, usehref should make reference to current document
  - add support for SOM expressions in use and usehref to search a node.
  - get prototype for all nodes if any.
2021-06-04 14:57:05 +02:00
Jonas Jenwald
4b1c4d2bd9 Add hasEOL to the TextItem typedef in the API (PR 13257 follow-up) 2021-06-04 10:22:43 +02:00
Calixte Denizet
f61f80a5a3 XFA - Use native radio and checkbox buttons
- Remove current stuff which relies on some system fonts to avoid bad rendering.
2021-06-01 21:25:38 +02:00
calixteman
f2ade671ec
Merge pull request #13411 from calixteman/xfa_print
XFA - Add support to print XFA forms
2021-06-01 19:06:49 +02:00
Jonas Jenwald
3456ed271b
Merge pull request #13378 from calixteman/10544
Replace terminal null char by a endchar command in CFF charstrings to make OTS happy
2021-06-01 16:04:09 +02:00
Jonas Jenwald
e3bde56311 Ensure that the old/new options are correctly combined in PartialEvaluator.clone 2021-05-31 12:14:53 +02:00
Calixte Denizet
a434011517 XFA - Add support to print XFA forms 2021-05-31 10:26:30 +02:00
calixteman
8c53bf8647
Merge pull request #13437 from calixteman/xfa_mv_root
XFA - Move the fake HTML representation of XFA from the worker to the main thread
2021-05-31 10:14:15 +02:00
Tim van der Meij
dd0014ef2e
Merge pull request #13465 from Snuffleupagus/misc-legacy-cleanup
Some `-es5`/`-legacy` renaming clean-up, and `deprecated` API options removal (PR 12978, PR 13207 follow-up)
2021-05-30 21:13:42 +02:00
Ikko Ashimine
c66289f1fc
Fix typo in template.js
refering -> referring
2021-05-31 01:02:07 +09:00
Jonas Jenwald
d8a7c75b4a Revert "Add deprecated handling of the now removed AnnotationStorage API-parameters" (PR 13207 follow-up)
This reverts commit 737a8e846d, since it's included in the latest beta version `2.9.359`.
2021-05-30 16:38:33 +02:00
Tim van der Meij
a0ce3cb3b4
Merge pull request #13448 from Snuffleupagus/_setDefaultAppearance-alpha
Support strokeAlpha/fillAlpha when creating a fallback appearance stream (issue 6810)
2021-05-28 23:39:36 +02:00
Tim van der Meij
5e5641b147
Merge pull request #13457 from Snuffleupagus/issue-13242
Work-around for HighlightAnnotations without a top-level /ExtGState-entry (issue 13242)
2021-05-28 23:38:39 +02:00
Tim van der Meij
0d56b1c365
Merge pull request #13443 from Snuffleupagus/charsCache
Re-factor the `charsCache` on `Font`-instances
2021-05-28 21:29:57 +02:00
Brendan Dahl
a6484c9861
Merge pull request #13427 from calixteman/xfa_storage
XFA - Add a storage to save fields values
2021-05-28 12:10:08 -07:00
Jonas Jenwald
707a9e3b02 Work-around for HighlightAnnotations without a top-level /ExtGState-entry (issue 13242)
For HighlightAnnotations with a built-in appearance stream, we still rely on it to specify the opacity correctly via a suitable blend mode. However, if the Annotation-drawing operators are placed *within* a /XObject of the /Form-type, the /ExtGState won't apply to the final rendering and the result is that the highlighting obscures the underlying text.

The more *correct* and general solution would likely be to somehow modify the implementation in `src/display/canvas.js`, to special-case handling of /Form-type /XObjects when rendering Annotations. Since we can very easily work-around this problem for now by using the "no appearance stream" code-path, doing *something* here ought to be preferable.

This patch is (obviously) merely a work-around, but given that the referenced issue is (as far as I know) the first case we've seen of this problem a simple solution will hopefully suffice for now.
2021-05-28 13:49:27 +02:00
calixteman
e499521b78
Merge pull request #13456 from calixteman/clazz
Replace clazz by classNames
2021-05-28 12:18:27 +02:00
Jonas Jenwald
2cc3b96351
Merge pull request #13455 from calixteman/italic
Italic angle is defined clockwise in CSS when it's counterclockwise in PDF
2021-05-28 11:49:48 +02:00
Calixte Denizet
f35176a32e Replace clazz by classNames 2021-05-28 11:17:38 +02:00
Calixte Denizet
1b0006093d Italic angle is defined clockwise in CSS when it's counterclockwise in PDF 2021-05-28 11:06:11 +02:00
Jonas Jenwald
70c79c6f69 Fix the JSDocs for PDFDocumentProxy.getPageIndex (issue 13449) 2021-05-27 16:41:08 +02:00
Jonas Jenwald
52c13326cd Support Annotations, without appearance streams, with bogus /Rect-entries (issue 13447)
This extends PR 13106 to apply not only to empty /Rect-entries, but also to bogus /Rect-entries for various Annotation-types.
2021-05-27 16:23:21 +02:00
Jonas Jenwald
a6447f2ca2 Support strokeAlpha/fillAlpha when creating a fallback appearance stream (issue 6810)
This fixes the colours, by respecting the strokeAlpha/fillAlpha-values, for a couple of Annotations in the PDF document from issue 13447.[1]

---
[1] Some of the annotations still won't render at all, when compared with Adobe Reader, but that could/should probably be handled separately.
2021-05-27 16:23:18 +02:00
calixteman
f587d5998e
Merge pull request #13445 from calixteman/ps_name
Fix Postscript name in font to avoid bug when saving in pdf
2021-05-27 13:52:47 +02:00
Calixte Denizet
0c698346b8 Fix Postscript name in font to avoid bug when saving in pdf
- for xfa rendering, fonts are loaded and used in html;
  - when printed and saved in pdf, on linux, Firefox uses cairo backend
  - when subsetting a font, cairo uses the font postscript name and when this one is empty that leads to a bug
    (the append at 63f0d62684/src/cairo-cff-subset.c (L2049) is failing because of null length)
  - so this patch adds a postscript name to the font to make cairo happy.
2021-05-27 12:45:40 +02:00
Jonas Jenwald
8b1d01816b Re-factor the charsCache on Font-instances
Currently `charsCache` is initialized *lazily*, which considering that it just contains a simple `Object` doesn't seem entirely necessary. This first of all forces us to do repeated exists-checks in the `Font.charsToGlyphs` method, and secondly the similar/related `glyphCache` is already initialized eagerly.

Furthermore, this patch also does a bit of clean-up in the `Font.charsToGlyphs` method since this code is quite old.
2021-05-26 13:13:44 +02:00
Tim van der Meij
3da9f077be
Merge pull request #13435 from Snuffleupagus/eslint-no-array-push-push
Enable the `unicorn/no-array-push-push` ESLint plugin rule
2021-05-25 21:10:01 +02:00
Tim van der Meij
6e92b56efa
Merge pull request #13436 from Snuffleupagus/getPathGenerator-buf
Re-factor FontFaceObject.getPathGenerator to use Arrays instead of strings
2021-05-25 20:35:01 +02:00
Calixte Denizet
45c3f00a27 XFA - Move the fake HTML representation of XFA from the worker to the main thread
- the only goal of this patch is to be able to get synchronously the fake html when printing from firefox:
    - in order to print we need to inject some html in beforeprint callback but we cannot block in waiting for all the pages.
  - from a memory point of view: it doesn't change anything since the fake HTML is deleted in the worker;
  - this way we don't break any assumptions.
2021-05-25 19:33:07 +02:00
Calixte Denizet
9478d2f064 XFA - Add a storage to save fields values - this is required to be able to print (or save) a document. Some pages can be unloaded (because pdf.js is lazy) and this storage will help to save their data in order to resuse them when printing or just when displaying a page again. 2021-05-25 19:25:09 +02:00
Calixte Denizet
7cebdbd58c XFA - Fix lot of layout issues
- I thought it was possible to rely on browser layout engine to handle layout stuff but it isn't possible
    - mainly because when a contentArea overflows, we must continue to layout in the next contentArea
    - when no more contentArea is available then we must go to the next page...
    - we must handle breakBefore and breakAfter which allows to "break" the layout to go to the next container
  - Sometimes some containers don't provide their dimensions so we must compute them in order to know where to put
    them in their parents but to compute those dimensions we need to layout the container itself...
  - See top of file layout.js for more explanations about layout.
  - fix few bugs in other places I met during my work on layout.
2021-05-25 17:51:36 +02:00
Jonas Jenwald
9ad7746118 Replace a couple of standard for-loops with for...of in src/display/font_loader.js 2021-05-25 14:11:57 +02:00
Jonas Jenwald
dcbb23d7fa Re-factor FontFaceObject.getPathGenerator to use Arrays instead of strings
This is similar to a lot of other code, where we use "Array + join" rather than repeated string concatenation.
2021-05-25 14:11:54 +02:00
Jonas Jenwald
ec3bcadf56 Enable the unicorn/no-array-push-push ESLint plugin rule
There's generally speaking no need to use multiple consecutive `Array.prototype.push()` calls, since that method accepts multiple arguments, and this ESLint rule helps enforce that pattern.

Please see https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/no-array-push-push.md for additional information.
2021-05-25 13:54:46 +02:00
Calixte Denizet
209ac5ca57 XFA - Don't display images with a href 2021-05-22 15:09:43 +02:00
calixteman
0df1a56619
Merge pull request #13417 from Snuffleupagus/xfa-URL-clone
[XFA] Send URLs as strings, rather than objects (issue 1773)
2021-05-22 14:31:59 +02:00
Tim van der Meij
de680d7777
Merge pull request #13381 from Snuffleupagus/buildFontPaths-ignoreErrors
Handle errors gracefully, in PartialEvaluator.buildFontPaths, when glyph path building fails
2021-05-22 13:06:31 +02:00
Jonas Jenwald
53a70244d0 Use the stringToBytes helper function in more places
Rather than manually reimplementing, more-or-less, this functionality in a few spots we can simply use the existing helper function instead.
2021-05-22 12:23:09 +02:00
Jonas Jenwald
ba13bd8c2d [XFA] Send URLs as strings, rather than objects (issue 1773)
Given that `URL`s aren't supported by the structured clone algorithm, see https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm, the document in issue 1773 will cause the browser to throw `DataCloneError: The object could not be cloned.`-errors and nothing will render.
To fix this, we'll instead simply send the stringified version of the `URL` to prevent these errors from occuring.
2021-05-22 11:58:53 +02:00
Jonas Jenwald
c4429bc3f2 Do the isType3Font-check *once*, rather than repeating it, in PartialEvaluator.translateFont
*This is a small piece of clean-up that I happened to notice while browsing the code.*
2021-05-22 11:46:37 +02:00
Jonas Jenwald
68350378c0 Handle errors gracefully, in PartialEvaluator.buildFontPaths, when glyph path building fails
The building of glyph paths, in the `FontRendererFactory`, can fail in various ways for corrupt font data. However, we're currently not attempting to handle any such errors in the evaluator, which means that a single broken glyph *can* prevent an entire page from rendering.

To address this we simply have to pass along, and check, the existing `ignoreErrors` option in `PartialEvaluator.buildFontPaths` similar to the rest of the `PartialEvaluator` code.
2021-05-22 11:46:31 +02:00
Jonas Jenwald
0dba468e60 Don't allow the LoopbackPort to "clone" a URL
Note that `URL`s aren't supported by the structured clone algorithm, see https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm, and any attempt to send a `URL` using `postMessage` is rejected by the browser. Hence, for consistency when workers are disabled, the `LoopbackPort` should obviously also reject any `URL`s.
2021-05-22 10:11:31 +02:00
Tim van der Meij
b2ffebe978
Merge pull request #13416 from calixteman/xfa_config
XFA - Fix wrong function name
2021-05-21 20:33:35 +02:00
Calixte Denizet
8a8879aed2 XFA - Fix wrong function name 2021-05-21 20:25:26 +02:00
Tim van der Meij
d1d9b9043d
Merge pull request #13415 from Snuffleupagus/getDestination-out-of-order
Improve handling of named destinations in out-of-order NameTrees (PR 10274 follow-up)
2021-05-21 20:15:09 +02:00
Jonas Jenwald
8d5689387b Improve handling of named destinations in out-of-order NameTrees (PR 10274 follow-up)
According to the specification, see https://web.archive.org/web/20210404042322if_/https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G6.2384179, the keys of a NameTree/NumberTree should be ordered.
For corrupt PDF files, which violate this assumption, it's thus possible that trying to lookup a single entry fails.

Previously, in PR 10274, we implemented a fallback that only applies to the "bottom" node of a NameTree/NumberTree, which in general might not actually help for sufficiently corrupt NameTree/NumberTree data.
Instead we remove the current *limited* fallback from `NameOrNumberTree.get`, and defer to the call-site to handle this case explicitly e.g. by using `NameOrNumberTree.getAll` for data where that makes sense. For well-formed documents, these changes should *not* lead to any additional data fetching/parsing.

Finally, as part of these changes, the validation of named destination data is improved in the `Catalog` and a new unit-test is also added.
2021-05-21 15:48:37 +02:00
Jonas Jenwald
1a8d05fdcf Remove some, with Prettier 2.3.0, unnecessary // prettier-ignore comments
To get the maximum benefit from something like Prettier, you obviously don't want to disable the automatic formatting unless absolutely necessary. When we added Prettier there were a number of cases, mostly involving larger Arrays, which required disabling of the automatic formatting for overall readability and/or to not break inline comments.

With changes in Prettier version `2.3.0`, see [the release notes](https://prettier.io/blog/2021/05/09/2.3.0.html#concise-formatting-of-number-only-arrays-10106httpsgithubcomprettierprettierpull10106-10160httpsgithubcomprettierprettierpull10160-by-thorn0httpsgithubcomthorn0), there's now better formatting support for Arrays containing only numbers. Hence we can now remove a number of `// prettier-ignore` comments, and thus get the benefit of automatic formatting in (slightly) more of the code-base.
2021-05-19 11:36:03 +02:00
calixteman
faf6b10939
Merge pull request #13394 from calixteman/xml_parser
Handle PI with no value in xml parser
2021-05-18 11:14:48 +02:00
Calixte Denizet
4544ebf38a Handle PI with no value in xml parser
- an XML PI contains a target and optionally some content (see https://en.wikipedia.org/wiki/Processing_Instruction)
  - the parser expected to always have some content and so it could lead to wrong parsing.
2021-05-18 10:22:18 +02:00
Brendan Dahl
239d0097fa
Merge pull request #13390 from calixteman/opentype_and_xfa
XFA - Don't move glyphes in private area with non-truetype fonts
2021-05-17 12:39:10 -07:00
Brendan Dahl
46c2eeb19a
Merge pull request #13389 from calixteman/width_in_cff
Get any width (if one is present) in CFF parser
2021-05-17 09:13:45 -07:00
Brendan Dahl
17e9cfcd2a
Merge pull request #13328 from calixteman/js_display1
JS - Add support for display property
2021-05-17 08:47:13 -07:00
Calixte Denizet
a74d19262a XFA - Don't move glyphes in private area with non-truetype fonts
- it has been done in PR #13146 but only for truetype fonts.
2021-05-17 16:52:39 +02:00
Calixte Denizet
d394188835 Get any width (if one is present) in CFF parser
- in charstring specs at page 21 (section 4.2): "Also, it may appear in the charstring as the difference from nominalWidthX" so the number we've on the stack doesn't have to be positive.
  - currently this bug has probably no visible effect
  - but when the font is loaded to be used with XFA, then the rendering is incorrect.
2021-05-17 14:17:08 +02:00
Jonas Jenwald
718f7bf7e1 Fix a few *safe* ESLint no-var failures in src/core/evaluator.js (13371 follow-up)
As can be seen in PR 13371, some of the `no-var` changes in the `PartialEvaluator.{getOperatorList, getTextContent}` methods caused errors in `gulp server`-mode.
However, there's a handful of instances of `var` in other methods which should be completely *safe* to convert since there's no strange scope-issues present in that code.
2021-05-16 15:22:43 +02:00
Tim van der Meij
a5c74f53c1
Merge pull request #13386 from timvandermeij/src-core-bidi-no-var
Enable the `no-var` linting rule in `src/core/bidi.js`
2021-05-16 15:02:18 +02:00
Tim van der Meij
b8a5e797c5
Enable the no-var linting rule in src/core/bidi.js
This is done automatically with `gulp lint --fix` and the following
manual changes:

```diff
diff --git a/src/core/bidi.js b/src/core/bidi.js
index e9e0a7217..32691c0c6 100644
--- a/src/core/bidi.js
+++ b/src/core/bidi.js
@@ -82,7 +82,8 @@ function isEven(i) {
 }

 function findUnequal(arr, start, value) {
-  for (var j = start, jj = arr.length; j < jj; ++j) {
+  let j, jj;
+  for (j = start, jj = arr.length; j < jj; ++j) {
     if (arr[j] !== value) {
       return j;
     }
@@ -251,15 +252,14 @@ function bidi(str, startLevel, vertical) {
   for (i = 0; i < strLength; ++i) {
     if (types[i] === "EN") {
       // do before
-      var j;
-      for (j = i - 1; j >= 0; --j) {
+      for (let j = i - 1; j >= 0; --j) {
         if (types[j] !== "ET") {
           break;
         }
         types[j] = "EN";
       }
       // do after
-      for (j = i + 1; j < strLength; ++j) {
+      for (let j = i + 1; j < strLength; ++j) {
         if (types[j] !== "ET") {
           break;
         }
```
2021-05-16 14:14:26 +02:00
Jonas Jenwald
3cfa316d40 Convert src/core/operator_list.js to use standard classes
With modern JavaScript modules, where only *explicitly* exported properties are visible to the outside, the `QueueOptimizerClosure` should no longer be necessary.

Furthermore, to reduce the possibility of `NullOptimizer` and `QueueOptimizer` getting out of sync (note e.g. the inconsistency fixed in PR 10784), we now let the latter extend the former one.
2021-05-16 13:39:54 +02:00
Tim van der Meij
8a8a67de3b
Merge pull request #13380 from Snuffleupagus/pattern_helper-class
Re-factor and convert the code in `src/display/pattern_helper.js` to use standard classes
2021-05-16 13:11:04 +02:00
Jonas Jenwald
8943bcd3c3 Account for formatting changes in Prettier version 2.3.0
With the exception of one tweaked `eslint-disable` comment, in `web/generic_scripting.js`, this patch was generated automatically using `gulp lint --fix`.

Please find additional information at:
 - https://github.com/prettier/prettier/releases/tag/2.3.0
 - https://prettier.io/blog/2021/05/09/2.3.0.html
2021-05-16 11:44:05 +02:00
Jonas Jenwald
a984431046 Modernize the ShadingIRs structure, in src/display/pattern_helper.js, to use standard classes
This patch replaces the old structure with an abstract base-class, which the new ShadingPattern classes then inherit from.
The old `createMeshCanvasClosure` can now be removed, since it's not necessary any more with modern JavaScript, and the `createMeshCanvas` function is now instead a method on the new `MeshShadingPattern` class (avoids unnecessary parameter passing).
2021-05-15 16:00:00 +02:00
Jonas Jenwald
40939d5955 Convert src/display/pattern_helper.js to use standard classes
Note that this patch only covers `TilingPattern`, since the `ShadingIRs`-implementation required additional re-factoring.
2021-05-15 13:03:07 +02:00
Jonas Jenwald
bb8e15c971 [api-minor] Update minimum supported browser versions (PR 13361 follow-up)
With the changes in PR 13361, we're now using the `CanvasPattern.setTransform()` method when rendering certain Shadings/Patterns.
Note that while `CanvasPattern` itself has been supported since basically "forever", its `setTransform` method is a slightly newer addition to the specification; please refer to https://developer.mozilla.org/en-US/docs/Web/API/CanvasPattern#browser_compatibility

Rather than trying to re-write PR 13361 to not use, or possibly spending time/effort (if possible) polyfilling, `CanvasPattern.setTransform()` this patch thus suggests that we simply update the *minimum* supported browser versions instead.

According to the compatibility data linked above, the *minimum* supported browser versions in the PDF.js library are now as follows:
 - Chrome >= 68, which was released on 2018-07-24.[1]
 - Firefox ESR, see https://wiki.mozilla.org/Release_Management/Calendar.
 - Safari >= 11.1, which was release on 2018-03-29.[2]

(Given that the PDF.js contributors cannot realistically test a bunch of old browsers, it's not unimaginable that some older browser versions are already not working with the PDF.js library.)

Based on these changes, which we should ensure are reflected in the Wiki as well, we can also remove a number of now redundant polyfills. Furthermore we'll no longer "claim" to support Windows XP, note the `gulpfile.js` changes, which should definitely *not* be an issue given that it's no longer officially supported.[3]

---
[1] According to https://en.wikipedia.org/wiki/Google_Chrome_version_history

[2] According to https://en.wikipedia.org/wiki/Safari_version_history#Safari_11

[3] According to https://en.wikipedia.org/wiki/Windows_XP#End_of_support
2021-05-15 09:57:34 +02:00
Tim van der Meij
d2e7161f2c
Merge pull request #13377 from Snuffleupagus/pattern-class
Re-factor and convert the code in `src/core/pattern.js` to use standard classes
2021-05-14 22:23:44 +02:00
Jonas Jenwald
ebe3ee4f25 Modernize the Shadings structure, in src/core/pattern.js, to use standard classes
This patch replaces the old structure with a abstract base-class, which the new RadialAxial/Mesh-shading classes then inherit from.[1]
The old `MeshClosure` can now be removed, since it's not necessary any more, and most of the functions inside of it are now instead methods on the new `MeshShading` class. This is particularly nice, in my opinion, since we previously were *manually* passing around a reference to the current `Mesh`-instance.

---
[1] If we want/need to, in the future, split e.g. the Mesh-handling into multiple classes that should now be easy to do.
2021-05-14 21:44:41 +02:00
Jonas Jenwald
6acb2db4be Convert src/core/pattern.js to use standard classes
Note that this patch only covers `Pattern` and `MeshStreamReader`, since the `Shadings`-implementation required additional re-factoring.
2021-05-14 21:42:21 +02:00
Calixte Denizet
f92e1fa160 Replace terminal null char by a endchar command in CFF charstrings to make OTS happy 2021-05-14 18:34:51 +02:00
Jonas Jenwald
612b43852b Remove unused properties from the Shadings-implementations in src/core/pattern.js
Neither the `type` or the `cs` properties are used outside of the "constructors", and we can thus remove them.[1]
Note that a lot of this code is very old, and that it actually predates the main/worker-thread split before which the *same* file was used on both the main- *and* worker-threads.

---
[1] On the main-thread, a similar `type` property was removed in PR 12591.
2021-05-14 16:11:48 +02:00
Calixte Denizet
1a2cea21a5 Replace command with not enough args by an endchar in CFF font
- Right now, a glyph with an erroneous outline is replaced by an empty glyph
    if the error is far enough from the start there's likely something to render
    so the idea is to replace a command with args by an endchar when no args are
    on the stack: this way OTS is likely happy (no remaining args on stack) and we
    can draw something which is likely better than nothing.
2021-05-14 13:45:45 +02:00
Jonas Jenwald
4248f0745c Improve the Page.content and Page.getContentStream methods
First of all, by using `Dict.getArray` in the `Page.content` getter we remove the need to manually iterate through and fetch the sub-streams (when they exist) in the `Page.getContentStream` method.
Secondly, we can simplify the code in `Page.{getOperatorList, extractTextContent}` by letting `Page.getContentStream` ensure that `content` is available and returning a Promise instead.
2021-05-14 11:47:34 +02:00
Jonas Jenwald
70113131de Inline the data lookup in the Dict.getArray method
Similar to the `get`/`getAsync` methods, this should be a *tiny* bit more efficient which cannot hurt considering that `getArray` is now used a lot more than when initially added.
2021-05-14 11:24:27 +02:00
Tim van der Meij
e394da5861
Merge pull request #13369 from brendandahl/smask-pattern
Fix tiling pattern with smask.
2021-05-13 13:26:38 +02:00
Jonas Jenwald
75208d36c2 Revert "Fix the remaining no-var failures, which couldn't be handled automatically, in the src/core/evaluator.js file" (PR 13344 follow-up)
This reverts commit 0ef9b5aafc, since it cases a lot of warnings (see below) *locally* with e.g. the document from issue 9627.
Strangely enough, this only occurs with `gulp server`-mode and the actual builds are apparently fine. It seems that this *may* be some unfortunate interaction with the old Babel-plugin that's used together with SystemJS.

```
Warning: getTextContent - ignoring ExtGState: "FormatError: ExtGState should be a dictionary.".
```

Rather than taking the risk that this could actually cover a more serious bug, and since I cannot immediately figure out what's wrong, it thus seem safest to revert this for now and we can (carefully) revisit this once SystemJS has been removed (see PR 12563).
2021-05-13 11:19:46 +02:00
Brendan Dahl
53991d0924 Fix tiling pattern with smask.
After drawing a tiling pattern we were not calling
endDrawing, which handles compositing any
active smasks.

Fixes #8565.
2021-05-12 11:42:08 -07:00
Tim van der Meij
ba99e54c66
Merge pull request #13361 from brendandahl/patterns-fixes
Fix several issues with radial/axial shadings and tiling patterns.
2021-05-12 20:27:37 +02:00
Tim van der Meij
1cf9f42ca2
Merge pull request #13366 from Snuffleupagus/primitives-class
Convert the remaining functions in `src/core/primitives.js` to use standard classes
2021-05-12 20:20:35 +02:00
Tim van der Meij
0a3e483c7f
Merge pull request #13360 from Snuffleupagus/renderer-conditional-pref
Only include the `renderer`-preference in builds where `SVGGraphics` is defined
2021-05-12 20:16:53 +02:00
Jonas Jenwald
64c55d381d Fix the Jbig2Image export for the gulp image_decoders build (PR 9729 follow-up, issue 13367) 2021-05-12 19:41:29 +02:00
Jonas Jenwald
757636d519 Convert the remaining functions in src/core/primitives.js to use standard classes
This patch was tested using the PDF file from issue 2618, i.e. https://bug570667.bugzilla-attachments.gnome.org/attachment.cgi?id=226471, with the following manifest file:
```
[
    {  "id": "issue2618",
       "file": "../web/pdfs/issue2618.pdf",
       "md5": "",
       "rounds": 50,
       "type": "eq"
    }
]
```

which gave the following results when comparing this patch against the `master` branch:
```
-- Grouped By browser, stat --
browser | stat         | Count | Baseline(ms) | Current(ms) | +/- |   %  | Result(P<.05)
------- | ------------ | ----- | ------------ | ----------- | --- | ---- | -------------
firefox | Overall      |    50 |         3417 |        3426 |   9 | 0.27 |
firefox | Page Request |    50 |            1 |           1 |   0 | 5.41 |
firefox | Rendering    |    50 |         3416 |        3426 |   9 | 0.27 |
```

Based on these results, there's no significant performance regression from using standard classes and this patch should thus be OK.
2021-05-12 09:36:28 +02:00
Brendan Dahl
ac44afa70e Fix several issues with radial/axial shadings and tiling patterns.
Previously, we set the base transformation and pattern matrix
directly to the main rendering ctx of the page, however doing this
caused the current transform to be lost. This would cause issues
with things like shear missing so the pattern was misaligned or when
stroke was used the scale of the line width or dash would be wrong.
Instead we should leave the current transform and use setTransfrom
on the pattern so it is applied correctly. For axial and radial shadings I had
to create a temporary canvas to draw the shading so I could in turn
use setTransform.

Fixes: #13325, #6769, #7847, #11018, #11597, #11473

The following already in the corpus are improved:
issue8078-page1
issue1877-page1
2021-05-11 16:32:24 -07:00
Jonas Jenwald
b068882bd0 Clean-up usage of the TESTING-define in src/pdf.sandbox.js
This patch moves the `PDFJSDev`-checks *inline*, similar to the rest of the code-base, such that the code in question is actually being removed from the *built* files in e.g. the official releases.
2021-05-11 12:39:33 +02:00
Jonas Jenwald
7548dc5ea2 Only include the renderer-preference in builds where SVGGraphics is defined
After PR 13117 it's now (finally) possible for *different* build targets to specify individual options/preferences, and we can utilize that to only expose the `renderer`-preference in builds where `SVGGraphics` is actually defined.
Note that for e.g. `MOZCENTRAL`-builds, trying to enable SVG-rendering will throw immediately and the preference thus doesn't make sense to include there.

Also, update the dummy `SVGGraphics` to use a class, tweak the `PDFJSDev`-check in `src/display/svg.js` to agree fully with the option/preference, and remove an unnecessary `eslint-disable`.
2021-05-10 12:03:53 +02:00
Jonas Jenwald
2ba4b65ca8 [api-minor] Remove the WebGL implementation
Reasons for the removal include:
 - This functionality was always somewhat experimental and has never been enabled by default, partly because of worries about rendering bugs caused by e.g. bad/outdated graphics drivers.

 - After the initial implementation, in PR 4286 (back in 2014), no additional functionality has been added to the WebGL implementation.

 - The vast majority of all documents do not benefit from WebGL rendering, since only a couple of *specific* features are supported (e.g. some Soft Masks and Patterns).

 - There is, and has always been, *zero* test-coverage for the WebGL implementation.

 - Overall performance, in the PDF.js library, has improved since the experimental WebGL implementation was added.

Rather than shipping unused *and* untested code, it seems reasonable to simply remove the WebGL implementation for now; thanks to version control it's always possible to bring back the code should the need ever arise.
2021-05-09 16:38:44 +02:00
Jonas Jenwald
6eef69de22 Export the "raw" toUnicode-data from PartialEvaluator.preEvaluateFont
Compared to other data-structures, such as e.g. `Dict`s, we're purposely *not* caching Streams on the `XRef`-instance.[1]
The, somewhat unfortunate, effect of Streams not being cached is that repeatedly getting the *same* Stream-data requires re-parsing/re-initializing of a bunch of data; see `XRef.fetch` and related methods.

For the font-parsing in particular we're currently fetching the `toUnicode`-data, which is very often a Stream, in `PartialEvaluator.preEvaluateFont` and then *again* in `PartialEvaluator.extractDataStructures` soon afterwards.
By instead letting `PartialEvaluator.preEvaluateFont` export the "raw" `toUnicode`-data, we can avoid *some* unnecessary re-parsing/re-initializing when handling fonts.
*Please note:* In this particular case, given that `PartialEvaluator.preEvaluateFont` only accesses the "raw" `toUnicode` data, exporting a Stream should be safe.

---
[1] The reasons for this include:
 - Streams, especially `DecodeStream`-instances, can become *very* large once read. Hence caching them really isn't a good idea simply because of the (potential) memory impact of doing so.

 - Attempting to read from the *same* Stream-instance more than once won't work, unless it's `reset` in between, since using any method such as e.g. `getBytes` always starts at the current data position.

 - Given that parsing, even in the worker-thread, is now fairly asynchronous it's generally impossible to assert that any one Stream-instance isn't being accessed "concurrently" by e.g. different `getOperatorList` calls. Hence `reset`-ing a cached Stream-instance isn't going to work in the general case.
2021-05-08 12:04:13 +02:00
Jonas Jenwald
13fb1654dc Export the firstChar/lastChar-data from PartialEvaluator.preEvaluateFont
Rather than re-fetching/re-parsing these properties immediately in `PartialEvaluator.translateFont`, we can simply export them instead. (Obviously the effect will be really tiny, but there is less parsing overall this way.)
2021-05-08 12:02:49 +02:00
Jonas Jenwald
8a1cb82aee Ensure that the Widths array is parsed correctly in PartialEvaluator.preEvaluateFont
*Please note:* While I don't have a document that this patches fixes, the current code is however not entirely correct as far as I can tell.

Looking at how the `Widths` array is parsed in `PartialEvaluator.extractWidths`, it's clear that the implementation in `PartialEvaluator.preEvaluateFont` is a bit too simplistic. In particular, by only wrapping the data into a TypedArray, there's no attempt to handle *indirect* objects which could potentially lead to colliding `hash`es being computed.
2021-05-07 21:23:44 +02:00
Jonas Jenwald
30b2739adf Ensure that composite/non-composite fonts won't get the same hash in PartialEvaluator.preEvaluateFont
To hopefully help prevent any future bugs, make sure that composite/non-composite fonts cannot accidentally get matching `hash`es. Given the differences between those font types, that's very unlikely to be useful or even correct in general.
2021-05-07 21:22:37 +02:00
Jonas Jenwald
fc59a5f709 Take the W array into account when computing the hash, in PartialEvaluator.preEvaluateFont, for composite fonts (issue 13343)
Without this some *composite* fonts may incorrectly end up with matching `hash`es, thus breaking rendering since we'll not actually try to load/parse some of the fonts.

*Please note:* Given that the document, in the referenced issue, doesn't embed *any* of its fonts there's no guarantee that it renders correctly in all configurations even with this patch.
2021-05-07 21:22:36 +02:00
Tim van der Meij
a3632c0f38
Merge pull request #13344 from Snuffleupagus/evaluator-no-var
Enable the `no-var` rule in the `src/core/evaluator.js` file
2021-05-07 21:02:46 +02:00
Tim van der Meij
5248d0a77d
Merge pull request #13338 from Snuffleupagus/images-class
Convert the `src/core/{jbig2, jpg, jpx}.js` files to use standard classes
2021-05-07 20:59:58 +02:00
Calixte Denizet
af125cd299 JS - Add support for display property
- in annotation_layer, move common properties treatment in a common method instead having duplicated code in each widget.
2021-05-06 11:15:38 +02:00
Jonas Jenwald
0ef9b5aafc Fix the remaining no-var failures, which couldn't be handled automatically, in the src/core/evaluator.js file
The only *slight* complication here were some of the `switch`-cases, in `getOperatorList`/`getTextContent`, where the parsing is done asynchronously.
However, those cases are easy to deal with by wrapping the code within its own block; please see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/switch#block-scope_variables_within_switch_statements
2021-05-06 10:21:05 +02:00
Jonas Jenwald
f93c3b9aa7 Enable the no-var rule in the src/core/evaluator.js file
These changes were made automatically, using `gulp lint --fix`.
2021-05-06 09:39:21 +02:00
Jonas Jenwald
0a32ad3e42 Remove unnecessary closure in the src/core/font_renderer.js file
With modern JavaScript modules, where you explicitly list the properties that should be exported, it's no longer necessary to wrap *all* of the code within one file into a top-level closure.[1]

This patch reduces the size, of even the *built* `pdf.worker.js` file, since there's now a lot less unnecessary whitespace.

---
[1] For files which contain *different* functionality, some closures may however still make sense in order to separate the code.
It might be possible to remove some of those cases later, e.g. once private class fields becomes generally available/usable in browsers.
2021-05-05 22:35:52 +02:00
Tim van der Meij
afb8c4fd25
Merge pull request #13327 from Snuffleupagus/split-fonts
Split the functionality in `src/core/fonts.js` into multiple files, and use standard classes
2021-05-05 20:16:24 +02:00
Jonas Jenwald
9a1758c6b8 Remove unnecessary closure in src/display/text_layer.js, and use standard classes
With modern JavaScript modules, where you explicitly list the properties that should be exported, it's no longer necessary to wrap all of the code in a closure.[1]

This patch also tries to clean-up/improve a couple of the existing JSDoc-comments.

---
[1] This reduces the size, even of the *built* `pdf.js` file, since there's now a lot less unnecessary whitespace.
2021-05-05 18:44:56 +02:00
Jonas Jenwald
ce14171cf0 Convert src/core/jpx.js to use standard classes
*Please note:* Ignoring whitespace-only changes is probably necessary in order to review this.
2021-05-05 14:02:21 +02:00
Jonas Jenwald
cb65b762eb Fix the remaining no-var failures, which couldn't be handled automatically, in the src/core/jpx.js file 2021-05-05 14:02:21 +02:00
Jonas Jenwald
a273599a12 Enable the no-var rule in the src/core/jpx.js file
These changes were made automatically, using `gulp lint --fix`.
2021-05-05 14:02:21 +02:00
Jonas Jenwald
69dea39a42 Convert src/core/jpg.js to use standard classes
*Please note:* Ignoring whitespace-only changes is probably necessary in order to review this.
2021-05-05 14:02:21 +02:00
Jonas Jenwald
d0a299713c Fix the remaining no-var failures, which couldn't be handled automatically, in the src/core/jpg.js file 2021-05-05 14:02:21 +02:00
Jonas Jenwald
1e5a179600 Enable the no-var rule in the src/core/jpg.js file
These changes were made automatically, using `gulp lint --fix`.
2021-05-05 14:02:21 +02:00
Jonas Jenwald
0addf3a0d4 Convert src/core/jbig2.js to use standard classes
*Please note:* Ignoring whitespace-only changes is probably necessary in order to review this.
2021-05-05 14:02:21 +02:00
Jonas Jenwald
d59c9ab3ab Fix the remaining no-var failures, which couldn't be handled automatically, in the src/core/jbig2.js file 2021-05-05 14:02:21 +02:00
Jonas Jenwald
7ca3a34e1f Enable the no-var rule in the src/core/jbig2.js file
These changes were made automatically, using `gulp lint --fix`.
2021-05-05 14:02:21 +02:00
Jonas Jenwald
99fae47c8e [Regression] Move the super-call in the PredictorStream-constructor to prevent errors (PR 13303)
*My apologies for breaking this; thankfully PR 13303 hasn't reach mozilla-central yet.*

It's (obviously) necessary to initialize a `PredictorStream`-instance fully, since otherwise breakage may occur if there's errors during the actual stream parsing.
To reproduce this issue, try opening the PDF document from issue 13051 locally and observe the following message in the console:
```
Warning: Invalid stream: "ReferenceError: this hasn't been initialised - super() hasn't been called"
```
2021-05-05 13:24:12 +02:00
Calixte Denizet
3f29892d63 [JS] Fix several issues found in pdf in #13269
- app.alert and few other function can use an object as parameter ({cMsg: ...});
  - support app.alert with a question and a yes/no answer;
  - update field siblings when one is changed in an action;
  - stop calculation if calculate is set to false in the middle of calculations;
  - get a boolean for checkboxes when they've been set through annotationStorage instead of a string.
2021-05-04 19:21:51 +02:00
Calixte Denizet
549aae6c3d JS -- add support for page property in field 2021-05-03 15:46:29 +02:00
Jonas Jenwald
5e5daca407 Remove unnecessary MissingDataException check from getHeaderBlock
It shouldn't be possible for the `getBytes`-call to throw a `MissingDataException`, since all resources are loaded *before* e.g. font-parsing ever starts; see f0817015bd/src/core/object_loader.js (L111-L126)

Furthermore, even if we'd *somehow* re-throw a `MissingDataException` here that still won't help considering where the `Type1Font`-instance is created. Note how in the `Font`-constructor we simply catch any errors and fallback to a standard font, which means that a `MissingDataException` would just lead to rendering errors anyway; see f0817015bd/src/core/fonts.js (L648-L691)

All-in-all, it's not possible for a `MissingDataException` to be thrown in `getHeaderBlock` and this code-path can thus be removed.
2021-05-03 13:57:30 +02:00
Jonas Jenwald
b487edd05d Convert src/core/fonts.js to use standard classes
Obviously the `Font`-class is still *very* large, given particularly how TrueType fonts are handled, however this patch-series at least improves things by moving a number of functions/classes into their own files.
As a follow-up it might make sense to try and re-factor/extract the TrueType parsing into its own file, since all of this code is quite old, however that's probably best left for another time.

For e.g. `gulp mozcentral`, the *built* `pdf.worker.js` files decreases from `1 620 332` to `1 617 466` bytes with this patch-series.
2021-05-03 13:57:25 +02:00
Jonas Jenwald
cadc20d8b9 Fix the remaining no-var failures, which couldn't be handled automatically, in the src/core/fonts.js file 2021-05-02 21:00:29 +02:00
Jonas Jenwald
b9cd080c01 Enable the no-var rule in the src/core/fonts.js file
These changes were made automatically, using `gulp lint --fix`.
Given the large size of this patch, the manual fixes are done separately in the next commit.
2021-05-02 21:00:29 +02:00
Jonas Jenwald
f64b7922b3 Convert src/core/type1_font.js to use standard classes 2021-05-02 21:00:29 +02:00
Jonas Jenwald
4bd69556ab Enable the no-var rule in the src/core/type1_font.js file
These changes were made *mostly* automatically, using `gulp lint --fix`, with the following manual changes:

```diff
diff --git a/src/core/type1_font.js b/src/core/type1_font.js
index 50a3e49e6..55a2005fb 100644
--- a/src/core/type1_font.js
+++ b/src/core/type1_font.js
@@ -38,10 +38,9 @@ const Type1Font = (function Type1FontClosure() {
     const scanLength = streamBytesLength - signatureLength;

     let i = startIndex,
-      j,
       found = false;
     while (i < scanLength) {
-      j = 0;
+      let j = 0;
       while (j < signatureLength && streamBytes[i + j] === signature[j]) {
         j++;
       }
@@ -248,14 +247,14 @@ const Type1Font = (function Type1FontClosure() {
         return charCodeToGlyphId;
       }

-      let glyphNames = [".notdef"],
-        glyphId;
+      const glyphNames = [".notdef"];
+      let builtInEncoding, glyphId;
       for (glyphId = 0; glyphId < charstrings.length; glyphId++) {
         glyphNames.push(charstrings[glyphId].glyphName);
       }
       const encoding = properties.builtInEncoding;
       if (encoding) {
-        var builtInEncoding = Object.create(null);
+        builtInEncoding = Object.create(null);
         for (const charCode in encoding) {
           glyphId = glyphNames.indexOf(encoding[charCode]);
           if (glyphId >= 0
```
2021-05-02 21:00:29 +02:00
Jonas Jenwald
ff85bcfc0e Move the Type1Font from src/core/fonts.js and into its own file 2021-05-02 21:00:29 +02:00
Jonas Jenwald
e803584fe7 Convert src/core/cff_font.js to use standard classes 2021-05-02 21:00:29 +02:00
Jonas Jenwald
542ee0d798 Enable the no-var rule in the src/core/cff_font.js file
These changes were made automatically, using `gulp lint --fix`.
2021-05-02 21:00:29 +02:00
Jonas Jenwald
d5d73e3168 Move the CFFFont from src/core/fonts.js and into its own file 2021-05-02 21:00:29 +02:00
Jonas Jenwald
d4606712f2 Enable the no-var rule in the src/core/fonts_utils.js file
These changes were made *mostly* automatically, using `gulp lint --fix`, with the following manual changes:

```diff
diff --git a/src/core/fonts_utils.js b/src/core/fonts_utils.js
index f88ce4a8c..c4b3f3808 100644
--- a/src/core/fonts_utils.js
+++ b/src/core/fonts_utils.js
@@ -167,8 +167,8 @@ function type1FontGlyphMapping(properties, builtInEncoding,
glyphNames) {
   }

   // Lastly, merge in the differences.
-  let differences = properties.differences,
-    glyphsUnicodeMap;
+  const differences = properties.differences;
+  let glyphsUnicodeMap;
   if (differences) {
     for (charCode in differences) {
       const glyphName = differences[charCode];
```
2021-05-02 21:00:29 +02:00
Jonas Jenwald
77b258440b Move some constants and helper functions from src/core/fonts.js and into their own file
- `FontFlags`, is used in both `src/core/fonts.js` and `src/core/evaluator.js`.
 - `getFontType`, same as the above.
 - `MacStandardGlyphOrdering`, is a fairly large data-structure and `src/core/fonts.js` is already a *very* large file.
 - `recoverGlyphName`, a dependency of `type1FontGlyphMapping`; please see below.
 - `SEAC_ANALYSIS_ENABLED`, is used by both `Type1Font`, `CFFFont`, and unit-tests; please see below.
 - `type1FontGlyphMapping`, is used by both `Type1Font` and `CFFFont` which a later patch will move to their own files.
2021-05-02 21:00:29 +02:00
Jonas Jenwald
22539b52fa Convert src/core/to_unicode_map.js to use standard classes 2021-05-02 21:00:29 +02:00
Jonas Jenwald
33ea6b1131 Enable the no-var rule in the src/core/to_unicode_map.js file
These changes were made automatically, using `gulp lint --fix`.
2021-05-02 21:00:29 +02:00
Jonas Jenwald
6912bb5e0a Move the IdentityToUnicodeMap/ToUnicodeMap from src/core/fonts.js and into its own file 2021-05-02 21:00:29 +02:00
Jonas Jenwald
8c1d1a58f7 Convert src/core/opentype_file_builder.js to use standard classes 2021-05-02 21:00:28 +02:00
Jonas Jenwald
1808b2dc96 Enable the no-var rule in the src/core/opentype_file_builder.js file
These changes were made automatically, using `gulp lint --fix`.
2021-05-02 21:00:28 +02:00
Jonas Jenwald
a783c7ca79 Move the OpenTypeFileBuilder from src/core/fonts.js and into its own file 2021-05-02 21:00:28 +02:00
Tim van der Meij
af9feb1307
Merge pull request #13321 from timvandermeij/src-core-no-var
Enable the `no-var` linting rule in `src/core/{crypto,function}.js`
2021-05-02 13:45:33 +02:00
Tim van der Meij
b661cf2b80
Fix no-var linting rule violations in src/core/crypto.js that couldn't be changed automatically by ESLint
This is done in a separate commit due to the required number of changes
so that reviewing is easier than in a plain-text diff in the commit
message.
2021-05-02 13:32:34 +02:00
Tim van der Meij
1f8b452354
Enable the no-var linting rule in src/core/crypto.js
This is done automatically with `gulp lint --fix`.
2021-05-01 20:34:35 +02:00
Tim van der Meij
58e568fe62
Enable the no-var linting rule in src/core/function.js
This is done automatically with `gulp lint --fix` and the following
manual changes:

```diff
diff --git a/src/core/function.js b/src/core/function.js
index 878001057..b7e3e6ccf 100644
--- a/src/core/function.js
+++ b/src/core/function.js
@@ -131,7 +131,7 @@ function toNumberArray(arr) {
   return arr;
 }

-var PDFFunction = (function PDFFunctionClosure() {
+const PDFFunction = (function PDFFunctionClosure() {
   const CONSTRUCT_SAMPLED = 0;
   const CONSTRUCT_INTERPOLATED = 2;
   const CONSTRUCT_STICHED = 3;
@@ -484,7 +484,9 @@ var PDFFunction = (function PDFFunctionClosure() {
         // clip to domain
         const v = clip(src[srcOffset], domain[0], domain[1]);
         // calculate which bound the value is in
-        for (var i = 0, ii = bounds.length; i < ii; ++i) {
+        const length = bounds.length;
+        let i;
+        for (i = 0; i < length; ++i) {
           if (v < bounds[i]) {
             break;
           }
@@ -673,23 +675,21 @@ const PostScriptStack = (function PostScriptStackClosure() {
     roll(n, p) {
       const stack = this.stack;
       const l = stack.length - n;
-      let r = stack.length - 1,
-        c = l + (p - Math.floor(p / n) * n),
-        i,
-        j,
-        t;
-      for (i = l, j = r; i < j; i++, j--) {
-        t = stack[i];
+      const r = stack.length - 1;
+      const c = l + (p - Math.floor(p / n) * n);
+
+      for (let i = l, j = r; i < j; i++, j--) {
+        const t = stack[i];
         stack[i] = stack[j];
         stack[j] = t;
       }
-      for (i = l, j = c - 1; i < j; i++, j--) {
-        t = stack[i];
+      for (let i = l, j = c - 1; i < j; i++, j--) {
+        const t = stack[i];
         stack[i] = stack[j];
         stack[j] = t;
       }
-      for (i = c, j = r; i < j; i++, j--) {
-        t = stack[i];
+      for (let i = c, j = r; i < j; i++, j--) {
+        const t = stack[i];
         stack[i] = stack[j];
         stack[j] = t;
       }
@@ -939,7 +939,7 @@ class PostScriptEvaluator {
 // We can compile most of such programs, and at the same moment, we can
 // optimize some expressions using basic math properties. Keeping track of
 // min/max values will allow us to avoid extra Math.min/Math.max calls.
-var PostScriptCompiler = (function PostScriptCompilerClosure() {
+const PostScriptCompiler = (function PostScriptCompilerClosure() {
   class AstNode {
     constructor(type) {
       this.type = type;
```
2021-05-01 20:04:58 +02:00
Jonas Jenwald
90b5fcb8e0 Remove unnecessary TypedArray re-initialization in FontFaceObject.createFontFaceRule
The `this.data` property is, when defined, sent from the worker-thread as a `Uint8Array` and there's thus no reason to re-initialize the TypedArray here.
Note also the `FontFaceObject.createNativeFontFace` method just above, where we simply use `this.data` as-is.

The explanation for this code looking like it does is, as is often the case, for historical reasons. Originally we only supported `@font-face`, before the Font Loading API existed, and back then we also polyfilled TypedArrays (using regular Arrays) which should explain this particular line of code.
2021-05-01 19:20:36 +02:00
Jonas Jenwald
3624f9eac7 Add a new BaseStream.getString(...) method to replace manual bytesToString(BaseStream.getBytes(...)) calls
Given that the `bytesToString(BaseStream.getBytes(...))` pattern is somewhat common throughout the `src/core/` code, it cannot hurt to add a new `BaseStream`-method which handles that case internally.
2021-05-01 19:20:36 +02:00
Tim van der Meij
f6f335173d
Merge pull request #13303 from Snuffleupagus/BaseStream
Add an abstract base-class, which all the various Stream implementations inherit from
2021-05-01 19:13:36 +02:00
calixteman
af4dc55019
[api-minor] Fix the way to chunk the strings (#13257)
- Improve chunking in order to fix some bugs where the spaces aren't here:
    * track the last position where a glyph has been drawn;
    * when a new glyph (first glyph in a chunk) is added then compare its position with the last saved one and add a space or break:
      - there are multiple ways to move the glyphs and to avoid to have to deal with all the different possibilities it's a way easier to just compare positions;
      - and so there is now one function (i.e. "compareWithLastPosition") where all the job is done.
  - Add some breaks in order to get lines;
  - Remove the multiple whites spaces:
    * some spaces were filled with several whites spaces and so it makes harder to find some sequences of words using the search tool;
    * other pdf readers replace spaces by one white space.

Update src/core/evaluator.js

Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com>

Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com>
2021-04-30 14:41:13 +02:00
Jonas Jenwald
2ac4ad3111 Let ChunkedStream extend Stream, rather than BaseStream directly
Looking at the `ChunkedStream` implementation, it's basically a "regular" `Stream` but with added functionality in order to deal with fetching/loading of missing data.
Hence, by letting `ChunkedStream` extend `Stream`, we can remove some duplicate methods from the `ChunkedStream` class.
2021-04-28 14:05:25 +02:00
Jonas Jenwald
fb0775525e Stop special-casing the dict parameter in the Jbig2Stream/JpegStream/JpxStream constructors
For all of the other `DecodeStream`s we're not passing in a `Dict`-instance manually, but instead get it from the `stream`-parameter. Hence there's no particularly good reason, as far as I can tell, to not do the same thing in `Jbig2Stream`/`JpegStream`/`JpxStream` as well.
2021-04-28 13:44:47 +02:00
Jonas Jenwald
67a1cfc1b1 Improve the handling getBaseStreams, on the various Stream implementations
The way that `getBaseStreams` is currently handled has bothered me from time to time, especially how we're checking if the method exists before calling it.
By adding a dummy `BaseStream.getBaseStreams` method, and having the call-sites simply check the return value, we can improve some of the relevant code.

Note in particular how the `ObjectLoader._walk` method didn't actually check that the data in question is a Stream instance, and instead only checked the `currentNode` (which could be anything) for the existence of a `getBaseStreams` property.
2021-04-28 13:44:47 +02:00
Jonas Jenwald
67415bfabe Add an abstract base-class, which all the various Stream implementations inherit from
By having an abstract base-class, it becomes a lot clearer exactly which methods/getters are expected to exist on all Stream instances.
Furthermore, since a number of the methods are *identical* for all Stream implementations, this reduces unnecessary code duplication in the `Stream`, `DecodeStream`, and `ChunkedStream` classes.

For e.g. `gulp mozcentral`, the *built* `pdf.worker.js` files decreases from `1 619 329` to `1 616 115` bytes with this patch-series.
2021-04-28 13:44:45 +02:00
Jonas Jenwald
6151b4ecac Convert src/core/stream.js to use standard classes 2021-04-28 13:44:10 +02:00
Jonas Jenwald
29cf415a69 Enable the no-var rule in the src/core/stream.js file 2021-04-28 10:16:51 +02:00
Jonas Jenwald
b11f012e52 Convert src/core/decode_stream.js to use standard classes 2021-04-28 10:16:51 +02:00
Jonas Jenwald
8ce2cae4a7 Enable the no-var rule in the src/core/decode_stream.js file 2021-04-28 10:16:51 +02:00
Jonas Jenwald
30a22a168d Move the DecodeStream and StreamsSequenceStream from src/core/stream.js and into its own file 2021-04-28 10:16:51 +02:00
Jonas Jenwald
213e1c389c Convert src/core/flate_stream.js to use standard classes 2021-04-28 10:16:51 +02:00
Jonas Jenwald
aa1deaf93c Enable the no-var rule in the src/core/flate_stream.js file 2021-04-28 10:16:51 +02:00
Jonas Jenwald
1e5bf352a5 Move the FlateStream from src/core/stream.js and into its own file 2021-04-28 10:16:51 +02:00
Jonas Jenwald
40c342ec6c Convert src/core/predictor_stream.js to use standard classes 2021-04-28 10:16:51 +02:00
Jonas Jenwald
b08f9a8182 Enable the no-var rule in the src/core/predictor_stream.js file 2021-04-28 10:16:51 +02:00
Jonas Jenwald
66d9d83dcb Move the PredictorStream from src/core/stream.js and into its own file 2021-04-28 10:16:51 +02:00
Jonas Jenwald
e938c05edb Convert src/core/decrypt_stream.js to use standard classes 2021-04-28 10:16:51 +02:00
Jonas Jenwald
a9476e7dd0 Enable the no-var rule in the src/core/decrypt_stream.js file 2021-04-28 10:16:51 +02:00
Jonas Jenwald
28b0809e60 Move the DecryptStream from src/core/stream.js and into its own file 2021-04-28 10:16:51 +02:00
Jonas Jenwald
cdb583b764 Convert src/core/ascii_85_stream.js to use standard classes 2021-04-28 10:16:51 +02:00
Jonas Jenwald
f6c7a65202 Enable the no-var rule in the src/core/ascii_85_stream.js file 2021-04-28 10:16:51 +02:00
Jonas Jenwald
3294d4d5a3 Move the Ascii85Stream from src/core/stream.js and into its own file 2021-04-28 10:16:51 +02:00
Jonas Jenwald
d2227a7d10 Convert src/core/ascii_hex_stream.js to use standard classes 2021-04-28 10:16:50 +02:00
Jonas Jenwald
59591f8788 Enable the no-var rule in the src/core/ascii_hex_stream.js file 2021-04-28 10:16:50 +02:00
Jonas Jenwald
d63df04854 Move the AsciiHexStream from src/core/stream.js and into its own file 2021-04-28 10:16:50 +02:00
Jonas Jenwald
704514c7cd Convert src/core/run_length_stream.js to use standard classes 2021-04-28 10:16:50 +02:00
Jonas Jenwald
66b898eb58 Enable the no-var rule in the src/core/run_length_stream.js file 2021-04-28 10:16:50 +02:00
Jonas Jenwald
342b0c1bbc Move the RunLengthStream from src/core/stream.js and into its own file 2021-04-28 10:16:50 +02:00
Jonas Jenwald
1f0685cee6 Convert src/core/lzw_stream.js to use standard classes 2021-04-28 10:16:50 +02:00
Jonas Jenwald
1f9b134c6a Enable the no-var rule in the src/core/src/core/lzw_stream.js file 2021-04-28 10:16:50 +02:00
Jonas Jenwald
6c1a321500 Move the LZWStream from src/core/stream.js and into its own file 2021-04-28 10:16:50 +02:00
Tim van der Meij
0acd801b1e
Merge pull request #13305 from timvandermeij/annotation-polygon-polyline-no-appearance-stream
Implement rendering polyline/polygon annotations without appearance stream
2021-04-27 20:03:35 +02:00
Tim van der Meij
60ab15427f
Implement rendering polyline/polygon annotations without appearance stream 2021-04-27 19:02:20 +02:00
Jonas Jenwald
0ecb42f4d7 Convert src/core/jpx_stream.js to use standard classes 2021-04-27 13:29:09 +02:00
Jonas Jenwald
c51ef1f21f Convert src/core/jbig2_stream.js to use standard classes 2021-04-27 13:29:09 +02:00
Jonas Jenwald
d9c1bf96b6 Convert src/core/jpeg_stream.js to use standard classes 2021-04-27 13:29:09 +02:00
Jonas Jenwald
0ca63f94b4 Convert src/core/ccitt_stream.js to use standard classes 2021-04-27 13:29:09 +02:00
Jonas Jenwald
8ff213871b Convert src/core/ccitt.js to use standard classes
Given that we're using modules, meaning that only explicitly `export`ed things are visible to the outside, it's no longer necessary to wrap all of the code in a closure.
2021-04-27 13:29:09 +02:00
Tim van der Meij
ca668587c6
Merge pull request #13300 from Snuffleupagus/canvas-class
Convert the code in `src/display/canvas.js` to use standard classes
2021-04-27 13:19:36 +02:00
Jonas Jenwald
6f4394fcd8
Support InkAnnotations without appearance streams (issue 13298) (#13301)
For now, we keep things purposely simple by using straight lines (rather than curves); please see https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G11.2096579
2021-04-27 11:49:03 +02:00
Jonas Jenwald
e6601f4582 Convert the code in src/display/canvas.js to use standard classes
This gets rid of *a lot* of boilerplate that stems from our old way of simulating classes, and it actually reduces the filesize noticeably.
For e.g. `gulp mozcentral`, the *built* `pdf.js` files decreases from `318 404` to `314 722` bytes (~1 percent) with this patch.
2021-04-26 22:10:38 +02:00
Tim van der Meij
270e56dae8
Enable the no-var linting rule in src/core/image.js
This is done automatically with `gulp lint --fix` and the following
manual changes:

```diff
diff --git a/src/core/image.js b/src/core/image.js
index 35c06b8ab..e718b9937 100644
--- a/src/core/image.js
+++ b/src/core/image.js
@@ -97,7 +97,7 @@ class PDFImage {
     if (isName(filter)) {
       switch (filter.name) {
         case "JPXDecode":
-          var jpxImage = new JpxImage();
+          const jpxImage = new JpxImage();
           jpxImage.parseImageProperties(image.stream);
           image.stream.reset();
```
2021-04-25 17:40:00 +02:00
Tim van der Meij
16efd09c9f
Enable the no-var linting rule in src/core/worker.js
This is done automatically with `gulp lint --fix` and the following
manual changes:

```diff
diff --git a/src/core/worker.js b/src/core/worker.js
index aec9c1d39..f88691622 100644
--- a/src/core/worker.js
+++ b/src/core/worker.js
@@ -300,7 +300,7 @@ class WorkerMessageHandler {
         cachedChunks = [];
       };
       const readPromise = new Promise(function (resolve, reject) {
-        var readChunk = function ({ value, done }) {
+        const readChunk = function ({ value, done }) {
           try {
             ensureNotTerminated();
             if (done) {
```
2021-04-25 17:40:00 +02:00
Tim van der Meij
85659b4cf0
Enable the no-var linting rule in src/core/cmap.js
This is done automatically with `gulp lint --fix` and the following
manual changes:

```diff
diff --git a/src/core/cmap.js b/src/core/cmap.js
index 850275a19..8794726dd 100644
--- a/src/core/cmap.js
+++ b/src/core/cmap.js
@@ -519,8 +519,8 @@ const BinaryCMapReader = (function BinaryCMapReaderClosure() {

     readHexNumber(num, size) {
       let last;
-      let stack = this.tmpBuf,
-        sp = 0;
+      const stack = this.tmpBuf;
+      let sp = 0;
       do {
         const b = this.readByte();
         if (b < 0) {
@@ -603,7 +603,6 @@ const BinaryCMapReader = (function BinaryCMapReaderClosure() {

         const ucs2DataSize = 1;
         const subitemsCount = stream.readNumber();
-        var i;
         switch (type) {
           case 0: // codespacerange
             stream.readHex(start, dataSize);
@@ -614,7 +613,7 @@ const BinaryCMapReader = (function BinaryCMapReaderClosure() {
               hexToInt(start, dataSize),
               hexToInt(end, dataSize)
             );
-            for (i = 1; i < subitemsCount; i++) {
+            for (let i = 1; i < subitemsCount; i++) {
               incHex(end, dataSize);
               stream.readHexNumber(start, dataSize);
               addHex(start, end, dataSize);
@@ -633,7 +632,7 @@ const BinaryCMapReader = (function BinaryCMapReaderClosure() {
             addHex(end, start, dataSize);
             stream.readNumber(); // code
             // undefined range, skipping
-            for (i = 1; i < subitemsCount; i++) {
+            for (let i = 1; i < subitemsCount; i++) {
               incHex(end, dataSize);
               stream.readHexNumber(start, dataSize);
               addHex(start, end, dataSize);
@@ -647,7 +646,7 @@ const BinaryCMapReader = (function BinaryCMapReaderClosure() {
             stream.readHex(char, dataSize);
             code = stream.readNumber();
             cMap.mapOne(hexToInt(char, dataSize), code);
-            for (i = 1; i < subitemsCount; i++) {
+            for (let i = 1; i < subitemsCount; i++) {
               incHex(char, dataSize);
               if (!sequence) {
                 stream.readHexNumber(tmp, dataSize);
@@ -667,7 +666,7 @@ const BinaryCMapReader = (function BinaryCMapReaderClosure() {
               hexToInt(end, dataSize),
               code
             );
-            for (i = 1; i < subitemsCount; i++) {
+            for (let i = 1; i < subitemsCount; i++) {
               incHex(end, dataSize);
               if (!sequence) {
                 stream.readHexNumber(start, dataSize);
@@ -692,7 +691,7 @@ const BinaryCMapReader = (function BinaryCMapReaderClosure() {
               hexToInt(char, ucs2DataSize),
               hexToStr(charCode, dataSize)
             );
-            for (i = 1; i < subitemsCount; i++) {
+            for (let i = 1; i < subitemsCount; i++) {
               incHex(char, ucs2DataSize);
               if (!sequence) {
                 stream.readHexNumber(tmp, ucs2DataSize);
@@ -717,7 +716,7 @@ const BinaryCMapReader = (function BinaryCMapReaderClosure() {
               hexToInt(end, ucs2DataSize),
               hexToStr(charCode, dataSize)
             );
-            for (i = 1; i < subitemsCount; i++) {
+            for (let i = 1; i < subitemsCount; i++) {
               incHex(end, ucs2DataSize);
               if (!sequence) {
                 stream.readHexNumber(start, ucs2DataSize);
```
2021-04-25 17:40:00 +02:00
Jonas Jenwald
4078dd856c Clear some Arrays, rather than re-initialize them, in src/display/-code
It's generally better to re-use the same Array, by clearing out all of its elements, rather than creating a new Array.
2021-04-24 13:00:53 +02:00
Jonas Jenwald
da22146b95 Replace a bunch of Array.prototype.forEach() cases with for...of loops instead
Using `for...of` is a modern and generally much nicer pattern, since it gets rid of unnecessary callback-functions. (In a couple of spots, a "regular" `for` loop had to be used.)
2021-04-24 13:00:19 +02:00
Tim van der Meij
da0e7ea969
Merge pull request #13272 from calixteman/issue13271
Update all the text widgets having the same name with the same value
2021-04-23 21:08:54 +02:00
Tim van der Meij
a6e3ad4c72
Merge pull request #13283 from Snuffleupagus/NameOrNumberTree-getAll-map
Change `NameOrNumberTree.getAll` to return a `Map` rather than an Object
2021-04-23 20:53:52 +02:00
calixteman
762cfd2d1b
[JS] Use heap allocation when initializing quickjs sandbox (#13286)
- In case of large string the sandbox initialization failed because of an OOM
    * so allocate a new string in the heap
    * and free it after use.
  - it requires a quickjs update since we need to export some symbols (stringToNewUTF8 and free).
2021-04-23 12:04:14 +02:00
Jonas Jenwald
4ec0a4fb43 Re-factor the Catalog._collectJavaScript method slightly
This patch first of all moves all checking/validation into the `appendIfJavaScriptDict` function, to avoid duplicating it in multiple places. Secondly, also removes what's now an outdated/incorrect comment since we have implemented scripting support.
2021-04-23 09:42:32 +02:00
Jonas Jenwald
83f7009e4b Change NameOrNumberTree.getAll to return a Map rather than an Object
Given that we're (almost) always iterating through the result of the `getAll`-calls, using a `Map` seems nicer overall since it's more suited to iteration compared to a regular Object.

Also, add a couple of `Dict`-checks in existing code touched by this patch, since it really cannot hurt to prevent *potential* errors in a corrupt PDF document.
2021-04-22 13:15:50 +02:00
Jonas Jenwald
57a1ea840f
Ensure that saveDocument works if there's no /ID-entry in the PDF document (issue 13279) (#13280)
First of all, while it should be very unlikely that the /ID-entry is an *indirect* object, note how we're using `Dict.get` when parsing it e.g. in `PDFDocument.fingerprint`. Hence we definitely should be consistent here, since if the /ID-entry is an *indirect* object the existing code in `src/core/writer.js` would already fail.
Secondly, to fix the referenced issue, we also need to check that the /ID-entry actually is an Array before attempting to access its contents in `src/core/writer.js`.

*Drive-by change:* In the `xrefInfo` object passed to the `incrementalUpdate` function, re-name the `encrypt` property to `encryptRef` since its data is fetched using `Dict.getRaw` (given the names of the other properties fetched similarly).
2021-04-22 12:08:56 +02:00
Brendan Dahl
066cbcfb27
Merge pull request #13277 from Snuffleupagus/adjustToUnicode-cff
For CFF fonts without proper `ToUnicode`/`Encoding` data, utilize the "charset"/"Encoding"-data from the font file to improve text-selection (issue 13260)
2021-04-21 10:41:36 -07:00
Brendan Dahl
5231d922ec
Add presentation role to text layer spans. (#13278)
Keeps screen readers from pausing on every span so
paragraphs are read more naturally. Note: this only seems
to affect Firefox, Chrome automatically combines the spans.
2021-04-21 10:47:51 +02:00
Jonas Jenwald
7fab73ed23 For CFF fonts without proper ToUnicode/Encoding data, utilize the "charset"/"Encoding"-data from the font file to improve text-selection (issue 13260)
This patch extends the approach, implemented in PR 7550, to also apply to CFF fonts.
2021-04-20 20:48:44 +02:00
Jonas Jenwald
8f6543c218 Ensure that the /Properties, used with optional content, is actually loaded *before* parsing the operatorList/textContent (PR 12095 follow-up)
By not waiting for the /Properties to load, before parsing of the operatorList/textContent starts, there's a very real risk that a `MissingDataException` will be thrown when trying to access the data in the `PartialEvaluator.parseMarkedContentProps` method.
If this ever happens it will thus lead to incomplete and/or outright broken rendering, and with e.g. `disableAutoFetch=true` set the likelihood of this occuring would increase quite a bit.

*Please note:* While I've not yet seen this error in an actual PDF document, it can happen during loading if you're unlucky enough with e.g. the structure of the PDF document and/or the download speed offered by the server.
2021-04-20 20:22:44 +02:00
Calixte Denizet
e868ab0051 Update all the text widgets having the same name with the same value 2021-04-20 20:03:19 +02:00
Jonas Jenwald
f560fe6875 A couple of small scripting/XFA-related tweaks in the worker-code
- Use `PDFManager.ensureDoc`, rather than `PDFManager.ensure`, in a couple of spots in the code. If there exists a short-hand format, we should obviously use it whenever possible.

 - Fix a unit-test helper, to account for the previous changes. (Also, converts a function to be `async` instead.)

 - Add one more exists-check in `PDFDocument.loadXfaFonts`, which I missed to suggest in PR 13146, to prevent any possible errors if the method is ever called in a situation where it shouldn't be.
   Also, print a warning if the actual font-loading fails since that could help future debugging. (Finally, reduce overall indentation in the loop.)

 - Slightly unrelated, but make a small tweak of a comment in `src/core/fonts.js` to reduce possible confusion.
2021-04-17 10:34:22 +02:00
Brendan Dahl
ac3fa1e3d7
Merge pull request #13146 from calixteman/xfa_fonts
XFA -- Load fonts permanently from the pdf
2021-04-16 12:55:12 -07:00
Calixte Denizet
7e9579045f XFA -- Load fonts permanently from the pdf
- Different fonts can be used in xfa and some of them are embedded in the pdf.
  - Load all the fonts in window.document.

Update src/core/document.js

Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com>

Update src/core/worker.js

Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com>
2021-04-15 17:57:42 +02:00
Jani Pehkonen
3a96977ea8 Implement visibility expressions for optional content 2021-04-14 17:39:41 +03:00
Jonas Jenwald
1d6d476cab Rename the src/core/obj.js file to src/core/catalog.js
Now that only the `Catalog` remains in this file, after the previous patches, it makes sense to rename the file to reduce confusion.
2021-04-13 21:00:30 +02:00
Jonas Jenwald
088a55f80d Enable the no-var rule in the src/core/xref.js file 2021-04-13 21:00:30 +02:00
Jonas Jenwald
bc828cd41f Convert the XRef to a "normal" class 2021-04-13 21:00:30 +02:00
Jonas Jenwald
e8750cfe95 Move the XRef from src/core/obj.js and into its own file
The size of the `src/core/obj.js` file has increased slowly over the years, and it also contains a fair amount of *distinct* functionality.
In order to improve readability and make it easier to navigate through the code, this patch moves the `XRef` into its own file.
2021-04-13 21:00:30 +02:00
Jonas Jenwald
24e5ecdf76 Move NameTree/NumberTree from src/core/obj.js and into its own file
The size of the `src/core/obj.js` file has increased slowly over the years, and it also contains a fair amount of *distinct* functionality.
In order to improve readability and make it easier to navigate through the code, this patch moves `NameTree`/`NumberTree` into its own file.
2021-04-13 21:00:30 +02:00
Jonas Jenwald
92141e0468 Enable the no-var rule in the src/core/file_spec.js file 2021-04-13 21:00:30 +02:00
Jonas Jenwald
22a066e657 Convert the FileSpec to a "normal" class 2021-04-13 21:00:30 +02:00
Jonas Jenwald
e02d17da93 Move the FileSpec from src/core/obj.js and into its own file
The size of the `src/core/obj.js` file has increased slowly over the years, and it also contains a fair amount of *distinct* functionality.
In order to improve readability and make it easier to navigate through the code, this patch moves the `FileSpec` into its own file.
2021-04-13 21:00:30 +02:00
Jonas Jenwald
6a935682fd Covert the ObjectLoader to a "normal" class 2021-04-13 21:00:30 +02:00
Jonas Jenwald
604cd6d600 Move the ObjectLoader from src/core/obj.js and into its own file
The size of the `src/core/obj.js` file has increased slowly over the years, and it also contains a fair amount of *distinct* functionality.
In order to improve readability and make it easier to navigate through the code, this patch moves the `ObjectLoader` into its own file.
2021-04-13 21:00:30 +02:00
Tim van der Meij
ebeb3f7999
Merge pull request #13234 from Snuffleupagus/hasJSActions-MissingDataException
[api-minor] Ensure that `PDFDocumentProxy.hasJSActions` won't fail if `MissingDataException`s are thrown during the associated worker-thread parsing
2021-04-13 20:44:58 +02:00
Cetin Sert
d498897ab5
Fix annotation input focus trap regression in Safari (#13232)
`setSelectionRange(0, 0)` added in 44b24fcc29 for #12359, required only by Firefox ([bug](https://bugzilla.mozilla.org/show_bug.cgi?id=860329)), causes issues mozilla#13191, mozilla#12592 in Safari.
`scrollLeft = 0` is a fix that breaks the focus trap in Safari while **keeping Firefox behavior same for #12359**.
2021-04-13 20:40:52 +02:00
Tim van der Meij
3d2d8002b0
Merge pull request #13223 from Snuffleupagus/worker-xfa-structTree-tweaks
Remove the unused "GetIsPureXfa" message handler; and avoid unnecessary parsing when no structTree is available (PR 13069 follow-up, PR 13221 follow-up)
2021-04-13 20:39:52 +02:00
Jonas Jenwald
2b2234fd5a [api-minor] Ensure that PDFDocumentProxy.hasJSActions won't fail if MissingDataExceptions are thrown during the associated worker-thread parsing
With the current implementation of `PDFDocument.hasJSActions`, in the worker-thread, we're not actually handling not-yet-loaded data correctly. This can thus fail in *two* different ways:
 - The `PDFDocument.fieldObjects` getter (and its helper method), while it may *return* a Promise, still fetches all of its data synchronously and it can thus throw a `MissingDataException` during parsing.
 - The `Catalog.jsActions` getter, which is completely synchronous, can obviously throw a `MissingDataException` during parsing.

If either of these cases occur currently, the `PDFDocumentProxy.hasJSActions` method in the API can either return a *rejected* Promise (which it never should) or possibly "hang" and never resolve.

*Please note:* While I've not *yet* seen this error in an actual PDF document, it can happen during loading if you're unlucky enough with e.g. the structure of the PDF document and/or the download speed offered by the server.
This patch is thus based on code-inspection *and* on manually throwing a `MissingDataException` on the first access of `Catalog.jsActions` to simulate this situation.

Finally, this patch adds a couple of *API* unit-tests for this (since none existed).
2021-04-13 14:33:56 +02:00
Jonas Jenwald
4aa27cc645 Re-factor Catalog._collectJavaScript to use a Map rather than an Object
Given that this only an internal helper method, used by the `Catalog.{javaScript, jsActions}` getters, this change simplifies iteration of the returned data.
We can also (slightly) re-factor the code of the `jsActions` getter, and remove an obsolete[1] JSDoc-comment from the `openAction` getter.

---
[1] Not really relevant now that we've got proper scripting support.
2021-04-13 14:16:17 +02:00
Calixte Denizet
a4c986515f XFA -- Display text content
- display xhtml;
  - allow spaces in xhtml (xfa-spacerun:yes);
  - support column layout;
  - fix some border issues.
2021-04-12 14:13:49 +02:00
Jonas Jenwald
54ef4370a2 Ensure that the data is loaded, in the "GetPageJSActions" message handler
Similar to all other data accesses, note e.g. the "GetDocJSActions" handler just above, we need to ensure that a `MissingDataException` isn't propagated to the main-thread if this data is accessed while the PDF document is still loading.
2021-04-12 13:54:37 +02:00
Jonas Jenwald
9360c7cbdc Avoid unnecessary parsing, in Page.GetStructTree, when no structTree is available (PR 13221 follow-up)
It's obviously (a bit) more efficient to return early in `Page.getStructTree`, rather than trying to first "parse" an *empty* structTree-root.

*Somehow I didn't think of this yesterday, but this feels like a much better solution overall; sorry about the churn here!*
2021-04-12 08:54:21 +02:00
Jonas Jenwald
0d2dd6c2fe Remove the unused "GetIsPureXfa" message handler in the worker (PR 13069 follow-up)
Looking at the API, there's no code which actually sends this message. Most likely it's a left-over from a previous version of PR 13069, since the `isPureXfa` parameter is being included in the "GetDoc" message.
2021-04-12 08:52:27 +02:00
Jonas Jenwald
5adee0cdd1 [api-minor] Let PDFPageProxy.getStructTree return null, rather than an empty structTree, for documents without any accessibility data (PR 13171 follow-up)
This is first of all consistent with existing API-methods, where we return `null` when the data in question doesn't exist. Secondly, it should also be (slightly) more efficient since there's less dummy-data that we need to transfer between threads.
Finally, this prevents us from adding an empty/unnecessary span to *every* single page even in documents without any structure tree data.
2021-04-11 12:35:33 +02:00
Jonas Jenwald
ff4dae05b0 Ensure that getStructTree won't break with disableAutoFetch = true set (PR 13171 follow-up)
Open http://localhost:8888/web/viewer.html?file=/test/pdfs/pdf.pdf#disableStream=true&disableAutoFetch=true and observe the following message in the console (repeated for each page of the document):
```
Uncaught (in promise)
Object { message: "Missing data [19787293, 19787294)", name: "UnknownErrorException", details: "MissingDataException: Missing data [19787293, 19787294)", stack: "BaseExceptionClosure@http://localhost:8888/src/shared/util.js:458:29\n@http://localhost:8888/src/shared/util.js:462:3\n" }
```
2021-04-11 12:15:33 +02:00
Tim van der Meij
d9d626a5e1
Merge pull request #13214 from calixteman/signatures
Display widget signature
2021-04-10 19:35:16 +02:00
Calixte Denizet
5875ebb1ca Display widget signature
- but don't validate them for now;
  - Firefox will display a bar to warn that the signature validation is not supported (see https://bugzilla.mozilla.org/show_bug.cgi?id=854315)
  - almost all (all ?) pdf readers display signatures;
  - validation is done in edge but for now it's behind a pref.
2021-04-10 19:13:28 +02:00
Tim van der Meij
03c8c89002
Merge pull request #13171 from brendandahl/struct-tree
[api-minor] Add support for basic structure tree for accessibility.
2021-04-09 21:32:44 +02:00
Tim van der Meij
b0473eb353
Merge pull request #13207 from Snuffleupagus/api-AnnotationStorage-params
[api-minor] Remove the manual passing of an `AnnotationStorage`-instance when calling various API-method
2021-04-09 21:09:16 +02:00
Brendan Dahl
fc9501a637 Add support for basic structure tree for accessibility.
When a PDF is "marked" we now generate a separate DOM that represents
the structure tree from the PDF.  This DOM is inserted into the <canvas>
element and allows screen readers to walk the tree and have more
information about headings, images, links, etc. To link the structure
tree DOM (which is empty) to the text layer aria-owns is used. This
required modifying the text layer creation so that marked items are
now tracked.
2021-04-09 09:56:28 -07:00
Jonas Jenwald
737a8e846d Add deprecated handling of the now removed AnnotationStorage API-parameters
These changes are done separately, to make it easier to remove them in the future.
2021-04-09 13:25:03 +02:00
Jonas Jenwald
72ef183085 [api-minor] Remove the manual passing of an AnnotationStorage-instance when calling various API-method
Note how we purposely don't expose the `AnnotationStorage`-class directly in the official API (see `src/pdf.js`), since trying to use *multiple* ones simultaneously doesn't really make sense (e.g. in the viewer).
Instead we lazily initialize, and cache, just *one* instance via `PDFDocumentProxy.annotationStorage` which should thus be available internally in the API itself without having to be manually passed to various methods.

To support these changes, the `AnnotationStorage`-instance initialization is moved into the `WorkerTransport`-class to allow both `PDFDocumentProxy` and `PDFPageProxy` to access it.
This patch implements the following simplifications:
 - Remove the `annotationStorage`-parameter from `PDFDocumentProxy.saveDocument`, since it's already available internally.
   Furthermore, while it's currently possible to call that method without an `AnnotationStorage`-instance, that really does *not* make any sense at all. In this case you're effectively reducing `PDFDocumentProxy.saveDocument` to a "regular" `PDFDocumentProxy.getData` call, but with *a lot* more overhead, which was obviously not the intention of the `PDFDocumentProxy.saveDocument`-method.

 - Try to discourage third-party users from calling `PDFDocumentProxy.saveDocument` unconditionally, as a replacement for `PDFDocumentProxy.getData` (note the previous point).

 - Replace the `annotationStorage`-parameter, in `PDFPageProxy.render`, with a boolean `includeAnnotationStorage`-parameter which simply indicates if the (internally available) `AnnotationStorage`-instance should be used during rendering (e.g. for printing).

 - By removing the need to *manually* provide `annotationStorage`-parameters to various API-methods, using the API should become simpler (e.g. for third-parties) since you no longer need to worry about manually fetching and passing around this data.
2021-04-09 13:24:25 +02:00
Ikko Ashimine
c4c4333d54
Fix typo in canvas.js
Reseting -> Resetting
2021-04-08 23:45:24 +09:00
Tim van der Meij
6429ccc002
Merge pull request #13194 from Snuffleupagus/ttcf-fuzzy-match
Fuzzy-match the fontName, for TrueType Collection fonts, where the "name"-table is wrong (issue 13193)
2021-04-07 20:50:19 +02:00
Tim van der Meij
5945f7c4a1
Merge pull request #13186 from Snuffleupagus/rm-deprecated-code
Remove some `deprecated` code
2021-04-07 20:38:59 +02:00
Jonas Jenwald
f986ccdf0e Fuzzy-match the fontName, for TrueType Collection fonts, where the "name"-table is wrong (issue 13193)
The fontName, as defined in the PDF document, cannot be found in *any* of the "name"-tables in the TrueType Collection font. To work-around that, this patch adds a *fallback* code-path to allow using an approximately matching fontName rather than outright failing.
2021-04-07 15:25:32 +02:00
Jonas Jenwald
4e81e0e14f Remove the deprecated AnnotationStorage.getOrCreateValue-method (PR 12759 follow-up)
While this method has only been deprecated in one releases now, the `AnnotationStorage`-functionality is new enough that third-party implementations hopefully don't rely heavily on it just yet. (And removing this quickly should help reduce the likelihood that someone starts using it.)
2021-04-06 13:22:06 +02:00
Tim van der Meij
fc0cd4a443
Convert the startXRefParsedCache variable, in src/core/obj.js, from an object to a set
We only want to track XRef starting points instead of actual data, so
using a set conveys that intention more clearly and is slightly more
efficient.
2021-04-05 19:32:58 +02:00
Tim van der Meij
228adbf673
Merge pull request #13172 from Snuffleupagus/cleanup-keepFonts
[api-minor] Add an option, in `PDFDocumentProxy.cleanup`, to allow fonts to remain attached to the DOM
2021-04-05 14:21:34 +02:00
Jonas Jenwald
16fd838f52 Convert the renderTasks, used in PDFPageProxy.render/PDFPageProxy.getOperatorList, to a Set
When removing tasks we're currently forced to *indirectly* iterate through the array, which can be avoided by using a Set instead.
Furthermore, we can also (slightly) modernize the code responsible for initializing the `renderTasks`.
2021-04-05 10:51:28 +02:00
Jonas Jenwald
68d3a333ac Change the seenStyles object, in PartialEvaluator.getTextContent, to a Set
Given that what we actually want is only to keep track of the loadedFont-names, rather than storing any actual data, using an object isn't really necessary here. Furthermore, in the current code, we're also using `in` when checking if the data exists, which is generally less efficient than just checking for the value directly.
2021-04-05 10:34:02 +02:00
Jonas Jenwald
a2bc6481a0 [api-minor] Add an option, in PDFDocumentProxy.cleanup, to allow fonts to remain attached to the DOM
As mentioned in the JSDoc comment, this should not be used unless you know what you're doing, since it will lead to increased memory usage. However, in some situations (e.g. SVG-rendering), we still want to be able to run general clean-up on both the main/worker-thread while keeping loaded fonts attached to the DOM.[1]

As part of these changes, `WorkerTransport.startCleanup` is converted to an async method and we'll also skip clean-up when destruction has started (since it's redundant).

---
[1] The SVG-rendering mode is obviously not officially supported, since it's both rather incomplete and inherently slower. However with recent changes, whereby we cache repeated images on the document rather than the page level, memory usage can be *a lot* worse than before if we never attempt to release e.g. cached image-data when the viewer is in SVG-rendering mode.
2021-04-02 12:32:31 +02:00
Jonas Jenwald
48ff20493f Mark some internal PDFDocumentProxy-properties as "private"
These two properties were *never* intended to be anything but "private", hence it really cannot hurt to actually indicate that they're *not* part of any official API.
2021-04-02 12:26:32 +02:00
Jonas Jenwald
0eb1433c78 [api-minor] Change the format of the fontName-property, in defaultAppearanceData, on Annotation-instances (PR 12831 follow-up)
Currently the `fontName`-property contains an actual /Name-instance, which is a problem given that its fallback value is an empty string; see ca7f546828/src/core/default_appearance.js (L35)
The reason that this is a problem can be seen in ca7f546828/src/core/primitives.js (L30-L34), since an empty string short-circuits the cache. Essentially, in PDF documents, a /Name-instance cannot be empty and the way that the `DefaultAppearanceEvaluator` does things is unfortunately not entirely correct.

Hence the `fontName`-property is changed to instead contain a string, rather than a /Name-instance, which simplifies the code overall.

*Please note:* I'm tagging this patch with "[api-minor]", since PR 12831 is included in the current pre-release (although we're not using the `fontName`-property in the display-layer).
2021-04-01 16:47:30 +02:00
Tim van der Meij
ca7f546828
Merge pull request #12908 from calixteman/11918
Slightly rescale lineWidth to workaround chrome rendering issue
2021-03-31 21:56:31 +02:00
Calixte Denizet
a0cfb0841f Slightly rescale lineWidth to workaround chrome rendering issue 2021-03-31 21:49:00 +02:00
Tim van der Meij
5a64157a2f
Merge pull request #13168 from janpe2/ttf-uni-glyphs
Use post table when Encoding has only Differences
2021-03-31 21:35:13 +02:00
Tim van der Meij
1a4af17d07
Merge pull request #13165 from Snuffleupagus/Annotation-rm-defaultAppearance-export
[api-minor] Stop exposing the *raw* `defaultAppearance`-string on Annotation-instances
2021-03-31 21:30:50 +02:00
Tim van der Meij
5be0fbe8f1
Merge pull request #13166 from Snuffleupagus/getDocument-URL
[api-minor] Support proper `URL`-objects, in addition to URL-strings, in `getDocument`
2021-03-31 21:20:08 +02:00
Tim van der Meij
2fb4d02ea5
Merge pull request #13158 from Snuffleupagus/rm-URL-polyfill
Remove the `URL` polyfill
2021-03-31 20:22:02 +02:00
Jani Pehkonen
0117ee5071 Use post table when Encoding has only Differences
Fixes #13107
In the issue, some TrueType glyph names have the format `uniXXXX`.
Font's `Encoding` dictionary has the entry `Differences` but no
`BaseEncoding`. `uniXXXX` names are converted to glyph indices
using font's `post` table but currently that is done only when
`BaseEncoding` exists. We must enable the conversion also when only
`Differences` exists.
2021-03-31 17:58:44 +03:00
Jonas Jenwald
db1e1612df [api-minor] Support proper URL-objects, in addition to URL-strings, in getDocument
Currently only URL-strings are officially supported by `getDocument`, however at this point in time I cannot really see any compelling reason to not support `URL`-objects as well.

Most likely the reason that we've don't already support `URL`-objects, in `getDocument`, is that historically `URL` wasn't fully implemented across browsers and our old polyfill wasn't perfect; see https://developer.mozilla.org/en-US/docs/Web/API/URL/URL#browser_compatibility

*Please note:* Because of how the `url` parameter is currently handled, there's actually *some* cases where passing a `URL`-object to `getDocument` already works. That, in my opinion, provides additional motivation for supporting `URL`-objects officially, since it makes the API more consistent.

The following is an attempt to summarize the *current* situation, based on the actual code rather than the JSDocs:
 - `getDocument("url string")` works and is documented.[1]
 - `getDocument({ url: "url string", })` works and is documented.[1]
 - `getDocument(new URL(...))` throws immediately, since no supported parameters are found.
 - `getDocument({ url: new URL(...), })` actually works even though it's not documented.[1] Originally, when data was fetched on the worker-thread, this would likely have thrown since `URL` isn't clonable.[2]
 - `getDocument({ url: { abc: 123, }, })`, or some similarily meaningless input, will be "accepted" by `getDocument` and then throw a `MissingPDFException` when attempting to fetch the bogus data.

With the changes in this patch, not only is `URL`-objects now officially supported and documented when calling `getDocument`, but we'll also do a much better job at actually validating any URL-data passed to `getDocument` (and instead fail early).

---
[1] In *browsers*, we create a valid URL thus indirectly validating the input. In Node.js environments, on the other hand, no validation is done since obtaining a baseUrl is more difficult (and PDF.js is primarily written for browsers anyway).

[2] https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm#supported_types
2021-03-31 16:21:41 +02:00
Jonas Jenwald
27add0f1f3 Re-factor the source parsing, in getDocument, to use switch rather than if...else
Given the number of parameters that we now need to parse here, this code is no longer as readable as one would like. Hence this re-factoring, which will improve overall readability and also help with the next patch.
2021-03-31 16:21:37 +02:00
Jonas Jenwald
9c6770748c Move the PDFDocumentStats typedef closer to its usage
Currently this typedef appears slightly out-of-place, in the middle of the arguably much more important `getDocument` JSDocs.
2021-03-31 16:21:22 +02:00
calixteman
b3528868c1
XFA - Add support for few ui elements (#13115)
- input;
  - layout;
  - border;
  - margin;
  - color.
2021-03-31 15:42:21 +02:00
Jonas Jenwald
3df24254e3 [api-minor] Stop exposing the *raw* defaultAppearance-string on Annotation-instances
The reasons for making this change are:
 - This property is not, nor has it ever been, used anywhere in the PDF.js display-layer.
 - Related to the previous point, the format of the `defaultAppearance`-string is such that it'd be difficult to use it as-is in the display-layer anyway.
 - It (usually) contains the "raw" appearance-string, from the PDF document, which is neither parsed nor validated and could thus be bogus.
 - We now expose a `defaultAppearanceData`-property, which is first of all used in the display-layer and secondly contains actually parsed/validated data.
 - In the event that a third-party implementation needs the `defaultAppearance`-string, it could be easily constructed from the recently added `defaultAppearanceData`-property.

All-in-all, I'm thus suggesting that we stop exposing an unused and unnecessary property on all Annotation-instances.
2021-03-31 15:09:18 +02:00
Jonas Jenwald
38acde8375 Use template strings, to reduce unnecessary verbosity in a few warn(...) calls in src/core/annotation.js 2021-03-31 14:40:21 +02:00
calixteman
84d7cccb1d
JS - Handle correctly hierarchy of fields (#13133)
* JS - Handle correctly hierarchy of fields
  - it aims to fix #13132;
  - annotations can inherit their actions from the parent field;
  - there are some fields which act as a container for other fields:
    - they can be access through js so need to add them with an empty type (nothing in the spec about that but checked in Acrobat);
    - calculation order list (CO) can reference them so need make them through this.getField;
    - getArray method must return kids.
  - field values are number, string, ... depending of their type but nothing in the spec on how to know what's the type:
    - according to the comment for Canonical Format: https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#page=461
    - it seems that this "type" can be guessed from js action Format (when setting a type in Acrobat DC, the only affected thing is this action).
  - util.scand with an empty string returns the current date.
2021-03-30 08:50:35 -07:00
Jonas Jenwald
fa86a192f9 Remove the URL polyfill
Based on this compatibility information, given that IE 11 is now *explicitly* unsupported, we should no longer need to bundle a `URL` polyfill in any builds: https://developer.mozilla.org/en-US/docs/Web/API/URL/URL#browser_compatibility

Note that the caveat listed for older Safari-versions doesn't apply to any code in the PDF.js library, since we never call `new URL(url, undefined)` in the code-base.

Note also that Node.js has a web-compatible `URL` implementation, which according to the "History" section at https://nodejs.org/api/url.html#url_the_whatwg_url_api has been available since Node.js `10.0.0` (according to https://nodejs.org/en/about/releases/ that branch is one month away from being EOL-ed).
2021-03-29 18:00:36 +02:00
Tim van der Meij
1a2cdaffc5
Merge pull request #13152 from calixteman/13130
Skip extra objects in object stream in using offsets
2021-03-28 15:11:55 +02:00
Jonas Jenwald
19c2dfbb96 Move rotation normalization from PDFViewerApplication and into BaseViewer
The rotation handling that's currently living in `PDFViewerApplication` is *very* old, and pre-dates the introduction of the viewer components by years.
As can be seen in the `BaseViewer.pagesRotation` setter, we're not actually normalizing the rotation as intended and instead rely on the caller to handle that correctly. This is first of all inconsistent, given how other setters are implemented, and secondly it could also lead to the rotation being set to a value outside of the `[0, 360)`-range.

Finally, for improved consistency the rotation handling in `PageViewport` is updated similarly. Please note that this case, it's *not* changing the pre-existing logic.
2021-03-28 14:19:58 +02:00
Calixte Denizet
9296ee6986 Skip extra objects in object stream in using offsets 2021-03-28 13:03:05 +02:00
calixteman
81c602c61c
Set CFF header to 4 when writing it because it contains 4 elements (#13149) 2021-03-26 18:23:18 +01:00
calixteman
63471bcbbe
XFA - Convert some template properties into CSS ones (#13082)
- implement few positioning properties: position, width, height, anchor;
  - implement font element;
  - implement fill element (used by font) and its children (linear, radial, ...);
  - font property is inherited from ancestor container (see https://www.pdfa.org/wp-content/uploads/2020/07/XFA-3_3.pdf#page=43) so let CSS handles that stuff;
  - in order to reduce the number of properties to set, only set non default properties and put the default in CSS;
  - set a background to some containers to be able to see them (will be removed in a future commit).
2021-03-25 13:02:39 +01:00