Commit Graph

3195 Commits

Author SHA1 Message Date
Jonas Jenwald
9faf2fa8a0
Merge pull request #16019 from Snuffleupagus/viewer-rm-require
Remove most build-time `require` statements from the viewer (PR 16009 follow-up)
2023-02-08 15:02:40 +01:00
Jonas Jenwald
90ffbc1d39 Remove most build-time require statements from the viewer (PR 16009 follow-up)
This further extends the web-specific import maps introduced in PR 16009, to allow removing *most* of the build-time `require` statements from the viewer. The few remaining ones are fallbacks used for the COMPONENTS respectively the `legacy` GENERIC builds.
2023-02-07 22:45:19 +01:00
Calixte Denizet
a25895bf72 [Annotation] Take into account the stroke alpha for a FreeText without appearance 2023-02-07 22:15:27 +01:00
calixteman
ecd86ccffc
Merge pull request #16020 from calixteman/bug1815476
[Annotation] Avoid to encrypt the appearance stream two times (bug 1815476)
2023-02-07 20:57:49 +01:00
Calixte Denizet
ea7b4b4d6c [Annotation] Avoid to encrypt the appearance stream two times (bug 1815476) 2023-02-07 19:26:46 +01:00
Jonas Jenwald
a98e80c4ff [GeckoView] Reduce the size of the *built* viewer
Given that the GV-viewer isn't using most of the UI-related components of the default-viewer, we can avoid including them in the *built* viewer to save space.[1]
The least "invasive" way of implementing this, at least that I could come up with, is to leverage import maps with suitable stubs for the GV-viewer.

The one slightly annoying thing is that we now have larger import maps across multiple html-files, and you'll need to remember to update all of them when making future changes.

---
[1] With this patch, the built `viewer.js` size is 391 kB and `viewer-geckoview.js` is 285 kB.
2023-02-05 14:12:32 +01:00
Tim van der Meij
e848a0e61c
Merge pull request #15981 from Snuffleupagus/cMapPacked-true
[api-minor] Let the `cMapPacked` parameter, in `getDocument`, default to `true`
2023-02-04 15:00:26 +01:00
Jonas Jenwald
851c394e64 Remove the isEmptyObj unit-test helper function
We should be able to let Jasmine simply compare directly against an actually empty Object, rather than using a manually implemented helper function for that.
2023-02-04 12:43:53 +01:00
Jonas Jenwald
c5d6391898 [api-minor] Let the cMapPacked parameter, in getDocument, default to true
The initial CMap support was added in PR 4259 using the "raw" Adobe files, however they were quickly deemed to be unnecessarily large. As a result PR 4470 introduced the more compact "binary" CMap format, with both of those PRs being included in the very same release (version `0.8.1334`) .

Please note that we've thus never shipped anything *except* the "binary" CMap files with the PDF library, and furthermore note that we've not even once updated the CMap files since they were originally added almost nine years ago.

Requiring users to remember that `cMapPacked = true` is necessary, in addition to setting the `cMapUrl` parameter, in order for CMap loading to work feels like a less than ideal API.
Hence this patch, which suggests that we simply let `cMapPacked` default to `true` now.
2023-01-30 15:35:02 +01:00
Jonas Jenwald
808ca828f1 Extend getGlyphMapForStandardFonts with additional entries (issue 15977) 2023-01-30 12:13:21 +01:00
Jonas Jenwald
07ba352903 Update npm packages 2023-01-28 08:13:13 +01:00
Calixte Denizet
6f4d037a8e [JS] Correctly format field with numbers (bug 1811694, bug 1811510)
In PR #15757, a value is automatically converted into a number when it's possible
but the case of numbers like "000123" has been overlooked and their format must
be preserved.
When a script is doing something like "foo.value + bar.value" and the values are
numbers then "foo.value" must return a number but the displayed value must be what
the user entered or what a script set, so this patch is just adding a a field
_orginalValue in order to track the value has it has defined.
Some people are used to use a comma as decimal separator, hence it must be considered
when a value is parsed into a number.
This patch is fixing a regression introduced by #15757.
2023-01-26 14:57:02 +01:00
Tim van der Meij
edfdb693e5
Merge pull request #15948 from Snuffleupagus/bug-1811668
Tweak `adjustType1ToUnicode` for fonts with a predefined *named* encoding (bug 1811668, PR 14050 follow-up)
2023-01-21 14:02:40 +01:00
Tim van der Meij
a27d7ba524
Merge pull request #15943 from Snuffleupagus/deprecate-direct-PDFDataRangeTransport
[api-minor] Deprecate calling `getDocument` directly with a `PDFDataRangeTransport`-instance
2023-01-21 13:50:20 +01:00
Jonas Jenwald
40a46e4397 Tweak adjustType1ToUnicode for fonts with a predefined *named* encoding (bug 1811668, PR 14050 follow-up)
*Please note:* I cannot reproduce the problem reported in bug 1811668, regarding the context menu, and in any case it's not clear that that part is even a PDF Viewer bug.

Looking at bug 1811668 I couldn't help but noticing that the textLayer isn't correct, and it's unfortunately once again a problem with the `adjustType1ToUnicode` function. That's intended to help improve text-selection for fonts without a /ToUnicode-entry, and in many cases it does help (the original PR fixed lots of issues) however it's also caused some problems.

In order to improve text-selection in bug 1811668, we'll now properly ignore fonts that have a predefined *named* encoding specified since that's really the intention with PR 14050.
2023-01-21 12:21:21 +01:00
Calixte Denizet
dc94b750de [GV] Avoid to update the finder when the results aren't complete
At the beginning of a search we can an update can be triggered with 0 over 0
found matches.
In the GeckoView context, we can't update the finder whenever we want but only
when it has been required.
2023-01-20 18:13:16 +01:00
Jonas Jenwald
f2fce93826 [JBIG2] Ensure that the decodeInteger function returns valid integers (issue 15942)
The JBIG2 images in this PDF document are corrupt enough that even Adobe Reader warns about it when opening the file.
*Please note:* I don't really know the JBIG2 image format at all, however from a very brief look at the specification it seems that integers should be 32-bit.
2023-01-19 17:14:17 +01:00
Jonas Jenwald
7976fc7851 [api-minor] Deprecate calling getDocument directly with a PDFDataRangeTransport-instance
In general it's recommended to pass a *parameter object* when calling the `getDocument`-function in the API, since that's the only way to provide additional options, and the fact that it also accepts a URL or TypedArray directly is now mostly for backwards compatibility reasons.
However, the `getDocument`-function also accepts a direct `PDFDataRangeTransport`-instance which just seems unnecessary.

*Please note:* The `PDFDataRangeTransport`-implementation was added specifically for the *built-in* Firefox PDF Viewer, however it's most likely not commonly used by any third-party (given that it requires manual PDF-data loading).
Furthermore, the default-viewer always provides a *parameter object* when calling the `getDocument`-function and it's thus completely unaffected by these changes.
2023-01-19 14:25:55 +01:00
Jonas Jenwald
d6be5141e9 Fallback to using the name table to infer the encoding for TrueType fonts missing such data (issue 15910)
The relevant TrueType font is missing both /ToUnicode *and* /Encoding entires, either of which would have prevented the (current) broken textLayer rendering.
My first idea was that we could use the `post` table in the TrueType font, see https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6post.html, to get the actual glyphNames and amend the fallback ToUnicode-map that way. Unfortunately that didn't work, since the `post` table only contained ".notdef" and "" (i.e. empty string) entries.

Instead we try to use the `name` table in the TrueType font, see https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6name.html, to determine if the platform is Windows and thus fallback to generate a ToUnicode-map from the `WinAnsiEncoding`.
2023-01-17 16:04:51 +01:00
Jonas Jenwald
d8d5545e03
Merge pull request #15926 from Snuffleupagus/annotation-appearance-stream
Ensure that Annotation `appearance`-entries are actually Streams
2023-01-16 15:00:12 +01:00
Jonas Jenwald
8f3fa18c93
Merge pull request #15920 from Snuffleupagus/transfer-pdf-data
[api-minor] Enable transferring of TypedArray PDF data by default (PR 15908 follow-up)
2023-01-16 13:20:57 +01:00
Jonas Jenwald
cefaecc2e8 Ensure that Annotation appearance-entries are actually Streams
Note how all over the `src/core/annotation.js`-code we're assuming that if an `appearance`-entry exists it's also a Stream. However, we're not actually checking that thoroughly enough which causes issues in some badly generated PDF documents.
2023-01-16 13:02:53 +01:00
Jonas Jenwald
32357e3d17 Update rimraf to version 4
The primary change is that the `rimraf` function now returns a Promise instead of taking a callback; please see https://github.com/isaacs/rimraf#major-changes-from-v3-to-v4
2023-01-15 15:38:30 +01:00
Tim van der Meij
9e3adb5ec7
Implement unit tests for the numberToString utility function 2023-01-14 15:09:58 +01:00
Tim van der Meij
a6dfcc89fa
Implement unit tests for the recoverJsURL utility function 2023-01-14 15:09:58 +01:00
Jonas Jenwald
397f943ca3 [api-minor] Enable transferring of TypedArray PDF data by default (PR 15908 follow-up)
This patch removes the recently introduced `transferPdfData` API-option, and simply enables transferring of TypedArray data *by default* instead of copying it. This will help reduce main-thread memory usage, however it will take ownership of the TypedArrays. Currently this only applies to the following cases:
 - TypedArrays passed to the `getDocument`-function in the API, in order to open PDF documents from binary data.
 - TypedArrays passed to a `PDFDataRangeTransport`-instance, used to support custom PDF document fetching/loading (see e.g. the Firefox PDF Viewer).

*PLEASE NOTE:* To avoid being affected by this, please simply *copy* any TypedArray data before passing it to either of the functions/methods mentioned above.

Now that we transfer TypedArray data that we previously only copied, we need to be more careful with input validation. Given how the `{IPDFStreamReader, IPDFStreamRangeReader}.read` methods will always return ArrayBuffer data, which is then transferred to the worker-thread[1], the actual TypedArray data passed to the API thus need to have the same exact size as its underlying ArrayBuffer to prevent issues.
Hence we'll check for this and only allow transferring of *safe* TypedArray data, and fallback to simply copying the data just as before. This obviously shouldn't be an issue in the Firefox PDF Viewer, but for the general PDF.js library we need to be more careful here.

---
[1] See e09ad99973/src/display/api.js (L2492-L2506) respectively e09ad99973/src/display/api.js (L2578-L2590)
2023-01-14 10:39:36 +01:00
Jonas Jenwald
bbe629018d [api-minor] Add a new transferPdfData option to allow transferring more data to the worker-thread (bug 1809164)
Also, removes the `initialData`-parameter JSDocs for the `getDocument`-function given that this parameter has been completely unused since PR 8982 (over five years ago). Note that the `initialData`-parameter is, and always was, intended to be provided when initializing a `PDFDataRangeTransport`-instance.
2023-01-10 21:03:44 +01:00
Jonas Jenwald
74e4b515c5
Merge pull request #15897 from Snuffleupagus/issue-15893
Support parsing encrypted documents in `XRef.indexObjects` (issue 15893)
2023-01-07 16:55:41 +01:00
Jonas Jenwald
1d5de9f4f4 Inline the setPDFNetworkStreamFactory functionality in src/display/api.js
Given that this is internal functionality, not exposed in the official API, it's not entirely clear (at least to me) why we can't just initialize this directly in `src/display/api.js` instead.
When testing both the development viewer and all the ways in which we run tests, everthing still appears to work just fine with this patch.
2023-01-06 13:23:07 +01:00
Jonas Jenwald
7d94fdeb48 Support parsing encrypted documents in XRef.indexObjects (issue 15893)
*Please note:* The reduced test-case is *not* a perfect reproduction of the original PDF document, since this one fails to open in e.g. Adobe Reader, but I do believe that it captures the most important points here.

For corrupt *and* encrypted PDF documents, it's possible that only some trailer dictionaries actually contain an /Encrypt-entry. Previously we'd could easily miss that, since we generally pick the first not obviously corrupt trailer dictionary, and the solution implemented here is to simply pre-parse all trailer dictionaries to see if there's any /Encrypt-entries.
2023-01-06 13:09:37 +01:00
Calixte Denizet
661f425934 [GV] Add an option in the find controller to update matches count only when the last page is reached (bug 1803188).
In GeckoView, on an event, a callback must be executed with the result of an action,
but the callback can be used only one time.
So for each FindInPage event, we must trigger only one matches count update.
2023-01-06 10:56:26 +01:00
Calixte Denizet
dea2471e96 [JS] UserActivation must be enabled before running document actions
else auto-print is broken (it's a regression from patch #15822).
2023-01-04 21:26:36 +01:00
Jonas Jenwald
42aa08563b
Merge pull request #15880 from Snuffleupagus/rm-docStats
[api-minor] Remove the `PDFDocumentProxy.stats` getter (PR 15758 follow-up)
2023-01-02 16:03:53 +01:00
Calixte Denizet
69c88477a9 Avoid an infinite loop when searching for a single diacritic 2023-01-02 12:27:07 +01:00
Jonas Jenwald
0c1fb4e740 [api-minor] Remove the PDFDocumentProxy.stats getter (PR 15758 follow-up)
This was deprecated in PR 15758 and given that it's quite unlikely that any third-party users are relying on this functionality, since it was only ever added to support telemetry reporting in the Firefox PDF Viewer, it should hopefully be fine to remove this fairly quickly.

These changes reduce the bundle size of the Firefox PDF Viewer by 4.5 kB in total.
2023-01-01 17:06:47 +01:00
Jonas Jenwald
2fcf8bb5be Re-factor searching for incomplete objects in XRef.indexObjects (issue 15803)
When trying to find incomplete objects, i.e. those missing the "endobj"-string at the end, there's unfortunately a number of possible operators that we need to check for. Otherwise we could miss e.g. the "trailer" at the end of a corrupt PDF document, which is why the referenced document didn't work.

Currently we do all searching on the "raw" bytes of the PDF document, for efficiency, however this doesn't really work when we need to check for *multiple* potential command-strings. To keep the complexity manageable we'll instead use regular expressions here, but we can at least avoid creating lots of substrings thanks to the `RegExp.lastIndex` property; which is well supported across browsers according to https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/lastIndex#browser_compatibility

Note that this repeated regular expression usage could perhaps be slightly less efficient than the old code, however this method is only invoked for corrupt PDF documents.
2022-12-19 23:01:09 +01:00
Calixte Denizet
f80880ccaa Strip out a reserved operator (9) from CFF char strings (fixes issue #15784) 2022-12-16 15:17:46 +01:00
Jonas Jenwald
26135b0313 Always parse the entire startXRefQueue in XRef.readXRef (issue 15833)
Previously we'd abort all parsing if an Error was encountered, despite the fact that multiple `startXRefQueue`-entries may be available and that continued parsing could thus eventually be able to find usable data.

Note that in the referenced PDF document the `startxref`-operator, at the end of the file, points to a position in the middle of an arbitrary `stream` which is why things break.
2022-12-15 13:46:28 +01:00
Jonas Jenwald
91524d1a60 [api-minor] Allow specifying an extra-delay, in RenderTask.cancel, for worker-thread aborting of operatorList parsing
This is done to support upcoming viewer-changes, and in order to prevent third-party users from outright breaking things we'll simply ignore too large values.
2022-12-14 12:34:16 +01:00
Calixte Denizet
2ebf8745a2 [JS] Run the named actions before running the format when the file is open (issue #15818)
It's a follow-up of #14950: some format actions are ran when the document is open
but we must be sure we've everything ready for that, hence we have to run some
named actions before runnig the global format.
In playing with the form, I discovered that the blur event wasn't triggered when
JS called `setFocus` (because in such a case the mouse was never down). So I removed
the mouseState thing to just use the correct commitKey when blur is triggered by a
TAB key.
2022-12-13 21:12:32 +01:00
Calixte Denizet
0c1ec946aa [JS] Handle correctly choice widgets where the display and the export values are different (issue #15815) 2022-12-13 19:08:26 +01:00
Calixte Denizet
1a397681fe The annotation layer dimensions must be set before adding some elements (follow-up of #15770)
In order to move the annotations in the DOM to have something which corresponds
to the visual order, we need to have their dimensions/positions which means that
the parent must have some dimensions.
2022-12-13 14:54:45 +01:00
Calixte Denizet
4f0bfabe7a Take all the viewBox into account when computing the coordinates of an annotation in the page (fixes #15789) 2022-12-08 15:02:20 +01:00
calixteman
fe3df4dcb4
Merge pull request #15782 from calixteman/15780
[api-minor][Editor] Don't use the editor parent which can be null.
2022-12-08 14:27:42 +01:00
Calixte Denizet
b93bf9f654 [Editor] Don't use the editor parent which can be null.
An annotation editor layer can be destroyed when it's invisible, hence some
annotations can have a null parent but when printing/saving or when changing
font size, color, ... of all added annotations (when selected with ctrl+a) we
still need to have some parent properties especially the page dimensions, global
scale factor and global rotation angle.
This patch aims to remove all the references to the parent in the editor instances
except in some cases where an editor should obviously have one.
It fixes #15780.
2022-12-08 14:06:06 +01:00
Jonas Jenwald
0ca92bf2a8
Merge pull request #15775 from Snuffleupagus/vars-all
Tighten the `vars`-argument for the ESLint `no-unused-vars` rule
2022-12-07 10:30:47 +01:00
Calixte Denizet
9af89381cd [Editor] Add a very basic and incomplete workaround for issue #15780
The main issue is due to the fact that an editor's parent can be null when
we want to serialize it and that lead to an exception which break all the
saving/printing process.
So this incomplete patch fixes only the saving/printing issue but not the
underlying problem (i.e. having a null parent) and doesn't bring that much
complexity, so it should help to uplift it the next Firefox release.
2022-12-06 16:22:24 +01:00
Jonas Jenwald
fe8fded23b [api-minor] Combine the textContent/textContentStream parameters
Rather than handling these parameters separately, which is a left-over from back when streaming of textContent was originally added, we can simply pass either data directly to the `TextLayer` and let it handle things accordingly.

Also, improves a few JSDoc comments and `typedef`-imports.
2022-12-04 21:22:14 +01:00
Jonas Jenwald
b659bacc43 Tighten the vars-argument for the ESLint no-unused-vars rule
Please see https://eslint.org/docs/latest/rules/no-unused-vars#vars
2022-12-04 16:15:50 +01:00
Tim van der Meij
99cfef882f
Merge pull request #15752 from Snuffleupagus/no-typeof-undefined
Enable the `no-typeof-undefined` ESLint plugin rule
2022-12-02 19:48:16 +01:00
Calixte Denizet
eed9bf71c5 Refactor the text layer code in order to avoid to recompute it on each draw
The idea is just to resuse what we got on the first draw.
Now, we only update the scaleX of the different spans and the other values
are dependant of --scale-factor.
Move some properties in the CSS in order to avoid any updates in JS.
2022-12-01 18:42:43 +01:00
Jonas Jenwald
47dbfc4ade Enable the no-typeof-undefined ESLint plugin rule
Please see https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/no-typeof-undefined.md
2022-12-01 18:20:39 +01:00
Calixte Denizet
20fd9099f8 [Annotation] Send correctly the updated values to the JS sandbox 2022-11-29 17:34:06 +01:00
Calixte Denizet
ea1995991b Don't add an extra space after a Katakana or a Hiragana at the eol when searching 2022-11-29 10:46:48 +01:00
Calixte Denizet
ae7da6ae48 [JS] By default, a text field value must be treated as a number (bug 1802888) 2022-11-28 16:24:01 +01:00
calixteman
33f9d1aab2
Merge pull request #15755 from calixteman/rounding_printf
[JS] Fix a rounding issue in printf (bug 1802888)
2022-11-28 15:39:36 +01:00
Calixte Denizet
4ee0c83548 [JS] Fix a rounding issue in printf (bug 1802888) 2022-11-28 14:37:15 +01:00
Jonas Jenwald
aa5b678f94 Add default icons for FileAttachment annotations (bug 1230933)
*Please note:* This "borrows" the icons from Thunderbird.

According to the PDF specification, see https://web.archive.org/web/20220309040754if_/https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G11.2096626, we should be providing default icons for FileAttachment annotations without appearances.
2022-11-26 11:24:59 +01:00
Jonas Jenwald
8fda3f04fe
Merge pull request #15732 from Snuffleupagus/issue-15719
Add a fallback for non-embedded *composite* Tahoma fonts (issue 15719)
2022-11-24 19:09:12 +01:00
Jonas Jenwald
d1c01b3164 Add a fallback for non-embedded *composite* Tahoma fonts (issue 15719) 2022-11-23 15:51:18 +01:00
Jonas Jenwald
47682985d3 Add support for Optional Content in TilingPatterns (issue 15716)
This can't be a particularly common feature, since we've supported Optional Content for over two years and this is the very first TilingPattern-case we've seen.
2022-11-23 12:58:00 +01:00
Jonas Jenwald
0ba242ea4a Support FileAttachments with hash-signs in the filename (issue 15729)
The reason for the issue is that we use the generic `getFilenameFromUrl` helper function, which was originally intended for regular URLs.
For the filenames we're dealing with in FileAttachments, we really only want to strip the path when one exists[1].

---
[1] See [bug 1230933](https://bugzilla.mozilla.org/show_bug.cgi?id=1230933) for an example of such a case.
2022-11-23 10:47:33 +01:00
Tim van der Meij
ae7c97aef8
Merge pull request #15710 from Snuffleupagus/issue-10791
Add localization support for the `annotationLayer` reference tests (issue 10791)
2022-11-19 11:25:15 +01:00
Jonas Jenwald
2ff904fb2b Add localization support for the annotationLayer reference tests (issue 10791) 2022-11-18 23:08:11 +01:00
Jonas Jenwald
7d029f8bfe Add a basic stringToUTF16HexString unit-test 2022-11-16 12:39:35 +01:00
Jonas Jenwald
9adc7859c8 Move the escapeString helper function into the worker-thread
Given that this helper function is only used on the worker-thread, there's no reason to duplicate it in both of the `pdf.js` and `pdf.worker.js` files.
2022-11-16 12:35:48 +01:00
Jonas Jenwald
e5859e145d Move the isAscii helper function into the worker-thread
Given that this helper function is only used on the worker-thread, there's no reason to duplicate it in both of the `pdf.js` and `pdf.worker.js` files.
2022-11-16 12:35:48 +01:00
Jonas Jenwald
2eaa708e3a Combine the stringToUTF16String and stringToUTF16BEString helper functions
Given that these functions are virtually identical, with the latter only adding a BOM, we can combine the two. Furthermore, since both functions were only used on the worker-thread, there's no reason to duplicate this functionality in both of the `pdf.js` and `pdf.worker.js` files.
2022-11-16 12:35:44 +01:00
Calixte Denizet
2be64d63e1 Normalize fullwidth, halfwidth and circled chars when searching 2022-11-14 19:27:51 +01:00
Jonas Jenwald
a1d48e3651 Add a *linked* test-case for issue 2618
Given that this PDF document is an interesting test-case for performance reasons, w.r.t. inline image caching, it probably can't hurt to add it to the test-suite to make it more readily available.
Considering the contents of that PDF document I'm not sure if we can include it directly in the repository, hence why a *linked* test-case was choosen here.
2022-11-12 16:31:01 +01:00
Jonas Jenwald
595711bd7c
Merge pull request #15679 from Snuffleupagus/bug-1799927-2
Use the *full* inline image as the cacheKey in `Parser.makeInlineImage` (bug 1799927)
2022-11-10 22:54:48 +01:00
Calixte Denizet
3ca03603c2 [Annotation] Fix printing/saving for annotations containing some non-ascii chars and with no fonts to handle them (bug 1666824)
- For text fields
 * when printing, we generate a fake font which contains some widths computed thanks to
   an OffscreenCanvas and its method measureText.
   In order to avoid to have to layout the glyphs ourselves, we just render all of them
   in one call in the showText method in using the system sans-serif/monospace fonts.
 * when saving, we continue to create the appearance streams if the fonts contain the char
   but when a char is missing, we just set, in the AcroForm dict, the flag /NeedAppearances
   to true and remove the appearance stream. This way, we let the different readers handle
   the rendering of the strings.
- For FreeText annotations
  * when printing, we use the same trick as for text fields.
  * there is no need to save an appearance since Acrobat is able to infer one from the
    Content entry.
2022-11-10 19:05:39 +01:00
Jonas Jenwald
b46e0d61cf Use the *full* inline image as the cacheKey in Parser.makeInlineImage (bug 1799927)
*Please note:* This only fixes the "wrong letter" part of bug 1799927.

It appears that the simple `computeAdler32` function, used when caching inline images, generates hash collisions for some (very short) TypedArrays. In this case that leads to some of the "letters", which are actually inline images, being rendered incorrectly.
Rather than switching to another hashing algorithm, e.g. the `MurmurHash3_64` class, we simply cache using a stringified version of the inline image data as the cacheKey to prevent any future collisions. While this will (naturally) lead to slightly higher peak memory usage, it'll however be limited to the current `Parser`-instance which means that it's not persistent.

One small benefit of these changes is that we can avoid creating lots of `Stream`-instances for already cached inline images.
2022-11-10 18:27:26 +01:00
Samuel Yuan
36fb5c1e2b Propagate the translated font name to TextContentItems.
This allows font data for system fonts to be looked up in the
PDFObjects.
2022-11-08 11:16:21 -08:00
Jonas Jenwald
23930a249e [api-minor] Let Catalog.getAllPageDicts return an *empty* dictionary when loading the first /Page fails (issue 15590)
In order to support opening certain corrupt PDF documents, particularly hand-edited ones, this patch adds support for letting the `Catalog.getAllPageDicts` method fallback to returning an *empty* dictionary to replace (only) the first /Page of the document.
Given that the viewer cannot initialize/load without access to the first page, this will thus allow e.g. document-level scripting to run as expected. Note that by effectively replacing a corrupt or missing first /Page in this way[1], we'll now render nothing but a *blank* page for certain cases of broken/corrupt PDF documents which may look weird.

*Please note:* This functionality is controlled via the existing `stopAtErrors` option, that can be passed to `getDocument`, since it's easy to imagine use-cases where this sort of fallback behaviour isn't desirable.

---
[1] Currently we still require that a /Pages-dictionary is found though, however it *may* be possible to relax even that assumption if that becomes absolutely necessary in future corrupt documents.
2022-11-03 12:51:48 +01:00
Jonas Jenwald
2516ffa78e Fallback to finding the first "obj" occurrence, when the trailer-dictionary is incomplete (issue 15590)
Note that the "trailer"-case is already a fallback, since normally we're able to use the "xref"-operator even in corrupt documents. However, when a "trailer"-operator is found we still expect "startxref" to exist and be usable in order to advance the stream position. When that's not the case, as happens in the referenced issue, we use a simple fallback to find the first "obj" occurrence instead.

This *partially* fixes issue 15590, since without this patch we fail to find any objects at all during `XRef.indexObjects`. However, note that the PDF document is still corrupt and won't render since there's no actual /Pages-dictionary and the /Root-entry simply points to the /OpenAction-dictionary instead.
2022-11-03 12:46:30 +01:00
calixteman
e42e1cde61
Merge pull request #15615 from calixteman/bug1796741
[Form] Don't use field appearances when /NeedAppearances is set to true (bug 1796741)
2022-10-31 09:58:27 +01:00
Jonas Jenwald
980acddbfa Prevent textLayer errors in documents with unbalanced beginMarkedContent/endMarkedContent operators (issue 15629) 2022-10-26 18:35:48 +02:00
Calixte Denizet
9f95a14e91 [Form] Don't use field appearances when /NeedAppearances is set to true (bug 1796741)
When a form isn't changed, we used the appearances we had in the file, but when
/NeedAppearances is true, all the appearances have to be regenerated whatever they're.
2022-10-26 12:10:51 +02:00
Calixte Denizet
2384fbcb89 Fix editor tests on Windows
- In #15373, we implemented copy/paste actions in using the system
clipboard.
For any reasons, on Windows, the clipboard doesn't contain the expected
data when the tests are ran in parallel, hence the tests which are
using the clipboard need to be ran sequentially.
- Make sure that we can paste after having copied.
2022-10-25 22:48:02 +02:00
Jonas Jenwald
71bd8b4de9 Let Lexer.getNumber treat more invalid "numbers" as zero (issue 15604)
In the referenced PDF document there are "numbers" which consist only of `-.`, and while that's obviously not valid Adobe Reader seems to handle it just fine.
Letting this method ignore more invalid "numbers" was suggested during the review of PR 14543, so let's simply relax our the validation here.
2022-10-20 22:36:15 +02:00
Jonas Jenwald
e591378ff1 Restore a weaker version of the /Pages dictionary /Count check for corrupt documents (PR 15593 follow-up)
It appears that PR 15593 broke `issue12402`, and we thus need to partially restore the /Count check.
 I completely missed this when looking at the test-results for PR 15593, both locally and on the bots, since the `Driver._getLastPageNumber` method would "swallow" an unavailable page number.
2022-10-20 14:22:29 +02:00
Calixte Denizet
6db9cefaaf [Annotation] Replace use of id by data-element-id to have the correct id 2022-10-19 23:36:28 +02:00
Jonas Jenwald
3c046c0a21 Extend getSupplementalGlyphMapForCalibri with some umlauts (issue 15594) 2022-10-19 17:49:40 +02:00
Jonas Jenwald
bc13a277ce Relax the /Pages dictionary /Count check for corrupt documents (issue 9105)
After PR 14311, and follow-up patches, we no longer require that the /Count entry (in the /Pages dictionary) is either present or even valid in order to parse/render a PDF document.
Hence it seems strange to keep this requirement for *corrupt* PDF documents, when trying to find a usable `trailer` in the `XRef.indexObjects` method.
2022-10-19 12:28:25 +02:00
Jonas Jenwald
de99f99a01 Fallback and try a *previous* generation if all else fails in XRef.indexObjects (issue 15577)
When we fail to find a usable PDF document `trailer` *and* there were errors during parsing, try and fallback to a *previous* generation as a last resort during fetching of uncompressed references.
*Please note:* This will not affect "normal" PDF documents, with valid /XRef data, and even most *corrupt* documents should be completely unaffected by these changes.
2022-10-18 20:24:01 +02:00
Jonas Jenwald
7bd484ebd3 Update npm packages 2022-10-16 09:38:58 +02:00
Calixte Denizet
556513a6e7 Use all the current transform as key when caching some image for masks used with pattern fill (bug 1795263, #15573) 2022-10-14 14:37:58 +02:00
Jonas Jenwald
15d4d80d45
Merge pull request #15563 from Snuffleupagus/issue-15559
Take the /CIDToGIDMap into account when getting the glyph mapping for CFF fonts (issue 15559)
2022-10-14 09:13:41 +02:00
Calixte Denizet
e756bb69e4 [JS] Take into account all the required fields for some computations
- Fix Field::getArray in order to collect only the fields which have a value;
- Fix AFSimple_Calculate:
  * allow to have a string with a list of field names as argument;
  * since a field can be non-terminal, use Field::getArray to collect
    the field under it and then apply the calculation on all the descendants.
2022-10-13 18:33:12 +02:00
Jonas Jenwald
858d941ff8 Take the /CIDToGIDMap into account when getting the glyph mapping for CFF fonts (issue 15559)
*Please note:* I don't really know what I'm doing here, however the patch appears to fix the referenced issue when comparing the rendering with Adobe Reader (with the caveat that I don't speak the language in question).
2022-10-13 10:02:25 +02:00
Jonas Jenwald
081e897588 Ensure that Page.getOperatorList handles Annotation parsing errors correctly (issue 15557)
*Fixes a regression from PR 15246, sorry about that!*

The return value of all `Annotation.getOperatorList` methods was changed in PR 15246, however I missed updating the error code-path in `Page.getOperatorList` which thus breaks all operatorList-parsing for pages with corrupt Annotations.
2022-10-10 09:48:01 +02:00
Jonas Jenwald
ce66fefbff [api-minor] Add partial support for the "GoToE" action (issue 8844)
*Please note:* The referenced issue is the only mention that I can find, in either GitHub or Bugzilla, of "GoToE" actions.
Hence why I've purposely settled for a very simple, and partial, "GoToE" implementation to avoid complicating things initially.[1] In particular, this patch only supports "GoToE" actions that references the /EmbeddedFiles-dict in the PDF document.

See https://web.archive.org/web/20220309040754if_/https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G11.2048909

---
[1] Usually I always prefer having *real-world* test-cases to work with, whenever I'm implementing new features.
2022-10-06 10:33:07 +02:00
Jonas Jenwald
fe5d9b4b6a Remove duplicated destroy-calls in the "custom ownerDocument" unit-tests
Given that `PDFDocumentProxy.destroy` is nothing but an alias for `PDFDocumentLoadingTask.destroy` calling both methods is obviously not useful.
2022-10-02 12:01:41 +02:00
Tim van der Meij
606fb8c394
Fix intermittent errors in the "check that first text field has focus" scripting test
This commit fixes the "Expected null to equal '401R'" errors that
surfaced after the Puppeteer 18 upgrade. Note that even before that
this would have been an improvement because it takes some time between
scripting being reported ready (i.e., triggering the execution of any
OpenActions) and those OpenActions actually completing execution, so
it's only safe to check which element is focused if we know an element
actually became focused.
2022-10-01 18:08:15 +02:00
Jonas Jenwald
7b24931f67
Merge pull request #15517 from Snuffleupagus/issue-15516
Add more non-standard ligatures in the `glyphlist.js` file (issue 15516)
2022-09-30 23:30:50 +02:00
Calixte Denizet
330048ad6b [JS] Add the function AFExactMatch 2022-09-29 14:23:56 -10:00
Jonas Jenwald
c87f90102c Add more non-standard ligatures in the glyphlist.js file (issue 15516)
Note that this PR only adds the "underscore"-variant of *actually existing* ligatures, however the referenced PDF document also uses a couple of non-standard ones (e.g. `ft`, `Th`, and `fh`) that we cannot easily support without larger changes (since they don't have official Unicode-entries).
Given that it's clearly the PDF document, and its fonts, that's the culprit here it's not entirely clear to me that we actually want to attempt a larger refactoring/rewriting of the `glyphlist.js` code, assuming it's even generally possible. Especially when this patch alone already improves our copy-paste behaviour when compared to both Adobe Reader and PDFium, and that this is only the *second* time this sort of bug has been reported.
2022-09-27 16:31:51 +02:00
calixteman
da1780f826
Merge pull request #15486 from nmtigor/fix_orders_of_prop
Fix property chain orders of Operators in isDotExpression
2022-09-25 04:13:25 -10:00
Jonas Jenwald
6538409282 Replace some Array.prototype-usage with spread syntax
We have a few, quite old, call-sites that use the `Array.prototype`-format and which can now be replaced with spread syntax instead.
2022-09-23 09:35:30 +02:00
calixteman
034017d526
Merge pull request #15494 from Snuffleupagus/issue-15492
Tweak the heuristic that handles JPEG images with a wildly incorrect SOF (Start of Frame) `scanLines` parameter (issue 15492)
2022-09-22 17:05:49 +02:00
Calixte Denizet
9e40938a29 [JS] Try to guess what the date is when it doesn't follow the given format (issue #15490)
We use the format to guess in which order we can find month, day, ... we get the numbers
in the date and consider them as month, day, ...
2022-09-22 16:30:39 +02:00
Jonas Jenwald
f1b0dc6f04 Tweak the heuristic that handles JPEG images with a wildly incorrect SOF (Start of Frame) scanLines parameter (issue 15492) 2022-09-22 14:09:04 +02:00
nmtigor
22cc9b7dc7 Fix property chain orders of Operators in isDotExpression and isSomPredicate 2022-09-21 17:20:23 +02:00
Calixte Denizet
198e9a3db1 Initialize values in the path bounding box before flushing the operator list (bug 1791583)
OperatorList.addOp can trigger a flush if it's required, hence the values passed to it must
be correctly initialized in order to avoid some wrong values in the renderer.
Because of that a clip path was considered as empty, nothing was clipped, hence the wrong
rendering in bug 1791583.
2022-09-20 20:01:54 +02:00
Calixte Denizet
f5b835157b [XFA] Fix an hidden issue in the FormCalc lexer
Since there are no script engine with XFA, the FormCalc parser is not used irl.
The bug @nmtigor noticed was hidden by another one (the wrong check on `match`).
2022-09-20 13:53:55 +02:00
Jonas Jenwald
20b9887476 Enable the unicorn/prefer-regexp-test ESLint plugin rule
Please see https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/prefer-regexp-test.md
2022-09-19 16:34:01 +02:00
Jonas Jenwald
7a19def34c Extend getSupplementalGlyphMapForCalibri with more entries (issue 15443) 2022-09-15 22:19:16 +02:00
Jonas Jenwald
2f2ecad8fd Extend getGlyphMapForStandardFonts with some quote-entries (issue 15441) 2022-09-15 11:37:20 +02:00
Jonas Jenwald
21fe5017bb Remove the abstract BaseViewer-class
After the changes in PR 14112 the `PDFViewer`-class is now "identical" to the `BaseViewer`-class and the `PDFSinglePageViewer`-class is just a very thin wrapper around the `BaseViewer`-class.
Hence we can rename these files, and also remove the abstract `BaseViewer`-class, which helps reduce some unnecessary "closures" in the *built* viewer.

*Please note:* These changes are made in two separate commits, to allow GitHub to preserve `blame` for the affected files.
2022-09-08 12:38:17 +02:00
Jonas Jenwald
6dc4c994b8 Remove the abstract BaseViewer-class
After the changes in PR 14112 the `PDFViewer`-class is now "identical" to the `BaseViewer`-class and the `PDFSinglePageViewer`-class is just a very thin wrapper around the `BaseViewer`-class.
Hence we can rename these files, and also remove the abstract `BaseViewer`-class, which helps reduce some unnecessary "closures" in the *built* viewer.

*Please note:* These changes are made in two separate commits, to allow GitHub to preserve `blame` for the affected files.
2022-09-08 12:38:17 +02:00
Jonas Jenwald
af6aacfc0e
Merge pull request #15398 from Snuffleupagus/more-optional-chaining
Use more optional chaining in the code-base
2022-09-06 20:31:03 +02:00
Jonas Jenwald
38ee28b1d3 Use more optional chaining in the code-base
This patch updates a bunch of older code, that makes conditional function calls, to use optional chaining rather than `if`-blocks.

These mostly mechanical changes reduce the size of the `gulp mozcentral` build by a little over 1 kB.
2022-09-05 15:41:53 +02:00
Jonas Jenwald
947d390421 Fallback to a standard font when a Type1 font program is empty (issue 15292)
*Please note:* This is only a, hopefully generally helpful, work-around rather than a proper solution to issue 15292.

There's something that's "special" about the Type1 fonts in the referenced PDF document, since we don't manage to find any actual font programs and thus cannot render anything.
Given that it shouldn't make sense for a Type1 font program to ever be empty, since that means that there's no glyph-data to render, we simply fallback to a standard font to at least try and render *something* in these rare cases.
2022-09-05 12:07:19 +02:00
Jonas Jenwald
9578152ae4
Merge pull request #15392 from Snuffleupagus/issue-15352
Don't allow `adjustToUnicode` to extend a built-in /ToUnicode map (issue 15352)
2022-09-04 15:12:10 +02:00
Calixte Denizet
6c6f6fb2b8 Don't replace cr by a white space when the last char on the line is an ideographic char 2022-09-04 14:21:05 +02:00
Jonas Jenwald
12d60e0acf Don't allow adjustToUnicode to extend a built-in /ToUnicode map (issue 15352)
Given that the change in PR 13393 was slightly speculative, given the lack of test-cases, let's just revert part of that to fix the referenced issue.
Based on a quick look at old issues and existing test-cases, it seems that most (if not all) PDF documents that benefit from using the font-data in this way lack any /ToUnicode maps which should mean that they're unaffected by these changes.
2022-09-03 23:11:42 +02:00
Jonas Jenwald
cc4baa2fe9 [api-minor] Add basic support for the SetOCGState action (issue 15372)
Note that this patch implements the `SetOCGState`-handling in `PDFLinkService`, rather than as a new method in `OptionalContentConfig`[1], since this action is nothing but a series of `setVisibility`-calls and that it seems quite uncommon in real-world PDF documents.

The new functionality also required some tweaks in the `PDFLayerViewer`, to ensure that the `layersView` in the sidebar is updated correctly when the optional-content visibility changes from "outside" of `PDFLayerViewer`.

---
[1] We can obviously move this code into `OptionalContentConfig` instead, if deemed necessary, but for an initial implementation I figured that doing it this way might be acceptable.
2022-09-01 17:34:24 +02:00
Jonas Jenwald
216b86a082 [api-minor] Support Named-actions in the outline (issue 15367)
Apparently this is implemented in e.g. Adobe Reader, and the specification does support it, however it cannot be commonly used in real-world PDF documents since it took over ten years for this feature to be requested.
2022-08-30 18:47:45 +02:00
Jonas Jenwald
571ce13dd6 [api-major] Remove the enhanceTextSelection functionality (PR 15145 follow-up)
For the `gulp mozcentral` command, this reduces the size of the *built* `pdf.js` file by `> 10` kB.
2022-08-28 15:04:47 +02:00
Jonas Jenwald
723584dd4f Enable the unicorn/prefer-array-find ESLint plugin rule
Please find additional information here:
 - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/find#browser_compatibility
 - https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/prefer-array-find.md
2022-08-24 16:46:26 +02:00
Calixte Denizet
c06c5f7cbd [Annotations] charLimit === 0 means unlimited (bug 1782564)
Changing the charLimit in JS had no impact, so this patch aims to fix
that and add an integration test for it.
2022-08-19 11:28:28 +02:00
Jonas Jenwald
0024165f1f Move binarySearchFirstItem back to the web/-folder (PR 15237 follow-up)
This was moved into the `src/display/`-folder in PR 15110, for the initial editor-a11y patch. However, with the changes in PR 15237 we're again only using `binarySearchFirstItem` in the `web/`-folder and it thus seem reasonable to move it back there.
The primary reason for moving it back is that `binarySearchFirstItem` is currently exposed in the public API, and we always want to avoid that unless it's either PDF-related functionality or code that simply must be shared between the `src/`- and `web/`-folders. In this case, `binarySearchFirstItem` is a general helper function that doesn't really satisfy either of those alternatives.
2022-08-14 11:38:17 +02:00
calixteman
6b4c2464ad
Merge pull request #15237 from calixteman/annotation_a11y
[Annotations] Add some aria-owns in the text layer to link to annotations (bug 1780375)
2022-08-12 15:04:56 +02:00
Calixte Denizet
f316300113 [Annotations] Add some aria-owns in the text layer to link to annotations (bug 1780375)
This patch doesn't structurally change the text layer: it just adds some aria-owns
attributes to some spans.
The aria-owns attribute expect to have an element id, hence it's why it adds back an
id on the element rendering an annotation, but this id is built in using crypto.randomUUID
to avoid any potential issues with the hash in the url.
The elements in the annotation layer are moved into the DOM in order to have them in the
same "order" as they visually are.
The overall goal is to help screen readers to present to the user the annotations as
they visually are and as they come in the text flow.
It is clearly not perfect, but it should improve readability for some people with visual
disabilities.
2022-08-12 14:35:26 +02:00
Jonas Jenwald
dd95e4f851 Add *official* support for passing ArrayBuffer-data to getDocument (issue 15269)
While this has always worked, as a consequence of the implementation, it's never been officially supported.
In addition to adding basic unit-tests, this patch also introduces a couple of new JSDoc `@typedef`s in the API to avoid overly long lines.
2022-08-10 14:13:01 +02:00
Calixte Denizet
04f78c935c Fix OTS issue with empty index (#15289) 2022-08-08 22:56:26 +02:00
Jonas Jenwald
f6db7975c5 Enable the ESLint prefer-spread rule
Note that in a couple of spots the argument could be `undefined` and there we simply disable the rule instead.

Please refer to https://eslint.org/docs/latest/rules/prefer-spread
2022-08-06 10:17:00 +02:00
Calixte Denizet
3c8d8f0d02 [Editor] A pasted FreeText editor was missing when printing/saving
When a FreeText editor is pasted then it hasn't an editorDiv yet when added
to the layer, hence it's empty.
So this patch just move the call to addToAnnotationStorage to ensure we've
what we need.
2022-08-04 13:00:45 +02:00
calixteman
b985eaa98c
Merge pull request #15267 from calixteman/freetext_a11y
[Annotation] Add a div containing the text of a FreeText annotation (bug 1780375)
2022-08-04 11:49:29 +02:00
Calixte Denizet
31155740c3 [Annotation] Add a div containing the text of a FreeText annotation (bug 1780375)
An annotation doesn't have to be in the text flow, hence it's likely a bad idea
to insert its text in the text layer. But the text must be visible from a screen
reader point of view so it must somewhere in the DOM.
So with this patch, the text from a FreeText annotation is extracted and added in
a div in its HTML counterpart, and with the patch #15237 the text should be visible
and positioned relatively to the text flow.
2022-08-04 11:14:05 +02:00
Calixte Denizet
6916fabd51 Skip unknown fields when calculating a value in using AFSimple_Calculate 2022-08-03 23:40:09 +02:00
Jonas Jenwald
899fc29eef Always set a border-radius for RadioButton annotations (issue 15262) 2022-08-02 13:58:20 +02:00
Jonas Jenwald
0c31320c12 [api-minor] Improve thumbnail handling in documents that contain interactive forms
To improve performance of the sidebar we use the page-canvases to generate the thumbnails whenever possible, since that avoids unnecessary re-rendering when the sidebar is open. This works generally well, however there's an old problem in PDF documents that contain interactive forms (when those are enabled): Note how the thumbnails become partially (or fully) blank, since those Annotations are not included in the OperatorList.[1]

We obviously want to keep using the `PDFThumbnailView.setImage`-method for most documents, however we need a way to skip it only for those pages that contain interactive forms.
As it turns out it's unfortunately not all that simple to tell, after the fact, from looking only at the OperatorList that some Annotations were skipped. While it might have been possible to try and infer that in the viewer, it'd not have been pretty considering that at the time when rendering finishes the annotationLayer has not yet been built.
The overall simplest solution that I could come up with, was instead to include a *summary* of the interactive form-state when doing the final "flushing" of the OperatorList and expose that information in the API.

---
[1] Some examples from our test-suite: `annotation-tx2.pdf` where the thumbnail is completely blank, and `bug1737260.pdf` where the thumbnail is missing the "buttons" found on the page.
2022-07-30 16:53:32 +02:00
Calixte Denizet
9a464b70c1 [Editor] Avoid to slightly move ink editor when undoing/redoing 2022-07-29 16:53:03 +02:00
Calixte Denizet
d092a85b6c Fix wrong order of arguments when calling the CipherTransform ctor (bug 1782186) 2022-07-29 12:46:45 +02:00
Jonas Jenwald
d7cc47d73a
Merge pull request #15235 from Snuffleupagus/annot-isUsingOwnCanvas
Ensure that the `isUsingOwnCanvas`-parameter is consistently included in operatorLists (PR 14247 follow-up)
2022-07-28 17:04:45 +02:00
Jonas Jenwald
2cad5cf45b Set opacity in the reference tests (PR 15219 follow-up)
Without these changes in the manifest, the affected test-cases fail to render correctly.
2022-07-28 14:35:09 +02:00
Jonas Jenwald
2fb083f3e2 Ensure that the isUsingOwnCanvas-parameter is consistently included in operatorLists (PR 14247 follow-up)
Currently some `OPS.beginAnnotation` arguments will contain a `Number` value for the `isUsingOwnCanvas`-parameter, or in some cases an `undefined` value, which is inconsistent from an API perspective.
2022-07-28 13:37:37 +02:00
Calixte Denizet
759116f4c5 [Editor] Avoid to add unexpected commands in the undo/redo queue when undoing/redoing (bug 1781790)
We can undo/redo a command which will at some point add a command in the queue: typically
it can happening when redoing an addition.
So the idea is to lock the queue when undoing/redoing.
2022-07-27 19:12:06 +02:00
Calixte Denizet
7831a100b3 [Editor] Add the possibility to change line opacity in Ink editor 2022-07-27 18:46:25 +02:00
Calixte Denizet
ce4144eee4 [Editor] Avoid editor creation/selection on right click (bug 1781762) 2022-07-27 17:53:22 +02:00
Jonas Jenwald
fc018ea9ea Support images with /Filter-entries that contain Arrays (issue 15220)
This patch "borrows" the code found in the `Parser.makeInlineImage`-method, to ensure that JBIG2 and JPX images can be rendered correctly.
2022-07-25 08:41:37 +02:00
Calixte Denizet
5bbe0d0782 [Editor] Unselect correctly removed editors
- After undoing a deletion of several editors, they appeared to be selected (they had a red border)
when in fact they were not, consequently, this patch aims to remove the selectedEditor class when
an editor is removed;
- Add a test with some ink editors.
2022-07-22 13:21:08 +02:00
Calixte Denizet
d6b9ca48a5 [Editor] Add the ability to make multiple selections (bug 1779582)
- several editors can be selected/unselected using ctrl+click;
- and then they can be copied, pasted, their properties can be changed.
2022-07-21 22:53:52 +02:00
Calixte Denizet
af41a5cb49 [Editor] Simplify the command manager
The previous version was maybe functional but definitely painful to maintain
(maybe more efficient... I don't know) so this patch aims to simplify it and
it adds some basic unit tests.
2022-07-21 18:44:41 +02:00
Jonas Jenwald
60bd9580e2 Ignore invalid /CIDToGIDMap-entries when parsing fonts (issue 15139)
In the referenced PDF document the fonts have /CIDToGIDMap-entries that cannot be loaded. Hence, only when `ignoreErrors` is set, we'll now ignore these corrupt /CIDToGIDMap-entries and fallback to simply assume that no such data is available.

Given that this is *clearly* a case of a corrupt PDF document, there's no guarantee that this will "fix" things in the general case since a /CIDToGIDMap may be *required* in order for some composite fonts to render correctly. However, attempting to render *something* is surely better than skipping a font altogether.
2022-07-20 11:58:44 +02:00
Jonas Jenwald
f46895d750
Merge pull request #15110 from calixteman/editing_a11y
[Editor] Improve a11y for newly added element (#15109)
2022-07-19 20:02:53 +02:00
Jonas Jenwald
98f70d87f6
Merge pull request #15174 from Snuffleupagus/more-for-of
Use more `for...of` loops in the code-base
2022-07-19 19:04:47 +02:00
Calixte Denizet
624b26e1de [Editor] Improve a11y for newly added element (#15109)
- In the annotationEditorLayer, reorder the editors in the DOM according
  the position of the elements on the screen;
- add an aria-owns attribute on the "nearest" element in the text layer
  which points to the added editor.
2022-07-19 18:52:17 +02:00
Calixte Denizet
3c17dbb43e [Editor] Use serialized data when copying/pasting
- in using the global clipboard, it'll be possible to copy from a
  pdf and paste in an other one;
- it'll allow to edit a previously created annotation;
- copy the editors in the current page.
2022-07-19 17:54:06 +02:00
Jonas Jenwald
37ebc28756 Use more for...of loops in the code-base
Note that these cases, which are all in older code, were found using the [`unicorn/no-for-loop`](https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/no-for-loop.md) ESLint plugin rule.
However, note that I've opted not to enable this rule by default since there's still *some* cases where I do think that it makes sense to allow "regular" for-loops.
2022-07-17 16:18:54 +02:00
Calixte Denizet
ad62ae561c Disable canvas acceleration for linux tests 2022-07-15 13:31:37 +02:00
Tim van der Meij
3d4476ac53
Update ttx from version 2.5 to version 3.19.0
The current version 2.5 is from September 2014, which is almost 8 years
old now. The new version 3.19.0 is from November 2017, which is still
almost 5 years old, but is a step forward towards eventually using the
most recent version. Note that we currently can't update any further;
see #11802 for the details.

Fortunately using this newer version only required a few changes:

- The `ttx` output regexes needed updating to ignore comments that `ttx`
  now puts after some XML nodes (`<!-- ... -->` and `/* ... */`).
- The `ttx` invocation now explicitly uses `python2` (except on Windows
  where this alias doesn't exist) since otherwise the font tests can't
  be run on modern systems anymore given that `python` is nowadays an
  alias for `python3`, and it now points at the new location of the
  `ttx.py` file since the `Tools` folder got removed.
- The note about needing a 32-bit Python 2.6 version is dropped since
  it's obsolete: this version (and also the existing one already) work
  just fine on a 64-bit Python 2.7 as well.
2022-07-10 21:18:36 +02:00
calixteman
2b6a67c5d0
Merge pull request #15153 from calixteman/1778692
[Annotation] A push button can have no action (bug 1778692)
2022-07-08 21:06:53 +02:00
Calixte Denizet
8f26ba5487 [Annotation] A push button can have no action (bug 1778692) 2022-07-08 15:39:56 +02:00
Jonas Jenwald
c2f7942aea Ensure that the /Resources-entry is actually a dictionary (issue 15150)
Prevent issues in *corrupt* PDF documents, if the /Resources-entry is not of the correct and expected type.
2022-07-08 12:43:43 +02:00
Jonas Jenwald
345bb18575 [editor] Use the fit-curve package (issue 15004)
Rather than including all of this external code in the PDF.js repository, we should be using the npm package instead.
Unfortunately this is slightly more complicated than you'd hope, since the `fit-curve` package (which is older) isn't directly compatible with modern JavaScript modules.
In particular, the following cases needed to be considered:
 - For the development viewer (i.e. `gulp server`) and the unit-tests, we thus need to build a fitCurve-bundle that can be directly `import`ed.
 - For the actual PDF.js build-targets, we can slightly reduce the sizes by depending on the "raw" `fit-curve` source-code.
 - For the Node.js unit-tests, the `fit-curve` package can be used as-is.
2022-07-07 10:43:43 +02:00
Jonas Jenwald
79cfc548fc Improve text-selection for Type3 fonts with bogus /FontBBox-entries (issue 14999)
This extends PR 13461, by also building a fallback bounding box for Type3 fonts that contain a much too small /FontBBox-entry.

*Please note:* While this patch improves things overall, copy-and-pasting still doesn't work perfectly for this document. In particular the lowercase letter "c" cannot be selected/copied, however this can be reproduced in both Adobe Reader and PDFium (in Google Chrome) too, which is caused by a lack of proper /ToUnicode-data in the PDF document.
2022-07-05 14:27:14 +02:00
Jonas Jenwald
552ee9decd Call AnnotationLayer.setDimensions as part of the render/update-methods (PR 15036 follow-up)
Rather than forcing the user to *manually* call `setDimensions`, which is also breaking any existing third-party code, it seems that we can simply let the `AnnotationLayer.{render, update}`-methods handle that internally.

As far as I can tell, based on testing manually in the viewer *and* running the browser-tests, everything still appears to work correctly with this patch.
2022-07-04 12:27:20 +02:00
Calixte Denizet
1a3ef2a0aa [editor] Add some UI elements in order to set font size & color, and ink thickness & color 2022-06-28 12:05:04 +02:00
Calixte Denizet
3789dab307 Always flush the current item with MarkedContent stuff when getting text (#15094) 2022-06-25 17:19:57 +02:00
calixteman
23fcdabb37
Merge pull request #15088 from calixteman/editor_rotation
Support rotating editor layer
2022-06-25 16:18:07 +02:00
Jonas Jenwald
c0f65657a2 Use the *built* components/pdf_viewer.css file in the reference tests
Currently we're loading the `web/annotation_layer_builder.css` and `web/xfa_layer_builder.css` files *directly* during the reference tests.
This becomes a problem is we want to reduce duplication in the CSS-files, e.g. by placing *common* rules in the `web/pdf_viewer.css` file.

Given that `gulp components` is already being utilized when running tests, we can thus use that to instead depend on the *entire* viewer-components CSS-file in the reference tests.
2022-06-25 09:54:05 +02:00
Calixte Denizet
0c420f5135 Support rotating editor layer
- As in the annotation layer, use percent instead of pixels as unit;
- handle the rotation of the editor layer in allowing editing when rotation
  angle is not zero;
- the different editors are rotated counterclockwise in order to be usable
  when the main page is itself rotated;
- add support for saving/printing rotated editors.
2022-06-24 20:02:32 +02:00
Jonas Jenwald
cd35b9bfac
Merge pull request #15095 from Snuffleupagus/Annotation-OC
Add (basic) support for Optional Content in Annotations
2022-06-24 19:10:11 +02:00
Calixte Denizet
6e46226cd7 Fix unit test (#15093 follow-up) 2022-06-24 18:55:35 +02:00
calixteman
b5fea8ff14
Merge pull request #15093 from calixteman/issue15092
[JS] Update siblings when a field is updated after a calculation (#15092)
2022-06-24 16:17:59 +02:00
Jonas Jenwald
c48dc251e0 Add (basic) support for Optional Content in Annotations
Given that Annotations can also have an `OC`-entry, we need to take that into account when generating their operatorLists.

Note that in order to simplify the patch the `getOperatorList`-methods, for the Annotation-classes, were converted to be `async`.
2022-06-24 15:19:56 +02:00
Calixte Denizet
a334a21a1d [JS] Update siblings when a field is updated after a calculation (#15092) 2022-06-24 14:23:06 +02:00
Jonas Jenwald
3fab4af949
Merge pull request #15043 from Snuffleupagus/PrintAnnotationStorage
[api-minor] Introduce a `PrintAnnotationStorage` with *frozen* serializable data
2022-06-24 09:30:07 +02:00
Calixte Denizet
e49d039853 Correctly order added annotations when saving or printing
- the annotations must be rendered in the same order as the chronological one.
- fix a bug in document.js which avoids to read a saved pdf correctly in Acrobat:
  there is no need to reset the xref state: it's done in worker.js once everything
  has been saved.
2022-06-23 17:39:12 +02:00
Jonas Jenwald
1cc7cecc7b [api-minor] Introduce a PrintAnnotationStorage with *frozen* serializable data
Given that printing is triggered *synchronously* in browsers, it's thus possible for scripting (in PDF documents) to modify the Annotation-data while printing is currently ongoing.
To work-around that we add a new printing-specific `AnnotationStorage`, where the serializable data is *frozen* upon initialization, which the viewer can thus create/utilize during printing.
2022-06-23 17:06:46 +02:00
Calixte Denizet
30c63eb0ec [Editor] Add support for printing newly added FreeText annotations 2022-06-22 13:26:09 +02:00
Calixte Denizet
f27c8c4471 [Editor] Add support for printing newly added Ink annotations 2022-06-21 18:21:49 +02:00
calixteman
8d466f5dac
Merge pull request #15060 from calixteman/annotation_rotation
Rotate annotations based on the MK::R value (bug 1675139)
2022-06-21 18:03:09 +02:00
Calixte Denizet
cdc58b7a52 Rotate annotations based on the MK::R value (bug 1675139)
- it aims to fix: https://bugzilla.mozilla.org/show_bug.cgi?id=1675139;
- An annotation can be rotated (counterclockwise);
- the rotation can be set in using JS.
2022-06-21 17:57:26 +02:00
Calixte Denizet
fd209d9685 Add an outline around popup trigger areas in ref-tests
- The goal is to avoid any future regressions.
2022-06-20 21:23:30 +02:00
Calixte Denizet
e2db9bacef Get rid of CSS transform on each annotation in the annotation layer
- each annotation has its coordinates/dimensions expressed in percentage,
  hence it's correctly positioned whatever the scale factor is;
- the font sizes are expressed in percentage too and the main font size
  is scaled thanks a css var (--scale-factor);
- the rotation is now applied on the div annotationLayer;
- this patch improve the rendering of some strings where the glyph spacing
  was not correct (it's a Firefox bug);
- it helps to simplify the code and it should slightly improve the update of
  page (on zoom or rotation).
2022-06-18 17:54:59 +02:00
Jonas Jenwald
03757d82b7 Replace element ids with custom attributes for Widget-annotations (issue 15056)
We want to avoid adding regular `id`s to Annotation-elements, since that means that they become "linkable" through the URL hash in a way that's not supported/intended. This could end up clashing with "named destinations", and that could easily lead to bugs; see issue 11499 and PR 11503 for some context.

Rather than using `id`s, we'll instead use a *custom* `data-element-id` attribute such that it's still possible to access the Annotation-elements directly.
Unfortunately these changes required updating most of the integration-tests, and to reduce the amount of repeated code a couple of helper functions were added.
2022-06-18 16:43:05 +02:00
Calixte Denizet
7e3941da9d [JS] Hide field borders and buttons (#15053)
- Since the border belongs to the section containing the HTML
  counterpart of an annotation, this section must be hidden when
  a JS action requires it;
- it wasn't possible to hide a button in using JS.
2022-06-17 17:36:38 +02:00
Jonas Jenwald
64cce1269e Add basic support for non-embedded ArialUnicodeMS fonts (issue 15044)
This appears to be a Microsoft-specific version of the regular Arial font, hence we simply map this to Helvetica in the same way that we treat many other Arial-named fonts.
2022-06-15 10:37:20 +02:00
Jonas Jenwald
2dca14028d Extend getGlyphMapForStandardFonts with some Hebrew entries (issue 15033)
This only adds the minimum entries required in order to render the referenced document correctly, rather than trying to support "all" Hebrew glyphs, to ensure that all lines in `getGlyphMapForStandardFonts` are covered by tests.
2022-06-13 10:08:39 +02:00
Jonas Jenwald
8129815538 Enable the unicorn/prefer-dom-node-append ESLint plugin rule
This rule will help enforce slightly shorter code, especially since you can insert multiple elements at once, and according to MDN `Element.append()` is available in all browsers that we currently support.

Please find additional information here:
 - https://developer.mozilla.org/en-US/docs/Web/API/Element/append
 - https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/prefer-dom-node-append.md
2022-06-12 13:07:03 +02:00
Jonas Jenwald
bbf857d635 [api-minor] Stop using the beginAnnotations/endAnnotations operators (PR 14998 follow-up)
After the changes in PR 14998, these operators are now no-ops in the `src/display/canvas.js` code and should no longer be necessary.
Given that `beginAnnotations`/`endAnnotations` are not in the PDF specification, but are rather *custom* PDF.js operators, it seems reasonable to stop using them now that they've become no-ops.
2022-06-11 14:21:26 +02:00
Tim van der Meij
a57a4bc6c2
Merge pull request #15018 from Snuffleupagus/issue-15016
Expose `TextLayerRenderTask` in the TypeScript definitions (issue 15016, PR 14013 follow-up)
2022-06-10 22:18:35 +02:00
Jonas Jenwald
e046b811b7 Expose TextLayerRenderTask in the TypeScript definitions (issue 15016, PR 14013 follow-up)
While `TextLayerRenderTask` apparently makes sense in TypeScript environments, given that it's being returned by the `renderTextLayer`-function in the API, we really don't want to extend the *public* API by simply exporting the class directly in `src/pdf.js` since it should never be called/initialized manually.
Hence we follow the same pattern as in PR 14013, and add some very basic unit-tests to ensure that `renderTextLayer` always returns a `TextLayerRenderTask`-instance as expected.
2022-06-10 22:12:32 +02:00
Jonas Jenwald
9ac4536693 Enable the unicorn/prefer-at ESLint plugin rule (PR 15008 follow-up)
Please find additional information here:
 - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/at
 - https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/prefer-at.md
2022-06-09 21:21:19 +02:00
calixteman
5d88233fbb
Merge pull request #15006 from calixteman/ink2
[editor] Add support for saving newly added Ink
2022-06-09 21:13:53 +02:00
Jonas Jenwald
3d244cb6a8 Render PopupAnnotations even if they have missing or empty /Rect-entries (issue 15012, PR 14439 follow-up)
This only applies to *corrupt* PDF documents, where Annotations are missing the required /Rect-entry. Rendering PopupAnnotations unconditionally shouldn't be a problem, since we're not using a `BaseSVGFactory`-instance in that case.
2022-06-09 15:10:54 +02:00
Calixte Denizet
36aae436bf [editor] Add support for saving newly added Ink 2022-06-08 22:16:01 +02:00
calixteman
2fbf14ace8
Merge pull request #14978 from calixteman/editor2
[editor] Add support for saving a newly added FreeText
2022-06-08 15:51:03 +02:00
Calixte Denizet
7773b3f5be [edition] Add support for saving a newly added FreeText 2022-06-08 14:34:09 +02:00
Calixte Denizet
2dd0c861bf Outline fields which are required (bug 1724918)
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1724918;

- it applies for both Acroform and XFA.
2022-06-07 17:02:11 +02:00
Calixte Denizet
96d0d22d66 Reset all the canvas states after rendering each annotations (#14105)
- each annotation must be rendered independently of the others. So
  after having rendered each annotation, the canvas states are reset
  in order to have something clean to render the next one.
2022-06-07 14:59:02 +02:00
Jonas Jenwald
59dd4ea2b0 Lookup image-data correctly in paintImageMaskXObjectGroup (issue 14990)
*This fixes a regression from PR 14754.*

We didn't lookup the image-data correctly, with the result that we tried to render some ImageMasks using a string rather than the intended TypedArray. To make matters worse, this code-path was apparently not *properly* covered by existing test-cases.
2022-06-05 12:39:23 +02:00
Calixte Denizet
be1aa11986 [edition] Add a FreeText editor (#14970)
- add a basic UI to edit some text in a pdf;
- an editor can be moved, suppressed, cut, copied, pasted, selected;
- add an undo/redo manager.
2022-06-04 18:20:11 +02:00
Calixte Denizet
66b513fc00 [Annotations] Show buttons even if they've no actions
- it's a regression from PR #14247:
 - before the PR, the button was rendered on the canvas whatever its status was;
 - after the PR, the button image has been moved in an other canvas so when the button is not renderable
   (because it has no actions) then the image is not added the HTML element.
- the buttons in the pdf in bug 1737260 or in the pdf in #14308 were not visible
- make the button always renderable but don't add the link element if it's useless.
2022-05-28 23:50:50 +02:00
Calixte Denizet
9d82106d20 Set the text fields font size based on their height
- right now we're using the font size from the pdf itself but we use an other font
  in the annotation layer. So this size doesn't really make sense and leads to bad
  rendering (see pdf in #14928);
- use a sans-serif font for the fields containing text (fix issue #14736);
- remove useless padding in text-based fields (fix issue #14301);
- text fields allow/disallow scrolling bars (see bit 24 in Ff entry), so use this
  value to hide/show scrollbars in annotation layer.
2022-05-28 18:00:39 +02:00
Calixte Denizet
c7afce4210 Support Hangul syllables when searching some text (bug 1771477)
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1771477;
- hangul contains some syllables which are decomposed when using NFD, hence
  the text must be correctly shifted in case it contains some of them.
2022-05-28 16:50:03 +02:00
Jonas Jenwald
5a2899c57e Skip bogus d1 operators in Type3-glyphs (issue 14953)
In the `src/display/canvas.js` code the `d1` operator will be used to set the clipping region, and it obviously cannot be empty since that prevents the Type3-glyph from rendering.

Also, the patch removes an outdated comment; refer to PR 12718.
2022-05-24 12:20:31 +02:00
Calixte Denizet
9407adc416 [JS] Format all the fields if any when the document is open (bug 1766987)
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1766987.
2022-05-22 15:50:42 +02:00
Jonas Jenwald
5b67202851
Merge pull request #14929 from calixteman/14928
Display background when printing or saving a text widget (issue #14928)
2022-05-20 18:11:10 +02:00
Calixte Denizet
0a66e1f5ea Allow to have float numbers when getting image information in reftest-analyzer
- outputScale can be e.g. 1.5 in real life.
2022-05-20 13:02:22 +02:00
Calixte Denizet
60498c67e4 Display background when printing or saving a text widget (issue #14928) 2022-05-19 16:41:54 +02:00
Jonas Jenwald
5a774b7ed3 Adjust the heuristics for handling of incomplete path operators (issue 14917)
This limits the heuristics for handling of incomplete path operators, see PR 9838, to only apply to *sequences* of such operators. In practice a couple of invalid path operators are (hopefully) unlikely to completely break rendering, whereas a sequence of them will easily lead to fairly chaotic rendering artifacts.
2022-05-15 11:24:39 +02:00
Tim van der Meij
fbc7981c98
Merge pull request #14894 from Snuffleupagus/rm-mozOpaque
Try to remove the `mozOpaque` canvas-property (PR 6551 follow-up)
2022-05-12 22:03:05 +02:00
Jonas Jenwald
6bcc5b615d [api-minor] Include line endings in Line/Polyline Annotation-data (issue 14896)
Please refer to:
 - https://web.archive.org/web/20220309040754if_/https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G11.2109792
 - https://web.archive.org/web/20220309040754if_/https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G11.2096489
 - https://web.archive.org/web/20220309040754if_/https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G11.2096447

Note that we still won't attempt to use the /LE-data when creating fallback appearance streams, as mentioned in PR 13448, since custom line endings aren't common enough to warrant the added complexity.
Finally, note that according to the PDF specification we should *potentially* also take the line endings into account for FreeText Annotations. However, in that case their use is conditional on other parameters that we currently don't support.
2022-05-12 11:08:30 +02:00
Jonas Jenwald
af5789125f Try to remove the mozOpaque canvas-property (PR 6551 follow-up)
According to MDN, see https://developer.mozilla.org/en-US/docs/Web/API/HTMLCanvasElement/mozOpaque, the `mozOpaque` canvas-property is not only non-standard (obviously) but it's also been deprecated.
Instead it's recommended to use `alpha = false` when getting the canvas-context, see https://developer.mozilla.org/en-US/docs/Web/API/HTMLCanvasElement/getContext#contextattributes, which all of our affected code is already doing.
2022-05-09 13:03:08 +02:00
Jonas Jenwald
6e7e9d83d8 Add support for TrueType format 12 cmaps (issue 14881)
This is, as far as I can tell, the first case we've seen of a format 12 `cmap`.
Please see https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6cmap.html
2022-05-06 11:11:38 +02:00
calixteman
cfac6fa511
Merge pull request #14874 from calixteman/colors
[api-minor] Improve pdf reading in high contrast mode
2022-05-05 21:48:19 +02:00
Calixte Denizet
c8afd6ce8c [api-minor] Improve pdf reading in high contrast mode
- Use Canvas & CanvasText color when they don't have their default value
  as background and foreground colors.
- The colors used to draw (stroke/fill) in a pdf are replaced by the bg/fg
  ones according to their luminance.
2022-05-05 16:34:51 +02:00
Jonas Jenwald
8267fd8a52 Replace the AnnotationStorage.lastModified-getter with a proper hash-method
The current `lastModified`-getter, which only contains a time-stamp, is a fairly crude way of detecting if the stored data has actually been changed. In particular, when the `getRawValue`-method is used, the `lastModified`-getter doesn't cope with data being modified from the "outside".

To fix these issues[1], and to prevent any future bugs in this code, this patch introduces a new `AnnotationStorage.hash`-getter which computes a hash of the currently stored data. To simplify things this re-uses the existing `MurmurHash3_64`-implementation, which required moving that file into the `src/shared/`-folder, since its performance should be good enough here.

---
[1] Given how the `AnnotationStorage.lastModified`-getter was used, this would have been limited to *printing* of forms.
2022-05-04 15:21:30 +02:00
Jonas Jenwald
8135d7ccf6
Merge pull request #14869 from calixteman/14862
[JS] Fix few bugs present in the pdf for issue #14862
2022-05-03 18:31:31 +02:00
Calixte Denizet
6b866a5efa Add a small delay when typing in some integration tests to avoid intermittent failures 2022-05-03 16:25:29 +02:00
Calixte Denizet
094ff38da0 [JS] Fix few bugs present in the pdf for issue #14862
- since resetForm function reset a field value a calculateNow is consequently triggered.
  But the calculate callback can itself call resetForm, hence an infinite recursive loop.
  So basically, prevent calculeNow to be triggered by itself.
- in Firefox, the letters entered in some fields were duplicated: "AaBb" instead of "AB".
  It was mainly because beforeInput was triggering a Keystroke which was itself triggering
  an input value update and then the input event was triggered.
  So in order to avoid that, beforeInput calls preventDefault and then it's up to the JS to
  handle the event.
- fields have a property valueAsString which returns the value as a string. In the
  implementation it was wrongly used to store the formatted value of a field (2€ when the user
  entered 2). So this patch implements correctly valueAsString.
- non-rendered fields can be updated in using JS but when they're, they must take some properties
  in the annotationStorage. It was implemented for field values, but it wasn't for
  display, colors, ...
- it fixes #14862 and #14705.
2022-05-03 15:48:44 +02:00
Jonas Jenwald
df5a4fd0a7 Support encoded dest-strings in /GoTo destination dictionaries (issue 14864)
Interestingly enough this appears to be the very first case of *encoded* dest-strings, in /GoTo destination dictionaries, that we've actually come across. What's really fascinating is that it's less than a week after issue 14847, given that these issues are *somewhat* similar.
2022-05-02 10:14:32 +02:00
calixteman
b10b8dad7d
Merge pull request #14853 from calixteman/white_lines
Use integer coordinates when drawing images (bug 1264608, issue #3351)
2022-04-29 18:15:03 +02:00
Calixte Denizet
624d8a8e3e Use integer coordinates when drawing images (bug 1264608, issue #3351)
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1264608;
- it's only a partial fix for #3351;
- some tiled images have some spurious white lines between the tiles.
  When the current transform is applyed the corners of an image can have
  some non-integer coordinates leading to some extra transparency added
  to handle that. So with this patch the current transform is applied on the
  point and on the dimensions in order to have at the end only integer values.
2022-04-29 16:01:34 +02:00
Jonas Jenwald
fbf6dee8ee [api-minor] Remove the forceClamped-functionality in the Streams (issue 14849)
As it turns out, most of the code-paths in the `PDFImage`-class won't actually pass the TypedArray (containing the image-data) to the `ColorSpace`-code. Hence we *generally* don't need to force the image-data to be a `Uint8ClampedArray`, and can just as well directly use a `Uint8Array` instead.

In the following cases we're returning the data without any `ColorSpace`-parsing, and the exact TypedArray used shouldn't matter:
 - b72a448327/src/core/image.js (L714)
 - b72a448327/src/core/image.js (L751)

In the following cases the image-data is only used *internally*, and again the exact TypedArray used shouldn't matter:
 - b72a448327/src/core/image.js (L762) with the actual image-data being defined (as `Uint8ClampedArray`) further below
 - b72a448327/src/core/image.js (L837)

*Please note:* This is tagged `api-minor` because it's API-observable, given that *some* image/mask-data will now be returned as `Uint8Array` rather than using `Uint8ClampedArray` unconditionally. However, that seems like a small price to pay to (slightly) reduce memory usage during image-conversion.
2022-04-29 14:46:30 +02:00
Jonas Jenwald
71370d012b Support destinations in NameTrees with encoded keys (issue 14847)
Initially I considered updating the `NameOrNumberTree`-implementation to handle encoded keys, however that quickly became somewhat messy (especially in the `NameOrNumberTree.get`-method) since only NameTrees using string-keys.
Hence the easiest solution, as far as I'm concerned, was thus to just update the `Catalog.destinations`-getter instead. Please note that in the referenced PDF document the `Catalog.destination`-method will thus fallback to fetch all destinations, which should be fine since this is the very first case of encoded keys that we've seen.

Also changes the `NameOrNumberTree.getAll`-method to prevent a possible run-time error, although we've so far not seen such a case, for any non-Array Kids-entries found in a NameTree/NumberTree.

Finally, to improve overall consistency and to hopefully prevent future bugs, the patch also updates a couple of other `NameTree` call-sites to correctly handle encoded keys. (Note that the `Catalog.attachments`-getter was already doing this.)
2022-04-27 11:19:55 +02:00
Calixte Denizet
314fd83bba Don't use pref 'browser.download.improvements_to_download_panel' in Firefox (#14822) 2022-04-25 15:05:43 +02:00
Tim van der Meij
752dee5caa
Merge pull request #14825 from Snuffleupagus/issue-14824
Ensure that worker-thread image caching doesn't break optional content (issue 14824)
2022-04-23 13:19:56 +02:00
Tim van der Meij
f9e54d9226
Merge pull request #14823 from Snuffleupagus/issue-14821
Ignore invalid /Encoding-entries when parsing fonts (issue 14821)
2022-04-23 13:19:26 +02:00
Jonas Jenwald
6c229dffb1 Ensure that worker-thread image caching doesn't break optional content (issue 14824)
Currently we only insert optionalContent-data into the operatorList the first time that an image is parsed, which will (in hindsight) obviously cause problems for cached images.
Hence we also need to insert the optionalContent-data in the various worker-thread image caches, such that it can be accessed in the fast-paths that are used to skip re-parsing of images.

In order to reduce the amount of repeated code, this patch also adds a new `OperatorList`-method that takes care of inserting the necessary data in the operatorList.
2022-04-22 14:49:16 +02:00
Jonas Jenwald
e723da7261 Ignore invalid /Encoding-entries when parsing fonts (issue 14821)
In the referenced PDF document the fonts have /Encoding-entries that are Streams (containing completely bogus data), which are thus obviously not valid here.
Hence, only when `ignoreErrors` is set, we'll now ignore these corrupt /Encoding-entries and fallback to the existing code to try and infer a usable encoding.

Given that this is *clearly* a case of corrupt PDF documents, there's no guarantee that this will "fix" all such cases, however it's the best that we do here and shouldn't really be worse than ignoring an entire font.
2022-04-22 11:49:03 +02:00
Jonas Jenwald
39d1bdde09 Ignore non-Stream /SMask-entries when parsing images (issue 14814)
This is similar to the pre-existing check used in the /Mask-case below, to handle *corrupt* PDF documents that include non-Stream /SMask-entries in images; please refer to the PDF specification:
https://web.archive.org/web/20220309040754if_/https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#page=216

*Please note:* Adobe Reader also fails to render the image on the second page, and displays an error message.
2022-04-21 12:14:08 +02:00
Jonas Jenwald
5bc7339c1b Add support for the /Catalog Base-URI when resolving URLs (issue 14802)
As far as I can tell, this is actually the very first time that we've seen a PDF document with a Base-URI specified in the /Catalog; please refer to the specification:
https://web.archive.org/web/20220309040754if_/https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G11.2097122

To simplify the overall implementation, this new parameter is accessed via the existing `BasePdfManager.docBaseUrl`-getter and will thus override any user-specified `docBaseUrl` API-parameter.
2022-04-19 17:14:52 +02:00
Calixte Denizet
3d74d2c6cb Don't clip when the clip path is empty (issue #12306) 2022-04-18 10:33:44 +02:00
Calixte Denizet
f62d961dfe Improve performances with image masks (bug 857031)
- it's the second part of the fix for https://bugzilla.mozilla.org/show_bug.cgi?id=857031;
- some image masks can be used several times but at different positions;
- an image need to be pre-process before to be rendered:
  * rescale it;
  * use the fill color/pattern.
- the two operations above are time consuming so we can cache the generated canvas;
- the cache key is based on the current transform matrix (without the translation part)
  and the current fill color when it isn't a pattern.
- the rendering of the pdf in the above bug is really faster than without this patch.
2022-04-16 20:48:39 +02:00
Calixte Denizet
040fcae5ab Improve performance with image masks (bug 857031)
- it aims to partially fix performance issue reported: https://bugzilla.mozilla.org/show_bug.cgi?id=857031;
- the idea is too avoid to use byte arrays but use ImageBitmap which are a way faster to draw:
  * an ImageBitmap is Transferable which means that it can be built in the worker instead of in the main thread:
    - this is achieved in using an OffscreenCanvas when it's available, there is a bug to enable them
      for pdf.js: https://bugzilla.mozilla.org/show_bug.cgi?id=1763330;
    - or in using createImageBitmap: in Firefox a task is sent to the main thread to build the bitmap so
      it's slightly slower than using an OffscreenCanvas.
  * it's transfered from the worker to the main thread by "reference";
  * the byte buffers used to create the image data have a very short lifetime and ergo the memory used is globally
    less than before.
- Use the localImageCache for the mask;
- Fix the pdf issue4436r.pdf: it was expected to have a binary stream for the image;
- Move the singlePixel trick from operator_list to image: this way we can use this trick even if it isn't in a set
  as defined in operator_list.
2022-04-09 18:26:26 +02:00
Jonas Jenwald
36a289d747
Merge pull request #14735 from calixteman/14685
[Annotations] Some annotations can have their values stored in the xfa:datasets
2022-04-01 11:30:16 +02:00
Calixte Denizet
0b597304c1 [Annotations] Some annotations can have their values stored in the xfa:datasets
- it aims to fix #14685;
- add a basic object to get values from the parsed datasets;
- these annotations don't have an appearance so we must create one when printing or saving.
2022-04-01 10:28:04 +02:00
Jonas Jenwald
addb4cb12b Use String.prototype.repeat() in a couple of spots
Rather than using a temporary Array to manually create repeated strings, we can use `String.prototype.repeat()` instead.
The reason that we didn't use this from the start is most likely because some browsers, notably IE, didn't support this; note https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/repeat#browser_compatibility
2022-03-30 15:42:40 +02:00
Calixte Denizet
ad3fb71a02 [Annotations] Add support for printing/saving choice list with multiple selections
- it aims to fix issue #12189.
2022-03-29 18:59:44 +02:00
Calixte Denizet
18e79e3c0b [text selection] Add the whitespaces present in the pdf in the text chunk
- it aims to fix issue #14627;
- the basic idea of the recent text refactoring was to only consider the rendered visible whitespaces.
  But sometimes, the heuristics aren't correct and although some whitespaces are in the text stream
  they weren't in the text chunks because they were too small. Hence we added some exceptions, for example,
  we always add a whitespace when it is between two non-whitespace chars but only when in the same Tj.
  So basically, this patch removes the constraint to have the chars in the same Tj
  (in using a circular buffer to save the two last chars) but don't add a space when the visible space is really
  too small (hence `NOT_A_SPACE_FACTOR`).
2022-03-27 14:34:56 +02:00
Jonas Jenwald
849de5a508 Slightly improve validation of (some) parameters in getDocument
There's a couple of `getDocument` parameters that should be numbers, but which are currently not *fully* validated to prevent issues elsewhere in the code-base.
Also, improves validation of the `ownerDocument` parameter since we currently accept more-or-less anything here.
2022-03-21 13:32:17 +01:00
Calixte Denizet
f0b549c2a2 [JS] - Parse a date in using the given format first and then try the default date parser
- it aims to fix #14672.
2022-03-19 16:07:43 +01:00
Jonas Jenwald
c0736647f9 Add general iteration support in the RefSet and RefSetCache classes
This patch removes the existing `forEach` methods, in favor of making the classes properly iterable instead. Given that the classes are using a `Set` respectively a `Map` internally, implementing this is very easy/efficient and allows us to simplify some existing code.
2022-03-18 14:27:34 +01:00
Jonas Jenwald
fb345ee184 Enable the "gets fieldObjects" unit-test in Node.js (PR 14409 follow-up)
Apparently this unit-test works in Node.js now, hence it's *possible* that the reason it didn't work previously is that there were bugs in our old `structuredClone` polyfill.
2022-03-13 10:40:57 +01:00
Tim van der Meij
bcf453cf14
Merge pull request #14656 from Snuffleupagus/mv-isSameOrigin
Move the `isSameOrigin` helper function
2022-03-11 21:08:49 +01:00
Jonas Jenwald
537ed37835 Move the isSameOrigin helper function
This function is currently placed in the `src/shared/util.js` file, which means that the code is duplicated in both of the *built* `pdf.js` and `pdf.worker.js` files. Furthermore, it only has a single call-site which is also specific to the `GENERIC`-build of the PDF.js library.

Hence this helper function is instead moved into the `src/display/api.js` file, in such a way that it's conditionally defined but still can be unit-tested.
2022-03-10 13:51:09 +01:00
Jonas Jenwald
e08e3f4d37 Replace XMLHttpRequest usage with the Fetch API in send (in test/unit/testreporter.js)
Besides converting the `send` function to use the Fetch API, this patch also changes the method to return a `Promise` to get rid of the callback function. (Although, currently there's no call-site passing in a callback function.)
2022-03-10 12:55:08 +01:00
Tim van der Meij
ee39499a5a
Merge pull request #14651 from Snuffleupagus/Driver-inlineImages-fetch
Replace XMLHttpRequest usage with the Fetch API in `inlineImages` (in `test/driver.js`)
2022-03-09 20:47:38 +01:00
Jonas Jenwald
b3f4758183 Replace XMLHttpRequest usage with the Fetch API in inlineImages (in test/driver.js)
This is the final part in a series of patches that try to re-implement PR 14287 in smaller steps.

Besides converting `inlineImages` to use the Fetch API, this patch also combines the `inlineImages` and `resolveImages` functions since they are always used together.
2022-03-09 11:32:51 +01:00
Jonas Jenwald
19c2cc8689 Replace XMLHttpRequest usage with the Fetch API in Driver._send
This is another part in a series of patches that try to re-implement PR 14287 in smaller steps.

Besides converting `Driver._send` to use the Fetch API, this also changes the method to return a `Promise` to get rid of the callback function.
Please note that I *purposely* try to maintain the existing behaviour of re-sending the data on failure/unexpected response, including how/where the old callback function was invoked.
2022-03-07 16:00:52 +01:00
Tim van der Meij
3e593cfc1d
Merge pull request #14636 from Snuffleupagus/Driver-quit-fetch
Replace XMLHttpRequest usage with the Fetch API in `Driver._quit`
2022-03-06 18:26:55 +01:00
Jonas Jenwald
90445679e8 Replace XMLHttpRequest usage with the Fetch API in the reftest-analyzer 2022-03-06 16:02:34 +01:00
Jonas Jenwald
65d5974192 Replace XMLHttpRequest usage with the Fetch API in Driver._quit
This is another step in what'll hopefully become a series of patches to implement PR 14287 in smaller steps.
2022-03-06 15:36:48 +01:00
Jonas Jenwald
62e0939ce2 Replace XMLHttpRequest usage with the Fetch API in loadStyles (in test/driver.js)
This is another small step in what'll hopefully become a series of patches to implement PR 14287 in smaller steps.
2022-03-06 13:57:42 +01:00