Commit Graph

2916 Commits

Author SHA1 Message Date
Calixte Denizet
9f95a14e91 [Form] Don't use field appearances when /NeedAppearances is set to true (bug 1796741)
When a form isn't changed, we used the appearances we had in the file, but when
/NeedAppearances is true, all the appearances have to be regenerated whatever they're.
2022-10-26 12:10:51 +02:00
Jonas Jenwald
71bd8b4de9 Let Lexer.getNumber treat more invalid "numbers" as zero (issue 15604)
In the referenced PDF document there are "numbers" which consist only of `-.`, and while that's obviously not valid Adobe Reader seems to handle it just fine.
Letting this method ignore more invalid "numbers" was suggested during the review of PR 14543, so let's simply relax our the validation here.
2022-10-20 22:36:15 +02:00
Jonas Jenwald
e591378ff1 Restore a weaker version of the /Pages dictionary /Count check for corrupt documents (PR 15593 follow-up)
It appears that PR 15593 broke `issue12402`, and we thus need to partially restore the /Count check.
 I completely missed this when looking at the test-results for PR 15593, both locally and on the bots, since the `Driver._getLastPageNumber` method would "swallow" an unavailable page number.
2022-10-20 14:22:29 +02:00
Calixte Denizet
6db9cefaaf [Annotation] Replace use of id by data-element-id to have the correct id 2022-10-19 23:36:28 +02:00
Jonas Jenwald
3c046c0a21 Extend getSupplementalGlyphMapForCalibri with some umlauts (issue 15594) 2022-10-19 17:49:40 +02:00
Jonas Jenwald
bc13a277ce Relax the /Pages dictionary /Count check for corrupt documents (issue 9105)
After PR 14311, and follow-up patches, we no longer require that the /Count entry (in the /Pages dictionary) is either present or even valid in order to parse/render a PDF document.
Hence it seems strange to keep this requirement for *corrupt* PDF documents, when trying to find a usable `trailer` in the `XRef.indexObjects` method.
2022-10-19 12:28:25 +02:00
Jonas Jenwald
de99f99a01 Fallback and try a *previous* generation if all else fails in XRef.indexObjects (issue 15577)
When we fail to find a usable PDF document `trailer` *and* there were errors during parsing, try and fallback to a *previous* generation as a last resort during fetching of uncompressed references.
*Please note:* This will not affect "normal" PDF documents, with valid /XRef data, and even most *corrupt* documents should be completely unaffected by these changes.
2022-10-18 20:24:01 +02:00
Jonas Jenwald
7bd484ebd3 Update npm packages 2022-10-16 09:38:58 +02:00
Calixte Denizet
556513a6e7 Use all the current transform as key when caching some image for masks used with pattern fill (bug 1795263, #15573) 2022-10-14 14:37:58 +02:00
Jonas Jenwald
15d4d80d45
Merge pull request #15563 from Snuffleupagus/issue-15559
Take the /CIDToGIDMap into account when getting the glyph mapping for CFF fonts (issue 15559)
2022-10-14 09:13:41 +02:00
Calixte Denizet
e756bb69e4 [JS] Take into account all the required fields for some computations
- Fix Field::getArray in order to collect only the fields which have a value;
- Fix AFSimple_Calculate:
  * allow to have a string with a list of field names as argument;
  * since a field can be non-terminal, use Field::getArray to collect
    the field under it and then apply the calculation on all the descendants.
2022-10-13 18:33:12 +02:00
Jonas Jenwald
858d941ff8 Take the /CIDToGIDMap into account when getting the glyph mapping for CFF fonts (issue 15559)
*Please note:* I don't really know what I'm doing here, however the patch appears to fix the referenced issue when comparing the rendering with Adobe Reader (with the caveat that I don't speak the language in question).
2022-10-13 10:02:25 +02:00
Jonas Jenwald
081e897588 Ensure that Page.getOperatorList handles Annotation parsing errors correctly (issue 15557)
*Fixes a regression from PR 15246, sorry about that!*

The return value of all `Annotation.getOperatorList` methods was changed in PR 15246, however I missed updating the error code-path in `Page.getOperatorList` which thus breaks all operatorList-parsing for pages with corrupt Annotations.
2022-10-10 09:48:01 +02:00
Jonas Jenwald
ce66fefbff [api-minor] Add partial support for the "GoToE" action (issue 8844)
*Please note:* The referenced issue is the only mention that I can find, in either GitHub or Bugzilla, of "GoToE" actions.
Hence why I've purposely settled for a very simple, and partial, "GoToE" implementation to avoid complicating things initially.[1] In particular, this patch only supports "GoToE" actions that references the /EmbeddedFiles-dict in the PDF document.

See https://web.archive.org/web/20220309040754if_/https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G11.2048909

---
[1] Usually I always prefer having *real-world* test-cases to work with, whenever I'm implementing new features.
2022-10-06 10:33:07 +02:00
Jonas Jenwald
fe5d9b4b6a Remove duplicated destroy-calls in the "custom ownerDocument" unit-tests
Given that `PDFDocumentProxy.destroy` is nothing but an alias for `PDFDocumentLoadingTask.destroy` calling both methods is obviously not useful.
2022-10-02 12:01:41 +02:00
Tim van der Meij
606fb8c394
Fix intermittent errors in the "check that first text field has focus" scripting test
This commit fixes the "Expected null to equal '401R'" errors that
surfaced after the Puppeteer 18 upgrade. Note that even before that
this would have been an improvement because it takes some time between
scripting being reported ready (i.e., triggering the execution of any
OpenActions) and those OpenActions actually completing execution, so
it's only safe to check which element is focused if we know an element
actually became focused.
2022-10-01 18:08:15 +02:00
Jonas Jenwald
7b24931f67
Merge pull request #15517 from Snuffleupagus/issue-15516
Add more non-standard ligatures in the `glyphlist.js` file (issue 15516)
2022-09-30 23:30:50 +02:00
Calixte Denizet
330048ad6b [JS] Add the function AFExactMatch 2022-09-29 14:23:56 -10:00
Jonas Jenwald
c87f90102c Add more non-standard ligatures in the glyphlist.js file (issue 15516)
Note that this PR only adds the "underscore"-variant of *actually existing* ligatures, however the referenced PDF document also uses a couple of non-standard ones (e.g. `ft`, `Th`, and `fh`) that we cannot easily support without larger changes (since they don't have official Unicode-entries).
Given that it's clearly the PDF document, and its fonts, that's the culprit here it's not entirely clear to me that we actually want to attempt a larger refactoring/rewriting of the `glyphlist.js` code, assuming it's even generally possible. Especially when this patch alone already improves our copy-paste behaviour when compared to both Adobe Reader and PDFium, and that this is only the *second* time this sort of bug has been reported.
2022-09-27 16:31:51 +02:00
calixteman
da1780f826
Merge pull request #15486 from nmtigor/fix_orders_of_prop
Fix property chain orders of Operators in isDotExpression
2022-09-25 04:13:25 -10:00
Jonas Jenwald
6538409282 Replace some Array.prototype-usage with spread syntax
We have a few, quite old, call-sites that use the `Array.prototype`-format and which can now be replaced with spread syntax instead.
2022-09-23 09:35:30 +02:00
calixteman
034017d526
Merge pull request #15494 from Snuffleupagus/issue-15492
Tweak the heuristic that handles JPEG images with a wildly incorrect SOF (Start of Frame) `scanLines` parameter (issue 15492)
2022-09-22 17:05:49 +02:00
Calixte Denizet
9e40938a29 [JS] Try to guess what the date is when it doesn't follow the given format (issue #15490)
We use the format to guess in which order we can find month, day, ... we get the numbers
in the date and consider them as month, day, ...
2022-09-22 16:30:39 +02:00
Jonas Jenwald
f1b0dc6f04 Tweak the heuristic that handles JPEG images with a wildly incorrect SOF (Start of Frame) scanLines parameter (issue 15492) 2022-09-22 14:09:04 +02:00
nmtigor
22cc9b7dc7 Fix property chain orders of Operators in isDotExpression and isSomPredicate 2022-09-21 17:20:23 +02:00
Calixte Denizet
198e9a3db1 Initialize values in the path bounding box before flushing the operator list (bug 1791583)
OperatorList.addOp can trigger a flush if it's required, hence the values passed to it must
be correctly initialized in order to avoid some wrong values in the renderer.
Because of that a clip path was considered as empty, nothing was clipped, hence the wrong
rendering in bug 1791583.
2022-09-20 20:01:54 +02:00
Calixte Denizet
f5b835157b [XFA] Fix an hidden issue in the FormCalc lexer
Since there are no script engine with XFA, the FormCalc parser is not used irl.
The bug @nmtigor noticed was hidden by another one (the wrong check on `match`).
2022-09-20 13:53:55 +02:00
Jonas Jenwald
20b9887476 Enable the unicorn/prefer-regexp-test ESLint plugin rule
Please see https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/prefer-regexp-test.md
2022-09-19 16:34:01 +02:00
Jonas Jenwald
7a19def34c Extend getSupplementalGlyphMapForCalibri with more entries (issue 15443) 2022-09-15 22:19:16 +02:00
Jonas Jenwald
2f2ecad8fd Extend getGlyphMapForStandardFonts with some quote-entries (issue 15441) 2022-09-15 11:37:20 +02:00
Jonas Jenwald
21fe5017bb Remove the abstract BaseViewer-class
After the changes in PR 14112 the `PDFViewer`-class is now "identical" to the `BaseViewer`-class and the `PDFSinglePageViewer`-class is just a very thin wrapper around the `BaseViewer`-class.
Hence we can rename these files, and also remove the abstract `BaseViewer`-class, which helps reduce some unnecessary "closures" in the *built* viewer.

*Please note:* These changes are made in two separate commits, to allow GitHub to preserve `blame` for the affected files.
2022-09-08 12:38:17 +02:00
Jonas Jenwald
6dc4c994b8 Remove the abstract BaseViewer-class
After the changes in PR 14112 the `PDFViewer`-class is now "identical" to the `BaseViewer`-class and the `PDFSinglePageViewer`-class is just a very thin wrapper around the `BaseViewer`-class.
Hence we can rename these files, and also remove the abstract `BaseViewer`-class, which helps reduce some unnecessary "closures" in the *built* viewer.

*Please note:* These changes are made in two separate commits, to allow GitHub to preserve `blame` for the affected files.
2022-09-08 12:38:17 +02:00
Jonas Jenwald
af6aacfc0e
Merge pull request #15398 from Snuffleupagus/more-optional-chaining
Use more optional chaining in the code-base
2022-09-06 20:31:03 +02:00
Jonas Jenwald
38ee28b1d3 Use more optional chaining in the code-base
This patch updates a bunch of older code, that makes conditional function calls, to use optional chaining rather than `if`-blocks.

These mostly mechanical changes reduce the size of the `gulp mozcentral` build by a little over 1 kB.
2022-09-05 15:41:53 +02:00
Jonas Jenwald
947d390421 Fallback to a standard font when a Type1 font program is empty (issue 15292)
*Please note:* This is only a, hopefully generally helpful, work-around rather than a proper solution to issue 15292.

There's something that's "special" about the Type1 fonts in the referenced PDF document, since we don't manage to find any actual font programs and thus cannot render anything.
Given that it shouldn't make sense for a Type1 font program to ever be empty, since that means that there's no glyph-data to render, we simply fallback to a standard font to at least try and render *something* in these rare cases.
2022-09-05 12:07:19 +02:00
Jonas Jenwald
9578152ae4
Merge pull request #15392 from Snuffleupagus/issue-15352
Don't allow `adjustToUnicode` to extend a built-in /ToUnicode map (issue 15352)
2022-09-04 15:12:10 +02:00
Calixte Denizet
6c6f6fb2b8 Don't replace cr by a white space when the last char on the line is an ideographic char 2022-09-04 14:21:05 +02:00
Jonas Jenwald
12d60e0acf Don't allow adjustToUnicode to extend a built-in /ToUnicode map (issue 15352)
Given that the change in PR 13393 was slightly speculative, given the lack of test-cases, let's just revert part of that to fix the referenced issue.
Based on a quick look at old issues and existing test-cases, it seems that most (if not all) PDF documents that benefit from using the font-data in this way lack any /ToUnicode maps which should mean that they're unaffected by these changes.
2022-09-03 23:11:42 +02:00
Jonas Jenwald
cc4baa2fe9 [api-minor] Add basic support for the SetOCGState action (issue 15372)
Note that this patch implements the `SetOCGState`-handling in `PDFLinkService`, rather than as a new method in `OptionalContentConfig`[1], since this action is nothing but a series of `setVisibility`-calls and that it seems quite uncommon in real-world PDF documents.

The new functionality also required some tweaks in the `PDFLayerViewer`, to ensure that the `layersView` in the sidebar is updated correctly when the optional-content visibility changes from "outside" of `PDFLayerViewer`.

---
[1] We can obviously move this code into `OptionalContentConfig` instead, if deemed necessary, but for an initial implementation I figured that doing it this way might be acceptable.
2022-09-01 17:34:24 +02:00
Jonas Jenwald
216b86a082 [api-minor] Support Named-actions in the outline (issue 15367)
Apparently this is implemented in e.g. Adobe Reader, and the specification does support it, however it cannot be commonly used in real-world PDF documents since it took over ten years for this feature to be requested.
2022-08-30 18:47:45 +02:00
Jonas Jenwald
571ce13dd6 [api-major] Remove the enhanceTextSelection functionality (PR 15145 follow-up)
For the `gulp mozcentral` command, this reduces the size of the *built* `pdf.js` file by `> 10` kB.
2022-08-28 15:04:47 +02:00
Jonas Jenwald
723584dd4f Enable the unicorn/prefer-array-find ESLint plugin rule
Please find additional information here:
 - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/find#browser_compatibility
 - https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/prefer-array-find.md
2022-08-24 16:46:26 +02:00
Calixte Denizet
c06c5f7cbd [Annotations] charLimit === 0 means unlimited (bug 1782564)
Changing the charLimit in JS had no impact, so this patch aims to fix
that and add an integration test for it.
2022-08-19 11:28:28 +02:00
Jonas Jenwald
0024165f1f Move binarySearchFirstItem back to the web/-folder (PR 15237 follow-up)
This was moved into the `src/display/`-folder in PR 15110, for the initial editor-a11y patch. However, with the changes in PR 15237 we're again only using `binarySearchFirstItem` in the `web/`-folder and it thus seem reasonable to move it back there.
The primary reason for moving it back is that `binarySearchFirstItem` is currently exposed in the public API, and we always want to avoid that unless it's either PDF-related functionality or code that simply must be shared between the `src/`- and `web/`-folders. In this case, `binarySearchFirstItem` is a general helper function that doesn't really satisfy either of those alternatives.
2022-08-14 11:38:17 +02:00
calixteman
6b4c2464ad
Merge pull request #15237 from calixteman/annotation_a11y
[Annotations] Add some aria-owns in the text layer to link to annotations (bug 1780375)
2022-08-12 15:04:56 +02:00
Calixte Denizet
f316300113 [Annotations] Add some aria-owns in the text layer to link to annotations (bug 1780375)
This patch doesn't structurally change the text layer: it just adds some aria-owns
attributes to some spans.
The aria-owns attribute expect to have an element id, hence it's why it adds back an
id on the element rendering an annotation, but this id is built in using crypto.randomUUID
to avoid any potential issues with the hash in the url.
The elements in the annotation layer are moved into the DOM in order to have them in the
same "order" as they visually are.
The overall goal is to help screen readers to present to the user the annotations as
they visually are and as they come in the text flow.
It is clearly not perfect, but it should improve readability for some people with visual
disabilities.
2022-08-12 14:35:26 +02:00
Jonas Jenwald
dd95e4f851 Add *official* support for passing ArrayBuffer-data to getDocument (issue 15269)
While this has always worked, as a consequence of the implementation, it's never been officially supported.
In addition to adding basic unit-tests, this patch also introduces a couple of new JSDoc `@typedef`s in the API to avoid overly long lines.
2022-08-10 14:13:01 +02:00
Calixte Denizet
04f78c935c Fix OTS issue with empty index (#15289) 2022-08-08 22:56:26 +02:00
Jonas Jenwald
f6db7975c5 Enable the ESLint prefer-spread rule
Note that in a couple of spots the argument could be `undefined` and there we simply disable the rule instead.

Please refer to https://eslint.org/docs/latest/rules/prefer-spread
2022-08-06 10:17:00 +02:00
Calixte Denizet
3c8d8f0d02 [Editor] A pasted FreeText editor was missing when printing/saving
When a FreeText editor is pasted then it hasn't an editorDiv yet when added
to the layer, hence it's empty.
So this patch just move the call to addToAnnotationStorage to ensure we've
what we need.
2022-08-04 13:00:45 +02:00