Commit Graph

1215 Commits

Author SHA1 Message Date
Jonas Jenwald
59dd4ea2b0 Lookup image-data correctly in paintImageMaskXObjectGroup (issue 14990)
*This fixes a regression from PR 14754.*

We didn't lookup the image-data correctly, with the result that we tried to render some ImageMasks using a string rather than the intended TypedArray. To make matters worse, this code-path was apparently not *properly* covered by existing test-cases.
2022-06-05 12:39:23 +02:00
Calixte Denizet
66b513fc00 [Annotations] Show buttons even if they've no actions
- it's a regression from PR #14247:
 - before the PR, the button was rendered on the canvas whatever its status was;
 - after the PR, the button image has been moved in an other canvas so when the button is not renderable
   (because it has no actions) then the image is not added the HTML element.
- the buttons in the pdf in bug 1737260 or in the pdf in #14308 were not visible
- make the button always renderable but don't add the link element if it's useless.
2022-05-28 23:50:50 +02:00
Calixte Denizet
9d82106d20 Set the text fields font size based on their height
- right now we're using the font size from the pdf itself but we use an other font
  in the annotation layer. So this size doesn't really make sense and leads to bad
  rendering (see pdf in #14928);
- use a sans-serif font for the fields containing text (fix issue #14736);
- remove useless padding in text-based fields (fix issue #14301);
- text fields allow/disallow scrolling bars (see bit 24 in Ff entry), so use this
  value to hide/show scrollbars in annotation layer.
2022-05-28 18:00:39 +02:00
Calixte Denizet
c7afce4210 Support Hangul syllables when searching some text (bug 1771477)
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1771477;
- hangul contains some syllables which are decomposed when using NFD, hence
  the text must be correctly shifted in case it contains some of them.
2022-05-28 16:50:03 +02:00
Jonas Jenwald
5a2899c57e Skip bogus d1 operators in Type3-glyphs (issue 14953)
In the `src/display/canvas.js` code the `d1` operator will be used to set the clipping region, and it obviously cannot be empty since that prevents the Type3-glyph from rendering.

Also, the patch removes an outdated comment; refer to PR 12718.
2022-05-24 12:20:31 +02:00
Calixte Denizet
9407adc416 [JS] Format all the fields if any when the document is open (bug 1766987)
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1766987.
2022-05-22 15:50:42 +02:00
Calixte Denizet
60498c67e4 Display background when printing or saving a text widget (issue #14928) 2022-05-19 16:41:54 +02:00
Jonas Jenwald
5a774b7ed3 Adjust the heuristics for handling of incomplete path operators (issue 14917)
This limits the heuristics for handling of incomplete path operators, see PR 9838, to only apply to *sequences* of such operators. In practice a couple of invalid path operators are (hopefully) unlikely to completely break rendering, whereas a sequence of them will easily lead to fairly chaotic rendering artifacts.
2022-05-15 11:24:39 +02:00
Jonas Jenwald
6e7e9d83d8 Add support for TrueType format 12 cmaps (issue 14881)
This is, as far as I can tell, the first case we've seen of a format 12 `cmap`.
Please see https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6cmap.html
2022-05-06 11:11:38 +02:00
Jonas Jenwald
8135d7ccf6
Merge pull request #14869 from calixteman/14862
[JS] Fix few bugs present in the pdf for issue #14862
2022-05-03 18:31:31 +02:00
Calixte Denizet
094ff38da0 [JS] Fix few bugs present in the pdf for issue #14862
- since resetForm function reset a field value a calculateNow is consequently triggered.
  But the calculate callback can itself call resetForm, hence an infinite recursive loop.
  So basically, prevent calculeNow to be triggered by itself.
- in Firefox, the letters entered in some fields were duplicated: "AaBb" instead of "AB".
  It was mainly because beforeInput was triggering a Keystroke which was itself triggering
  an input value update and then the input event was triggered.
  So in order to avoid that, beforeInput calls preventDefault and then it's up to the JS to
  handle the event.
- fields have a property valueAsString which returns the value as a string. In the
  implementation it was wrongly used to store the formatted value of a field (2€ when the user
  entered 2). So this patch implements correctly valueAsString.
- non-rendered fields can be updated in using JS but when they're, they must take some properties
  in the annotationStorage. It was implemented for field values, but it wasn't for
  display, colors, ...
- it fixes #14862 and #14705.
2022-05-03 15:48:44 +02:00
Jonas Jenwald
df5a4fd0a7 Support encoded dest-strings in /GoTo destination dictionaries (issue 14864)
Interestingly enough this appears to be the very first case of *encoded* dest-strings, in /GoTo destination dictionaries, that we've actually come across. What's really fascinating is that it's less than a week after issue 14847, given that these issues are *somewhat* similar.
2022-05-02 10:14:32 +02:00
Calixte Denizet
624d8a8e3e Use integer coordinates when drawing images (bug 1264608, issue #3351)
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1264608;
- it's only a partial fix for #3351;
- some tiled images have some spurious white lines between the tiles.
  When the current transform is applyed the corners of an image can have
  some non-integer coordinates leading to some extra transparency added
  to handle that. So with this patch the current transform is applied on the
  point and on the dimensions in order to have at the end only integer values.
2022-04-29 16:01:34 +02:00
Jonas Jenwald
71370d012b Support destinations in NameTrees with encoded keys (issue 14847)
Initially I considered updating the `NameOrNumberTree`-implementation to handle encoded keys, however that quickly became somewhat messy (especially in the `NameOrNumberTree.get`-method) since only NameTrees using string-keys.
Hence the easiest solution, as far as I'm concerned, was thus to just update the `Catalog.destinations`-getter instead. Please note that in the referenced PDF document the `Catalog.destination`-method will thus fallback to fetch all destinations, which should be fine since this is the very first case of encoded keys that we've seen.

Also changes the `NameOrNumberTree.getAll`-method to prevent a possible run-time error, although we've so far not seen such a case, for any non-Array Kids-entries found in a NameTree/NumberTree.

Finally, to improve overall consistency and to hopefully prevent future bugs, the patch also updates a couple of other `NameTree` call-sites to correctly handle encoded keys. (Note that the `Catalog.attachments`-getter was already doing this.)
2022-04-27 11:19:55 +02:00
Tim van der Meij
752dee5caa
Merge pull request #14825 from Snuffleupagus/issue-14824
Ensure that worker-thread image caching doesn't break optional content (issue 14824)
2022-04-23 13:19:56 +02:00
Tim van der Meij
f9e54d9226
Merge pull request #14823 from Snuffleupagus/issue-14821
Ignore invalid /Encoding-entries when parsing fonts (issue 14821)
2022-04-23 13:19:26 +02:00
Jonas Jenwald
6c229dffb1 Ensure that worker-thread image caching doesn't break optional content (issue 14824)
Currently we only insert optionalContent-data into the operatorList the first time that an image is parsed, which will (in hindsight) obviously cause problems for cached images.
Hence we also need to insert the optionalContent-data in the various worker-thread image caches, such that it can be accessed in the fast-paths that are used to skip re-parsing of images.

In order to reduce the amount of repeated code, this patch also adds a new `OperatorList`-method that takes care of inserting the necessary data in the operatorList.
2022-04-22 14:49:16 +02:00
Jonas Jenwald
e723da7261 Ignore invalid /Encoding-entries when parsing fonts (issue 14821)
In the referenced PDF document the fonts have /Encoding-entries that are Streams (containing completely bogus data), which are thus obviously not valid here.
Hence, only when `ignoreErrors` is set, we'll now ignore these corrupt /Encoding-entries and fallback to the existing code to try and infer a usable encoding.

Given that this is *clearly* a case of corrupt PDF documents, there's no guarantee that this will "fix" all such cases, however it's the best that we do here and shouldn't really be worse than ignoring an entire font.
2022-04-22 11:49:03 +02:00
Jonas Jenwald
39d1bdde09 Ignore non-Stream /SMask-entries when parsing images (issue 14814)
This is similar to the pre-existing check used in the /Mask-case below, to handle *corrupt* PDF documents that include non-Stream /SMask-entries in images; please refer to the PDF specification:
https://web.archive.org/web/20220309040754if_/https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#page=216

*Please note:* Adobe Reader also fails to render the image on the second page, and displays an error message.
2022-04-21 12:14:08 +02:00
Jonas Jenwald
5bc7339c1b Add support for the /Catalog Base-URI when resolving URLs (issue 14802)
As far as I can tell, this is actually the very first time that we've seen a PDF document with a Base-URI specified in the /Catalog; please refer to the specification:
https://web.archive.org/web/20220309040754if_/https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G11.2097122

To simplify the overall implementation, this new parameter is accessed via the existing `BasePdfManager.docBaseUrl`-getter and will thus override any user-specified `docBaseUrl` API-parameter.
2022-04-19 17:14:52 +02:00
Calixte Denizet
3d74d2c6cb Don't clip when the clip path is empty (issue #12306) 2022-04-18 10:33:44 +02:00
Calixte Denizet
f62d961dfe Improve performances with image masks (bug 857031)
- it's the second part of the fix for https://bugzilla.mozilla.org/show_bug.cgi?id=857031;
- some image masks can be used several times but at different positions;
- an image need to be pre-process before to be rendered:
  * rescale it;
  * use the fill color/pattern.
- the two operations above are time consuming so we can cache the generated canvas;
- the cache key is based on the current transform matrix (without the translation part)
  and the current fill color when it isn't a pattern.
- the rendering of the pdf in the above bug is really faster than without this patch.
2022-04-16 20:48:39 +02:00
Calixte Denizet
040fcae5ab Improve performance with image masks (bug 857031)
- it aims to partially fix performance issue reported: https://bugzilla.mozilla.org/show_bug.cgi?id=857031;
- the idea is too avoid to use byte arrays but use ImageBitmap which are a way faster to draw:
  * an ImageBitmap is Transferable which means that it can be built in the worker instead of in the main thread:
    - this is achieved in using an OffscreenCanvas when it's available, there is a bug to enable them
      for pdf.js: https://bugzilla.mozilla.org/show_bug.cgi?id=1763330;
    - or in using createImageBitmap: in Firefox a task is sent to the main thread to build the bitmap so
      it's slightly slower than using an OffscreenCanvas.
  * it's transfered from the worker to the main thread by "reference";
  * the byte buffers used to create the image data have a very short lifetime and ergo the memory used is globally
    less than before.
- Use the localImageCache for the mask;
- Fix the pdf issue4436r.pdf: it was expected to have a binary stream for the image;
- Move the singlePixel trick from operator_list to image: this way we can use this trick even if it isn't in a set
  as defined in operator_list.
2022-04-09 18:26:26 +02:00
Calixte Denizet
0b597304c1 [Annotations] Some annotations can have their values stored in the xfa:datasets
- it aims to fix #14685;
- add a basic object to get values from the parsed datasets;
- these annotations don't have an appearance so we must create one when printing or saving.
2022-04-01 10:28:04 +02:00
Calixte Denizet
18e79e3c0b [text selection] Add the whitespaces present in the pdf in the text chunk
- it aims to fix issue #14627;
- the basic idea of the recent text refactoring was to only consider the rendered visible whitespaces.
  But sometimes, the heuristics aren't correct and although some whitespaces are in the text stream
  they weren't in the text chunks because they were too small. Hence we added some exceptions, for example,
  we always add a whitespace when it is between two non-whitespace chars but only when in the same Tj.
  So basically, this patch removes the constraint to have the chars in the same Tj
  (in using a circular buffer to save the two last chars) but don't add a space when the visible space is really
  too small (hence `NOT_A_SPACE_FACTOR`).
2022-03-27 14:34:56 +02:00
Jonas Jenwald
1a7921dbf0 Compute the loca table endOffset, of the "first" glyph, correctly (issue 14618)
When there are *multiple* empty glyphs at the start of the data, ensure that the "first" glyph gets a correct `endOffset` to avoid skipping it during parsing in the `sanitizeGlyph` function.
2022-03-03 14:22:45 +01:00
Brendan Dahl
85ff7b117e
Merge pull request #14536 from calixteman/thin_line
Fix some issues with lineWidth < 1 after transform (bug 1753075, bug 1743245, bug 1710019)
2022-03-02 09:46:15 -08:00
Jeff Muizelaar
9b9609a6d8 [JPEG 2000] Add support for resetContextProbabilities (bug 1731483) 2022-02-26 13:05:23 +01:00
Calixte Denizet
46369e4aa5 Fix some issues with lineWidth < 1 after transform (bug 1753075, bug 1743245, bug 1710019)
- it aims to fix:
   - https://bugzilla.mozilla.org/show_bug.cgi?id=1753075;
   - https://bugzilla.mozilla.org/show_bug.cgi?id=1743245;
   - https://bugzilla.mozilla.org/show_bug.cgi?id=1710019;
   - issue #13211;
   - issue #14521.
 - previously we were trying to adjust lineWidth to have something correct after the current transform is applied but this approach was not correct because finally the pixel is rescaled with the same factors in both directions.
  And sometimes those factors must be different (see bug 1753075).
 - So the idea of this patch is to apply a scale matrix to the current transform just before setting lineWidth and stroking. This scale matrix is computed in order to ensure that after transform, a pixel will have its two thickness greater than 1.
2022-02-25 18:37:34 +01:00
Jonas Jenwald
530af48b8e
Merge pull request #14569 from brendandahl/smask-state
Fix canvas state getting out of sync from smasks. (bug 1755507)
2022-02-18 19:35:58 +01:00
Brendan Dahl
7def6d12c8 Fix canvas state getting out of sync from smasks. (bug 1755507)
Soft masks can be enabled/disabled at anytime and at different
points in the save/restore stack. This can lead to
the amount of save/restores becoming unbalanced across the
two canvases. Instead of save/restoring on the temporary canvas
change it so we only track state on the main (suspended canvas).

I was also getting an out balance stack from patterns, so I've also
fixed that and added a warning that will at least show up on chrome.
It would be nice to add this so Firefox at some point too.

Fixes #11328, #14297 and bug 1755507
2022-02-17 17:38:32 -08:00
Calixte Denizet
18f4e560ae [Search] Some matches were incorrectly shifted because of some '-\n'
- it aims to fix #14562;
- 'X-\n' were not correctly positioned;
- when X is a diacritic (e.g. in "sä-\n", which is decomposed into "sa¨-\n") we must handle both things:
  - diacritics on the one hand;
  - "-\n" on the other hand.
2022-02-14 10:12:33 +01:00
Calixte Denizet
18e3a98c2b [api-minor] Don't add in the text content the chars which are out-of-page (bug 1755201)
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1755201;
- if the glyph position is not within the view then skip it.
2022-02-13 21:07:11 +01:00
Brendan Dahl
f8b2a99ddc
Merge pull request #14543 from Snuffleupagus/bug-1753983
Let `Lexer.getNumber` treat a single minus sign as zero (bug 1753983)
2022-02-09 14:06:35 -08:00
Jonas Jenwald
60efae96fd Update the file used with the xfa_bug1720182 test-case
The file used in this test-case is *identical* to, i.e. the md5 entry perfectly matches, the file used with the `xfa_bug1716380` test-case.
While it's obviously fine to use the same PDF document in different reference-tests, note how we e.g. have both `eq` and `text` tests for one document, we should always avoid adding *duplicate* files in the `test/pdfs/` folder.
2022-02-08 14:23:41 +01:00
Jonas Jenwald
64f3dbeb48 Let Lexer.getNumber treat a single minus sign as zero (bug 1753983)
This appears to be consistent with the behaviour in both Adobe Reader and PDFium (in Google Chrome); this is essentially the same approach as used for a single decimal point in PR 9827.
2022-02-07 17:09:47 +01:00
Calixte Denizet
1f41028fcb Support search with or without diacritics (bug 1508345, bug 916883, bug 1651113)
- get original index in using a dichotomic seach instead of a linear one;
  - normalize the text in using NFD;
  - convert the query string into a RegExp;
  - replace whitespaces in the query with \s+;
  - handle hyphens at eol use to break a word;
  - add some \s* around punctuation signs
2022-02-03 15:42:55 +01:00
Jonas Jenwald
e13353cf21 Remove the xfa_bug1716838 browser-test since it's a duplicate
The md5 entry perfectly matches the `xfa_bug1717668_1` test-case, which means that we unnecessarily test the same exact document twice.
2022-02-01 12:05:19 +01:00
Jonas Jenwald
ef1676678f Remove the MBTA-pretax-form-July2012 browser-test since it's a duplicate
The md5 entry perfectly matches the `xfa_bug1718521_2` test-case, which means that we unnecessarily test the same exact document twice.
2022-01-31 12:35:26 +01:00
Calixte Denizet
ae842e1c3a [api-minor] Annotations - Adjust the font size in text field in considering the total width (bug 1721335)
- it aims to fix #14502 and bug 1721335;
 - Acrobat and Pdfium do the same;
 - it'll avoid to have truncated data when printed;
 - change the factor to compute font size in using field height: lineHeight = 1.35*fontSize
  - this is the value used by Acrobat.
 - in order to not have truncated strings on the bottom, add few basic metrics for standard fonts.
2022-01-30 15:53:31 +01:00
calixteman
838909f8c1
Merge pull request #14491 from quaoaris/lines-rendered-too-thick
fix for lines (stroke) are rendered too thick  (Bug 1743245)
2022-01-27 18:46:26 +01:00
Calixte Denizet
3a7004ca25 Take into account all rotations before comparing glyph positions
- it aims to fix #14497;
 - previously, only rotations with an angle 0, 90, 180 or 270 were taken into account;
 - so generalize to any angle but keep the fast path for 0, 90, ... because they're likely more common than anything else.
2022-01-26 17:19:00 +01:00
quaoaris
3f77d80f31 fix for lines (stroke) are rendered too thick (Bug 1743245)
This commit fixes Bug 1743245 (Grided PDF file lines rendered too thick) which was created by a fix for  #12868 .
The lineWidth was set to round(1 * this._combinedScaleFactor) when the pixel is drawn as a parallelorgam with a height <1. This fix changes this to floor(1*this._combinedScaleFactor) .

This change shows a visual result comparable to Chrome and Acrobat.
Regarding the last PR 3 statements in canvas.js are affected and will change with this commit (stroke and paintChar).

renaming the reference files to naming comvention
2022-01-25 10:27:30 +01:00
Calixte Denizet
e1d3a3b414 Remove the invisible format marks from the text chunks
- it aims to fix issue #9186.
2022-01-24 13:47:24 +01:00
calixteman
88236e1163
Merge pull request #14430 from calixteman/beforeinput
[JS] Use beforeinput event to trigger a keystroke event in the sandbox
2022-01-23 20:42:33 +01:00
Calixte Denizet
6ac296e48e [JS] Use beforeinput event to trigger a keystroke event in the sandbox
- it aims to fix issue #14307;
 - this event has been added recently in Firefox and we can now use it;
 - fix few bugs in aform.js or in annotation_layer.js;
 - add some integration tests to test keystroke events (see `AFSpecial_Keystroke`);
 - make dispatchEvent in the quickjs sandbox async.
2022-01-23 19:53:01 +01:00
Tim van der Meij
23b6fde9fc
Merge pull request #14464 from Snuffleupagus/issue-14462
Support Type1 font files with incomplete /CharStrings definitions (issue 14462)
2022-01-19 20:38:46 +01:00
Calixte Denizet
74f25d2755 Font renderer - get int8 instead of uint8 in composite glyphes (bug 1749563)
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1749563;
 - use some helper functions to get (u|i)int** values in buffer: it helps to have a clearer code;
 - in composite glyphes the translations values with a transformations are signed so consequently get some int8 instead of uint8;
 - add few TODOs.
2022-01-18 22:06:23 +01:00
Jonas Jenwald
a13ae5d97d Support Type1 font files with incomplete /CharStrings definitions (issue 14462)
Please refer to https://www.pdfa.org/norm-refs/Type1Fonts.pdf#page=15 for the expected format for the /CharStrings entries.
In the referenced PDF document the /CharStrings are missing the expected end-token, which causes us to swallow the start of the next glyph name.
2022-01-17 18:55:22 +01:00
Tim van der Meij
922dac035c
Merge pull request #14448 from Snuffleupagus/Type3-circular-refs
Prevent circular references in Type3 fonts
2022-01-15 14:11:47 +01:00