Commit Graph

15027 Commits

Author SHA1 Message Date
Jonas Jenwald
04bdc26d3a Update the eslint-plugin-unicorn package to the latest version 2021-11-14 10:27:29 +01:00
Jonas Jenwald
1dd74efb0f Update the eslint-plugin-no-unsanitized package to the latest version 2021-11-14 10:24:41 +01:00
Jonas Jenwald
bd1e140e2a Update the dommatrix package to the latest version 2021-11-14 10:20:54 +01:00
Jonas Jenwald
9f6d37263c Update npm packages 2021-11-14 10:17:30 +01:00
Jonas Jenwald
712621b508
Merge pull request #14255 from Snuffleupagus/GrabToPan-class
Convert `GrabToPan` to a standard `class`
2021-11-13 23:24:36 +01:00
Jonas Jenwald
08d56c67ae Convert GrabToPan to a standard class
This code is the last piece[1] of the viewer that's not using standard `class`es, and by converting this code we get rid of some now unneeded boilerplate code (slightly reducing the size of the *built* `web/viewer.js` file).
Note that while this code was originally imported from a separate repository, it was last sync-ed with upstream *five years* ago which is why this re-factoring should be OK as far as I'm concerned (and we've done some other clean-up since then as well).

---
[1] Technically the `web/debugger.js` file is left as well, however that code is first of all not bundled in the *built* `web/viewer.js` file and secondly it's not even loaded by default either.
2021-11-13 23:07:36 +01:00
Jonas Jenwald
ed6af0f844 [web/grab_to_pan.js] Inline the isLeftMouseReleased helper function
Given the support information listed in the function itself, the [MDN compatibility data](https://developer.mozilla.org/en-US/docs/Web/API/MouseEvent/buttons#browser_compatibility), and the [currently supported browsers](4bb9de4b00/gulpfile.js (L79-L87)) in the PDF.js project we should be able to simplify the code by inlining the function instead.
2021-11-13 23:00:15 +01:00
Jonas Jenwald
7a428345db
Merge pull request #14271 from calixteman/params
Parse query string in using URLSearchParams
2021-11-13 22:59:34 +01:00
Calixte Denizet
fe95e100e4 Parse query string in using URLSearchParams
- I just noticed in reading the code that we parse that stuff when something exists in the web api;
 - see https://developer.mozilla.org/en-US/docs/Web/API/URLSearchParams/URLSearchParams.
2021-11-13 21:10:54 +01:00
Tim van der Meij
de7cfed9e3
Merge pull request #14260 from Snuffleupagus/telemetry-pageInfo-once
Report "pageInfo" telemetry once, rather than for each rendered page
2021-11-13 20:22:58 +01:00
Tim van der Meij
138ebb09c0
Merge pull request #14253 from Snuffleupagus/ScrollMode-PAGE-chromium
[Chromium addon] Add the Page scrolling mode to the options (PR 14112 follow-up)
2021-11-13 20:12:04 +01:00
calixteman
85c6dd59ce
Merge pull request #14268 from calixteman/outline
Remove non-displayable chars from outline title (#14267)
2021-11-13 08:12:56 -08:00
Calixte Denizet
7041c62ccf Remove non-displayable chars from outline title (#14267)
- it aims to fix #14267;
 - there is nothing about chars in range [0-1F] in the specs but acrobat doesn't display them in any way.
2021-11-13 16:56:08 +01:00
Jonas Jenwald
db41c49321
Merge pull request #14270 from Snuffleupagus/issue-14269
When parsing corrupt documents without any trailer-dictionary, fallback to the "top"-dictionary (issue 14269)
2021-11-13 16:01:28 +01:00
Jonas Jenwald
afcc99a86d When parsing corrupt documents without any trailer-dictionary, fallback to the "top"-dictionary (issue 14269)
There's obviously no guarantee that this will work in general, if the document is sufficiently corrupt, but it should hopefully be better than just throwing `InvalidPDFException` as currently happens.

Please note that, as is often the case with corrupt documents, it's somewhat difficult to know if we're rendering the document "correctly" with this patch[1]. In this case even Adobe Reader cannot open the document, which is always a good sign that it's *really* corrupt, however we're at least able to render *something* with this patch.

---
[1] Whatever "correct" even means when dealing with corrupt PDF documents, where often times different PDF viewers won't agree completely.
2021-11-13 13:21:38 +01:00
Jonas Jenwald
28fb3975eb
Merge pull request #14266 from calixteman/bug931481
Don't consider space as real space when there is an extra spacing (bug 931481)
2021-11-12 21:42:32 +01:00
Calixte Denizet
a88ff34eb7 Don't consider space as real space when there is an extra spacing (bug 931481)
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=931481;
 - real space chars are pushed in the chunk but when there is an extra spacing, the next char position must be compared with the previous one;
 - for example, an extra spacing can cancel a space so visually there are no space.
2021-11-12 18:53:48 +01:00
calixteman
7d6d3fc124
Merge pull request #14265 from calixteman/14150
XFA - Avoid an exception when looking for a font in a parent node
2021-11-12 09:53:19 -08:00
Calixte Denizet
5b7e1f5232 XFA - Avoid an exception when looking for a font in a parent node
- it aims to fix issue https://github.com/mozilla/pdf.js/issues/14150;
  - a parent can be null in case the root has been reached, so just add a check.
2021-11-12 16:27:08 +01:00
Brendan Dahl
3a31b7ef0c
Merge pull request #14258 from Snuffleupagus/issue-14256-2
Always prefer abbreviated keys, over full ones, when doing any dictionary lookups (issue 14256)
2021-11-11 14:23:43 -08:00
Jonas Jenwald
8eed0b9145 Report "pageInfo" telemetry once, rather than for each rendered page
Reporting telemetry, in Firefox, includes using `JSON.stringify` on the data and then sending an event to the `PdfStreamConverter.jsm`-code.
In that code the event is handled and `JSON.parse` is used to retrieve the data, and in the "pageInfo"-case we'll then proceed to ignore everything except *the first* such event; see https://searchfox.org/mozilla-central/rev/24fac1ad31fb9c6e9c4c767c6a7ff45d226078f3/toolkit/components/pdfjs/content/PdfStreamConverter.jsm#509-514

All-in-all, sending the "pageInfo" telemetry for each rendered page is thus unnecessary and this patch makes the viewer send it only *once* instead.
2021-11-11 12:36:06 +01:00
Jonas Jenwald
ea1c348c67 Always prefer abbreviated keys, over full ones, when doing any dictionary lookups (issue 14256)
Note that issue 14256 was specifically about *inline* images, please refer to:
 - https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G7.1852045
 - https://www.pdfa.org/safedocs-unearths-pdf-inline-image-issue/
 - https://pdf-issues.pdfa.org/32000-2-2020/clause08.html#H8.9.7

However, during review of the initial PR in https://github.com/mozilla/pdf.js/pull/14257#issuecomment-964469710, it was suggested that we instead do this *unconditionally for all* dictionary lookups.
In addition to re-ordering the existing call-sites in the `src/core`-code, and adding non-PRODUCTION/TESTING asserts to catch future errors, for consistency a number of existing `if`/`switch`-blocks were re-factored to also check the abbreviated keys first.
2021-11-10 11:56:18 +01:00
Brendan Dahl
4ee906adf4
Merge pull request #14209 from Snuffleupagus/issue-14205
[Google Chrome] Ensure that `markedContent` spans are placed in the top-left corner (issue 14205)
2021-11-09 07:59:14 -08:00
calixteman
4bb9de4b00
Merge pull request #14239 from calixteman/1739502
XFA - Fix a breakBefore issue when target is a contentArea and startNew is 1 (bug 1739502)
2021-11-08 03:14:42 -08:00
Jonas Jenwald
27e461a897 [Chromium addon] Add the Page scrolling mode to the options (PR 14112 follow-up) 2021-11-08 10:18:25 +01:00
Jonas Jenwald
8064318015
Merge pull request #14250 from calixteman/14249
XFA - Encode tag names in UTF-8 when saving (fix #14249)
2021-11-07 22:28:53 +01:00
Calixte Denizet
13ae6d493a XFA - Encode tag names in UTF-8 when saving (fix #14249) 2021-11-07 21:41:37 +01:00
Tim van der Meij
891f21fba6
Merge pull request #14245 from Snuffleupagus/PDFPageViewBuffer-class
Convert `PDFPageViewBuffer` to a standard class, and use a `Set` internally
2021-11-07 14:37:33 +01:00
Jonas Jenwald
13ef763222 [Google Chrome] Ensure that markedContent spans are placed in the top-left corner (issue 14205)
*This is a tentative patch, since we unfortunately cannot easily test it (as far as I can tell).*

In Firefox this (obviously) works as-is, but in Google Chrome the `markedContent` spans are inserted within the regular text-content (in the DOM) and with non-zero heights.
2021-11-07 11:01:35 +01:00
calixteman
efb4455749
Merge pull request #14240 from calixteman/14014
XFA - Get each page asynchronously in order to avoid blocking the event loop (#14014)
2021-11-06 13:21:43 -07:00
Tim van der Meij
b820beb8d9
Merge pull request #14244 from Snuffleupagus/issue-14243
Prevent mobile devices from interfering with the textLayer-elements (issue 14243)
2021-11-06 19:30:47 +01:00
Calixte Denizet
1681e25008 XFA - Get each page asynchronously in order to avoid blocking the event loop (#14014) 2021-11-06 13:25:03 +01:00
Jonas Jenwald
12d41bcba4 Prevent mobile devices from interfering with the textLayer-elements (issue 14243)
*This is a tentative patch, since I don't have the necessary hardware to test it.*

See https://developer.mozilla.org/en-US/docs/Web/CSS/text-size-adjust, which is currently ignored in Firefox.
It seems overall safer, and more future-proof, to simply add this to the *entire* `textLayer` rather than its individual elements.
2021-11-06 11:39:43 +01:00
Jonas Jenwald
a774707e31 Remove the moveToEndOfArray helper function, since it's unused
With the previous patch, this helper function is no longer used and keeping it around will simply increase the size of the builds.
This removal is purposely done *separately*, to make it easy to revert the patch in the future if this helper function would become useful again.
2021-11-06 10:19:17 +01:00
Jonas Jenwald
f55bf42398 Convert PDFPageViewBuffer to use a Set internally
This relies on the fact that `Set`s preserve the insertion order[1], which means that we can utilize an iterator to access the *first* stored view.

Note that in the `resize`-method, we can now move the visible pages to the back of the buffer using a single loop (hence we don't need to use the `moveToEndOfArray` helper function any more).

---
[1] This applies to `Map`s as well, although that's not entirely relevant here.
2021-11-06 10:19:17 +01:00
Jonas Jenwald
0eba15b43a Convert PDFPageViewBuffer to a standard class
This patch makes use of private `class` fields, to ensure that the previously "private" properties remain as such.
2021-11-06 10:19:17 +01:00
Jonas Jenwald
c62bcb55ac Adjust the "handles resize correctly, with idsToKeep provided" unit-test (PR 14238 follow-up)
This small change will help validate an important part of the upcoming re-factoring, regarding the *correct* iteration of the `Set` in the `PDFPageViewBuffer.resize` method in particular.
2021-11-06 10:19:17 +01:00
Jonas Jenwald
38efd13a54
Merge pull request #14230 from brendandahl/reduce-gradient-size
Create shading patterns the size of the current path. (bug 1722807)
2021-11-06 10:17:49 +01:00
Brendan Dahl
b56cca0324 Create shading patterns the size of the current path. (bug 1722807)
Previously, when we created a shading pattern canvas we created it
as the same size as the page. This was good for caching if the same
pattern was used over and over again, but when lots of different
shadings are created that caused us to create many full page
canvases.

Instead of creating the full page canvses, create the canvas
as the same size as the current path bounding box. This reduces memory
consumption by a lot since most paths are pretty small. Also, in real world
PDFs it's rare for a shading (non shading fill) to be reused over and over again.
Bug 1721949 is an example where the same pattern is reused and it will be slightly
slower than before.
2021-11-05 20:44:18 -07:00
Brendan Dahl
3b5a463357
Merge pull request #14241 from brendandahl/group-bbox-matrix
Don't double apply a group xobject's bbox.
2021-11-05 20:39:02 -07:00
Brendan Dahl
8161d3f29d Don't double apply a group xobject's bbox.
In `beginGroup` we create a new canvas that is the size of the
bounding box and we translate it to the offset. This means we don't need to
also apply the bounding box during `paintFormXObjectBegin`.

This improves #6961 quite a bit, but it still is missing the indention
in the ruler.
2021-11-05 15:40:58 -07:00
Tim van der Meij
30bd5f0a39
Merge pull request #14238 from Snuffleupagus/PDFPageViewBuffer-tests
Add a couple of basic unit-tests for `PDFPageViewBuffer`
2021-11-05 19:58:15 +01:00
Jonas Jenwald
fe205efd8d Add a couple of basic unit-tests for PDFPageViewBuffer
The `PDFPageViewBuffer`-code is very important for the correct function of the viewer, but it's currently not tested at all.
While the `PDFPageViewBuffer` is obviously intended to be used with `PDFPageView`-instances, it only accesses a couple of `PDFPageView` properties/methods and consequently it's fairly easy to unit-test this code with dummy-data.

These unit-tests should help improve our confidence in this code, and will also come in handy with other changes that I'm working on (regarding modernizing and re-factoring the `PDFPageViewBuffer`-code).
2021-11-05 19:43:20 +01:00
Calixte Denizet
a08763f4aa XFA - Fix a breakBefore issue when target is a contentArea and startNew is 1 (bug 1739502)
- it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1739502;
 - when the target area was the current content area, everything was pushed in it instead of creating a new one (and consequently a new pageArea is created).
 - the pdf shows an alignment issue on page 4:
   - the hAlign is "center" but the subform was the width of its parent, so compute the real width of the subform with tb layout;
 - there is an extra empty page at the end of the pdf:
   - there is a subform with some hidden elements which are not rendered for now (since there is no plugged JS engine it isn't possible to draw them in changing their visibility).
   - so in case a subform is empty and has no real dimensions (at least one is 0), we just consider it as empty.
2021-11-05 18:59:55 +01:00
calixteman
e136afbabc
Merge pull request #14218 from janekotovich/subform_min_0
XFA subform with occur min=0 and no bound data displaying.
2021-11-05 04:12:34 -07:00
Jonas Jenwald
8222d6530b
Merge pull request #14232 from brendandahl/show-text-pattern
Use correct matrix for patterns with showText.
2021-11-05 10:04:56 +01:00
Brendan Dahl
1c7048399b Use correct matrix for patterns with showText.
We were incorrectly using the transform in the pattern before it had been
adjusted causing the pattern to be misplaced relative to the page.

Fixes: ShowText-ShadingPattern.pdf (already in corpus)
Fixes: #8111
Fixes: #9243
2021-11-04 16:57:36 -07:00
Jane-Kotovich
56b502391c XFA subform with occur min=0 and no bound data displaying
Subfrom nomin displays even though it's subform is set to <occur max=-1 min=0>
If we look through specs of XFA 3.3 : https://www.pdfa.org/norm-refs/XFA-3_3.pdf
- The min attribute is used when processing a form that contains data. Regardless of the data at least this number of instances is included. It is permissible to set this value to zero, in which case the container is entirely excluded if there is no data for it.

However, in our case it doesn't happen, because we let our empty dataNode get through. Though by setting a clause:
- eliminate unmatched data with occur min=0
we are checking our empty data and sending it to uselessNode array where at the end it gets removed;
2021-11-04 20:22:05 +10:00
Jonas Jenwald
611627f5a1
Merge pull request #14219 from Snuffleupagus/getVisibleElements-ids
Let `getVisibleElements` return a Set containing the visible element `id`s
2021-11-03 23:49:27 +01:00
Jonas Jenwald
e1a35e7bb6
Merge pull request #14213 from Snuffleupagus/issue-11656
Tweak the Bidi-detection heuristics for very short RTL strings (issue 11656)
2021-11-03 22:09:14 +01:00