Commit Graph

1185 Commits

Author SHA1 Message Date
Jonas Jenwald
3ced0dec1b [api-major] Remove the SVG back-end (PR 15173 follow-up)
This has been deprecated since version `2.15.349`, which is a year ago.
Removing this will also simplify some upcoming changes, specifically outputting of JavaScript modules in the builds.
2023-10-01 23:14:29 +02:00
Jonas Jenwald
9624505f0f Use a standard export statement in the web/pdfjs.js file
This removes the only remaining old and non-standard handling of exports in the `web/`-folder, since some initial attempts at outputting JavaScript modules in the builds have identified this file as a potential problem.
While this uses a hard-coded list, for overall simplicity, I don't believe that that's a big problem since:
 - Generating this file automatically would require a bunch more parsing *every single time* that the library is built.
 - The official API-surface doesn't change often enough for this to really impede development in any significant way.
 - The added unit-test helps ensure that this list cannot accidentally become outdated.
2023-09-30 12:10:02 +02:00
Calixte Denizet
f2196f7803 StructParents entry isn't required on pages with no tagged contents (bug 1855641) 2023-09-28 14:23:10 +02:00
Calixte Denizet
3ee5268a23 [Editor] Don't try to add data to the struct tree when there is no accessibilityData (bug 1855157) 2023-09-26 11:02:14 +02:00
Jonas Jenwald
1df31c0284 Use one noContextMenu function in both the src/- and web/-folders
Currently we duplicate this event handler function in multiple places, which seems unnecessary.
2023-09-23 15:37:13 +02:00
Calixte Denizet
6545551e76 [Editor] Avoid to darken the current editor when opening the alt-text dialog 2023-09-21 20:44:53 +02:00
Jonas Jenwald
e2b7896826 [GeckoView] Avoid bundling the AltTextManager class, since it's unused 2023-09-21 12:51:34 +02:00
Calixte Denizet
a8573d4e1b [Editor] Add the ability to create/update the structure tree when saving a pdf containing newly added annotations (bug 1845087)
When there is no tree, the tags for the new annotions are just put under the root element.
When there is a tree, we insert the new tags at the right place in using the value
of structTreeParentId (added in PR #16916).
2023-09-16 18:34:58 +02:00
Tim van der Meij
66507ccae8
Enable unit test "creates pdf doc from non-existent URL"
The unit test is re-enabled because it no longer seems to fail after 10
runs on Linux where this used to fail often. Code inspection also shows
that the code is correct and should raise the previous exception
(anymore). Finally, a lot has changed since this test was disabled such
as new Jasmine versions, new Linux bot OS version and new browser
versions.
2023-09-10 15:47:04 +02:00
Jonas Jenwald
df9cce39c0 Slightly reduce asynchronicity when parsing Annotations
Over time the amount of "document level" data potentially needed during parsing of Annotations have increased a fair bit, which means that we currently need to ensure that a bunch of data is available for each individual Annotation.
Given that this data is "constant" for a PDF document we can instead create (and cache) it lazily, only when needed, *before* starting to parse the Annotations on a page. This way the parsing of individual Annotations should become slightly less asynchronous, which really cannot hurt.

An additional benefit of these changes is that we can reduce the number of parameters that need to be explicitly passed around in the annotation-code, which helps overall readability in my opinion.

One potential drawback of these changes is that the `AnnotationFactory.create` method no longer handles "everything" on its own, however given how few call-sites there are I don't think that's too much of a problem.
2023-09-08 13:27:27 +02:00
Calixte Denizet
a8a50c567a Construct the correct field name and strip out classes when searching
The classes were stripped out during when creating the field name but
it led to a wrong name.
Since class components in a path are irrelevant, they're just ignored
when searching for a node in the datasets.
2023-09-07 15:56:47 +02:00
Calixte Denizet
ee3ac35e05 Revert fix for bug 1838855 (bug 1849876)
The issue described in the mentioned bug is reall because
Acrobat is rendering the XFA instead of the Acroform.
The original patch just tried to workaround the issue but it
induces some regressions.
2023-08-23 12:34:41 -04:00
Tim van der Meij
5828ac0ee3
Merge pull request #16834 from Snuffleupagus/globalWorkerPort-parallel-test
Add a unit-test for the "correct" way of using the global `workerPort` in parallel (PR 16830 follow-up)
2023-08-19 13:38:16 +02:00
Jonas Jenwald
4d19db0b19 Re-format the code to account for prettier and globals updates
The `prettier` update slightly changed the formatting of some await-expressions; please see https://github.com/prettier/prettier/blob/main/CHANGELOG.md#302

The `globals` update removed the need for some eslint-disable statements; please see https://github.com/sindresorhus/globals/releases/tag/v13.21.0
2023-08-19 09:30:34 +02:00
Jonas Jenwald
29b2050ac2 Improve the "write a new annotation, save the pdf and check that the text content is correct" unit-test (PR 16559 follow-up)
Currently this unit-test will pass just fine if compression is disabled, e.g. by commenting out the relevant code in the `src/core/writer.js` file.
While we don't have a simple way of *directly* checking that the Annotation text-content is compressed, we can however use the resulting file-size as a fairly good proxy. (Note that if compression is disabled the file-size is more than doubled.)
2023-08-15 15:12:17 +02:00
Jonas Jenwald
2422492ee3 Add a unit-test for the "correct" way of using the global workerPort in parallel (PR 16830 follow-up)
Please note that for performance reasons it's not really advised to use the same worker-thread *in parallel* for parsing multiple PDF documents, since they will then unnecessarily compete for resources.
However, given that it's still possible to do that e.g. when using the global `workerPort` it probably won't hurt to add a unit-test for this particular situation.
2023-08-15 12:45:54 +02:00
Jonas Jenwald
66437917db Avoid using the global workerPort when destruction has started, but not yet finished (issue 16777)
Given that the `PDFDocumentLoadingTask.destroy()`-method is documented as being asynchronous, you thus need to await its completion before attempting to load a new PDF document when using the global `workerPort`.
If you don't await destruction as intended then a new `getDocument`-call can remain pending indefinitely, without any kind of indication of the problem, as shown in the issue.

In order to improve the current situation, without unnecessarily complicating the API-implementation, we'll now throw during the `getDocument`-call if the global `workerPort` is in the process of being destroyed.
This part of the code-base has apparently never been covered by any tests, hence the patch adds unit-tests for both the *correct* usage (awaiting destruction) as well as the specific case outlined in the issue.
2023-08-12 21:21:50 +02:00
Jonas Jenwald
389a26c115 Fallback to check all pages when getting the pageIndex of FieldObjects
Given that the FieldObjects are parsed in parallel, in combination with the existing caching in the `getPage`-method and `annotations`-getter, adding additional caches for this fallback code-path doesn't seem entirely necessary.
2023-08-10 17:10:04 +02:00
Jonas Jenwald
64e8557fb5 [api-minor] Deprecate the PDFDocumentProxy.getJavaScript method
This method is very old, however with the exception of the auto-print hack (when scripting is disabled) in the viewer it's never actually been used.

Most likely the idea with `PDFDocumentProxy.getJavaScript` was that it'd be useful if scripting support was added, however it turned out that it was a bit too simplistic and instead a number of new methods were added for the scripting use-cases.
2023-08-01 09:02:05 +02:00
Jonas Jenwald
d022912719 Remove most build-time require-calls from the src/display/-folder
By leveraging import maps we can get rid of *most* of the remaining `require`-calls in the `src/display/`-folder, since we should strive to use modern `import`-statements wherever possible.
The only remaining cases are Node.js-specific dependencies, since those seem very difficult to convert unless we start producing a bundle *specifically* for Node.js environments.
2023-07-17 19:47:13 +02:00
Jonas Jenwald
3a886e7264 Move the isNodeJS-helper into the src/shared/util.js file
With the changes in the previous patch the `isNodeJS`-helper no longer needs to live in its own file, which helps get rid of a closure in the *built* files.
2023-07-17 16:42:25 +02:00
Jonas Jenwald
86a868189c Re-factor the PDFScriptingManager-class for the viewer-components
Currently this class contains a few "special" code-paths for the COMPONENTS build-target, which normally wouldn't be a problem. However, in this particular case that means accessing code that we don't want to include unconditionally in all builds.
This is currently implemented using build-time `require`-calls which we nowadays want to avoid, and we should strive to remove all such cases from the code-base. (Generally speaking `import` is the future, and build-tools may not always play well with a mix of both formats.)

We can easily improve things here by using sub-classing for the COMPONENTS build-target, and then use the ability to re-name when exporting (to avoid breaking existing code).
2023-07-16 08:51:46 +02:00
Jonas Jenwald
f84657d837 Address formatting changes from Prettier version 3 2023-07-15 10:44:39 +02:00
Jonas Jenwald
506bca5e6d Add unit-tests to check that more PDF.js APIs expose the expected functionality
Similar to e.g. PR 16587, let's ensure that the `pdf.worker.js` and `pdf.image_decoders.js` files expose the expected functionality.
2023-07-07 12:36:21 +02:00
Jonas Jenwald
6442a6cc4e Improve parseAppearanceStream to handle more "complex" ColorSpaces
The existing code is unable to *correctly* extract the color from the appearance-stream when the ColorSpace-data is "complex". To reproduce this:
 - Open `freetexts.pdf` in the viewer.
 - Note the purple color of the "Hello World from Preview" annotation.
 - Enable any of the Editors.
 - Note how the relevant annotation is now black.
2023-07-06 15:58:09 +02:00
Calixte Denizet
77656ce881 [Editor] When saving/printing a FreeText, use the identity matrix for the AP and set the cm when rendering it
When there was a rotation, the generated bbox was wrong because of an inversion
between width and height.
This patch aims to fix this issue in re-writing the FreeText code generation
to have something similar to what Acrobat does.
And fix the name of the font which wasn't the correct one when calling the
evaluator.
2023-07-05 16:37:01 +02:00
Jonas Jenwald
39113baa33 Move the transfers computation into the AnnotationStorage class
Rather than having to *manually* determine the potential `transfers` at various spots in the API, we can let the `AnnotationStorage.serializable` getter include this.
To further simplify things, we can also let the `serializable` getter compute and include the `hash`-string as well.
2023-06-29 19:51:57 +02:00
calixteman
88c7c8b5bf
Merge pull request #16588 from calixteman/editor_stamp_2
[Editor] Add support for printing/saving newly added Stamp annotations
2023-06-28 22:42:54 +02:00
Calixte Denizet
599b9498f2 [Editor] Add support for printing/saving newly added Stamp annotations
In order to minimize the size the of a saved pdf, we generate only one
image and use a reference in each annotation using it.
When printing, it's slightly different since we have to render each page
independantly but we use the same image within a page.
2023-06-26 15:47:05 +02:00
Jonas Jenwald
5f5db4b160 Run the PDF.js-viewer API unit-test in Node.js environments (PR 16592 follow-up)
It occurred to me that we can actually run this unit-test in Node.js environments by making use of the preprocessor to stub out the browser globals there.
2023-06-26 09:37:34 +02:00
Jonas Jenwald
e153e3a741 Expose FindState in the viewer-components (issue 16589) 2023-06-24 13:23:02 +02:00
Jonas Jenwald
f596490a1b Add a unit-test to check that the *official* PDF.js-viewer API exposes the expected functionality
Until now we've not actually had *any* tests that ensure that the *official* PDF.js-viewer API exposes the intended functionality, which means that things can easily break accidentally.

*Please note:* This unit-test cannot (easily) be run in Node.js-environments, since the `external/webL10n/l10n.js` file contains various browser-specific functionality.
2023-06-23 12:22:54 +02:00
Jonas Jenwald
0bbadce066 Add a unit-test to check that the *official* PDF.js API exposes the expected functionality
Until now we've not actually had *any* tests that ensure that the *official* PDF.js API exposes the intended functionality, which means that things can easily break accidentally.
2023-06-22 15:21:10 +02:00
Calixte Denizet
d1e172458f [api-minor] Make the popup independent of their associated annotations
- it'll help to be able to move popups on screen to let the user read the text
- popups won't inherit some properties from their parent:
  - the popup can be misrendered if for example the parent has a clip-path property.
- add an outline to the popup when the parent is focused.
- hide a popup when it's clicked.
2023-06-20 15:30:39 +02:00
Calixte Denizet
5c0054d58d Guess that a checkbox belongs to a group in using its T value (bug 1838855) 2023-06-16 18:45:09 +02:00
Jonas Jenwald
2cb113b545 Improve handling of /Filter-entries in writeStream
Fix handling of /Filter-entries, since the current implementation could potentially corrupt the data if there's multiple filters present.
Please note that filters are applied *sequentially* during decoding, starting from the first one in the Array, hence the first Array-entry needs to be /FlateDecode in order for things to actually work correctly.

To prevent a future bug, if we want to save more "complex" data such as images, also ensure that we include any existing /DecodeParms-entries when updating the /Filter-entry.
2023-06-16 10:27:23 +02:00
Calixte Denizet
85b38fc247 Add a test to check that the compression is ok when saving an annotation 2023-06-16 10:05:42 +02:00
Calixte Denizet
71479fdd21 [Editor] Avoid to have duplicated entries in the Annot array when saving an existing and modified annotation 2023-06-15 22:02:10 +02:00
Jonas Jenwald
877884029d
Merge pull request #16551 from Snuffleupagus/page-destroyed-complete
Ensure that `cleanup` during rendering is actually ignored, to prevent a blank canvas
2023-06-15 12:26:57 +02:00
Jonas Jenwald
0650be4641
Merge pull request #16550 from Snuffleupagus/rm-RenderingCancelledException-type
[api-minor] Remove the `type` from `RenderingCancelledException` (PR 16226 follow-up)
2023-06-15 12:26:27 +02:00
Jonas Jenwald
a591c3de84 Ensure that cleanup during rendering is actually ignored, to prevent a blank canvas
The existing unit-test doesn't work as intended, since the page never actually renders. Note how `cleanup` is *not* allowed to run when parsing and/or rendering is ongoing, however an (old) incorrect condition could prevent rendering from ever starting.

This is very old code, which has been slightly re-factored a couple of times (many years ago), however this doesn't appear to affect e.g. the default viewer since the incorrect behaviour seem highly dependent on "unlucky" timing.
Note also how at the start of the `PDFPageProxy.prototype.render`-method we purposely cancel any pending `cleanup`-call, to prevent unnecessary re-parsing for multiple sequential `render`-calls.

Finally, avoid running `cleanup` when document/page destruction has already started since it's pointless in that case.
2023-06-15 11:39:26 +02:00
Jonas Jenwald
225734dd00 [api-minor] Remove the type from RenderingCancelledException (PR 16226 follow-up)
After PR 16226 we're only using `RenderingCancelledException` together with canvas-rendering, hence the `type`-property is no longer necessary.
2023-06-14 15:40:25 +02:00
Jonas Jenwald
fee850737b Enable the unicorn/prefer-optional-catch-binding ESLint plugin rule
According to MDN this format is available in all browsers/environments that we currently support, see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/try...catch#browser_compatibility

Please also see https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/prefer-optional-catch-binding.md
2023-06-12 11:46:11 +02:00
Calixte Denizet
ba8c996623 [Editor] Guess font size and color from the AS of FreeText annotations 2023-06-05 17:15:17 +02:00
Jonas Jenwald
cf3a35e9da Enable the import/no-cycle ESLint plugin rule
Having cyclical imports is obviously not a good idea, and this ESLint plugin rule can help detect those; please see https://github.com/import-js/eslint-plugin-import/blob/main/docs/rules/no-cycle.md
2023-06-04 13:44:15 +02:00
Calixte Denizet
133d103186 [Editor] Add few more info when saving ink data (thickness, opacity, ...)
Fix the InkList entry: the coordinates were relative to the page and not
to the bounding box of the annotation.
2023-05-31 15:43:07 +02:00
Calixte Denizet
d2b4ed3cea [Editor] Improve curve smoothing for Ink tool (bug 1789443)
- Remove the dependency on fit-curve;
- Improve the way to draw the current line in using a Path2D and
  in clearing only the last part of the curve instead of clearing
  all the canvas;
- Smooth the curve when drawing to avoid to have some changes after
  the drawing ends;
- Make the smoothing a bit less agressive.
2023-05-23 17:15:21 +02:00
Jonas Jenwald
8c4821ceda [api-minor] Slightly shorten the marked-content ids used in the textLayer
Generally we try to keep the ids that we create short, hence we can slightly shorten the "static" parts of them.
2023-05-18 22:32:10 +02:00
Jonas Jenwald
04de155aaa Slightly shorten the loadedName-ids used with font-substitutions
Generally we try to keep the ids that we create short, hence we can slightly shorten the "static" part of them.
2023-05-18 22:27:11 +02:00
Calixte Denizet
3091e70aad Flush the current chunk when the font changed because of a restore op (issue #14755) 2023-05-18 19:37:16 +02:00
Calixte Denizet
385f275ad9 Warn when pdf.js can't load an OS font 2023-05-16 14:58:38 +02:00
Jonas Jenwald
cb1a10e358 Check the css property in the getFontSubstitution unit-tests
Given that the `css` property isn't constant, since it contains document/font ids, we cannot just check it directly. However, we can make use of regular expressions to ensure that the format is generally correct.
2023-05-14 19:11:35 +02:00
calixteman
4101128c09
Merge pull request #16421 from calixteman/font_subst_test
Add tests for the font substitution
2023-05-14 18:23:12 +02:00
Calixte Denizet
89140fcd98 Add tests for the font substitution 2023-05-14 18:07:03 +02:00
Jonas Jenwald
8fbd6755eb Enable the unicorn/no-useless-promise-resolve-reject ESLint plugin rule
Please see https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/no-useless-promise-resolve-reject.md

Note that this patch also re-sorts the existing `unicorn`-rules in proper alphabetical order.
2023-05-13 11:30:25 +02:00
Jonas Jenwald
8f3940fbf3 Move the sidebar-resizing handling into the PDFSidebar class
Originally the `PDFSidebarResizer` class was slightly larger, since the code used to contain e.g. feature testing for older (and no longer supported) browsers.
Given that there's some amount of overlap, when it comes to what DOM-elements and state that these classes need, it now seems reasonable to simply move the sidebar-resizing into the `PDFSidebar` class.

For the MOZCENTRAL build-target this patch reduces the size of the *built* `web/viewer.js` file by just over `1.1` kilobytes.
2023-05-12 10:00:12 +02:00
Calixte Denizet
cfb908c999 Add a cache to avoid to load several times a local font
On my computer, it takes few tenths of a second to load a local font.
Since a font can be used several times in a document, the cache will
improve performances.
2023-05-10 20:01:21 +02:00
Calixte Denizet
2486536843 Compress the data when saving annotions
CompressionStream API has been added in Firefox 113
(see https://bugzilla.mozilla.org/show_bug.cgi?id=1823619)
hence we can use it to compress the streams with added/modified
annotations.
2023-05-09 14:46:50 +02:00
Jonas Jenwald
317abd6d07 Change the createPromiseCapability helper function into a PromiseCapability class
This is not only slightly more compact, but it also simplifies the handling of the `settled` getter.
2023-04-29 13:43:24 +02:00
Calixte Denizet
19ca41896e Correctly clip the text in the text layer (fixes #16316) 2023-04-18 17:00:42 +02:00
Calixte Denizet
117bbf7cd9 [api-minor] Don't normalize the text used in the text layer.
Some arabic chars like \ufe94 could be searched in a pdf, hence it must be normalized
when creating the search query. So to avoid to duplicate the normalization code,
everything is moved in the find controller.
The previous code to normalize text was using NFKC but with a hardcoded map, hence it
has been replaced by the use of normalize("NFKC") (it helps to reduce the bundle size
by 30kb).
In playing with this \ufe94 char, I noticed that the bidi algorithm wasn't taking into
account some RTL unicode ranges, the generated font wasn't embedding the mapping this
char and the unicode ranges in the OS/2 table weren't up-to-date.

When normalized some chars can be replaced by several ones and it induced to have
some extra chars in the text layer. To avoid any regression, when copying some text
from the text layer, a copied string is normalized (NFKC) before being put in the
clipboard (it works like this in either Acrobat or Chrome).
2023-04-17 14:31:23 +02:00
Jonas Jenwald
0e19c3a120 [api-minor] Add support, in PDFFindController, for mixing phrase/word searches (issue 7442)
*Please note:* This patch only extends the `PDFFindController` implementation itself to support this functionality, however it's *purposely* not exposed in the default viewer.

This replaces the previous `phraseSearch`-parameter, and a `query`-string will now always be interpreted as a phrase-search.
To enable searching for individual words, the `query`-parameter must instead consist of an Array of strings. This way it's now also possible to combine phrase/word searches, with a `query`-parameter looking something like `["Lorem ipsum", "foo", "bar"]` which will search for the phrase "Lorem ipsum" *and* the words "foo" respectively "bar".
2023-04-15 13:32:37 +02:00
Calixte Denizet
d8795f9f8f Fix search of numbers inside fractions 2023-04-11 20:57:26 +02:00
Jonas Jenwald
5063a6f2a9 [api-minor] Remove the disableCombineTextItems option
*Please note:* This parameter has never been used within the PDF.js library/viewer itself, and it was only ever added for backwards compatibility reasons.

This parameter was added in PR 7475, over six years ago, to try and optionally maintain the previous *default* text-extraction behaviour.
However as part of the general text-extraction improvements in PR 13257, almost two years ago, the `disableCombineTextItems` functionality was accidentally "broken" in various ways. Note how the only (very basic) unit-test was updated in a way that doesn't really make sense, since generally speaking you'd expect that using the option should result in *more* (or at least the same number of) text-items. Furthermore there's also the recent issue 16209, where the option causes almost all textContent to be concatenated together.

Hence this patch proposes that we simply remove the `disableCombineTextItems` option since it's essentially unused/untested functionality, as evident from the fact that it took almost two years for someone to notice that it's broken.
2023-03-30 14:23:38 +02:00
Calixte Denizet
a96f10e55d Create a new chunk when the char is too rised compared to the previouse one 2023-03-28 13:56:46 +02:00
Jonas Jenwald
1fc09f0235 Enable the unicorn/prefer-string-replace-all ESLint plugin rule
Note that the `replaceAll` method still requires that a *global* regular expression is used, however by using this method it's immediately obvious when looking at the code that all occurrences will be replaced; please see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replaceAll#parameters

Please find additional details at https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/prefer-string-replace-all.md
2023-03-23 12:57:10 +01:00
Jonas Jenwald
5f64621d46 Use String.prototype.replaceAll() where appropriate
This fairly new method allows replacing *multiple* occurrences within a string without having to use regular expressions.

Please refer to:
 - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replaceAll
 - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replaceAll#browser_compatibility
2023-03-22 15:31:10 +01:00
Jonas Jenwald
137a2d6e30 Add even more non-standard ligatures (PR 15517 follow-up)
Given that we already create multi-byte ToUnicode entries in other cases, see e.g. the `getNormalizedUnicodes` table, this is hopefully fine.
2023-03-22 10:42:52 +01:00
Jonas Jenwald
9321758d91
Merge pull request #16186 from Snuffleupagus/issue-16176
Support multi-byte ToUnicode entries, when using predefined CMaps (issue 16176)
2023-03-21 22:17:18 +01:00
Jonas Jenwald
d4bcfe8c16 Support multi-byte ToUnicode entries, when using predefined CMaps (issue 16176)
Hopefully this makes sense, since we already "create" multi-byte ToUnicode entries in other cases (see e.g. the `getNormalizedUnicodes` table).
2023-03-21 21:35:57 +01:00
calixteman
8bfebf1c24
Merge pull request #16188 from calixteman/bug1823296
Use the position of the previous xref stream if any when saving a pdf (bug 1823296)
2023-03-21 21:21:49 +01:00
Calixte Denizet
2d0f30a67c Use the position of the previous xref stream if any when saving a pdf (bug 1823296) 2023-03-21 19:27:24 +01:00
Jonas Jenwald
c4a725fe98 Fix the transfer parameter, for structuredClone, in the LoopbackPort
The way that we handle the `transfer` parameter is unfortunately wrong, ever since PR 14392 which introduced the code, given that the MDN article originally contained incorrect information; please see https://github.com/mdn/content/pull/23164

By updating the `structuredClone` call such that it works correctly, we can enable more unit-tests in Node.js environments; please refer to https://developer.mozilla.org/en-US/docs/Web/API/structuredClone#parameters
2023-03-19 22:04:01 +01:00
Jonas Jenwald
0e54a3c37a Warn about missing/incorrect --scale-factor CSS-variable in renderTextLayer (issue 16139)
Unfortunately I don't believe that we can simply add a default `--scale-factor` CSS-variable to the `container`-element, since that might not be entirely appropriate/correct in all cases.[1]
However, we can at least print a console-error to hopefully make this situation more apparent to users. (This is purposely not using the `warn` helper-function, since those messages can be disabled.)

---
[1] One example is in our reference-tests, where we don't need to add it to the `container`-element itself.
2023-03-16 11:53:12 +01:00
calixteman
b2a86350fc
Merge pull request #16096 from bungeman/fix_trig_functions
Correct PostScript trigonometric operators
2023-03-11 14:32:23 +01:00
Calixte Denizet
07b094729e Fix search in pdf a containing some UTF-32 characters (bug 1820909)
Some chars were supposed to have a length equals to 1 but UTF-32 chars
can be longuer.
2023-03-09 15:03:01 +01:00
Calixte Denizet
b8dda089e2 Slightly modify the max width of a tracking space 2023-03-07 19:38:49 +01:00
Ben Wagner
158c836e26 Correct PostScript trigonometric operators
PDF 32000-1:2008 7.10.5.1 "Type 4 (PostScript Calculator) Functions"
defers to the PostScript Language Reference for the description of these
functions. The PostScript Language Reference, third edition chapter 8
"Operators" defines the `angle` type as a "number of degrees". Section
8.1 defines "angle `sin` real", "angle `cos` real", and "num den `atan`
angle". The documentation for `atan` further states that it will return
an angle in degrees between 0 and 360.

Handle these operators correctly in `PostScriptEvaluator.execute`.
Convert the inputs to `sin` and `cos` from degrees to radians for use
with `Math.sin` and `Math.cos`. Correctly pop two values from the stack
for `atan`, use `Math.atan2`, and convert from radians to (positive)
degrees.
2023-03-03 17:25:11 -05:00
Calixte Denizet
fd03cd5493 [api-minor] Generate images in the worker instead of the main thread.
We introduced the use of OffscreenCanvas in #14754 and this patch aims
to use them for all kind of images.
It'll slightly improve performances (and maybe slightly decrease memory use).
Since an image can be rendered in using some transfer maps but because of
OffscreenCanvas we don't have the underlying pixels array the transfer maps
stuff is re-implemented in using the SVG filter feComponentTransfer.
2023-03-01 17:40:12 +01:00
Jonas Jenwald
f42a2e8451
[api-minor] Move the canvasFactory option into getDocument
Rather than repeatedly initializing a `canvasFactory`-instance for every page, move it to the document-level instead.

*Please note:* This patch is written using the GitHub UI, since I'm currently without a dev machine, so hopefully it works correctly.
2023-03-01 09:07:16 +01:00
Calixte Denizet
3a21423386 [Acroform] Use the full path to find the node in the XFA datasets where to store the value
I noticed several 'Path not found' errors because of a field called #subform[2].
From the XFA specs, the hash is used for a class of elements in the template tree.
When we're looking for a node in the datasets tree, it doesn't make sense to search
for a class. Hence the path element starting with a hash are just skipped.
2023-02-23 12:09:39 +01:00
Calixte Denizet
fc7d74385f Don't replace an eol by a whitespace when the last char is a Katakana-Hiragana diacritic 2023-02-16 11:31:58 +01:00
Jonas Jenwald
6d4d402a78 Move the arrayBuffersToBytes helper function into the worker-thread
Given that this helper function is only used on the worker-thread, there's no reason to duplicate it in both of the *built* `pdf.js` and `pdf.worker.js` files.
2023-02-11 21:34:37 +01:00
Calixte Denizet
4e9f26afa3 Ignore position of combining diacritics when getting text (bug 1640217) 2023-02-09 17:13:57 +01:00
Jonas Jenwald
90ffbc1d39 Remove most build-time require statements from the viewer (PR 16009 follow-up)
This further extends the web-specific import maps introduced in PR 16009, to allow removing *most* of the build-time `require` statements from the viewer. The few remaining ones are fallbacks used for the COMPONENTS respectively the `legacy` GENERIC builds.
2023-02-07 22:45:19 +01:00
calixteman
ecd86ccffc
Merge pull request #16020 from calixteman/bug1815476
[Annotation] Avoid to encrypt the appearance stream two times (bug 1815476)
2023-02-07 20:57:49 +01:00
Calixte Denizet
ea7b4b4d6c [Annotation] Avoid to encrypt the appearance stream two times (bug 1815476) 2023-02-07 19:26:46 +01:00
Jonas Jenwald
a98e80c4ff [GeckoView] Reduce the size of the *built* viewer
Given that the GV-viewer isn't using most of the UI-related components of the default-viewer, we can avoid including them in the *built* viewer to save space.[1]
The least "invasive" way of implementing this, at least that I could come up with, is to leverage import maps with suitable stubs for the GV-viewer.

The one slightly annoying thing is that we now have larger import maps across multiple html-files, and you'll need to remember to update all of them when making future changes.

---
[1] With this patch, the built `viewer.js` size is 391 kB and `viewer-geckoview.js` is 285 kB.
2023-02-05 14:12:32 +01:00
Tim van der Meij
e848a0e61c
Merge pull request #15981 from Snuffleupagus/cMapPacked-true
[api-minor] Let the `cMapPacked` parameter, in `getDocument`, default to `true`
2023-02-04 15:00:26 +01:00
Jonas Jenwald
851c394e64 Remove the isEmptyObj unit-test helper function
We should be able to let Jasmine simply compare directly against an actually empty Object, rather than using a manually implemented helper function for that.
2023-02-04 12:43:53 +01:00
Jonas Jenwald
c5d6391898 [api-minor] Let the cMapPacked parameter, in getDocument, default to true
The initial CMap support was added in PR 4259 using the "raw" Adobe files, however they were quickly deemed to be unnecessarily large. As a result PR 4470 introduced the more compact "binary" CMap format, with both of those PRs being included in the very same release (version `0.8.1334`) .

Please note that we've thus never shipped anything *except* the "binary" CMap files with the PDF library, and furthermore note that we've not even once updated the CMap files since they were originally added almost nine years ago.

Requiring users to remember that `cMapPacked = true` is necessary, in addition to setting the `cMapUrl` parameter, in order for CMap loading to work feels like a less than ideal API.
Hence this patch, which suggests that we simply let `cMapPacked` default to `true` now.
2023-01-30 15:35:02 +01:00
Calixte Denizet
6f4d037a8e [JS] Correctly format field with numbers (bug 1811694, bug 1811510)
In PR #15757, a value is automatically converted into a number when it's possible
but the case of numbers like "000123" has been overlooked and their format must
be preserved.
When a script is doing something like "foo.value + bar.value" and the values are
numbers then "foo.value" must return a number but the displayed value must be what
the user entered or what a script set, so this patch is just adding a a field
_orginalValue in order to track the value has it has defined.
Some people are used to use a comma as decimal separator, hence it must be considered
when a value is parsed into a number.
This patch is fixing a regression introduced by #15757.
2023-01-26 14:57:02 +01:00
Tim van der Meij
a27d7ba524
Merge pull request #15943 from Snuffleupagus/deprecate-direct-PDFDataRangeTransport
[api-minor] Deprecate calling `getDocument` directly with a `PDFDataRangeTransport`-instance
2023-01-21 13:50:20 +01:00
Calixte Denizet
dc94b750de [GV] Avoid to update the finder when the results aren't complete
At the beginning of a search we can an update can be triggered with 0 over 0
found matches.
In the GeckoView context, we can't update the finder whenever we want but only
when it has been required.
2023-01-20 18:13:16 +01:00
Jonas Jenwald
7976fc7851 [api-minor] Deprecate calling getDocument directly with a PDFDataRangeTransport-instance
In general it's recommended to pass a *parameter object* when calling the `getDocument`-function in the API, since that's the only way to provide additional options, and the fact that it also accepts a URL or TypedArray directly is now mostly for backwards compatibility reasons.
However, the `getDocument`-function also accepts a direct `PDFDataRangeTransport`-instance which just seems unnecessary.

*Please note:* The `PDFDataRangeTransport`-implementation was added specifically for the *built-in* Firefox PDF Viewer, however it's most likely not commonly used by any third-party (given that it requires manual PDF-data loading).
Furthermore, the default-viewer always provides a *parameter object* when calling the `getDocument`-function and it's thus completely unaffected by these changes.
2023-01-19 14:25:55 +01:00
Jonas Jenwald
8f3fa18c93
Merge pull request #15920 from Snuffleupagus/transfer-pdf-data
[api-minor] Enable transferring of TypedArray PDF data by default (PR 15908 follow-up)
2023-01-16 13:20:57 +01:00
Tim van der Meij
9e3adb5ec7
Implement unit tests for the numberToString utility function 2023-01-14 15:09:58 +01:00
Tim van der Meij
a6dfcc89fa
Implement unit tests for the recoverJsURL utility function 2023-01-14 15:09:58 +01:00
Jonas Jenwald
397f943ca3 [api-minor] Enable transferring of TypedArray PDF data by default (PR 15908 follow-up)
This patch removes the recently introduced `transferPdfData` API-option, and simply enables transferring of TypedArray data *by default* instead of copying it. This will help reduce main-thread memory usage, however it will take ownership of the TypedArrays. Currently this only applies to the following cases:
 - TypedArrays passed to the `getDocument`-function in the API, in order to open PDF documents from binary data.
 - TypedArrays passed to a `PDFDataRangeTransport`-instance, used to support custom PDF document fetching/loading (see e.g. the Firefox PDF Viewer).

*PLEASE NOTE:* To avoid being affected by this, please simply *copy* any TypedArray data before passing it to either of the functions/methods mentioned above.

Now that we transfer TypedArray data that we previously only copied, we need to be more careful with input validation. Given how the `{IPDFStreamReader, IPDFStreamRangeReader}.read` methods will always return ArrayBuffer data, which is then transferred to the worker-thread[1], the actual TypedArray data passed to the API thus need to have the same exact size as its underlying ArrayBuffer to prevent issues.
Hence we'll check for this and only allow transferring of *safe* TypedArray data, and fallback to simply copying the data just as before. This obviously shouldn't be an issue in the Firefox PDF Viewer, but for the general PDF.js library we need to be more careful here.

---
[1] See e09ad99973/src/display/api.js (L2492-L2506) respectively e09ad99973/src/display/api.js (L2578-L2590)
2023-01-14 10:39:36 +01:00
Jonas Jenwald
bbe629018d [api-minor] Add a new transferPdfData option to allow transferring more data to the worker-thread (bug 1809164)
Also, removes the `initialData`-parameter JSDocs for the `getDocument`-function given that this parameter has been completely unused since PR 8982 (over five years ago). Note that the `initialData`-parameter is, and always was, intended to be provided when initializing a `PDFDataRangeTransport`-instance.
2023-01-10 21:03:44 +01:00