The current test fails intermittently only on Windows for unknown
reasons: the code is correct and on Linux it always passes. However, we
have already spent quite a lot of time on this test, so rather than
spending even more time on it I figured we should look at what behavior
the test is trying to check and find an alternative way to do it that
can't trigger this intermittent issue anymore.
This commit changes the test to use a term that only exists once in the
entire document so we cannot accidentally highlight another match
anymore. This doesn't change anything about the behavior that this test
aims to check: we still test searching in the XFA layer, we still test
that the original term is matched case-insensitively and we still test
that that match is actually highlighted. Note that the only objective of
the test is confirming that the search functionality covers the XFA
layer, so the exact phrase/match is not the interesting bit.
Given that we only use standard `import`/`export` statements now, after recent PRs, the "exports" global is unused.
Instead we add "__non_webpack_import__" to the `globals` to avoid having to sprinkle disable statements throughout the code.
Finally, the way that `globals` are defined has changed in ESLint and we should thus explicitly specify them as "readonly"; please find additional details at https://eslint.org/docs/latest/use/configure/language-options#specifying-globals
The previous change that set the timeout had effect because we have seen
quite a few protocol timeouts now correctly being raised in the context
of the active test, however we have also still seen a handful of cases
where this wasn't the case and the one second difference turned out to
be too low (likely because the operation was started slightly after one
second into the test run). We therefore tweak the value to be 75% of the
Jasmine timeout. This should be enough to catch operations that happen
later on in the test run, and if a single operation takes that long any
hope for success is already gone anyway.
It's not necessary because we have configured silent printing for
Firefox and Chrome in the browser arguments we pass in `test.mjs`. This
means that the print dialog is not even shown at all or disappears
automatically once printing is done, so the Escape key press serves no
purpose. Since it has been shown to time out, likely because the page
loses focus during printing, and because the page itself doesn't know
when the printing dialog is shown and we therefore can't possibly do the
key press at the right time anyway, this commit gets rid of it to
stabilize the test.
Those files only contain old debugging code that is not used/imported
anywhere anymore, which is generating code scanning alerts. Moreover,
they rely on globals/platform-specific code and don't import/export
logic properly.
Given the amount of work put into removing `require`-calls from the code-base, let's ensure that new ones aren't accidentally added in the future.
Note that we still have a couple of files where `require` is being used, in particular:
- The Node.js examples, however those will be updated to use `import` in PR 17081.
- The Webpack examples, and related support files, however I unfortunately don't know enough about Webpack to be able to update those. (Hopefully users of that code will help out here, once version `4` is released.)
- The `statcmp`-tool, since *some* of those `require`-calls cannot be converted to `import` without other code changes (and that file is only used during benchmarking).
Please find additional details at https://github.com/import-js/eslint-plugin-import/blob/main/docs/rules/no-commonjs.md
This *finally* allows us to mark the entire PDF.js library as a "module", which should thus conclude the (multi-year) effort to re-factor and improve how we import files/resources in the code-base.
This also means that the `gulp ci-test` target, which is what's run in GitHub Actions, now uses JavaScript modules since that's supported in modern Node.js versions.
For large/complex images it's possible that the image-data arrives in the API *after* the page has been scrolled out-of-view and thus been cleaned-up. In this case we obviously shouldn't cache such page-level data, since it'll first of all be unused and secondly can increase memory usage *a lot*.
Also, ensure that we *immediately* release any `ImageBitmap` data in this case to help reclaim memory faster.
The examples themselves were updated to account for JavaScript modules, which didn't require changing the actual URLs.
However, since it seems that JSFiddle doesn't support JavaScript modules in its separate "JavaScript" editing-area we need to change how we embed the examples to avoid showing a blank "JavaScript"-tab.
When pdfBug is true, the substitution font is used in the text layer in order
to be able to know what is the font really used thanks to the devtools.
And to be sure that fonts are loaded, the font cache isn't cleaned up when
the debugger is active.
The default protocol timeout is 180 seconds according to the
documentation at https://pptr.dev/api/puppeteer.browserconnectoptions,
but the Jasmine timeout we configure in the individual boot files is 30
seconds. The consequence of this is that if a protocol (CDP) error
occurs after 30 seconds Jasmine will fail the test, but the actual
protocol error from Puppeteer is raised much later in the context of
another test, which causes unrelated failures or tracebacks.
This commit fixes the problem by configuring Puppeteer to always use a
lower protocol timeout than the Jasmine timeout so that protocol errors
are always raised in the context of the test that actually triggered it.
It's been loaded as a JavaScript module for a long time, and given that the file is bundled as-is (without building) it seems reasonable to just change the file extension now.
The Windows bot is usually slower than the Linux bot, and therefore
text layer rendering is as well. However, the `autoprint` test awaited
text layer rendering to complete before activating the selector check,
which makes it timing-sensitive and causes it to never resolve because
the page is already printed (and the printed page div removed) by then.
This commit should fix the issue by activating the selector check as
soon as possible, namely as soon as the viewer appears, which should
ensure we're always registering the selector check in time because we're
doing it even before rendering is starting.
Comparing the currently supported browsers/environments, see [the FAQ](https://github.com/mozilla/pdf.js/wiki/Frequently-Asked-Questions#faq-support) and the [MDN compatibility data](https://developer.mozilla.org/en-US/docs/Web/API/structuredClone#browser_compatibility), the `structuredClone` polyfill is *only* needed in Google Chrome versions < 98. Because of some limitations in the core-js polyfill we're currently forced to special-case the `transfer` handling to prevent bugs, and it'd be nice to avoid that.
Note that `structuredClone`, with transfers, is only used in two spots:
- The `LoopbackPort` class, which is only used with fake workers. Given that fake workers should *never* be used in browsers, breaking that edge-case in older Google Chrome versions seem fine.
- The `AnnotationStorage` class, when Stamp-annotations have been added to the document. Given that Google Chrome isn't the main focus of development, breaking *part* of the editing-functionality in older Google Chrome versions should hopefully be acceptable.
To avoid problems with `export` statements in the QuickJS Javascript Engine, we can work-around that by *explicitly* exposing `pdfjsScripting` globally instead.
At this point in time all browsers, and also Node.js, support standard `import`/`export` statements and we can now finally consider outputting modern JavaScript modules in the builds.[1]
In order for this to work we can *only* use proper `import`/`export` statements throughout the main code-base, and (as expected) our Node.js support made this much more complicated since both the official builds and the GitHub Actions-based tests must keep working.[2]
One remaining issue is that the `pdf.scripting.js` file cannot be built as a JavaScript module, since doing so breaks PDF scripting.
Note that my initial goal was to try and split these changes into a couple of commits, however that unfortunately didn't really work since it turned out to be difficult for smaller patches to work correctly and pass (all) tests that way.[3]
This is a classic case of every change requiring a couple of other changes, with each of those changes requiring further changes in turn and the size/scope quickly increasing as a result.
One possible "issue" with these changes is that we'll now only output JavaScript modules in the builds, which could perhaps be a problem with older tools. However it unfortunately seems far too complicated/time-consuming for us to attempt to support both the old and modern module formats, hence the alternative would be to do "nothing" here and just keep our "old" builds.[4]
---
[1] The final blocker was module support in workers in Firefox, which was implemented in Firefox 114; please see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/import#browser_compatibility
[2] It's probably possible to further improve/simplify especially the Node.js-specific code, but it does appear to work as-is.
[3] Having partially "broken" patches, that fail tests, as part of the commit history is *really not* a good idea in general.
[4] Outputting JavaScript modules was first requested almost five years ago, see issue 10317, and nowadays there *should* be much better support for JavaScript modules in various tools.
The user should *always* provide a correct `GlobalWorkerOptions.workerSrc` value when using the PDF.js library in browser environments. Note that the fallback:
- Has been deprecated ever since PR 11418, first released in version `2.4.456` over three years ago.
- Was always a best-effort solution, with no guarantees that it'd actually work correctly.
- With upcoming changes, w.r.t. outputting JavaScript modules, it'd now be more diffiult to determine the correct value.
The minified default viewer has never been distributed in either official releases or through pdfjs-dist, which means that it's most likely unused, and it's has never been tested nor actively maintained.