This is consistent with the implementation used in the (now removed) webL10n-library, and by only using lowercase language-codes internally in the `L10n`-implementations we should avoid future issues e.g. when users manually set the `locale`-option (in the default viewer).
This commit migrates the font tests away from the bots. Not only are the
font tests broken on the Windows bot since some time, they also run on
Python 2 (end of life since January 2020) and `ttx` 3.19.0 (released in
November 2017). The latter is installed via a submodule, which requires
more complicated logic for finding and running `ttx`.
We solve the issues by implementing a modern workflow that installs the
most recent stable Python and `ttx` (`fonttools` package) versions. This
simplifies the `ttx` driver code as well because it can now assume `ttx`
is available on the path (just like we do for e.g. `node` invocations).
GitHub Actions takes care of creating a virtual environment with
`fonttools` in it so that the `ttx` entrypoint is available. Locally
the font tests can be run in a similar way by creating and sourcing a
virtual environment with `fonttools` in it before running the font
tests, and a README file is included with instructions for doing so.
This commit prepares for running the font tests on GitHub Actions where
we can't spin up headful browsers because there are no display
capabilities on the workers. This will also be useful for porting other
test targets to GitHub Actions at a later time, as well as running the
tests locally in headless mode.
After recent PRs the size and scope of the CI workflow is now reduced, and this patch tries to simplify things further. More specifically we can directly specify the gulp-tasks in the workflow, and thus clean-up the `gulpfile` a tiny bit.
Note that this will technically be slower, since the tests are now run in series (rather than in parallel), however `gulp externaltest` runs so quickly that it really won't matter in practice.
This should *hopefully* fix 17228, by tweaking the build scripts to give the GENERIC viewer something to await to avoid breaking third-party users of the standalone viewer components.
Currently we *synchronously* fetch a number of browser preferences/options, from the platform code, during the viewer respectively PDF document initialization paths.
This seems unnecessary, and we can re-factor the code to instead include the relevant data when fetching the regular viewer preferences.
*Please note:* While following the steps in the README still works with this patch, in the sense that the example runs and successfully renders a PDF document, I unfortunately cannot tell if it illustrates Webpack best practices.
*Please note:* These changes only affect the GENERIC build, since `NullL10n` is only a stub elsewhere (see PR 17135).
After the changes in PR 17115, which modernized and improved l10n-handling, the `NullL10n`-implementation is no longer a good fallback for the "proper" `L10n`-classes.
To improve this situation, especially for the *standalone* viewer-components, this patch makes the following changes:
- Let the `NullL10n`-implementation extend an actual `L10n`-class, which is constant and lazily initialized, to ensure that it works *exactly* like the "proper" ones.
- Automatically bundle the "en-US" l10n-strings in the build, via the pre-processor, such that we don't need to remember to manually update them.
- Ensure that the *standalone* viewer-components register their DOM-elements for translation, similar to the default viewer, since this will allow future code improvements by using "data-l10n-id"/"data-l10n-args" in most (if not all) parts of the viewer.
- Remove the `NullL10n` from the `AnnotationLayer`, to avoid affecting bundle size too much.
For third-party users that access the `AnnotationLayer`, as exposed in the main PDF.js library, they'll now need to *manually* register it for translation. (However, the *standalone* viewer-components still works given the point above.)
- For the generic viewer we use @fluent/dom and @fluent/bundle
- For the builtin pdf viewer in Firefox, we set a localization url
and then we rely on document.l10n which is a DOMLocalization object.
Those files only contain old debugging code that is not used/imported
anywhere anymore, which is generating code scanning alerts. Moreover,
they rely on globals/platform-specific code and don't import/export
logic properly.
This *finally* allows us to mark the entire PDF.js library as a "module", which should thus conclude the (multi-year) effort to re-factor and improve how we import files/resources in the code-base.
This also means that the `gulp ci-test` target, which is what's run in GitHub Actions, now uses JavaScript modules since that's supported in modern Node.js versions.
It's been loaded as a JavaScript module for a long time, and given that the file is bundled as-is (without building) it seems reasonable to just change the file extension now.
Comparing the currently supported browsers/environments, see [the FAQ](https://github.com/mozilla/pdf.js/wiki/Frequently-Asked-Questions#faq-support) and the [MDN compatibility data](https://developer.mozilla.org/en-US/docs/Web/API/structuredClone#browser_compatibility), the `structuredClone` polyfill is *only* needed in Google Chrome versions < 98. Because of some limitations in the core-js polyfill we're currently forced to special-case the `transfer` handling to prevent bugs, and it'd be nice to avoid that.
Note that `structuredClone`, with transfers, is only used in two spots:
- The `LoopbackPort` class, which is only used with fake workers. Given that fake workers should *never* be used in browsers, breaking that edge-case in older Google Chrome versions seem fine.
- The `AnnotationStorage` class, when Stamp-annotations have been added to the document. Given that Google Chrome isn't the main focus of development, breaking *part* of the editing-functionality in older Google Chrome versions should hopefully be acceptable.
To avoid problems with `export` statements in the QuickJS Javascript Engine, we can work-around that by *explicitly* exposing `pdfjsScripting` globally instead.
At this point in time all browsers, and also Node.js, support standard `import`/`export` statements and we can now finally consider outputting modern JavaScript modules in the builds.[1]
In order for this to work we can *only* use proper `import`/`export` statements throughout the main code-base, and (as expected) our Node.js support made this much more complicated since both the official builds and the GitHub Actions-based tests must keep working.[2]
One remaining issue is that the `pdf.scripting.js` file cannot be built as a JavaScript module, since doing so breaks PDF scripting.
Note that my initial goal was to try and split these changes into a couple of commits, however that unfortunately didn't really work since it turned out to be difficult for smaller patches to work correctly and pass (all) tests that way.[3]
This is a classic case of every change requiring a couple of other changes, with each of those changes requiring further changes in turn and the size/scope quickly increasing as a result.
One possible "issue" with these changes is that we'll now only output JavaScript modules in the builds, which could perhaps be a problem with older tools. However it unfortunately seems far too complicated/time-consuming for us to attempt to support both the old and modern module formats, hence the alternative would be to do "nothing" here and just keep our "old" builds.[4]
---
[1] The final blocker was module support in workers in Firefox, which was implemented in Firefox 114; please see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/import#browser_compatibility
[2] It's probably possible to further improve/simplify especially the Node.js-specific code, but it does appear to work as-is.
[3] Having partially "broken" patches, that fail tests, as part of the commit history is *really not* a good idea in general.
[4] Outputting JavaScript modules was first requested almost five years ago, see issue 10317, and nowadays there *should* be much better support for JavaScript modules in various tools.
The minified default viewer has never been distributed in either official releases or through pdfjs-dist, which means that it's most likely unused, and it's has never been tested nor actively maintained.
This has been deprecated since version `2.15.349`, which is a year ago.
Removing this will also simplify some upcoming changes, specifically outputting of JavaScript modules in the builds.
The old pre-processor used for CSS, and HTML, files leaves comments intact which unnecessarily contributes to the overall size of the *built* CSS files (note that the built JavaScript files don't include comments).
Rather than trying to "hack" comment removal into the pre-processor it seems easier to use a PostCSS plugin instead. The one potential issue is that it also affects *some* whitespaces, and it's not clear to me if this'll work with the various CSS-related tests that run in mozilla-central.
Please refer to https://www.npmjs.com/package/postcss-discard-comments for additional information.
Given the limitations of the old pre-processor that's used for CSS/HTML files, this unfortunately isn't as "easy" to implement as it is for JavaScript code.
Since this is the first case where we've wanted to do conditional CSS imports, rather than trying to completely re-write the pre-processor, this patch settles for handling it explicitly in the `expandCssImports` function.
The typescript compiler is now configured to know about the import map
to be able to resolve those imports and find the associated types.
As tsc outputs declaration files using the original module identifiers
and not the resolved ones, tsc-alias is used to post-process the
declaration files by resolving those paths.
Some configurations settings like `paths` cannot be provided through CLI
arguments but only in a configuration file. And when using a
configuration file, only a few options (like `--outDir` can still be
provided) through the CLI.
- The `src/core/unicode.js` exclude ought to have become unnecessary already with PR 16200, which significantly shortened and simplified that file.
- The `src/core/glyphlist.js` exclude no longer seems necessary in practice either, possibly because of improvements in Babel.
There are 2 rotation we've to deal with: the viewer one and the editor one.
The previous implementation was a bit complex and having to deal with these
rotation would have potentially increase it.
So this patch aims to simplify the implementation and deal with all the possible
cases.
The main idea is to transform the mouse deltas according to the rotations and then
apply the resizing in the page coordinates system.
We obviously don't want to re-introduce any `require` usage in e.g. the viewer, since we should strive to only use native `import` statements wherever possible.[1]
Hopefully exposing e.g. the library globally in more cases won't break anything, however it's somewhat difficult for me to imagine all the ways in which third-party users may be accessing the PDF.js library. (Given the lack of a runnable test-case in the issue, I also cannot guarantee that this is enough to fully address the problem.)
---
[1] Ideally we should probably not rely on e.g. `pdfjsLib` being globally available in the *built* viewer, and rather always `import` the library instead.
Unfortunately this would require larger (possibly breaking) changes in the builds that we provide, however note that Firefox only recently got support for `import` in workers and that Webpack still only have *experimental* support for outputting "proper" modules.