*This patch implements https://github.com/mozilla/pdf.js/pull/11777#issuecomment-609741348*
This extends the work from PR 11773 and 11777 further, by immediately releasing the `font.data` property once the font been attached to the DOM. By not unnecessarily holding onto this data on the main-thread, we'll thus reduce the memory usage of fonts even further (especially beneficial in longer documents with composite fonts).
The new behaviour is controlled by the recently added `fontExtraProperties` API option (adding a new option just for this patch didn't seem necessary), since there's one edge-case in the SVG renderer where the `font.data` property is necessary (see the `pdf2svg` example).
Note that while the default viewer does run clean-up with an idle timeout, that timeout will be reset whenever rendering occurs *or* when scrolling happens in the viewer. In practice this means that unless the user doesn't interact with the viewer in *any* way during an extended period of time, currently set to 30 seconds, the `PDFDocumentProxy.cleanup` method will never be called and font resources will thus not be cleaned-up.
Please note that these changes were done automatically, using `gulp lint --fix`.
Given that the major version number was increased, there's a fair number of (primarily whitespace) changes; please see https://prettier.io/blog/2020/03/21/2.0.0.html
In order to reduce the size of these changes somewhat, this patch maintains the old "arrowParens" style for now (once mozilla-central updates Prettier we can simply choose the same formatting, assuming it will differ here).
Note that Prettier, purposely, has only limited [configuration options](https://prettier.io/docs/en/options.html). The configuration file is based on [the one in `mozilla central`](https://searchfox.org/mozilla-central/source/.prettierrc) with just a few additions (to avoid future breakage if the defaults ever changes).
Prettier is being used for a couple of reasons:
- To be consistent with `mozilla-central`, where Prettier is already in use across the tree.
- To ensure a *consistent* coding style everywhere, which is automatically enforced during linting (since Prettier is used as an ESLint plugin). This thus ends "all" formatting disussions once and for all, removing the need for review comments on most stylistic matters.
Many ESLint options are now redundant, and I've tried my best to remove all the now unnecessary options (but I may have missed some).
Note also that since Prettier considers the `printWidth` option as a guide, rather than a hard rule, this patch resorts to a small hack in the ESLint config to ensure that *comments* won't become too long.
*Please note:* This patch is generated automatically, by appending the `--fix` argument to the ESLint call used in the `gulp lint` task. It will thus require some additional clean-up, which will be done in a *separate* commit.
(On a more personal note, I'll readily admit that some of the changes Prettier makes are *extremely* ugly. However, in the name of consistency we'll probably have to live with that.)
If, as PR 10368 suggests, more parameters should be added to `getViewport` I think that it would be a mistake to not change the signature *first* to avoid needlessly unwieldy call-sites.
To not break any existing code and third-party use-cases, this is obviously implemented with a deprecation warning *and* with a working fallback[1] for the old method signature.
---
[1] This is limited to `GENERIC` builds, which should be sufficient.
Implement a serialization "generator" for `DOMElement` in domutils.js
that yields the serialization of the SVG element. This method is used by
a newly added `ReadableSVGStream` class, which can be used like any
other readable stream in Node.js.
This reduces the memory requirements. Now, it is not needed to require
the serialization to fully fit in memory.
Note: The implementation of the serializer is a state machine in ES5
since the rest of the file is also in ES5. Its functionality is
equivalent to:
```
function* serializeSVGElement(elem) {
yield '<' + elem.nodeName;
if (elem.nodeName === 'svg:svg') {
yield ' xmlns:xlink="http://www.w3.org/1999/xlink"' +
' xmlns:svg="http://www.w3.org/2000/svg"';
}
for (let i in elem.attributes) {
yield ' ' + i + '="' + xmlEncode(elem.attributes[i]) + '"';
}
yield '>';
if (elem.nodeName === 'svg:tspan' || elem.nodeName === 'svg:style') {
yield xmlEncode(elem.textContent);
} else {
for (let childNode of elem.childNodes) {
yield* serializeSVGElement(childNode);
}
}
yield '</' + elem.nodeName + '>';
}
```
Do not directly export to global. Instead, export all stubs in domstubs.js and
add a method setStubs to assign all exported stubs to a namespace. Then replace
the import domstubs with an explicit call to this setStubs method. Also added
unsetStubs for undoing the changes. This is done to allow unit testing of the
SVG backend without namespace pollution.
Wait for the completion of writing the generated SVG file before
processing the next page. This is to enable the garbage collector to
garbage-collect the (potentially large) SVG string before trying to
allocate memory again for the next page.
Note that since the PDF-to-SVG conversion is now sequential instead of
parallel, the time to generate all pages increases.
Test case:
node --max_old_space_size=200 examples/node/pdf2svg.js /tmp/FatalProcessOutOfMemory.pdf
Before this patch:
- Node.js crashes due to OOM after processing 20 pages.
After this patch:
- Node.js is able to convert all 203 PDFs to SVG without crashing.
It doesn't really make sense to attempt to utilize the `NativeImageDecoder` in Node, since there's no native image support available, hence building on PR 8035 we can easily disable it in the example.
Fixes 7901.