Ignore globally cached images in PartialEvaluator.getTextContent (PR 11930 follow-up)

Given that we'll only cache `/XObject`s of the `Image`-type globally, we can utilize that in `PartialEvaluator.getTextContent` as well. This way, in cases such as e.g. issue 12098, we can avoid having to fetch/parse `/XObject`s that we already know to be `Image`s. This is helpful, since `Stream`s are not cached on the `XRef` instance (given their potential size) and the lookup can thus be somewhat expensive in general.

Also, skip a redundant `RefSetCache.has` check in the `GlobalImageCache.getData` method.
This commit is contained in:
Jonas Jenwald 2021-01-27 16:56:17 +01:00
parent d52e5b0505
commit 72da2aa166
2 changed files with 12 additions and 2 deletions

View File

@ -2515,6 +2515,15 @@ class PartialEvaluator {
return;
}
const globalImage = self.globalImageCache.getData(
xobj,
self.pageIndex
);
if (globalImage) {
resolveXObject();
return;
}
xobj = xref.fetch(xobj);
}

View File

@ -247,13 +247,14 @@ class GlobalImageCache {
if (pageIndexSet.size < GlobalImageCache.NUM_PAGES_THRESHOLD) {
return null;
}
if (!this._imageCache.has(ref)) {
const imageData = this._imageCache.get(ref);
if (!imageData) {
return null;
}
// Ensure that we keep track of all pages containing the image reference.
pageIndexSet.add(pageIndex);
return this._imageCache.get(ref);
return imageData;
}
setData(ref, data) {