pdf.js

Author SHA1 Message Date

Author	SHA1	Message	Date
Jonas Jenwald	628ca737dd	Make it possible to clear the cache, used by the `getB` function in `src/core/pattern.js` While this cache will not contain a huge amount of data in practice, it's nonetheless a global cache that currently will never be cleared. This patch also removes the existing closure, since it shouldn't really be necessary nowadays given that the code is a JavaScript module which means that only explicitly listed properties will be exported.	2023-09-15 12:23:06 +02:00
Jonas Jenwald	8836593b9e	Add a (global) cache to the `getCharUnicodeCategory` function Given that the regular expression has already become more complex (after the initial patch adding it), it seems to me that it probably cannot hurt to add a global cache to reduce unnecessary re-parsing. Obviously the `Glyph`-instances are being cached per font, however in most documents multiple fonts are being used and in practice there's very often a fair amount of overlap between the /ToUnicode-data in different fonts[1]. Consider for example loading and rendering the entire `tracemonkey.pdf` document (from the test-suite), which isn't a particularily large document. In that case the `getCharUnicodeCategory` function is being called a total of `601` times, however there's only `106` unique unicode-chars being checked. Please note: In practice I suppose that this won't have a huge effect on overall performance, however given the relative simplicity of this patch I figured that it'd not hurt to submit it for review. --- [1] Consider e.g. how there's usually different fonts used for regular, bold, respectively italic text.	2022-01-25 09:59:34 +01:00

Jonas Jenwald

628ca737dd

Make it possible to clear the cache, used by the getB function in src/core/pattern.js

While this cache will not contain a huge amount of data in practice, it's nonetheless a *global* cache that currently will never be cleared.

This patch also removes the existing closure, since it shouldn't really be necessary nowadays given that the code is a JavaScript module which means that only explicitly listed properties will be exported.

2023-09-15 12:23:06 +02:00

Jonas Jenwald

8836593b9e

Add a (global) cache to the getCharUnicodeCategory function

Given that the regular expression has already become more complex (after the initial patch adding it), it seems to me that it probably cannot hurt to add a global cache to reduce unnecessary re-parsing.
Obviously the `Glyph`-instances are being cached *per* font, however in most documents multiple fonts are being used and in practice there's very often a fair amount of overlap between the /ToUnicode-data in different fonts[1].

Consider for example loading and rendering the entire `tracemonkey.pdf` document (from the test-suite), which isn't a particularily large document. In that case the `getCharUnicodeCategory` function is being called a total of `601` times, however there's only `106` *unique* unicode-chars being checked.

*Please note:* In practice I suppose that this won't have a *huge* effect on overall performance, however given the relative simplicity of this patch I figured that it'd not hurt to submit it for review.

---
[1] Consider e.g. how there's usually different fonts used for regular, bold, respectively italic text.

2022-01-25 09:59:34 +01:00

2 Commits