We must force-fetch standard font data, when `disableFontFace = true` is set in the API, since otherwise rendering in e.g. the viewer is still broken (same as before PR 12726 landed).
*Please note:* We still need to also load standard font data for patterns and/or some text-rendering modes, however that will require larger changes so I figured that it cannot hurt to submit *this* patch right now.
*This implementation is basically a copy of the pre-existing `builtInCMapCache` implementation.*
For some, badly generated, PDF documents it's possible that we'll end up having to fetch the *same* standard font data over and over (which is obviously inefficient).
While not common, it's certainly possible that a PDF document uses *custom* font names where the actual font then references one of the standard fonts; see e.g. issue 11399 for one such example.
Note that I did suggest adding worker-thread caching of standard font data in PR 12726, however it wasn't deemed necessary at the time. Now that we have a real-world example that benefit from caching, I think that we should simply implement this now.
- Some js files contain scale factors for each glyph in order to rescale Liberation to have a final font with the correct width.
- A lot of XFA have some containers where their dimensions are based on their text content, so using default font from browser can lead to an almost unreadable pdf.
- a lot of xfa files are using Myriad pro or Arial fonts without embedding them and some containers have some dimensions based on those font metrics. So not having the exact same font leads to a wrong display.
- since it's pretty hard to find a replacement font with the exact same metrics, this patch gives the possibility to read glyf table, rescale each glyph and then write a new table.
- so once PR #12726 is merged we could rescale for example Helvetica to replace Myriad Pro.
This patch uses the new option added in PR 12726 to *also* allow fetching binary CMap data directly in the worker-thread in browsers.
Given that these changes remove the need to transfer data between threads for the default (browser) use-case, we can also revert the changes in PR 11118 since that simplifies the overall implementation.
- some elements weren't displayed because their rotation angle was not taken into account;
- fix box model (XFA concept):
- remove use of outline;
- position correctly border which isn't part of box dimensions;
- fix margins issues (see issue #13474).
- move border on button instead of having it on wrapping div;
This is necessary now, since with the previous patch the /FontBBox potentially depends on the contents of the /CharProcs-streams.
Note that if `getOperatorList` is called *before* `getTextContent`, this patch doesn't matter since the font is already fully loaded/parsed. However, for e.g. the `text` test-cases this is necessary to ensure correct reference images.
For Type3 fonts where the /CharProcs-streams of the individual glyph starts with a `d1` operator, we can use that to build a fallback bounding box for the font and thus improve text-selection in some cases.
While these objects aren't exactly that big and/or complex, they are nonetheless *only* necessary for XFA documents.
However, currently these objects are initialized *eagerly* for all PDF documents. By using the same pattern as elsewhere in the code-base, it's very easy to make these lazily initialized; so let's just do that instead :-)
- attribute 'use' was already implemented but not usehref
- in general, usehref should make reference to current document
- add support for SOM expressions in use and usehref to search a node.
- get prototype for all nodes if any.
For HighlightAnnotations with a built-in appearance stream, we still rely on it to specify the opacity correctly via a suitable blend mode. However, if the Annotation-drawing operators are placed *within* a /XObject of the /Form-type, the /ExtGState won't apply to the final rendering and the result is that the highlighting obscures the underlying text.
The more *correct* and general solution would likely be to somehow modify the implementation in `src/display/canvas.js`, to special-case handling of /Form-type /XObjects when rendering Annotations. Since we can very easily work-around this problem for now by using the "no appearance stream" code-path, doing *something* here ought to be preferable.
This patch is (obviously) merely a work-around, but given that the referenced issue is (as far as I know) the first case we've seen of this problem a simple solution will hopefully suffice for now.
This fixes the colours, by respecting the strokeAlpha/fillAlpha-values, for a couple of Annotations in the PDF document from issue 13447.[1]
---
[1] Some of the annotations still won't render at all, when compared with Adobe Reader, but that could/should probably be handled separately.
- for xfa rendering, fonts are loaded and used in html;
- when printed and saved in pdf, on linux, Firefox uses cairo backend
- when subsetting a font, cairo uses the font postscript name and when this one is empty that leads to a bug
(the append at 63f0d62684/src/cairo-cff-subset.c (L2049) is failing because of null length)
- so this patch adds a postscript name to the font to make cairo happy.
Currently `charsCache` is initialized *lazily*, which considering that it just contains a simple `Object` doesn't seem entirely necessary. This first of all forces us to do repeated exists-checks in the `Font.charsToGlyphs` method, and secondly the similar/related `glyphCache` is already initialized eagerly.
Furthermore, this patch also does a bit of clean-up in the `Font.charsToGlyphs` method since this code is quite old.
- the only goal of this patch is to be able to get synchronously the fake html when printing from firefox:
- in order to print we need to inject some html in beforeprint callback but we cannot block in waiting for all the pages.
- from a memory point of view: it doesn't change anything since the fake HTML is deleted in the worker;
- this way we don't break any assumptions.
- I thought it was possible to rely on browser layout engine to handle layout stuff but it isn't possible
- mainly because when a contentArea overflows, we must continue to layout in the next contentArea
- when no more contentArea is available then we must go to the next page...
- we must handle breakBefore and breakAfter which allows to "break" the layout to go to the next container
- Sometimes some containers don't provide their dimensions so we must compute them in order to know where to put
them in their parents but to compute those dimensions we need to layout the container itself...
- See top of file layout.js for more explanations about layout.
- fix few bugs in other places I met during my work on layout.
Given that `URL`s aren't supported by the structured clone algorithm, see https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm, the document in issue 1773 will cause the browser to throw `DataCloneError: The object could not be cloned.`-errors and nothing will render.
To fix this, we'll instead simply send the stringified version of the `URL` to prevent these errors from occuring.