I haven't got an example where the current code breaks, but given all the previous cases we've seen where PDF generators use indirect objects in Arrays it makes sense to fix this pro-actively.
I've modified the relevant unit-tests slightly, and they would *not* pass without the code changes in this patch.
*Note:* `Dict_getArray` only dereferences Array elements on the "top-level", to avoid recursion issues. Furthermore if you have to loop through the Array at the call-site anyway, then using `Dict_get` in combination with `XRef_fetchIfRef` is a tiny bit more efficient.
The original code is difficult to read and, more importantly, performs
actions that are not described in the specification. It replaces empty
names with a backtick and an index, but this behavior is not described
in the specification. While the specification is not entirely clear
about what should happen in this case, it does specify that the `T`
field is optional and that multiple field dictionaries may have the same
fully qualified name, so to achieve this it makes the most sense to
ignore missing `T` fields during construction of the field name. This is
the most specification-compliant solution and, judging by opened issue #6623, also the required and expected behavior.
Note that in `FIREFOX/MOZCENTRAL/CHROME` builds of the standard viewer the `docBaseUrl` parameter will be set by default, since in that case it makes sense to use the current URL as a base.
For the `GENERIC` viewer, or the API itself, it doesn't make sense to try and set the `docBaseUrl` by default. However, custom deployments/implementations may still find the parameter useful.
This not only reduces code duplication, but it also allow us to easily support the same kind of URLs we currently do for Link annotations in the Outline as well.
Directly use the hexadecimal representation, just like the
`AnnotationFlags`, to avoid calculations and to improve readability.
This allows us to simplify the unit tests for text widget annotations as
well.
When debugging issue 7643, I noticed that the `forms` tests currently doesn't look like the rendering in the viewer (with `renderInteractiveForms = true` set).
After scratching my head for a little while, I realized that PR 7633 make the implicit assumption that `Page_getOperatorList` (in `core/document.js`) is called *before* fetching the annotation with `PDFPageProxy_getAnnotations` (in `display/api.js`).
Hence this patch, that changes it so that we instead pass in the `renderInteractiveForms` parameter to `Annotation_appendToOperatorList` to ensure that it's always correctly set.
If interactive forms are enabled, then the display layer takes care of
rendering the form elements. There is no need to draw them on the canvas
as well. This also leads to issues when values are prefilled, because
the text fields are transparent, so the contents that have been rendered
onto the canvas will be visible too.
We address this issue by passing the `renderInteractiveForms` parameter
to the render task and handling it when the page is rendered (i.e., when
the canvas is rendered).
This patch improves the unit tests by testing the support for read-only
and multiline fields. Moreover, we add a reference test to ensure that
the text widgets are not only rendered, but also that their contents are
styled properly.
Finally, we perform minor improvements in `src/core/annotation.js`, for
example adding missing comments.
Currently, we only support text widget annotations (field type 'Tx')
partially. However, the current code does not make this entirely clear
and does not provide a warning when an unsupported field type is
encountered, making it harder to determine why rendering fails.
Moreover, in the display layer we make no distinction between the
various types of widget annotations, causing the code for text widget
annotations to also be executed for other types of widget annotations in
a fallback situation.
This patch improves the structure of the widget annotation code. In the
core layer, we use the same structure we use for non-widget annotations
in the factory and provide a clear warning when an unsupported type is
encountered. In the display layer, we do the same and split the
`WidgetAnnotationElement` class into two classes, namely
`TextWidgetAnnotationElement` for text widget annotations and
`WidgetAnnotationElement` for other unsupported annotations as a
fallback. From this it clear that we only support text widget
annotations and nothing else.
Note that I used a separate warning message for this case, instead of utilizing the same one as in the unsupported subtype case, to more clearly indicate that the PDF file itself is to blame rather than PDF.js.
Fixes 7446.
Fixes http://www.pdf-archive.com/2013/09/30/file2/file2.pdf.
Note how it's not possible to show the various Popup Annotations in the above document.
To fix that, this patch lets the Popup inherit the flags of the parent, in the special case where the parent is `viewable` *and* the Popup is not.
In general, I don't think that a Popup must have the same flags set as the parent. However, it seems very strange to have a `viewable` parent annotation, and then not being able to view the Popup.
Annoyingly the PDF specification doesn't, as far as I can find, mention anything about how this case should be handled, but this patch seem consistent with the actual behaviour in Adobe Reader.
Currently for explicit destinations, compared to named destinations, we manually try to build a hash that often times is a quite poor representation of the *actual* destination. (Currently this only, kind of, works for `\XYZ` destinations.)
For PDF files using explicit destinations, this can make it difficult/impossible to obtain a link to a specific section of the document through the URL.
Note that in practice most PDF files, especially newer ones, use named destinations and these are thus unnaffected by this patch.
This patch also fixes an existing issue in `PDFLinkService_getDestinationHash`, where a named destination consisting of only a number would not be handled correctly.
With the added, and already existing, type checks in place for destinations, I really don't think that this patch exposes any "sensitive" internal destination code not already accessible through normal hash parameters.
*Please note:* Just trying to improve the algorithm that generates the hash is unfortunately not possible in general, since there are a number of cases where it will simply never work well.
- First of all, note that `getDestinationHash` currently relies on the `_pagesRefCache`, hence it's possible that the hash returned is empty during e.g. ranged/streamed loading of a PDF file.
- Second of all, the currently computed hash is actually dependent on the document rotation. With named destinations, the fetched internal destination array is rotational invariant (as it should be), but this will not hold in general for the hash. We can easily avoid this issue by using a stringified destination array.
- Third of all, note that according to the PDF specification[1], `GoToR` destinations may actually contain explicit destination arrays. Since we cannot really construct a hash in `annotation.js`, we currently have no good way to support those. Even though this case seems *very* rare in practice (I've not actually seen such a PDF file), it's in the specification, and this patch allows us to support that for "free".
---
[1] http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G11.1951685
As evident from e.g. PRs 6485 and 7118, some bad PDF generators unfortunately create Arrays where *some* elements are indirect objects (i.e. `Ref`s). This seems to mostly affect Arrays that contain numbers, such as e.g. `Matrix/FontMatrix/BBox/FontBBox/Rect/Color/...`, and has manifested itself in PDF files that fail to render correctly (some elements are missing).
The problem in both the cases above, besides broken rendering, was that there were *no* errors/warnings that indicated what the problem was, making it difficult to pinpoint the issue.
Hence this patch, where I've audited all usages of `Dict_get` in `src/core/` files, and replaced it with `Dict_getArray` where appropriate to try and prevent unnecessary future bugs.
- Add support for the 'NewWindow' property.
- Ensure that destinations are applied to the *remote* document, instead of the current one.
- Handle the `F` entry being a standard string, instead of a dictionary.
Note that in the PDF files provided by the reporter, this issue was limited to `Rect` arrays in AcroForm entries (which we currently don't support).
However, since a bad PDF generator could create this problem in *any* kind of annotation, the reduced test-case included here uses a simple LinkAnnotation instead.
Fixes 7115.
Most code for Popup annotations is already present for Text annotations.
This patch extracts the popup creation logic from the Text annotation
code so it can be reused for Popup annotations.
Not only does this add support for Popup annotations, the Text
annotation code is also considerably easier. If a `Popup` entry is
available for a Text annotation, it will not be more than an image. The
popup will be handled by the Popup annotation. However, it is also
possible for Text annotations to not have a separate Popup annotation,
in which case the Text annotation handles the popup creation itself.
Now we have a full list of all possible annotation types and the
numbering corresponds to the order in the specification. Not only is
this more consistent and complete, it also prevents having to add these
constants when a new annotation type is implemented.
Additionally fix an issue where a regular Widget annotation would not
have `data.annotationType` set. It was only set for a
TextWidgetAnnotation, but instead move it to the base Widget annotation
class to add it for all Widget annotations (since TextWidgetAnnotation
inherits from WidgetAnnotation it will have it too).
This patch improves the code structure of the annotation code.
- Create the annotation border style object in the `setBorderStyle` method instead of in the constructor. The behavior is the same as the `setBorderStyle` method is always called, thus a border style object is still always available.
- Put all data object manipulation lines in one block in the constructor. This improves readability and maintainability as it is more visible which properties are exposed.
- Simplify `appendToOperatorList` by removing the promise capability and removing an unused parameter.
- Remove some unnecessary newlines/spaces.
This patch makes it possible to set and get all possible flags that the PDF specification defines. Even though we do not support all possible annotation types and not all possible annotation flags yet, this general framework makes it easy to access all flags for each annotation such that annotation type implementations can use this information.
We add constants for all possible annotation flags such that we do not need to hardcode the flags in the code anymore. The `isViewable()` and `isPrintable()` methods are now easier to read. Additionally, unit tests have been added to ensure correct behavior.
This is another part of #5218.
Currently, `src/core/core.js` uses the `fromRef` method on an `Annotation` object to obtain the right annotation type object (such as `LinkAnnotation` or `TextAnnotation`). That method in turn uses a method `getConstructor` to find out which annotation type object must be returned.
Aside from the fact that there is currently a lot of code to achieve this, these methods should not be part of the base `Annotation` class at all. Creation of annotation object should be done by a factory (as also recommended by @yurydelendik at https://github.com/mozilla/pdf.js/pull/5218#issuecomment-52779659) that handles finding out the correct annotation type object and returning it. This patch implements this separation of concerns.
Doing this allows us to also simplify the code quite a bit and to make it more readable. Additionally, we are now able to get rid of the hardcoded array of supported annotation types. The factory takes care of checking the annotation types and falls back to returning the base annotation type (and issuing a warning, which the current code also does not do well) when an annotation type is unsupported.
I have manually tested this commit with 20 test PDFs with different annotation types, such as /Link, /Text, /Widget, /FileAttachment and /FreeText. All render identically before and after the patch, and unsupported annotation types are now properly indicated with a warning in the console.
This patch refactors the code responsible for setting the annotation's rectangle. Its goal is to:
- Actually check that the input array is actually an array, and if so, that it contains exactly four elements.
- Only call `normalizeRect` if the input array is valid, i.e., we do not call it for the default rectangle anymore.
Unit tests are provided just like with the other patches in this series.
This became obsolete in bdeca30fbf. All it does is call the Annotation contructor and add hasHtml. This patch lets the Link and Text annotations directly extend the Annotation class and add hasHtml themselves.
This patch also removes an unused global.
This change does the following:
* Address TODO to remove getEmptyContainer helper.
* Not set container bg-color. The old code is incorrect, causing it to
not have any effect. It sets color to an array (item.color) rather
css string. Also in most cases it would set it to black background
which is incorrect.
* only add border instructions when there is actually a border
* reduce memory consumption by not creating new 3 element arrays for
annotation colors. In fact according to spec, this would be incorrect,
as the default should be "transparent" for an empty array. Adobe
Reader interprets a missing color array as black however.
Note that only Link annotations were actually setting a border style and
color. While Text annotations might have calculated a border they did
not color it. This behaviour is now controlled by the boolean flag.