This major version bump required two changes:
- The global line in the mobile viewer example should be removed because
the `.eslintrc` file already defines these globals and with the new
`eslint` version we otherwise get an error saying "'pdfjsLib' is already
defined as a built-in global variable".
- The ECMA version for the examples must be set to 6 since we're using
modules, otherwise we get an error saying "sourceType 'module' is not
supported when ecmaVersion < 2015". It turns out that the previous
version of `eslint` already used ECMA version 6 silently even though
we set 5, see https://github.com/eslint/eslint/issues/9687#issuecomment-432413384,
so in terms of our code nothing really changes.
If, as PR 10368 suggests, more parameters should be added to `getViewport` I think that it would be a mistake to not change the signature *first* to avoid needlessly unwieldy call-sites.
To not break any existing code and third-party use-cases, this is obviously implemented with a deprecation warning *and* with a working fallback[1] for the old method signature.
---
[1] This is limited to `GENERIC` builds, which should be sufficient.
This required the following changes in the Gulpfile:
- Defining a series of tasks is no longer done with arrays, but with the
`gulp.series` function. The `web` target is refactored to use a
smaller number of tasks to prevent tasks from running multiple times.
- Getting all tasks must now be done through the task registry.
- Tasks that don't return anything must call `done` upon completion.
Moreover, this upgrade allows us to use the latest Node.js on Travis CI
again.
*This patch is based on something that I noticed while working on PR 10126.*
The recent re-factoring of `PDFFindController` brought many improvements, among those the fact that access to `BaseViewer` is no longer required. However, with these changes there's one thing which now strikes me as not particularly user-friendly[1]: The fact that in order for searching to actually work, `PDFFindController.setDocument` must be called *and* a 'pagesinit' event must be dispatched (from somewhere).
For all other viewer components, calling the `setDocument` method[2] is enough in order for the component to actually be usable.
The `PDFFindController` thus stands out quite a bit, and it also becomes difficult to work with in any sort of custom implementation. For example: Imagine someone trying to use `PDFFindController` separately from the viewer[3], which *should* now be relatively simple given the re-factoring, and thus having to (somehow) figure out that they'll also need to manually dispatch a 'pagesinit' event for searching to work.
Note that the above even affects the unit-tests, where an out-of-place 'pagesinit' event is being used.
To attempt to address these problems, I'm thus suggesting that *only* `setDocument` should be used to indicate that searching may start. For the default viewer and/or the viewer components, `BaseViewer.setDocument` will now call `PDFFindController.setDocument` when the document is ready, thus requiring no outside configuration anymore[4]. For custom implementation, and the unit-tests, it's now as simple as just calling `PDFFindController.setDocument` to allow searching to start.
---
[1] I should have caught this during review of PR 10099, but unfortunately it's sometimes not until you actually work with the code in question that things like these become clear.
[2] Assuming, obviously, that the viewer component in question actually implements such a method :-)
[3] There's even a very recent issue, filed by someone trying to do just that.
[4] Short of providing a `PDFFindController` instance when creating a `BaseViewer` instance, of course.
With `PDFFindController` instances no longer (directly) depending on
`BaseViewer` instances, we can pass a single `findController` when
initializing a viewer, similar to other components.
This removes the dependency on a `PDFViewer` instance from the find
controller, which makes it more similar to other components and makes it
easier to unit test with a mock link service.
Finally, we remove the search capabilities from the SVG example since it
doesn't work there because there is no separate text layer.
Now it follows the same pattern as e.g., the document properties
component, which allows us to have one instance of the find controller
and set a new document to search upon switching documents.
Moreover, this allows us to get rid of the dependency on `pdfViewer` in
order to fetch the text content for a page. This is working towards
getting rid of the `pdfViewer` dependency upon initializing the
component entirely in future commits.
Finally, we make the `reset` method private since it's not supposed to
be used from the outside anymore now that `setDocument` takes care of
this, similar to other components.
The purpose of this patch is to provide a better default behaviour when `JpegImage` is used to parse standalone JPEG images with CMYK colour spaces.
Since the issue that the patch concerns is somewhat of a special-case, the implementation utilizes the already existing decode support in an attempt to minimize the impact w.r.t. code size.
*Please note:* It's always possible for the user of `JpegImage` to control image inversion, and thus override the new behaviour, by simply passing a custom `decodeTransform` array upon initialization.
This should provide a better out-of-the-box experience when using PDF.js in a Node.js environment, since it's missing native support for both `@font-face` and `Image`.
Please note that this change *only* affects the default values, hence it's still possible for an API consumer to override those values when calling `getDocument`.
Also, prevents "ReferenceError: document is not defined" errors, when running the unit-tests in Node.js/Travis.
We need to pass `disableFontFace` and `nativeImageDecoderSupport`
because Node.js has no native support for `@font-face` and `Image`.
Doing so makes it possible to render e.g., the Tracemonkey paper, which
failed before. I made this PDF file the default because it's also the
default in other examples/demos and because it showcases the
possibilities better than the very simple hello world PDF file.
Building the library with `gulp dist-install` is easier and is already
recommended in the other examples.
Rather than having two different (but connected) options for the textLayer, I think that it makes sense to try and unify this. For example: currently if `disableTextLayer === true`, then the value of `enhanceTextSelection` is simply ignored.
Since PDF.js version `2.0` already won't be backwards compatible in lots of ways, I don't think that we need to worry about migrating existing preferences here.
This removes the `PDFJS.useOnlyCssZoom` dependency from the viewer components, but please note that as a *temporary* solution the default viewer still uses it.
Despite this patch removing the `disableWorker` option itself, please note that we'll still fallback to loading the worker file(s) on the main-thread when running in environments without proper Web Worker support.
Furthermore it's still possible, even with this patch, to force the use of fake workers by manually loading the necessary file using a `<script>` tag on the main-thread.[1]
That way, the functionality of the now removed `SINGLE_FILE` build target and the resulting `build/pdf.combined.js` file can still be achieved simply by adding e.g. `<script src="build/pdf.worker.js"></script>` to the HTML (obviously with the path adjusted as needed).
Finally note that the `disableWorker` option is a performance footgun, and unfortunately many existing third-party examples actually use it without providing any sort of warning/justification.
---
[1] This approach is used in the default viewer, since certain kind of debugging may be easier if the code is running directly on the main-thread.
These were removed in PR 9170, since they were unused in the browsers that we'll support in PDF.js version `2.0`.
However looking at the output of Travis, where a subset of the unit-tests are run using Node.js, there's warnings about `btoa` being undefined. This doesn't appear to cause any errors, which probably explains why we didn't notice this before (despite PR 9201).
The `DOMParser` is most likely overkill and may be less secure.
Moreover, it is not supported in Node.js environments.
This patch replaces the `DOMParser` with a simple XML parser. This
should be faster and gives us Node.js support for free. The simple XML
parser is a port of the one that existed in the examples folder with a
small regex fix to make the parsing work correctly.
The unit tests are extended for increased test coverage of the metadata
code. The new method `getAll` is provided so the example does not have
to access internal properties of the object anymore.