In the PDF from issue 8527, the clip operator (W) shows up before a path
is defined. The current SVG backend however expects a path to exist
before generating a `<svg:clipPath>` element.
In the example, the path was defined after the clip, followed by a
endPath operator (n).
So this commit fixes the bug by moving the path generation logic from
clip to endPath.
Our canvas backend appears to use similar logic:
`CanvasGraphics_endPath` calls `consumePath`, which in turn draws the
clip and resets the `pendingClip` state. The canvas backend calls
`consumePath` from multiple other places, so we probably need to check
whether doing so is also necessary for the SVG backend.
I scanned our corpus of PDF files in test/pdfs, and found that in every
instance (except for one), the "W" PDF operator (clip) is immediately
followed by "n" (endPath). The new test from this commit (clippath.pdf)
starts with "W", followed by a path definition and then "n".
# Commands used to find some of the clipping commands:
grep -ra '^W$' -C7 | less -S
grep -ra '^W ' -C7 | less -S
grep -ra ' W$' -C7 | less -S
test/pdfs/issue6413.pdf is the only file where "W" (a tline 55) is not
followed by "n". In fact, the "W" is the last operation of a series of
XObject painting operations, and removing it does not have any effect
on the rendered PDF (confirmed by looking at the output of PDF.js's
canvas backend, and ImageMagick's convert command).
*As mentioned the last time that I touched this particular part of the font code, I'm sincerely hope that this doesn't cause any regressions!*
However, the patch passes all tests added in PRs 5770, 6270, and 7904 (and obviously all other tests as well). Furthermore, I've manually checked all the issues/bugs referenced in those PRs without finding any issues.
Fixes 8480.
Adds functionality to accept Queueing Strategy in
sendWithStream method. Using Queueing Strategy we
can control the data that is enqueued into the sink,
and hence regulated the flow of chunks from worker
to main thread.
Adds capability in pull and cancel methods.
Adds ready and desiredSize property in streamSink.
Adds unit test for ReadableStream and sendWithStream.
PR 7341 added special handling for `nameddest`s that look like pageNumbers, to prevent issues since we previously *incorrectly* supported specifying a pageNumber directly in the hash; i.e. `#10` versus the correct `#page=10` format.
Since this behaviour wasn't correct, PR 7757 fixed and deprecated the old format, which means that we no longer need to maintain the `nameddest` hack in multiple files.
Added test for ReadableStream.
Adds ref-implementation license-header in streams-lib
and change gulp task to copy external/streams/ in build/
external/streams/ and build/dist/external/streams folder.
Adds README.md and LICENSE.md
Refactors `Driver._cleanup` to return a `Promise` which is resolved once all opened documents have been destroyed.
This is then used in `Driver._nextTask` to ensure that we wait for everything to be cleaned up, such that the tests run sequentially.
For some reason, we're putting all kind of images *except* JPEG into the `imageCache` in `evaluator.js`.[1]
This means that in the PDF file in issue 8380, we'll keep sending the *same* two small images[2] to the main-thread and decoding them over and over. This is obviously hugely inefficient!
As can be seen from the discussion in the issue, the performance becomes *extremely* bad if the user has the addon "Adblock Plus" installed. However, even in a clean Firefox profile, the performance isn't that great.
This patch not only addresses the performance implications of the "Adblock Plus" addon together with that particular PDF file, but it *also* improves the rendering times considerably for *all* users.
Locally, with a clean profile, the rendering times are reduced from `~2000 ms` to `~500 ms` for my setup!
Obviously, the general structure of the PDF file and its operator sequence is still hugely inefficient, however I'd say that the performance with this patch is good enough to consider the issue (as it stands) resolved.[3]
Fixes 8380.
---
[1] Not technically true, since inline images are cached from `parser.js`, but whatever :-)
[2] The two JPEG images have dimensions 1x2, respectively 4x2.
[3] To make this even more efficient, a new state would have to be added to the `QueueOptimizer`. Given that PDF files this stupid fortunately aren't too common, I'm not convinced that it's worth doing.
Currently these methods accept a large number of parameters, which creates quite unwieldy call-sites. When invoking them, you have to remember not only what arguments to supply, but also the correct order, to avoid runtime errors.
Furthermore, since some of the parameters are optional, you also have to remember to pass e.g. `null` or `undefined` for those ones.
Also, adding new parameters to these methods (which happens occasionally), often becomes unnecessarily tedious (based on personal experience).
Please note that I do *not* think that we need/should convert *every* single method in `evaluator.js` (or elsewhere in `/core` files) to take parameter objects. However, in my opinion, once a method starts relying on approximately five parameter (or even more), passing them in individually becomes quite cumbersome.
With these changes, I obviously needed to update the `evaluator_spec.js` unit-tests. The main change there, except the new method signatures[1], is that it's now re-using *one* `PartialEvalutor` instance, since I couldn't see any compelling reason for creating a new one in every single test.
*Note:* If this patch is accepted, my intention is to (time permitting) see if it makes sense to convert additional methods in `evaluator.js` (and other `/core` files) in a similar fashion, but I figured that it'd be a good idea to limit the initial scope somewhat.
---
[1] A fun fact here, note how the `PartialEvaluator` signature used in `evaluator_spec.js` wasn't even correct in the current `master`.
Please see http://eslint.org/docs/rules/object-shorthand.
Unfortunately, based on commit 9276d1dcd9, it seems that we still need to maintain compatibility with old Node.js versions, hence certain files/directories that are executed in Node.js are currently exempt from this rule.
Furthermore, since the files specific to the Chromium extension are not run through Babel, the `/extensions/chromium/` directory is also exempt from this rule.
The test runner is automated, so if the default browser test is
performed, the browser hangs waiting for user input it never gets.
Disable the test to fix that.
Moreover, enable E10s now that it is mature. This may help with the
performance of the test runner as well.
Please refer to the JBIG2 standard, see https://www.itu.int/rec/dologin_pub.asp?lang=e&id=T-REC-T.88-200002-I!!PDF-E&type=items.
In particular, section "6.3.5.3 Fixed templates and adaptive templates" mentions that the offsets should be *subtracted*; where the offsets are defined according to "Table 6" under section "6.3.2 Input parameters".
Fixes 7145.
Fixes 7308.
Fixes 7401.
Fixes 7850.
Fixes 8270.
With the exception of just one test-case, all the current `ui_utils` unit-tests can run successfully on Node.js (since most of them doesn't rely on the DOM).
To get this working, I had to first of all add a new `LIB` build flag such that `gulp lib` produces a `web/pdfjs.js` file that is able to load `pdf.js` successfully.
Second of all, since neither `document` nor `navigator` is available in Node.js, `web/ui_utils.js` was adjusted slightly to avoid errors.
The patch also changes the `defaultFilename` to use the ES6 default parameter notation, and fixes the formatting of the JSDoc comment.
Finally, since `getPDFFileNameFromURL` currently has no unit-tests, a few basic ones are added to avoid regressions.
[api-minor] Always allow e.g. rendering to continue even if there are errors, and add a `stopAtErrors` parameter to `getDocument` to opt-out of this behaviour (issue 6342, issue 3795, bug 1130815)
This patch implements support for line annotations. Other viewers only
show the popup annotation when hovering over the line, which may have
any orientation. To make this possible, we render an invisible line (SVG
element) over the line on the canvas that acts as the trigger for the
popup annotation. This invisible line has the same starting coordinates,
ending coordinates and width of the line on the canvas.
Other PDF readers, e.g. Adobe Reader and PDFium (in Chrome), will attempt to render as much of a page as possible even if there are errors present.
Currently we just bail as soon the first error is hit, which means that we'll usually not render anything in these cases and just display a blank page instead.
NOTE: This patch changes the default behaviour of the PDF.js API to always attempt to recover as much data as possible, even when encountering errors during e.g. `getOperatorList`/`getTextContent`, which thus improve our handling of corrupt PDF files and allow the default viewer to handle errors slightly more gracefully.
In the event that an API consumer wishes to use the old behaviour, where we stop parsing as soon as an error is encountered, the `stopAtErrors` parameter can be set at `getDocument`.
Fixes, inasmuch it's possible since the PDF files are corrupt, e.g. issue 6342, issue 3795, and [bug 1130815](https://bugzilla.mozilla.org/show_bug.cgi?id=1130815) (and probably others too).
Considering how extremely simple this patch turned out to be, I'm almost worried that I completely misunderstood why the current code looks like it does...
I happened to notice that the error handling wasn't that great, which I missed previously since there were no unit-tests for failure to load built-in CMap files.
Hence this patch, which improves the error handling *and* adds tests.
I really cannot understand why this change is necessary, since modern browsers such as Firefox and Chrome work just fine with the old code.
Hence this is patch is yet another "hack" that's needed just because IE apparently cannot just work like you'd expect.
For consistency, the Node factory used in the CMap unit-tests is changed as well.
Fixes 8193.