PDFPageView.updatePosition
since it's not actually necessary
This method is currently called from `PDFViewer._scrollUpdate` on *every* scroll event in the viewer. However, I cannot see why this code is now necessary (assuming that it once was), since text-selection and searching still works *exactly* the same way with this patch as with the current `master`. When `PDFPageView.updatePosition` is called, the page can be in either of these states: 1. The page hasn't been rendered, in which case the `textLayer` property doesn't exist yet. 2. The page is currently rendering, meaning that the `textLayer` property exists. Given that the `textContent` won't be fetched until the page has been successfully rendered, `TextLayerBuilder.render` will return immediately and effectively be a no-op (since there's nothing to render yet). 3. The has been been rendered, and the `textLayer` is currently rendering. 4. The page, and its `textLayer`, has been completely rendered. In this case, `TextLayerBuilder.render` will return immediately and effectively be a no-op. Here, only the *third* case seem to require any further analysis: When scrolling occurs while the `textLayer` is rendering, `TextLayerBuilder.render` will via a helper method call `TextLayerRenderTask.cancel` (in src/display/text_layer.js) to stop processing. However, due to the run-to-completion nature of JavaScript, once `TextLayerRenderTask._render` has been invoked `appendText` will always run.[1] So even though we cancel rendering of pending `textLayer`s during scrolling, via the repeated `TextLayerBuilder.render` calls from within the `PDFPageView.updatePosition` method, that does *not* prevent us from running the code inside of `TextLayerRenderTask._render` over and over for the *same* page; which all seems *very* inefficient to me.[2] All this will thus have the effect of delaying the *actual* rendering of a `textLayer` ever so slightly while scrolling in the viewer. However, it does so at the expense of potentially hundreds of unnecessary `appendText` calls.[3] Hence it seems to me that it's less resource intensive overall to simply let rendering of the `textLayer` complete, once it has started. Obviously, we still abort all rendering of a page, and its `textLayer`, when it's being destroyed (e.g. by being evicted from the page cache). In case that there's any worry that the patch could affect e.g. highlighting of search results, please note that the existing code in `TextLayerBuilder.render` already calls `updateMatches` when the `TextLayerTask` resolves successfully. *I'm sorry that this became quite long, but to try and summarize:* `PDFPageView.updatePosition` doesn't actually do anything in *most* cases. In the one case where it matters, it seems that it's actually doing more harm than good; which is why I'm proposing that we just remove it. --- [1] Although we may be able to skip the `render` call, provided that it happens *after* a `timeout` (as is the case in the default viewer). [2] With current work being done to support streaming of `TextContent`, we'd also need to add just as many duplicate API calls to `PDFPageView.updatePosition`. [3] The number of duplicate `appendText` calls is directly proportional not only to the scroll speed, but also to the number of pages in the document.
PDF.js
PDF.js is a Portable Document Format (PDF) viewer that is built with HTML5.
PDF.js is community-driven and supported by Mozilla Labs. Our goal is to create a general-purpose, web standards-based platform for parsing and rendering PDFs.
Contributing
PDF.js is an open source project and always looking for more contributors. To get involved, visit:
- Issue Reporting Guide
- Code Contribution Guide
- Frequently Asked Questions
- Good Beginner Bugs
- Projects
Feel free to stop by #pdfjs on irc.mozilla.org for questions or guidance.
Getting Started
Online demo
Browser Extensions
Firefox (and Seamonkey)
PDF.js is built into version 19+ of Firefox, however one extension is still available:
-
Development Version - This extension is mainly intended for developers/testers, and it is updated every time new code is merged into the PDF.js codebase. It should be quite stable, but might break from time to time.
-
Please note that the extension is not guaranteed to be compatible with Firefox versions that are older than the current ESR version, see the Release Calendar.
-
The extension should also work in Seamonkey, provided that it is based on a Firefox version as above (see Which version of Firefox does SeaMonkey 2.x correspond with?), but we do not guarantee compatibility.
-
Chrome
- The official extension for Chrome can be installed from the Chrome Web Store. This extension is maintained by @Rob--W.
- Build Your Own - Get the code as explained below and issue
gulp chromium
. Then open Chrome, go toTools > Extension
and load the (unpackaged) extension from the directorybuild/chromium
.
Getting the Code
To get a local copy of the current code, clone it using git:
$ git clone git://github.com/mozilla/pdf.js.git
$ cd pdf.js
Next, install Node.js via the official package or via nvm. You need to install the gulp package globally (see also gulp's getting started):
$ npm install -g gulp-cli
If everything worked out, install all dependencies for PDF.js:
$ npm install
Finally you need to start a local web server as some browsers do not allow opening PDF files using a file:// URL. Run
$ gulp server
and then you can open
It is also possible to view all test PDF files on the right side by opening
Building PDF.js
In order to bundle all src/
files into two production scripts and build the generic
viewer, run:
$ gulp generic
This will generate pdf.js
and pdf.worker.js
in the build/generic/build/
directory.
Both scripts are needed but only pdf.js
needs to be included since pdf.worker.js
will
be loaded by pdf.js
. The PDF.js files are large and should be minified for production.
Using PDF.js in a web application
To use PDF.js in a web application you can choose to use a pre-built version of the library
or to build it from source. We supply pre-built versions for usage with NPM and Bower under
the pdfjs-dist
name. For more information and examples please refer to the
wiki page on this subject.
Learning
You can play with the PDF.js API directly from your browser using the live demos below:
The repository contains a hello world example that you can run locally:
More examples can be found at the examples folder. Some of them are using the pdfjs-dist package, which can be built and installed in this repo directory via gulp dist-install
command.
For an introduction to the PDF.js code, check out the presentation by our contributor Julian Viereck:
More learning resources can be found at:
Questions
Check out our FAQs and get answers to common questions:
Talk to us on IRC:
- #pdfjs on irc.mozilla.org
File an issue:
Follow us on twitter: @pdfjs