pdf.js

Author	SHA1	Message	Date
Jonas Jenwald	90b2664622	Add better validation for the "PREFERENCE" kind `AppOptions` Given that the "PREFERENCE" kind is used e.g. to generate the preference-list for the Firefox PDF Viewer, those options need to be carefully validated. With this patch we'll now check this unconditionally in development mode, during testing, and when creating the preferences in the gulpfile.	2024-02-20 18:38:15 +01:00
Calixte Denizet	29de9bdce6	Format json files in using prettier	2024-01-16 19:40:25 +01:00
Jonas Jenwald	3ced0dec1b	[api-major] Remove the SVG back-end (PR 15173 follow-up) This has been deprecated since version `2.15.349`, which is a year ago. Removing this will also simplify some upcoming changes, specifically outputting of JavaScript modules in the builds.	2023-10-01 23:14:29 +02:00
Jonas Jenwald	506bca5e6d	Add unit-tests to check that more PDF.js APIs expose the expected functionality Similar to e.g. PR 16587, let's ensure that the `pdf.worker.js` and `pdf.image_decoders.js` files expose the expected functionality.	2023-07-07 12:36:21 +02:00
Jonas Jenwald	5f5db4b160	Run the PDF.js-viewer API unit-test in Node.js environments (PR 16592 follow-up) It occurred to me that we can actually run this unit-test in Node.js environments by making use of the preprocessor to stub out the browser globals there.	2023-06-26 09:37:34 +02:00
Jonas Jenwald	0bbadce066	Add a unit-test to check that the official PDF.js API exposes the expected functionality Until now we've not actually had any tests that ensure that the official PDF.js API exposes the intended functionality, which means that things can easily break accidentally.	2023-06-22 15:21:10 +02:00
Calixte Denizet	89140fcd98	Add tests for the font substitution	2023-05-14 18:07:03 +02:00
Jonas Jenwald	21fe5017bb	Remove the abstract `BaseViewer`-class After the changes in PR 14112 the `PDFViewer`-class is now "identical" to the `BaseViewer`-class and the `PDFSinglePageViewer`-class is just a very thin wrapper around the `BaseViewer`-class. Hence we can rename these files, and also remove the abstract `BaseViewer`-class, which helps reduce some unnecessary "closures" in the built viewer. Please note: These changes are made in two separate commits, to allow GitHub to preserve `blame` for the affected files.	2022-09-08 12:38:17 +02:00
Jonas Jenwald	345bb18575	[editor] Use the `fit-curve` package (issue 15004) Rather than including all of this external code in the PDF.js repository, we should be using the npm package instead. Unfortunately this is slightly more complicated than you'd hope, since the `fit-curve` package (which is older) isn't directly compatible with modern JavaScript modules. In particular, the following cases needed to be considered: - For the development viewer (i.e. `gulp server`) and the unit-tests, we thus need to build a fitCurve-bundle that can be directly `import`ed. - For the actual PDF.js build-targets, we can slightly reduce the sizes by depending on the "raw" `fit-curve` source-code. - For the Node.js unit-tests, the `fit-curve` package can be used as-is.	2022-07-07 10:43:43 +02:00
Jonas Jenwald	e046b811b7	Expose `TextLayerRenderTask` in the TypeScript definitions (issue 15016, PR 14013 follow-up) While `TextLayerRenderTask` apparently makes sense in TypeScript environments, given that it's being returned by the `renderTextLayer`-function in the API, we really don't want to extend the public API by simply exporting the class directly in `src/pdf.js` since it should never be called/initialized manually. Hence we follow the same pattern as in PR 14013, and add some very basic unit-tests to ensure that `renderTextLayer` always returns a `TextLayerRenderTask`-instance as expected.	2022-06-10 22:12:32 +02:00
Jonas Jenwald	0a19ef6864	Move the `EventBus`, and related functionality, into its own file The size of the `web/ui_utils.js` file has increased over time, as more code has been added to (or moved into) that file. To reduce its size slightly, this patch moves the event-related functionality into a separate file.	2021-12-15 17:18:57 +01:00
Jonas Jenwald	fe205efd8d	Add a couple of basic unit-tests for `PDFPageViewBuffer` The `PDFPageViewBuffer`-code is very important for the correct function of the viewer, but it's currently not tested at all. While the `PDFPageViewBuffer` is obviously intended to be used with `PDFPageView`-instances, it only accesses a couple of `PDFPageView` properties/methods and consequently it's fairly easy to unit-test this code with dummy-data. These unit-tests should help improve our confidence in this code, and will also come in handy with other changes that I'm working on (regarding modernizing and re-factoring the `PDFPageViewBuffer`-code).	2021-11-05 19:43:20 +01:00
Calixte Denizet	429ffdcd2f	XFA - Save filled data in the pdf when downloading the file (Bug 1716288) - when binding (after parsing) we get a map between some template nodes and some data nodes; - so set user data in input handlers in using data node uids in the annotation storage; - to save the form, just put the value we have in the storage in the correct data nodes, serialize the xml as a string and then write the string at the end of the pdf using src/core/writer.js; - fix few bugs around data bindings: - the "Off" issue in Bug 1716980.	2021-06-25 18:57:01 +02:00
Calixte Denizet	a4c986515f	XFA -- Display text content - display xhtml; - allow spaces in xhtml (xfa-spacerun:yes); - support column layout; - fix some border issues.	2021-04-12 14:13:49 +02:00
Brendan Dahl	fc9501a637	Add support for basic structure tree for accessibility. When a PDF is "marked" we now generate a separate DOM that represents the structure tree from the PDF. This DOM is inserted into the <canvas> element and allows screen readers to walk the tree and have more information about headings, images, links, etc. To link the structure tree DOM (which is empty) to the text layer aria-owns is used. This required modifying the text layer creation so that marked items are now tracked.	2021-04-09 09:56:28 -07:00
calixteman	b5be515375	XFA - Add a lexer/parser for FormCalc language (#12936 ) - the language specifications are: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=1049 - it can be used to: * as a scripting language for calculation, validations, ... * in SOM expressions to select nodes: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.2157&rep=rep1&type=pdf#page=101	2021-02-17 20:28:06 +01:00
Jonas Jenwald	0068dba009	[api-minor] Rename `-es5` to `-legacy`, to reduce confusion over what's actually supported (issue 12976) Please note that this will also require some edits of the Wiki.	2021-02-10 16:01:59 +01:00
Calixte Denizet	0ff5cd7eb5	XFA - Add a parser for XFA files - the parser is base on a class extending XMLParserBase - it handle xml namespaces: * each namespace is assocated with a builder * builder builds nodes belonging to the namespace * when a node is inserted in the parent namespace compatibility is checked (if required) - to avoid name collision between xml names and object properties, use Symbol.	2021-02-01 13:45:31 +01:00
calixteman	1039698697	Add a parser to get font data from the default appearance (#12831 ) * Add a parser to get font data from the default appearance - pdfium & poppler use a special parser too to get these info. * Update src/core/default_appearance.js Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com> Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com>	2021-01-21 20:15:31 +01:00
Calixte Denizet	c7974e9996	JS -- Add a sandbox based on quickjs * quickjs-eval.js has been generated using https://github.com/mozilla/pdf.js.quickjs/ * lazy load of sandbox code * Rewrite tests to use the sandbox * Add a task `watch-sandbox` which update bundle pdf.sandbox.js on change in the sandbox code	2020-11-19 13:40:46 +01:00
Calixte Denizet	f69e848b1c	JS -- Add 'util' object This patch provides an implementation of the util object as described: * https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/js_api_reference.pdf#page=716	2020-11-06 18:12:29 +01:00
calixteman	68b99c59ee	Save form data in XFA datasets when pdf is a mix of acroforms and xfa (#12344 ) * Move display/xml_parser.js in shared to use it in worker * Save form data in XFA datasets when pdf is a mix of acroforms and xfa Co-authored-by: Brendan Dahl <brendan.dahl@gmail.com>	2020-09-08 15:13:52 -07:00
Calixte Denizet	1a6816ba98	Add support for saving forms	2020-08-12 10:32:59 +02:00
Calixte Denizet	584902dbf8	Add an annotation storage in order to save annotation data in acroforms	2020-07-24 10:50:11 +02:00
Jonas Jenwald	710704508c	Fail early, in modern `GENERIC` builds, if certain required browser functionality is missing (issue 11762) With two kind of builds now being produced, with/without translation/polyfills, it's unfortunately somewhat easy for users to accidentally pick the wrong one. In the case where a user would attempt to use a modern build of PDF.js in an older browser, such as e.g. IE11, the failure would be immediate when the code is loaded (given the use of unsupported ECMAScript features). However in some browsers/environments, in particular Node.js, a modern PDF.js build may load correctly and thus appear to function, only to fail for e.g. certain API calls. To hopefully lessen the support burden, and to try and improve things overall, this patch adds checks to ensure that a modern build of PDF.js cannot be used in browsers/environments which lack native support for critical functionality (such as e.g. `ReadableStream`). Hence we'll fail early, with an error message telling users to pick an ES5-compatible build instead. To ensure that we actually test things better especially w.r.t. usage of the PDF.js library in Node.js environments, the `gulp npm-test` task as used by Node.js/Travis was changed (back) to test an ES5-compatible build. (Since the bots still test the code as-is, without transpilation/polyfills, this shouldn't really be a problem as far as I can tell.) As part of these changes there's now both `gulp lib` and `gulp lib-es5` build targets, similar to e.g. the generic builds, which thanks to some re-factoring only required adding a small amount of code. Please note: While it's probably too early to tell if this will be a widespread issue, it's possible that this is the sort of patch that may warrant being `git cherry-pick`ed onto the current beta version (v2.4.456).	2020-04-01 19:42:48 +02:00
Jonas Jenwald	c5cf3ab808	Run the `custom_spec` unit-tests in Node.js/Travis (PR 10537 follow-up)	2019-02-26 22:40:55 +01:00
Jonas Jenwald	db5dc14158	Move worker-thread only functions from `src/shared/util.js` and into a new `src/core/core_utils.js` file The `src/shared/util.js` file is being bundled into both the `pdf.js` and `pdf.worker.js` files, meaning that its code is by definition duplicated. Some main-thread only utility functions have already been moved to a separate `src/display/display_utils.js` file, and this patch simply extends that concept to utility functions which are used only on the worker-thread. Note in particular the `getInheritableProperty` function, which expects a `Dict` as input and thus cannot possibly ever be used on the main-thread.	2019-02-24 00:35:39 +01:00
Jonas Jenwald	a1f7517996	Rename the `src/display/dom_utils.js` file to `src/display/display_utils.js` This file (currently) contains not only DOM-specific helper functions/classes, but is used generally for various helper code relevant for main-thread functionality.	2019-02-23 16:30:16 +01:00
Tim van der Meij	1b402996cf	Implement a basic unit test for the find controller This commit shows that we can now unit test the find controller and that executing regular queries works. Note that this is only a first step and not a complete suite of unit tests for all possible options of the find controller. While writing this unit test, I found two smaller issues that I addressed directly. The first one is that in the previous find controller refactoring I forgot to rename some occurrences of a now private member variable. Fortunately this did not cause any bugs since we did have a public getter and the fetched value may be changed by reference, but it's nevertheless good to fix. The second issue is that some entries in the `test/unit/clitests.json` file were not correct, resulting in these tests not being executed on e.g., Travis CI.	2018-09-30 18:32:34 +02:00
Jonas Jenwald	6d804d657f	Add initial support for "Whole words" searching in the viewer As outlined in https://bugzilla.mozilla.org/show_bug.cgi?id=1282759 the internal Firefox name for the feature is `entireWord`, hence that name is used here as well for consistency (with "Whole words" being limited to the UI). Given existing limitations of the PDF.js search functionality, e.g. the existing problems of searching across "new lines", there's some edge-cases where "Whole words" searching will ignore (valid) results. However, considering that this is a pre-existing issue related to the way that the find controller joins text-content together, that shouldn't have to block this new feature in my opionion. Please note: In order to enable this feature in the `MOZCENTRAL` version, a small follow-up patch for [PdfjsChromeUtils.jsm](https://hg.mozilla.org/mozilla-central/file/tip/browser/extensions/pdfjs/content/PdfjsChromeUtils.jsm) will be required once this has landed in `mozilla-central`.	2018-09-10 11:59:29 +02:00
Brendan Dahl	b76cf665ec	Map all glyphs to the private use area and duplicate the first glyph. There have been lots of problems with trying to map glyphs to their unicode values. It's more reliable to just use the private use areas so the browser's font renderer doesn't mess with the glyphs. Using the private use area for all glyphs did highlight other issues that this patch also had to fix: * small private use area - Previously, only the BMP private use area was used which can't map many glyphs. Now, the (much bigger) PUP 16 area can also be used. * glyph zero not shown - Browsers will not use the glyph from a font if it is glyph id = 0. This issue was less prevalent when we mapped to unicode values since the fallback font would be used. However, when using the private use area, the glyph would not be drawn at all. This is illustrated in one of the current test cases (issue #8234) where there's an "ä" glyph at position zero. The PDF looked like it rendered correctly, but it was actually not using the glyph from the font. To properly show the first glyph it is always duplicated and appended to the glyphs and the maps are adjusted. * supplementary characters - The private use area PUP 16 is 4 bytes, so String.fromCodePoint must be used where we previously used String.fromCharCode. This is actually an issue that should have been fixed regardless of this patch. * charset - Freetype fails to load fonts when the charset size doesn't match number of glyphs in the font. We now write out a fake charset with the correct length. This also brought up the issue that glyphs with seac/endchar should only ever write a standard charset, but we now write a custom one. To get around this the seac analysis is permanently enabled so those glyphs are instead always drawn as two glyphs.	2018-09-05 14:04:54 -07:00
Jonas Jenwald	44d8afd46b	Move `MessageHandler` into a separate `src/shared/message_handler.js` file The `MessageHandler` itself, and its assorted helper functions, are currently the single largest[1] piece of code in the `src/shared/util.js` file. By moving this code into its own file, `src/shared/util.js` thus becomes smaller and more manageable.	2018-06-04 12:53:08 +02:00
Jonas Jenwald	42c71cd99f	Utilize `PDFNodeStream` to run more API unit-tests on Node.js/Travis	2018-01-28 17:14:08 +01:00
Tim van der Meij	c7af2db2ec	Implement unit tests for the encodings and fix missing items Initially I just implemented the unit tests, but quickly found that they were failing my expectation of having a size of 256 items. Some of them did contain 256 items and some did not. I looked up various resources and figured that they indeed all need to have 256 items. One of the good resources is https://github.com/davidben/poppler/blob/master/poppler/FontEncodingTables.cc Aside from some missing `notdef` (empty string) entries at the end of the arrays, which I assume causes issues since it may cause out-of-bounds array access which in JavaScript gives `undefined`, there was a `notdef` entry missing in the `MacExpertEncoding`, causing the entries after that to be shifted. This fix for this is similar to the one in #8589. The unit tests verify that, for known encoding names, the return value is not only an array, but that it is also of the right length and contains only strings.	2017-12-24 18:14:40 +01:00
Tim van der Meij	957e2d420d	Implement unit tests for the network utility code This should provide 100% coverage for the file.	2017-12-23 19:24:11 +01:00
Tim van der Meij	2281061882	Enable metadata unit tests for Travis CI and Node.js	2017-09-19 23:09:07 +02:00
Jonas Jenwald	388851e37b	Add a `isDestsEqual` helper function, to allow comparing explicit destinations, in `pdf_history.js`	2017-08-30 19:45:13 +02:00
Mukul Mishra	d16709f5e4	Adds tests for node_stream	2017-08-24 12:46:44 +05:30
Apoorv Mishra	a129de7bd1	Add unit-tests for colorspace.js Added unit-tests for DeviceGray, DeviceRGB and DeviceCMYK Added unit-tests for CalGray Added unit-tests for CalRGB Removed redundant code Added unit-tests for LabCS Added unit-tests for IndexedCS Update comment Change lookup to Uint8Array as mentioned in pdf specs(these tests will pass after PR #8666 is merged). Added unit-tests for AlternateCS Resolved code-style issues Fixed code-style issues Addressed issues pointed out in https://github.com/mozilla/pdf.js/pull/8611#pullrequestreview-52865469	2017-07-28 14:24:56 +05:30
Rob Wu	01f03fe393	Optimize PNG compression in SVG backend on Node.js Use the environment's zlib implementation if available to get reasonably-sized SVG files when an XObject image is converted to PNG. The generated PNG is not optimal because we do not use a PNG predictor. Futher, when our SVG backend is run in a browser, the generated PNG images will still be unnecessarily large (though the use of blob:-URLs when available should reduce the impact on memory usage). If we want to optimize PNG images in browsers too, we can either try to use a DEFLATE library such as pako, or re-use our XObject image painting logic in src/display/canvas.js. This potential improvement is not implemented by this commit Tested with: - Node.js 8.1.3 (uses zlib) - Node.js 0.11.12 (uses zlib) - Node.js 0.10.48 (falls back to inferior existing implementation). - Chrome 59.0.3071.86 - Firefox 54.0 Tests: Unit test on Node.js: ``` $ gulp lib $ JASMINE_CONFIG_PATH=test/unit/clitests.json node ./node_modules/.bin/jasmine --filter=SVG ``` Unit test in browser: Run `gulp server` and open http://localhost:8888/test/unit/unit_test.html?spec=SVGGraphics To verify that the patch works as desired, ``` $ node examples/node/pdf2svg.js test/pdfs/xobject-image.pdf $ du -b svgdump/xobject-image-1.svg # ^ Calculates the file size. Confirm that the size is small # (784 instead of 80664 bytes). ```	2017-07-10 18:56:57 +02:00
Mukul Mishra	bbd9968f76	Added sendWithStream method in MessageHandler. Adds functionality to accept Queueing Strategy in sendWithStream method. Using Queueing Strategy we can control the data that is enqueued into the sink, and hence regulated the flow of chunks from worker to main thread. Adds capability in pull and cancel methods. Adds ready and desiredSize property in streamSink. Adds unit test for ReadableStream and sendWithStream.	2017-06-07 21:05:27 +05:30
Jonas Jenwald	bbe8c3d8ed	Enable running a subset of the API unit-tests on Travis Notably, this patch skips all canvas rendering tests in Node.js.	2017-05-12 11:48:27 +02:00
Jonas Jenwald	ae04cf1c37	Enable running the `ui_utils` unit-tests on Travis With the exception of just one test-case, all the current `ui_utils` unit-tests can run successfully on Node.js (since most of them doesn't rely on the DOM). To get this working, I had to first of all add a new `LIB` build flag such that `gulp lib` produces a `web/pdfjs.js` file that is able to load `pdf.js` successfully. Second of all, since neither `document` nor `navigator` is available in Node.js, `web/ui_utils.js` was adjusted slightly to avoid errors.	2017-04-25 13:37:56 +02:00
Yury Delendik	39e8ad24f7	Creates 'lib' for the 'dist' build target.	2017-03-03 16:37:58 -06:00
Jonas Jenwald	9082f08e37	Enable running the `cmap` unit-tests on Travis by utilizing a `NodeCMapReaderFactory`	2017-02-17 23:15:36 +01:00
Jonas Jenwald	e88c9c75db	Simplify the `FileAttachmentAnnotation` unit-test to avoid having to use the entire API in the test Every other unit-test in `annotation_spec.js` is already only testing the annotation code. Hence it seems unnecessarily convoluted to make use of the API here, when we can (fairly) simply provide the necessary data explicitly as in all the other annotation unit-test.	2017-01-12 19:10:37 +01:00
Syed Abdullah	857a5da8f1	Fix inverted calculation of RTL text percentage in bidi.	2017-01-12 23:54:06 +08:00
Jonas Jenwald	642d8621ef	Replace direct lookup of `uniquePrefix`/`idCounters`, in `Page` instances, with an `idFactory` containing an `createObjId` method instead We're currently making use of `uniquePrefix`/`idCounters` in multiple files, to create unique object id's, and adding a new occurrence of them requires some care to ensure that an object id isn't accidentally reused. Furthermore, having to pass around multiple parameters as we currently do seem like something you want to avoid. Instead, this patch adds a factory which means that there's only one thing that needs to be passed around. And since it's now only necessary to call a method in order to obtain a unique object id, the details are thus abstracted away at the call-sites which avoids accidental reuse of object id's. To test that this works as expected a very simple `Page` unit-test is added, and the existing `Annotation layer` tests are also adjusted slightly.	2017-01-09 23:16:25 +01:00
Yury Delendik	c45300e06c	Enables some unit tests on travis.	2017-01-09 15:43:45 -06:00

49 Commits