pdf.js

Author	SHA1	Message	Date
Jonas Jenwald	13e44c0776	Attempt to reduce intermittent failures in the "multiple render() on the same canvas" unit-test This patch should hopefully remove the intermittent unit-test failure, by using the same `optionalContentConfigPromise` for both `renderTask`s and thus get more predictable timing behaviour.	2020-08-04 22:31:24 +02:00
Brendan Dahl	ac494a2278	Add support for optional marked content. Add a new method to the API to get the optional content configuration. Add a new render task param that accepts the above configuration. For now, the optional content is not controllable by the user in the viewer, but renders with the default configuration in the PDF. All of the test files added exhibit different uses of optional content. Fixes #269. Fix test to work with optional content. - Change the stopAtErrors test to ensure the operator list has something, instead of asserting the exact number of operators.	2020-08-04 09:26:55 -07:00
Tim van der Meij	00a8b42e67	Merge pull request #12102 from ineiti/add_types_annotations Add types annotations	2020-08-02 16:45:37 +02:00
Tim van der Meij	5a66c56eca	Merge pull request #12108 from calixteman/radio Add support for radios printing	2020-08-02 14:47:46 +02:00
Tim van der Meij	b789a0e216	Log the total number of tests and the random seed in the test runner This might make debugging intermittent failures a bit easier in the future because it allows us to spot unexpected differences in the number of tests being run and allows us to run the tests locally in the same order in case of intermittent failures.	2020-08-01 21:09:01 +02:00
Tim van der Meij	662ac5548f	Log suite start failures in the test runner	2020-08-01 21:02:20 +02:00
Tim van der Meij	c19d76f9b8	Use a `for...of` loop in the `specDone` handler in the test reporter Moreover, remove a left-over reference to `test.py` since that was ported to JavaScript a long time ago.	2020-08-01 20:50:30 +02:00
Jonas Jenwald	05baa4c89f	Revert "[api-minor] Allow loading pdf fonts into another document."	2020-08-01 12:52:39 +02:00
Tim van der Meij	173b92a873	Merge pull request #12131 from jsg2021/issue-8271 [api-minor] Allow loading pdf fonts into another document.	2020-08-01 01:13:41 +02:00
Jonathan Grimes	9b16b8ef71	Allow loading pdf fonts into another document.	2020-07-31 11:41:48 -05:00
Jonas Jenwald	346afd1e1c	[api-minor] Fix the `AnnotationStorage` usage properly in the viewer/tests (PR 12107 and 12143 follow-up) The [api-minor] label probably ought to have been added to the original PR, given the changes to the `createAnnotationLayerBuilder` signature (if nothing else). This patch fixes the following things: - Let the `AnnotationLayer.render` method create an `AnnotationStorage`-instance if none was provided, thus making the parameter properly optional. This not only fixes the reference tests, it also prevents issues when the viewer components are used. - Stop exporting `AnnotationStorage` in the official API, i.e. the `src/pdf.js` file, since it's no longer necessary given the change above. Generally speaking, unless absolutely necessary we probably shouldn't export unused things in the API. - Fix a number of JSDocs `typedef`s, in `src/display/` and `web/` code, to actually account for the new `annotationStorage` parameter. - Update `web/interfaces.js` to account for the changes in `createAnnotationLayerBuilder`. - Initialize the storage, in `AnnotationStorage`, using `Object.create(null)` rather than `{}` (which is the PDF.js default).	2020-07-31 16:32:46 +02:00
Calixte Denizet	f22e702ecc	Amend test for checkboxes printing to test the unchecked appearance	2020-07-31 14:39:11 +02:00
Calixte Denizet	538017f7a7	Add support for radios printing	2020-07-31 14:31:49 +02:00
Aki Sasaki	7bb65bab7f	fix reftests after #12107 The f1040-annotations reftest started hanging after #12107. We traced this to `TypeError: can't access property "getOrCreateValue", storage is undefined`. We essentially need to add `annotationStorage` to the parameters in test/driver.js.	2020-07-30 12:25:27 -07:00
Linus Gasser	f1bbfdc16d	Add typescript definitions This PR adds typescript definitions from the JSDoc already present. It adds a new gulp-target 'types' that calls 'tsc', the typescript compiler, to create the definitions. To use the definitions, users can simply do the following: ``` import {getDocument, GlobalWorkerOptions} from "pdfjs-dist"; import pdfjsWorker from "pdfjs-dist/build/pdf.worker.entry"; GlobalWorkerOptions.workerSrc = pdfjsWorker; const pdf = await getDocument("file:///some.pdf").promise; ``` Co-authored-by: @oBusk Co-authored-by: @tamuratak	2020-07-30 11:10:37 +02:00
Tim van der Meij	eb4d6a0652	Merge pull request #12107 from calixteman/checkbox Add support for checkboxes printing	2020-07-30 00:11:41 +02:00
Calixte Denizet	cb60523a15	Add support for checkboxes printing	2020-07-29 16:42:57 +02:00
Tim van der Meij	65e76a3c6b	Fix a bug in the temporary folder check in the test runner The `noPrompt` option doesn't exist and should be `noPrompts`.	2020-07-26 20:41:19 +02:00
Tim van der Meij	01e2610cf4	Merge pull request #12126 from Snuffleupagus/unittest-shall_fail_cleanup Attempt to reduce intermittent failures in the "cleans up document resources during rendering of page" unit-test	2020-07-26 14:33:12 +02:00
Jonas Jenwald	86a8fd9810	Attempt to reduce intermittent failures in the "cleans up document resources during rendering of page" unit-test This patch should hopefully remove the `Unhandled promise rejection: ...` errors, by returning the "final" promise. Also, by pausing/delaying of rendering slightly the likelihood of the test failing in the first place should thus be reduced.	2020-07-26 14:05:46 +02:00
Jonas Jenwald	e4ad91be05	Include the browser name when printing unit-test results This uses a similar format to the reference-test logging, and will help determine in exactly which browser the failure occurred (since the tests run concurrently).	2020-07-26 12:54:16 +02:00
Calixte Denizet	584902dbf8	Add an annotation storage in order to save annotation data in acroforms	2020-07-24 10:50:11 +02:00
Jonas Jenwald	ea8e432c45	Add a `getRawValues` method, to `Dict` instances, to provide an easier way of getting all raw values When the old `Dict.getAll()` method was removed, it was replaced with a `Dict.getKeys()` call and `Dict.get(...)` calls (in a loop). While this pattern obviously makes a lot of sense in many cases, there's some instances where we actually want the raw `Dict` values (i.e. `Ref`s where applicable). In those cases, `Dict.getRaw(...)` calls are instead used within the loop. However, by introducing a new `Dict.getRawValues()` method we can reduce the number of (strictly unnecessary) function calls by simply getting the raw `Dict` values directly.	2020-07-17 16:32:00 +02:00
Jonas Jenwald	6381b5b08f	Add a `size` getter, to `Dict` instances, to provide an easier way of checking the number of entries This removes the need to manually call `Dict.getKeys()` and check its length.	2020-07-17 16:06:11 +02:00
Tim van der Meij	e63d1ebff5	Merge pull request #12087 from Snuffleupagus/LocalGStateCache Add local caching of "simple" Graphics State (ExtGState) data in `PartialEvaluator.{getOperatorList, getTextContent}` (issue 2813)	2020-07-17 16:02:45 +02:00
Tim van der Meij	29adbb7cd7	Implement unit tests for the `RefSetCache` primitive This primitive did not have unit test coverage yet, which is important for upcoming refactoring of the primitive.	2020-07-17 13:35:29 +02:00
Jonas Jenwald	90eb579713	Add local caching of "simple" Graphics State (ExtGState) data in `PartialEvaluator.getOperatorList` (issue 2813) This patch will help pathological cases the most, with issue 2813 being a particularily problematic example. While there's only four `/ExtGState` resources, there's a total `29062` of `setGState` operators. Even though parsing of a single `/ExtGState` resource is quite fast, having to re-parse them thousands of times does add up quite significantly. For simplicity we'll only cache "simple" `/ExtGState` resource, since e.g. the general `SMask` case cannot be easily cached (without re-factoring other code, which may have undesirable effects on general parsing). By caching "simple" `/ExtGState` resource, we thus improve performance by: - Not having to fetch/validate/parse the same `/ExtGState` data over and over. - Handling of repeated `setGState` operators becomes synchronous during the `OperatorList` building, instead of having to defer to the event-loop/microtask-queue since the `/ExtGState` parsing is done asynchronously. --- Obviously I had intended to include (standard) benchmark results with this patch, but for reasons I don't understand the test run-time (even with `master`) of the document in issue 2813 is a lot slower than in the development viewer (making normal benchmarking infeasible). However, testing this manually in the development viewer (using `pdfBug=Stats`) shows a reduction of `~10 %` in the rendering time of the PDF document in issue 2813.	2020-07-14 10:34:43 +02:00
Jonas Jenwald	d4d7ac1b88	Stop special-casing the (very unlikely) "no `/XObject` found"-scenario, when parsing `OPS.paintXObject` operators, in `PartialEvaluator.{getOperatorList, getTextContent}` Originally there weren't any (generally) good ways to handle errors gracefully, on the worker-side, however that's no longer the case and we can simply fallback to the existing `ignoreErrors` functionality instead. Also, please note that the "no `/XObject` found"-scenario should be extremely unlikely in practice and would only occur in corrupt/broken documents. Note that the `PartialEvaluator.getOperatorList` case is especially bad currently, since we'll simply (attempt to) send the data as-is to the main-thread. This is quite bad, since in a corrupt/broken document the data could contain anything and e.g. be unclonable (which would cause breaking errors). Also, we're (obviously) not attempting to do anything with this "raw" `OPS.paintXObject` data on the main-thread and simply ensuring that we never send it definately seems like the correct approach.	2020-07-12 21:59:59 +02:00
Tim van der Meij	7dabc5ecc8	Merge pull request #12063 from Snuffleupagus/issue-10989 Tweak the heuristic, in `src/core/jpg.js`, that handles JPEG images with a wildly incorrect SOF (Start of Frame) `scanLines` parameter (issue 10989)	2020-07-11 00:05:11 +02:00
Jonas Jenwald	d18cf47419	Remove the special handling, used when creating Indexed ColorSpaces, for the case where the `lookup`-data is a `Stream` This special-case was added in PR 1992, however it became unnecessary with the changes in PR 4824 since all of the ColorSpace parsing is now done on the worker-thread (with only RGB-data being sent to the main-thread).	2020-07-10 17:22:55 +02:00
Jonas Jenwald	4cc6797f17	Re-factor the `idFactory` functionality, used in the `core/`-code, and move the `fontID` generation into it Note how the `getFontID`-method in `src/core/fonts.js` is completely global, rather than properly tied to the current document. This means that if you repeatedly open and parse/render, and then close, even the same PDF document the `fontID`s will still be incremented continuously. For comparison the `createObjId` method, on `idFactory`, will always create a consistent id, assuming of course that the document and its pages are parsed/rendered in the same order. In order to address this inconsistency, it thus seems reasonable to add a new `createFontId` method on the `idFactory` and use that when obtaining `fontID`s. (When the current `getFontID` method was added the `idFactory` didn't actually exist yet, which explains why the code looks the way it does.) Please note: Since the document id is (still) part of the `loadedName`, it's thus not possible for different documents to have identical font names.	2020-07-07 16:33:31 +02:00
Jonas Jenwald	1d66fce781	Tweak the heuristic, in `src/core/jpg.js`, that handles JPEG images with a wildly incorrect SOF (Start of Frame) `scanLines` parameter (issue 10989)	2020-07-06 13:06:49 +02:00
Jonas Jenwald	4a7e29865d	[api-minor] Use the `NodeCanvasFactory`/`NodeCMapReaderFactory` classes as defaults in Node.js environments (issue 11900) This moves, and slightly simplifies, code that's currently residing in the unit-test utils into the actual library, such that it's bundled with `GENERIC`-builds and used in e.g. the API-code. As an added bonus, this also brings out-of-the-box support for CMaps in e.g. the Node.js examples.	2020-07-02 04:44:23 +02:00
Brendan Dahl	fe3df495cc	Merge pull request #12040 from wojtekmaj/replace-non-inclusive Replace non-inclusive "whitelist" term with "allowlist"	2020-07-01 15:41:41 -07:00
Jonas Jenwald	fef24658e7	Adjust the heuristics used when dealing with rectangles, i.e. `re` operators, with zero width/height (issue 12010)	2020-07-02 00:02:49 +02:00
Tim van der Meij	75fed02630	Merge pull request #12043 from Snuffleupagus/issue-4260-test Add a reduced test-case for issue 4260 (PR 4521 follow-up)	2020-07-01 23:51:21 +02:00
Jonas Jenwald	e451cabe37	Add a reduced test-case for issue 4260 (PR 4521 follow-up)	2020-06-30 09:26:41 +02:00
Wojciech Maj	78970bbbe1	Replace non-inclusive "whitelist" term with "allowlist"	2020-06-29 17:15:14 +02:00
Jonas Jenwald	4a5b68e077	Add at least some test-coverage for the `RenderTask.onContinue` functionality The default viewer, and thus Firefox, depends on the `RenderTask.onContinue` functionality to pause/continue rendering (such that the most visible page always renders first). Despite this functionality thus being very important, it has however never actually been tested at all as far as I can tell. Hence this patch which adds a new boolean `renderTaskOnContinue` parameter (`false` by default), that can be used to force a reference-test to use the `RenderTask.onContinue` code-path in the `InternalRenderTask` class. Note that I purposely made this new reference-test behaviour optional, since I didn't want to negatively affect the general runtime of the tests (given that there's a slight delay added to the rendering). Also, for e.g. benchmarking you'd most likely want to stay away from the `RenderTask.onContinue` functionality for similar reasons.	2020-06-29 00:38:34 +02:00
Jonas Jenwald	28d2ada59c	Attempt to detect inline images which contain "EI" sequence in the actual image data (issue 11124) This should reduce the possibility of accidentally truncating some inline images, while not causing the "EI" detection to become significantly slower.[1] There's obviously a possibility that these added checks are not sufficient to catch every single case of "EI" sequences within the actual inline image data, but without specific test-cases I decided against over-engineering the solution here. Please note: The interpolation issues are somewhat orthogonal to the main issue here, which is the truncated image, and it's already tracked elsewhere. --- [1] I've looked at the issue a few times, and this is the first approach that I was able to come up with that didn't cause unacceptable performance regressions in e.g. issue 2618.	2020-06-26 13:15:06 +02:00
Jonas Jenwald	19d7976483	Improve (local) caching of parsed `ColorSpace`s (PR 12001 follow-up) This patch contains the following notable improvements: - Changes the `ColorSpace.parse` call-sites to, where possible, pass in a reference rather than actual ColorSpace data (necessary for the next point). - Adds (local) caching of `ColorSpace`s by `Ref`, when applicable, in addition the caching by name. This (generally) improves `ColorSpace` caching for e.g. the SMask code-paths. - Extends the (local) `ColorSpace` caching to also apply when handling Images and Patterns, thus further reducing unneeded re-parsing. - Adds a new `ColorSpace.parseAsync` method, almost identical to the existing `ColorSpace.parse` one, but returning a Promise instead (this simplifies some code in the `PartialEvaluator`).	2020-06-24 23:53:10 +02:00
Jonas Jenwald	e22bc483a5	Re-factor `ColorSpace.parse` to take a parameter object, rather than a bunch of (randomly) ordered parameters Given the number of existing parameters, this will avoid needlessly unwieldy call-sites especially with upcoming changes in later patches.	2020-06-24 23:53:10 +02:00
Jonas Jenwald	e18fa3fc45	Tweak the `QueueOptimizer` to recognize `OPS.paintImageMaskXObject` operators as repeated when the "skew" transformation matrix elements are non-zero (issue 8078) First of all, I should mention that my understanding of the finer details of the `QueueOptimizer` (and its related `CanvasGraphics` methods) is somewhat limited. Hence I'm not sure if there's actually a very good reason for only considering ImageMasks where the "skew" transformation matrix elements are zero as repeated, however simply looking at the code I just don't see why these elements cannot be non-zero as long as they are all identical for the ImageMasks. Furthermore, looking at the group case (which is what we're currently falling back to), there's no particular limitation placed upon the transformation matrix elements. While this patch obviously isn't enough to completely fix the issue, since there should be a visible Pattern rendered as well[1], it seem (at least to me) like enough of an improvement that submitting this is justified. With these changes the referenced PDF document will no longer hang the entire browser, and rendering also finishes in a reasonable time (< 10 seconds for me) which seem fine given the huge number of identical inline images present.[2] --- [1] Temporarily changing the Pattern to a solid color does render the correct/expected area, which suggests that the remaining problem is a pre-existing issue related to the Pattern-handling itself rather than the `QueueOptimizer` functionality. [2] The document isn't exactly rendered immediately in e.g. Adobe Reader either.	2020-06-20 12:18:48 +02:00
Jonas Jenwald	4b51bcc733	Ensure that `PDFImage.buildImage` won't accidentally swallow errors, e.g. from ColorSpace parsing (issue 6707, PR 11601 follow-up) Because of a really stupid `Promise`-related mistake on my part, when re-factoring `PDFImage.buildImage` during the `NativeImageDecoder` removal, we're no longer re-throwing errors occuring during image parsing/decoding as intended. The result is that some (fairly) corrupt documents will never finish loading, and unfortunately there were apparently no sufficiently corrupt images in the test-suite to catch this.	2020-06-13 15:02:37 +02:00
Jonas Jenwald	88fdb482b0	Move the `isEmptyObj` helper function from `src/shared/util.js` to `test/unit/test_utils.js` Since this helper function is no longer used anywhere in the main code-base, but only in a couple of unit-tests, it's thus being moved to a more appropriate spot. Finally, the implementation of `isEmptyObj` is also tweaked slightly by removing the manual loop.	2020-06-09 17:50:16 +02:00
Tim van der Meij	550a38f1ba	Improve unit test coverage for primitives This commit includes unit tests for: - `isEOF` - `isStream` - `Ref`'s string representation and caching - `Dict`'s XRef assignment	2020-06-07 17:31:40 +02:00
Tim van der Meij	2bd0690fdd	Convert `var` to `const`/`let` in `test/unit_primitives_spec.js`	2020-06-07 15:04:24 +02:00
Carlos Rodríguez	802aa14a99	Jpeg encoded with RGB -instead of YCbCr- write the components index as "RGB" in ASCII to say it so On ISO/IEC 10918-6:2013 (E), section 6.1: (http://www.itu.int/rec/T-REC-T.872-201206-I/en) "Images encoded with three components are assumed to be RGB data encoded as YCbCr unless the image contains an APP14 marker segment as specified in 6.5.3, in which case the colour encoding is considered either RGB or YCbCr according to the application data of the APP14 marker segment" But common jpeg libraries consider RGB too if components index are ASCII R (0x52), G (0x47) and B (0x42): https://stackoverflow.com/questions/50798014/determining-color-space-for-jpeg/50861048 Issue #11931	2020-06-04 15:08:47 +02:00
Tim van der Meij	3b615e4ca3	Merge pull request #11601 from Snuffleupagus/rm-nativeImageDecoderSupport [api-minor] Decode all JPEG images with the built-in PDF.js decoder in `src/core/jpg.js`	2020-05-23 15:33:46 +02:00
Jonas Jenwald	56ebf01ae0	Avoid hanging the worker-thread for CMap data with ridiculously large ranges (issue 11922) This patch was inspired by `ad2b64f124/xpdf/CharCodeToUnicode.cc (L480-L484)`	2020-05-22 15:23:17 +02:00

1 2 3 4 5 ...

2036 Commits