pdf.js

Author	SHA1	Message	Date
Tim van der Meij	c8ee63319d	Merge pull request #9965 from timvandermeij/updates Update translations and packages	2018-08-05 21:35:22 +02:00
Tim van der Meij	f6eaa99cb2	Reword test reporter message The font tests use Jasmine too, so while they are technically unit tests, it's a bit confusing to see `Started unit tests` when the font tests are run on the bots.	2018-08-05 21:21:46 +02:00
Tim van der Meij	0a2ec871b6	Update packages	2018-08-05 21:20:37 +02:00
Tim van der Meij	fa40c068af	Update translations	2018-08-05 21:15:58 +02:00
Tim van der Meij	4111871ac5	Merge pull request #9958 from brendandahl/always-fallback Always fallback to system font on font failure.	2018-08-05 19:58:48 +02:00
Tim van der Meij	eec7e185d9	Merge pull request #9961 from Snuffleupagus/getFontFileType Parse the font file to determine the correct type/subtype, rather than relying on the (often incorrect) data in the font dictionary	2018-08-05 17:30:10 +02:00
Tim van der Meij	27e8a2f6fe	Merge pull request #9959 from brendandahl/test-util Utility script to add a reference test.	2018-08-05 16:53:37 +02:00
Tim van der Meij	b65d0450f5	Merge pull request #9960 from brendandahl/strict-verify Fail when MD5 of test files fails on bots.	2018-08-05 16:44:12 +02:00
Jonas Jenwald	3177f6aa55	Parse the font file to determine the correct type/subtype, rather than relying on the (often incorrect) data in the font dictionary The current font type/subtype detection code is quite inconsistent/unwieldy. In some cases it will simply assume that the font dictionary is correct, in others it will somewhat "arbitrarily" check the actual font file (more of these cases have been added over the years to fix specific bugs). As is evident from e.g. issue 9949, the font type/subtype detection code is continuing to cause issues. In an attempt to get rid of these hacks once and for all, this patch instead re-factors the type/subtype detection to always parse the font file. Please note that, as far as I can tell, we still appear to need to rely on the composite font detection based on the font dictionary. However, even if the composite/non-composite detection would get it wrong, that shouldn't really matter too much given that there's basically only two different code-paths (for "TrueType-like" vs "Type1-like" fonts).	2018-08-05 11:13:16 +02:00
Jonas Jenwald	9bbca04579	Add a (basic) `isCFFFile` helper function to detect CFF font files Compared to most other font formats, the CFF doesn't have a constant header which makes is slightly more difficult to detect such font files. Please refer to the Compact Font Format specification: https://www.adobe.com/content/dam/acom/en/devnet/font/pdfs/5176.CFF.pdf#G3.32094	2018-08-05 11:13:14 +02:00
Jonas Jenwald	f4db38aadf	Update the TrueType font file detection to also recognize the Mac specific header 'true' Please refer to the TrueType specification: https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6.html#ScalerTypeNote	2018-08-05 10:33:56 +02:00
Brendan Dahl	482ea2af32	Fail when MD5 of test files fails on bots.	2018-08-03 17:48:47 -07:00
Brendan Dahl	8b3ed473c1	Utility script to add a reference test.	2018-08-03 17:24:24 -07:00
Brendan Dahl	5f67a6a237	Always fallback to system font on font failure. The font in the PDF is marked as a CIDFontType0, but the font file is actually a true type font. To fully address this issue we should really peek into the font file and try to determine what it is. However, this is the first case of this issue, so I think this solution is acceptable for now.	2018-08-03 16:49:22 -07:00
Tim van der Meij	444976bcd5	Merge pull request #9956 from brendandahl/allow-zero-progress Allow loaded progress of 0 in unit tests.	2018-08-04 00:19:02 +02:00
Tim van der Meij	f19ee127a3	Merge pull request #9874 from boundlesshq/master [api-minor] Include export value for checkboxes	2018-08-03 23:43:23 +02:00
Tim van der Meij	ee9a5c1269	Merge pull request #9954 from Snuffleupagus/rm-PDFImage-Filter-warn Stop warning for non-Name /Filter entries in the `PDFImage` constructor (PR 9897 follow-up)	2018-08-03 23:21:14 +02:00
Brendan Dahl	d762567bcf	Allow loaded progress of 0 in unit tests.	2018-08-03 10:31:46 -07:00
Jonas Jenwald	a504befc76	Stop warning for non-Name /Filter entries in the `PDFImage` constructor (PR 9897 follow-up) Fixes a stupid oversight on my part, since /Filter may (obviously) contain an Array, which resulted in unnecessary console warning spam in perfectly valid PDF files. Note that it still makes sense to check that /Filter is actually a Name, before attempting to access its `name` property, but the warning should definitely be removed.	2018-08-03 10:23:08 +02:00
Tim van der Meij	8a4be24645	Merge pull request #9948 from Snuffleupagus/url-polyfill-unit-tests Add (basic) unit-tests for the non-global `URL` constructor (PR 9868 follow-up)	2018-08-02 23:32:07 +02:00
Brendan Dahl	e5e96e434f	Merge pull request #9946 from brianholle/CalRGB_Conversion_Fix Removed Extraneous Matrix Check in CalRGB Conversion	2018-08-02 11:24:34 -07:00
Brian	2a665ebad4	Removed Extraneous Matrix Check in CalRGB Conversion	2018-08-02 10:16:42 -07:00
Jonas Jenwald	f8388710e6	Add (basic) unit-tests for the non-global `URL` constructor (PR 9868 follow-up) This should really have been included in PR 9868, since it will help ensure that the `URL` constructor is correctly imported/exported by `src/shared/util.js`.	2018-08-02 10:32:06 +02:00
Tim van der Meij	716acf63d4	Merge pull request #9938 from Snuffleupagus/issue-9915 Ensure that Type0, i.e. composite, OpenType fonts with `CFF ` tables are not treated as CFF fonts if their glyph mapping is non-default (issue 9915)	2018-08-02 00:11:18 +02:00
Rob Wu	20fddef5ba	Merge pull request #9897 from Snuffleupagus/issue-9650 Prefer the Width/Height of the image data, rather than the image dictionary, for JPEG 2000 images (issue 9650)	2018-08-02 00:03:23 +02:00
Jonas Jenwald	3ce420131f	Prefer the Width/Height of the image data, rather than the image dictionary, for JPEG 2000 images (issue 9650) According to the PDF specification, see https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#page=45 > When using the JPXDecode filter with image XObjects, the following changes to and constraints on some entries in the image dictionary shall apply (see 8.9.5, "Image Dictionaries" for details on these entries): > > - Width and Height shall match the corresponding width and height values in the JPEG2000 data. > > - . . . Hence it seems reasonable to use the Width/Height of the image data itself, rather than the image dictionary when there's a mismatch. Given that JPEG 2000 images are already being parsed, in order to obtain basic parameters, the actual Width/Height is readily available in the `PDFImage` constructor.	2018-08-01 16:42:26 +02:00
Jonas Jenwald	17f65908ae	Add more validation of the /Filter entry, in image dictionaries, to the `PDFImage` constructor Given that the code is currently assuming that the /Filter entry is a `Name`, it cannot hurt to actually ensure that's the case. Also fixes an error message, for JPEG 2000 images with unsupported ColorSpaces, since `this.numComps` hasn't been initialized when it's accessed during the `throw new Error()` invocation.	2018-08-01 16:41:15 +02:00
Jonas Jenwald	690bcc8c8a	Add a reduced, `eq`, test-case for issue 9915	2018-07-29 23:06:15 +02:00
Jonas Jenwald	17eac2d48a	Ensure that Type0, i.e. composite, OpenType fonts with `CFF` tables are not treated as CFF fonts if their glyph mapping is non-default (issue 9915) This particular code-path has been the source of numerous regressions to date, so hopefully this patch won't cause any more of those. Fixes 9915.	2018-07-29 23:06:15 +02:00
Jonas Jenwald	cfdb597e4a	Ensure that the `CIDSystemInfo` strings, in Type0 fonts, are correctly decoded This isn't directly related to the subsequent patch, but just something that I happened to notice while poking around in the font code.	2018-07-29 23:06:15 +02:00
Tim van der Meij	3521424576	Merge pull request #9920 from Snuffleupagus/getMetadata-linearization [api-minor] Add an `IsLinearized` property to the `PDFDocument.documentInfo` getter, to allow accessing the linearization status through the API (via `PDFDocumentProxy.getMetadata`)	2018-07-29 20:23:22 +02:00
Tim van der Meij	f45450bd78	Merge pull request #9931 from Snuffleupagus/refactor-getPage Refactor `getPage` (in the worker), and attempt to use the `Linearization` dictionary to lookup the first Page	2018-07-29 19:33:46 +02:00
Tim van der Meij	a2c317f12b	Merge pull request #9925 from Snuffleupagus/StreamsSequenceStream-maybeLength Attempt to estimate the minimum required `buffer` length when initializing `StreamsSequenceStream` instances	2018-07-29 16:52:34 +02:00
Tim van der Meij	d19e13ee2e	Merge pull request #9936 from Snuffleupagus/BasePreferences-validate Validate the Preferences when fetching them from storage	2018-07-29 16:16:48 +02:00
Tim van der Meij	39846a6de3	Merge pull request #9935 from Snuffleupagus/builtInCMapCache-cleanup-regression [Regression] Convert `Catalog.builtInCMapCache` into a `Map`, instead of an Object, to ensure that it's correctly reset (PR 8064 follow-up)	2018-07-29 16:07:45 +02:00
Jonas Jenwald	ec3728b540	Use the `Linearization` dictionary, if it exists, when fetching the first Page Since PDF.js already supports range requests and streaming, not to mention chunked rendering, attempting to use the `Linearization` dictionary in `PDFDocument.getPage` probably isn't going to improve performance in any noticeable way. Nonetheless, when `Linearization` data is available, it will allow looking up the first Page directly without having to descend into the `Pages` tree to find the correct object.	2018-07-28 22:23:36 +02:00
Jonas Jenwald	fbb25ff4e2	Move `getPage`, on the worker side, from `Catalog` and into `PDFDocument` instead Addresses an existing TODO, and avoids having to pass in a `pageFactory` when creating `Catalog` instances.	2018-07-28 22:23:36 +02:00
Jonas Jenwald	81b471c781	[Regression] Convert `Catalog.builtInCMapCache` into a `Map`, instead of an Object, to ensure that it's correctly reset (PR 8064 follow-up) With the `builtInCMapCache` being a simple Object, it unfortunately means that the `Catalog.cleanup` method isn't resetting it as intended. By just replacing the `builtInCMapCache` with an empty Object, existing references to it will not actually be updated. The result is that e.g. `Page` instances still keeps references to, what should have been removed, CMap data. To fix these problems, the `builtInCMapCache` is converted into a `Map` instead (since it can be easily reset).	2018-07-28 22:20:43 +02:00
Jonas Jenwald	08b05b9fda	Validate the Preferences when fetching them from storage When updating Preferences using the `set` method, the input is carefully validated. However, no validation is (currently) done when a `BasePreferences` instance is created, which probably isn't that great. Hence this patch that simply ignores, to not unnecessarily break loading of the viewer itself, any invalid Preferences.	2018-07-28 14:32:24 +02:00
Jonas Jenwald	780cbadcd7	Stop re-loading the Preferences in `PDFViewerApplication.open`, and remove the `BasePreferences.reload` method Given that the various Preferences are currently, and have been for quite some time, only used when initializing `PDFViewerApplication` re-loading them when a new PDF file is opened in the viewer is essentially a no-op. Furthermore, with the only usage of `BasePreferences.reload` now gone, the value of that method seems questionable at best. In the event that the functionality is actually needed again, similar to the `ViewHistory`, it'd probably make more sense to simply replace `PDFViewerApplication.preferences` with a new `BasePreferences` instance instead (using e.g. `DefaultExternalServices.createPreferences`).	2018-07-28 13:50:16 +02:00
bion	c31ddf7edc	[api-minor] Include export value for checkboxes	2018-07-28 00:30:41 -07:00
Tim van der Meij	d6f378fbaf	Merge pull request #9933 from perlun/patch-1 README.md: suggest usage of https instead of git protocol	2018-07-28 00:09:24 +02:00
Per Lundberg	82f1d3c82a	README.md: suggest usage of https instead of git protocol The `git` protocol is unencrypted which means other parties could potentially eavesdrop your traffic. `https` or `ssh` is often encouraged because of this. (For example, the Ruby package manager `bundler` prints a warning when `git` sources are being used.)	2018-07-27 23:26:59 +03:00
Jonas Jenwald	522040d130	Expose the Linearization status in the document properties dialog This uses the same terminology, i.e. "Fast Web View", as is used by Adobe software.	2018-07-26 17:30:46 +02:00
Jonas Jenwald	928b89382e	[api-minor] Add an `IsLinearized` property to the `PDFDocument.documentInfo` getter, to allow accessing the linearization status through the API (via `PDFDocumentProxy.getMetadata`) There was a (somewhat) recent question on IRC about accessing the linearization status of a PDF document, and this patch contains a simple way to expose that through already existing API methods. Please note that during setup/parsing in `PDFDocument` the linearization data is already being fetched and parsed, provided of course that it exists. Hence this patch will not cause any additional data to be loaded.	2018-07-26 15:54:19 +02:00
Jonas Jenwald	8a4466139b	Simplify the `DocumentInfoValidators` definition With this file now being a proper (ES6) module, it's no longer (technically) necessary for this structure to be lazily initialized. Considering its size, and simplicity, I therefore cannot see the harm in letting `DocumentInfoValidators` just be simple Object instead. While I'm not aware of any bugs caused by the current code, it cannot hurt to add an `isDict` check in `PDFDocument.documentInfo` (since the current code assumes that `infoDict` being defined implies it also being a Dictionary). Finally, the patch also converts a couple of `var` to `let`/`const`.	2018-07-26 15:54:01 +02:00
Jonas Jenwald	2d51bce941	Remove unnecessary `stream.length` check from `PDFDocument.linearization` Note first of all that `PDFDocument` will be initialized with either a `Stream` or a `ChunkedStream`, and that both of these have `length` getters. Secondly, the `PDFDocument` constructor will assert that the `stream` has a non-zero (and positive) length. Hence there's no point in checking `stream.length` in the `linearization` getter.	2018-07-26 15:54:01 +02:00
Yury Delendik	51b0e60f9b	Merge pull request #9924 from ErikNijland/master fix(browser): zlib is not available in browser	2018-07-26 08:35:52 -05:00
Jonas Jenwald	32bfa55d98	Attempt to estimate the minimum required `buffer` length when initializing `StreamsSequenceStream` instances For most other `DecodeStream` based streams, we'll attempt to estimate the minimum `buffer` length based on the raw stream data. The purpose of this is to avoid having to unnecessarily re-size the `buffer`, thus reducing the number of intermediate allocations necessary when decoding the stream data. However, currently no such optimization is attempted for `StreamsSequenceStream`, and given that they can often be quite large that seems unfortunate. To improve this, at least somewhat, this patch utilizes the raw sizes of the `StreamsSequenceStream` sub-streams to estimate the minimum required `buffer` length. Most likely this patch won't have a huge effect on memory consumption, however for pathological cases it should help reduce peak memory usage slightly. One example is the PDF file in issue 2813, where currently the `StreamsSequenceStream` instances would grow their `buffer`s as `2 MiB -> 4 MiB -> 8 MiB -> 16 MiB -> 32 MiB`. With this patch, the same stream `buffers`s grow as `8 MiB -> 16 MiB -> 32 MiB`, thus avoiding a total of `12 MiB` of intermediate allocations (since there's two `StreamsSequenceStream` used, for rendering/text-extraction).	2018-07-26 13:42:59 +02:00
Erik Nijland	26c734e493	fix(browser): zlib is not available in browser	2018-07-26 12:01:10 +02:00

1 2 3 4 5 ...

10974 Commits