pdf.js

Author	SHA1	Message	Date
Tim van der Meij	5194e68134	Lint: correct code style violations Manual observations and working with other linting tools found these.	2016-11-01 15:04:21 +01:00
Jonas Jenwald	d284cfd5eb	[api-minor] Add support for relative URLs, in both annotations and the outline, by adding a `docBaseUrl` parameter to `PDFJS.getDocument` (bug 766086) Note that in `FIREFOX/MOZCENTRAL/CHROME` builds of the standard viewer the `docBaseUrl` parameter will be set by default, since in that case it makes sense to use the current URL as a base. For the `GENERIC` viewer, or the API itself, it doesn't make sense to try and set the `docBaseUrl` by default. However, custom deployments/implementations may still find the parameter useful.	2016-10-19 22:20:24 +02:00
Jonas Jenwald	71a781ee5c	Deprecate the `isValidUrl` utility function and replace it with `createValidAbsoluteUrl`/`isValidProtocal` functions instead, since the main URL validation is now done using the `new URL` constructor	2016-10-19 22:11:22 +02:00
Jonas Jenwald	42f07c6262	[api-minor] Use the `new URL` constructor when validating URLs in annotations and the outline, as a complement to only checking the protocol, and add a bit more validation to `Catalog_parseDestDictionary` Note that this will automatically reject any relative URL. To make the API more useful to consumers, URLs that are rejected will be available via the `unsafeUrl` property in the data object returned by `PDFPageProxy_getAnnotations`. The patch also adds a bit more validation of the data for `Named` actions.	2016-10-19 22:11:17 +02:00
Jonas Jenwald	e64bc1fd13	Move parsing of destination dictionaries to a helper function This not only reduces code duplication, but it also allow us to easily support the same kind of URLs we currently do for Link annotations in the Outline as well.	2016-10-18 16:14:07 +02:00
Tim van der Meij	2e20000b71	Merge pull request #7727 from Snuffleupagus/parser-stream-decodeParms Let `Parser_makeFilter` pass in the `DecodeParms` data to various image `Stream`s, instead of re-fetching it in various `[...]Stream.prototype.ensureBuffer` methods	2016-10-15 20:04:17 +02:00
Yury Delendik	ea5949f1fd	Merge pull request #7668 from Snuffleupagus/issue-7665 Prevent an infinite loop in `XRef_fetchUncompressed` for encrypted PDF files with indirect objects in the /Encrypt dictionary (issue 7665)	2016-10-15 10:52:08 -05:00
Jonas Jenwald	c8f83d6487	Let `Parser_makeFilter` pass in the `DecodeParms` data to various image `Stream`s, instead of re-fetching it in various `[...]Stream.prototype.ensureBuffer` methods In `Parser_filter` the `DecodeParms` data is fetched and passed to `Parser_makeFilter`, where we also make sure that a `Ref` is resolved to a direct object. We can thus pass this along to the various image `Stream` constructors, to avoid the current situation where we lookup/resolve data that is already available. Note also that we currently do not handle the case where `DecodeParms` is an Array entirely correct in the various image `Stream`s, and this patch fixes that for free.	2016-10-15 12:09:51 +02:00
Jonas Jenwald	1da59bec9b	Remove a remaining old-style preprocessor from `src/core/fonts.js` (PR 7322 follow-up) Note that this code was added after PR 7322 was opened, which thus explains why it was missed during rebasing.	2016-10-15 11:33:09 +02:00
Yury Delendik	0576c9c6c6	Replaces all preprocessor directives with PDFJSDev calls.	2016-10-14 10:57:53 -05:00
Chas Emerick	85c52f1fd6	Fix getTextContent evaluation to only apply TJ horizontal offsets using numeric items/args While the array argument to TJ should only contain strings and numbers, other unfortunate items are found in PDFs in the wild, e.g.: [(Grandes) 0.0 Tc -250.0 (Client\350les,) 0.0 Tc -250.0 (Financements) 0.0 Tc -250.0 (et) 0.0 Tc -250.0 (March\351s) ] TJ getOperatorList already properly ignores any non-string, non-numeric values in TJ arrays; without this patch to getTextContent, returned text items can have NaN widths due to calculations being applied to those non-numeric values.	2016-10-13 08:08:31 -04:00
Tim van der Meij	9b3a91f365	Merge pull request #7671 from timvandermeij/interactive-forms-choice-fields Interactive forms: render choice widget annotations	2016-10-05 23:27:45 +02:00
Tim van der Meij	d5d9f362aa	Choice widget annotations: core and display layer implementation	2016-10-05 21:25:29 +02:00
Yury Delendik	7b2a9ee4e0	Merge pull request #7670 from Snuffleupagus/Parser_makeFilter-maybeLength Only skip parsing a stream in `Parser_makeFilter` when we know for sure that it is empty (PR 6372 follow-up)	2016-10-05 10:38:12 -05:00
Jonas Jenwald	54ee83eb12	Attempt to skip zero bytes at the end of Scan blocks when decoding JPEG images (issue 4090)	2016-09-28 16:31:02 +02:00
Jonas Jenwald	116ba19dd9	Respect the 'ColorTransform' entry in the image dictionary when decoding JPEG images (bug 956965, issue 6574) Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=956965. Fixes 6574.	2016-09-26 21:55:43 +02:00
Jonas Jenwald	a22f0ae820	Only skip parsing a stream in `Parser_makeFilter` when we know for sure that it is empty (PR 6372 follow-up) For PDF files with multiple `/Filter`s, where the `/Length` entry is zero, we fail to render the file correctly. The reason is that `maybeLength` is `null` for the every filter except the first, and `!maybeLength` is thus truthy. Hence it seems that we should completely ignore the `/Length` entry and also explicitly check `maybeLength === 0`. Note that I've not (yet) come across a PDF file with this issue in the wild, but given all the stupid things PDF generators do I wouldn't be surprised if such a file actually exists. In order to prevent a possible future bug, I'm submitting this patch which includes a hand-edited PDF file that we currently cannot render correctly (but e.g. Adobe Reader can).	2016-09-25 12:40:15 +02:00
Jonas Jenwald	3e77cf6b32	Prevent an infinite loop in `XRef_fetchUncompressed` for encrypted PDF files with indirect objects in the /Encrypt dictionary (issue 7665)	2016-09-25 00:18:47 +02:00
Jonas Jenwald	6c263c1994	Merge pull request #7649 from timvandermeij/interactive-forms-tx-comb Text widget annotations: implement comb support	2016-09-22 11:36:30 +02:00
Tim van der Meij	375229d6b9	Widget annotations: simplify field flag handling Directly use the hexadecimal representation, just like the `AnnotationFlags`, to avoid calculations and to improve readability. This allows us to simplify the unit tests for text widget annotations as well.	2016-09-21 21:11:37 +02:00
Jonas Jenwald	ded01356c7	Pass in the `renderInteractiveForms` parameter to `Annotation_appendToOperatorList`, in `Page_getOperatorList`, instead of to the `Annotation` constructor (PR 7633 follow-up) When debugging issue 7643, I noticed that the `forms` tests currently doesn't look like the rendering in the viewer (with `renderInteractiveForms = true` set). After scratching my head for a little while, I realized that PR 7633 make the implicit assumption that `Page_getOperatorList` (in `core/document.js`) is called before fetching the annotation with `PDFPageProxy_getAnnotations` (in `display/api.js`). Hence this patch, that changes it so that we instead pass in the `renderInteractiveForms` parameter to `Annotation_appendToOperatorList` to ensure that it's always correctly set.	2016-09-21 12:21:20 +02:00
Tim van der Meij	6100ab4b18	Text widget annotations: implement comb support	2016-09-20 22:31:10 +02:00
Brendan Dahl	15e1ae4e3f	Merge pull request #7639 from Snuffleupagus/bug-1252420 Replace empty CharStrings with '.notdef' in `Type1Font_wrap` to prevent OTS from rejecting the font (bug 1252420)	2016-09-20 11:56:47 -07:00
Tim van der Meij	f062695d62	Merge pull request #7633 from timvandermeij/interactive-forms-tx-flags Text widget annotations: support read-only/multiline fields and improve testing	2016-09-17 17:19:47 +02:00
Tim van der Meij	dbea302a6e	Text widget annotations: do not render on canvas as well If interactive forms are enabled, then the display layer takes care of rendering the form elements. There is no need to draw them on the canvas as well. This also leads to issues when values are prefilled, because the text fields are transparent, so the contents that have been rendered onto the canvas will be visible too. We address this issue by passing the `renderInteractiveForms` parameter to the render task and handling it when the page is rendered (i.e., when the canvas is rendered).	2016-09-17 15:24:48 +02:00
Tim van der Meij	adf0972ca5	Text widget annotations: improve unit and reference tests This patch improves the unit tests by testing the support for read-only and multiline fields. Moreover, we add a reference test to ensure that the text widgets are not only rendered, but also that their contents are styled properly. Finally, we perform minor improvements in `src/core/annotation.js`, for example adding missing comments.	2016-09-17 15:24:48 +02:00
Tim van der Meij	f6965fadc0	Text widget annotations: support multiline and read-only fields Moreover, this patch provides us with a framework for handling field flags in general for all types of widget annotations.	2016-09-17 15:24:47 +02:00
Jonas Jenwald	aadcbe98c8	Replace empty CharStrings with '.notdef' in `Type1Font_wrap` to prevent OTS from rejecting the font (bug 1252420) Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1252420.	2016-09-17 14:39:10 +02:00
Jonas Jenwald	4acd31f51e	Merge pull request #7550 from Snuffleupagus/Type1-toUnicode-builtInEncoding-fallback For embedded Type1 fonts without included `ToUnicode`/`Encoding` data, attempt to improve text selection by using the `builtInEncoding` to amend the `toUnicode` map (issue 6901, issue 7182, issue 7217, bug 917796, bug 1242142)	2016-09-16 17:51:55 +02:00
Tim van der Meij	323e86c442	Text widget annotations: implement unit testing and sanitize data values	2016-09-13 14:57:11 +02:00
Jonas Jenwald	f620f61887	Change `src/core/jpx.js` to use the `error` utility function instead of using `throw new Error` Note that in `parseCodestream` I purposly left the `throw new Error` instances inside of the `try` block, since we don't want to throw any `Errors` while in recovery mode. Finally somewhat unrelated to the rest of the patch, but I moved the `doNotRecover` variable declaration outside of the `try` block to avoid variable hoisting given that it's accessed inside the `catch` block.	2016-09-12 11:05:43 +02:00
Jonas Jenwald	325f7afcca	For embedded Type1 fonts without included `ToUnicode`/`Encoding` data, attempt to improve text selection by using the `builtInEncoding` to amend the `toUnicode` map (issue 6901, issue 7182, issue 7217, bug 917796, bug 1242142) Note that in order to prevent any possible issues, this patch does not try to amend the `toUnicode` data for Type1 fonts that contain either `ToUnicode` or `Encoding` entries in the font dictionary. Fixes, or at least improves, issues/bugs such as e.g. 6658, 6901, 7182, 7217, bug 917796, bug 1242142.	2016-09-11 20:54:10 +02:00
Jonas Jenwald	0b75f63c03	Don't duplicate the first entry in the `charCodeToGlyphId` map for CIDFontType2 fonts with a `CIDToGIDMap` that already mapped the first entry to a non-zero `glyphId` (issue 7544) Fixes 7544.	2016-09-09 22:33:41 +02:00
Tim van der Meij	b112f9f9f4	Merge pull request #7600 from Snuffleupagus/issue-7598 Check that Type1C fonts does not actually contain OpenType font files (issue 7598)	2016-09-09 22:02:58 +02:00
Jonas Jenwald	44b75c01a1	Check that Type1C fonts does not actually contain OpenType font files (issue 7598) This patch is yet another instalment in the (never ending) series of patches for PDF files that specify completely incorrect Type/Subtype for its fonts. In this case Type1/Type1C, when in fact OpenType would have been correct. Fixes 7598.	2016-09-06 10:13:11 +02:00
Tim van der Meij	576f742047	Improve the structure for widget annotations Currently, we only support text widget annotations (field type 'Tx') partially. However, the current code does not make this entirely clear and does not provide a warning when an unsupported field type is encountered, making it harder to determine why rendering fails. Moreover, in the display layer we make no distinction between the various types of widget annotations, causing the code for text widget annotations to also be executed for other types of widget annotations in a fallback situation. This patch improves the structure of the widget annotation code. In the core layer, we use the same structure we use for non-widget annotations in the factory and provide a clear warning when an unsupported type is encountered. In the display layer, we do the same and split the `WidgetAnnotationElement` class into two classes, namely `TextWidgetAnnotationElement` for text widget annotations and `WidgetAnnotationElement` for other unsupported annotations as a fallback. From this it clear that we only support text widget annotations and nothing else.	2016-09-06 00:26:05 +02:00
Jonas Jenwald	a35773ec8c	Change `src/core/jpg.js` to use the `error` utility function instead of `throw`ing This allows us to remove the `try/catch` statements used in `src/core/stream.js` when parsing JPEG images. As far as I can tell, the only reason for the current usage of plain `throw` is that `jpg.js` originally was external code. Given that this code now lives in our repo, this patch brings the JPEG code more in line with e.g. `src/core/jpx.js` and `src/core/jbig2.js`.	2016-09-04 16:28:23 +02:00
Jonas Jenwald	1bbc694ac3	Assign the `quantizationTables` after parsing the entire JPEG image, to prevent issues when the DQT (Define Quantization Tables) marker is encountered after SOF{n} (Start of Frame) markers (issue 7406) This is a tentative patch that fixes 7406.	2016-08-31 18:42:05 +02:00
Yury Delendik	ffa99397ad	Merge pull request #7387 from Snuffleupagus/issue-5808 Attempt to ignore multiple identical Tf (setFont) commands in `PartialEvaluator_getTextContent` (issue 5808)	2016-08-30 15:21:41 -05:00
Tim van der Meij	f520616e00	Merge pull request #7570 from Snuffleupagus/issue-7569 Create a fallback annotation `id` for entries in `Annots` dictionaries that are not indirect objects (issue 7569)	2016-08-28 00:23:59 +02:00
Jonas Jenwald	088ce6c009	Add a unit-test to check that `ProblematicCharRanges` contains valid entries When adding new entries to `ProblematicCharRanges`, you have to be careful to not make any mistakes since that could cause glyph mapping issues. Currently the existing reference tests should probably help catch any errors, but based on experience I think that having a unit-test which specifically checks `ProblematicCharRanges` would be both helpful and timesaving when modifying/reviewing changes to this code. Hence this patch which adds a function (and unit-test) that is used to validate the entries in `ProblematicCharRanges`, and also checks that we don't accidentally add more character ranges than the Private Use Area can actually contain. The way that the validation code, and thus the unit-test, is implemented also means that we have an easy way to tell how much of the Private Use Area is potentially utilized by re-mapped characters.	2016-08-27 11:56:00 +02:00
Jonas Jenwald	78889646c8	Create a fallback annotation `id` for entries in `Annots` dictionaries that are not indirect objects (issue 7569) According to the PDF specification, see http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#page=86, entries in `Annots` dictionaries should be indirect objects, but obviously there're PDF generators that ignore this. Fixes 7569.	2016-08-27 10:56:16 +02:00
Tim van der Meij	b4c8814fc9	Merge pull request #7534 from Snuffleupagus/isName-name-check Add a parameter to the `isName` function that enables checking not just that something is a `Name`, but also that the actual `name` properties matches	2016-08-17 15:48:42 +02:00
Jonas Jenwald	544d29f5cb	Add a `recoveryMode` that suppresses errors from the `Parser`, and utilize it when searching for the main trailer in `XRef_indexObjects` (bug 1250079) Instead of having `Parser_getObj` fail unconditionally for the referenced PDF file, this patch attempts to let searching for the main trailer continue even if there are errors. Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1250079.	2016-08-17 12:37:35 +02:00
Jonas Jenwald	83ce6f0b6d	Adjust the (applicable) existing `isName` callsites to use the new `isName(v, name)` version of the function	2016-08-10 11:15:08 +02:00
Jonas Jenwald	af636aae96	Add a parameter to the `isName` function that enables checking not just that something is a `Name`, but also that the actual `name` properties matches This is similar to the existing `isCmd` and `isDict` functions, which already support similar kind of checks. With the updated `isName` function, we'll be able to simplify many callsites from: `isName(someVariable) && someVariable.name === 'someName'` to: `isName(someVariable, 'someName')`.	2016-08-10 11:15:03 +02:00
Jonas Jenwald	77c6ed5389	Attempt to ignore multiple identical Tf (setFont) commands in `PartialEvaluator_getTextContent` (issue 5808) This patch improves the performance of issue 5808, but I'm not sure if it's enough to call it fixed. On average, this patch reduces the number of textLayer div's by a factor of 3, and it also reduces the time spend in `getTextContent` by a factor of ~2. The PDF file is generated by `Scribus PDF`, which for reasons I cannot understand is placing redundant `Tf` commands before every showText command. Note how the PDF file also contains lots of (basically) identical fonts, but with slightly different names, which causes unnecessary font-switching. This causes some unnecessary breaking of textLayer div's, but this issue cannot be easily worked around.	2016-07-27 21:37:52 +02:00
Yury Delendik	a02e2686b9	Merge pull request #7475 from Snuffleupagus/api-getTextContent-combineTextItems [api-minor] Add a parameter to `PDFPageProxy_getTextContent` that controls whether `PartialEvaluator_getTextContent` will attempt to combine same line text items	2016-07-27 08:34:24 -05:00
Jonas Jenwald	558a22cd02	Prevent errors when parsing Annotations with missing (or invalid) /Subtype entries (issue 7446) Note that I used a separate warning message for this case, instead of utilizing the same one as in the unsupported subtype case, to more clearly indicate that the PDF file itself is to blame rather than PDF.js. Fixes 7446.	2016-07-25 13:59:26 +02:00
Brendan Dahl	5678486802	Merge pull request #7347 from Snuffleupagus/evaluator-more-Ref_toString Slightly refactor the `fontRef` handling in `PartialEvaluator_loadFont` (issue 7403 and issue 7402)	2016-07-22 17:21:47 -07:00

1 2 3 4 5 ...

982 Commits