pdf.js

Author	SHA1	Message	Date
Tim van der Meij	f062695d62	Merge pull request #7633 from timvandermeij/interactive-forms-tx-flags Text widget annotations: support read-only/multiline fields and improve testing	2016-09-17 17:19:47 +02:00
Tim van der Meij	dbea302a6e	Text widget annotations: do not render on canvas as well If interactive forms are enabled, then the display layer takes care of rendering the form elements. There is no need to draw them on the canvas as well. This also leads to issues when values are prefilled, because the text fields are transparent, so the contents that have been rendered onto the canvas will be visible too. We address this issue by passing the `renderInteractiveForms` parameter to the render task and handling it when the page is rendered (i.e., when the canvas is rendered).	2016-09-17 15:24:48 +02:00
Tim van der Meij	adf0972ca5	Text widget annotations: improve unit and reference tests This patch improves the unit tests by testing the support for read-only and multiline fields. Moreover, we add a reference test to ensure that the text widgets are not only rendered, but also that their contents are styled properly. Finally, we perform minor improvements in `src/core/annotation.js`, for example adding missing comments.	2016-09-17 15:24:48 +02:00
Tim van der Meij	f6965fadc0	Text widget annotations: support multiline and read-only fields Moreover, this patch provides us with a framework for handling field flags in general for all types of widget annotations.	2016-09-17 15:24:47 +02:00
Jonas Jenwald	4acd31f51e	Merge pull request #7550 from Snuffleupagus/Type1-toUnicode-builtInEncoding-fallback For embedded Type1 fonts without included `ToUnicode`/`Encoding` data, attempt to improve text selection by using the `builtInEncoding` to amend the `toUnicode` map (issue 6901, issue 7182, issue 7217, bug 917796, bug 1242142)	2016-09-16 17:51:55 +02:00
Tim van der Meij	26da2d57ce	Merge pull request #7632 from Snuffleupagus/more-efficient-expandTextDivs [EnhanceTextSelection] Make `expandTextDivs` more efficient by updating all styles at once instead of piecewise	2016-09-15 16:01:07 +02:00
Jonas Jenwald	8eaa2cbce3	Remove the deprecated `mozDash`/`mozDashOffset` canvas 2D context methods According to [MDN](https://developer.mozilla.org/en-US/docs/Web/API/CanvasRenderingContext2D/setLineDash#Browser_compatibility) the standard versions of these methods have been supported since Firefox 27, which was released over two and a half years ago. (See the dates in https://wiki.mozilla.org/RapidRelease/Calendar#Past_branch_dates) Furthermore the non-standard properties are now in the process of being removed, please see https://groups.google.com/forum/#!topic/mozilla.dev.platform/UIudMABegcY. Hence I don't think that we need to keep the old `moz` prefixed ones as fallback any more.	2016-09-15 10:05:40 +02:00
Jonas Jenwald	cb5f9df0c8	[EnhanceTextSelection] Make `expandTextDivs` more efficient by updating all styles at once instead of piecewise I intended to provide proper benchmarking results here, as outlined in https://github.com/mozilla/pdf.js/wiki/Benchmarking-your-changes, but after wasting a couple of hours over the weekend getting weird results I gave up. It appears that there's a lot of, i.e. way too much, variance between subsequent runs of `text` tests for the results to be meaningful. (Previously I've only benchmarked `eq` tests, so I don't know if the `text` tests has never worked well or if it's a newer problem. For reference, please see the results of back-to-back benchmark runs on the current `master` with a very simple manifest file: [link here].) Instead I used `console.time/timeEnd` in `appendText` and `expandTextDivs` to be able to compare the performance with/without the patch. The entire viewer was (skip-cache) reloaded between measurements, and the result are available here: [link here]. Given the troubles I've had with benchmarking, I've not yet computed any statistics on the results (e.g. mean, variance, confidence intervals, and so on). However, just by looking at the data I think it's safe to say that this patch first of all doesn't seem to regress the current performance. Secondly it certainly looks very likely that this patch actually improves the performance, especially for the one-glyph-per-text-div case (cf. issue 7224). Re: issue 7584.	2016-09-14 21:19:28 +02:00
Tim van der Meij	323e86c442	Text widget annotations: implement unit testing and sanitize data values	2016-09-13 14:57:11 +02:00
Yash Srivastav	4e428c7675	Fix lint warnings in URL polyfill	2016-09-12 20:34:51 +05:30
Tim van der Meij	03588ccbf7	Merge pull request #7623 from Snuffleupagus/jpx-error Change `src/core/jpx.js` to use the `error` utility function instead of using `throw new Error`	2016-09-12 15:34:05 +02:00
Yury Delendik	160b176109	Adding "proper" message port for fake worker.	2016-09-12 11:17:10 +02:00
Jonas Jenwald	f620f61887	Change `src/core/jpx.js` to use the `error` utility function instead of using `throw new Error` Note that in `parseCodestream` I purposly left the `throw new Error` instances inside of the `try` block, since we don't want to throw any `Errors` while in recovery mode. Finally somewhat unrelated to the rest of the patch, but I moved the `doNotRecover` variable declaration outside of the `try` block to avoid variable hoisting given that it's accessed inside the `catch` block.	2016-09-12 11:05:43 +02:00
Jonas Jenwald	325f7afcca	For embedded Type1 fonts without included `ToUnicode`/`Encoding` data, attempt to improve text selection by using the `builtInEncoding` to amend the `toUnicode` map (issue 6901, issue 7182, issue 7217, bug 917796, bug 1242142) Note that in order to prevent any possible issues, this patch does not try to amend the `toUnicode` data for Type1 fonts that contain either `ToUnicode` or `Encoding` entries in the font dictionary. Fixes, or at least improves, issues/bugs such as e.g. 6658, 6901, 7182, 7217, bug 917796, bug 1242142.	2016-09-11 20:54:10 +02:00
Tim van der Meij	be485f59ab	Text widget annotations: implement maximum length and text alignment Moreover, we refactor the code a bit to extract code that is shared between the two branches and we only apply text alignment (and create the array) when it is actually defined, since it's optional and left is already the default.	2016-09-11 20:49:00 +02:00
Jonas Jenwald	0b75f63c03	Don't duplicate the first entry in the `charCodeToGlyphId` map for CIDFontType2 fonts with a `CIDToGIDMap` that already mapped the first entry to a non-zero `glyphId` (issue 7544) Fixes 7544.	2016-09-09 22:33:41 +02:00
Tim van der Meij	b112f9f9f4	Merge pull request #7600 from Snuffleupagus/issue-7598 Check that Type1C fonts does not actually contain OpenType font files (issue 7598)	2016-09-09 22:02:58 +02:00
Tim van der Meij	e686db250c	Render interactive form (AcroForm) text widget annotations This patch is the first step towards implementing support for interactive forms (AcroForms). It makes it possible to render text widget annotations exactly like Adobe Reader/Acrobat. Everything we implement for AcroForms is disabled by default using a preference, mainly because it is not ready to use yet, but has to implemented in many steps to avoid complexity. The preference allows us to work with the code while not exposing the behavior by default. Mainly storing entered values and printing them is still absent, which would be minimal requirements for enabling this by default.	2016-09-07 15:37:28 +02:00
Jonas Jenwald	8dbb5a7c4a	Merge pull request #7596 from timvandermeij/widget-annotation-cleanup Improve the structure for widget annotations	2016-09-06 13:46:31 +02:00
Jonas Jenwald	44b75c01a1	Check that Type1C fonts does not actually contain OpenType font files (issue 7598) This patch is yet another instalment in the (never ending) series of patches for PDF files that specify completely incorrect Type/Subtype for its fonts. In this case Type1/Type1C, when in fact OpenType would have been correct. Fixes 7598.	2016-09-06 10:13:11 +02:00
Tim van der Meij	576f742047	Improve the structure for widget annotations Currently, we only support text widget annotations (field type 'Tx') partially. However, the current code does not make this entirely clear and does not provide a warning when an unsupported field type is encountered, making it harder to determine why rendering fails. Moreover, in the display layer we make no distinction between the various types of widget annotations, causing the code for text widget annotations to also be executed for other types of widget annotations in a fallback situation. This patch improves the structure of the widget annotation code. In the core layer, we use the same structure we use for non-widget annotations in the factory and provide a clear warning when an unsupported type is encountered. In the display layer, we do the same and split the `WidgetAnnotationElement` class into two classes, namely `TextWidgetAnnotationElement` for text widget annotations and `WidgetAnnotationElement` for other unsupported annotations as a fallback. From this it clear that we only support text widget annotations and nothing else.	2016-09-06 00:26:05 +02:00
Jonas Jenwald	37998076c9	In `display/api.js` ensure that we always reject with an `Error` in `JpegDecode`, and adjust a couple of other rejection sites as well In the case where the document was destroyed, we were rejecting the `Promise` in `JpegDecode` with a string instead of an `Error`. The patch also brings the wording more inline with other such rejections. Use the `isInt` utility function when validating the `pageNumber` parameter in `WorkerTransport_getPage`, to make it more obvious what's actually happening. There's also a couple more unit-tests added, to ensure that we always fail in the expected way. Finally, we can simplify the rejection handling in `WorkerTransport_getPageIndexByRef` somewhat. (Note that the only reason for using `catch` here is that since the promise is rejected on the worker side, the `reason` becomes a string instead of an `Error` which is why we "re-reject" on the display side.)	2016-09-05 16:35:32 +02:00
Jonas Jenwald	38c85039d1	Merge pull request #7588 from timvandermeij/text-layer-weakmap Use a `WeakMap` in `src/display/text_layer.js`	2016-09-04 21:25:48 +02:00
Tim van der Meij	96593571eb	Optimize scale calculation in `text_layer.js` This patch avoids having to calculate the scale twice by saving it in the properties object. Moreover, we remove a temporary variable and place parentheses around a calculation inside a string concatenation.	2016-09-04 20:19:31 +02:00
Jonas Jenwald	a35773ec8c	Change `src/core/jpg.js` to use the `error` utility function instead of `throw`ing This allows us to remove the `try/catch` statements used in `src/core/stream.js` when parsing JPEG images. As far as I can tell, the only reason for the current usage of plain `throw` is that `jpg.js` originally was external code. Given that this code now lives in our repo, this patch brings the JPEG code more in line with e.g. `src/core/jpx.js` and `src/core/jbig2.js`.	2016-09-04 16:28:23 +02:00
Tim van der Meij	d03651efff	Merge pull request #7407 from Snuffleupagus/issue-7406 Assign the `quantizationTables` after parsing the entire JPEG image, to prevent issues when the DQT (Define Quantization Tables) marker is encountered after SOF{n} (Start of Frame) markers (issue 7406)	2016-09-04 14:49:01 +02:00
Tim van der Meij	b3818d5c36	Replace `div.dataset` with a `WeakMap` in `text_layer.js` This patch improves performance by avoiding unnecessary type conversions, which also help the JIT for optimizations. Moreover, this patch fixes issues with the div expansion code where `textScale` would be undefined in a division. Because of the `dataset` usage, other comparisons evaluated to `true` while `false` would have been correct. This makes the expansion mode now work correctly for cases with, for example, each glyph in one div. The polyfill for `WeakMap` has been provided by @yurydelendik.	2016-09-03 20:06:42 +02:00
Tim van der Meij	b10add14f3	Refactor `text_layer.js` to pass the task as a parameter We pass many parameters to `appendText` while we might as well pass the `task` object that contains them. This saves a few lines of code and makes the signature of `appendText` more clear. We do the same for `expand`, which is useful for the next commit in which we replace `div.dataset` with a `WeakMap`. Furthermore, this patch adds a missing parameter to a comment block to make it clear which parameters remain.	2016-09-02 20:46:36 +02:00
Tim van der Meij	7c961b6b7a	Minor code style improvements after #7539	2016-09-01 18:07:12 +02:00
Tim van der Meij	6bb95e3129	Merge pull request #7539 from jeremypress/fairexpand [api-minor] Expanding divs to improve selection	2016-09-01 17:43:31 +02:00
Jeremy Press	6faa84abdb	Continuing fairexpand #6663 1. Expanding divs to improve text selection. (Yury) 2. Adding enhanceTextSelection as an option. 3. Moving feature functionality from text_layer_builder.js to text_layer.js. 4. Added expandTextDivs method to only load expanded divs on first click, and only show on subsequent clicks	2016-08-31 09:54:52 -07:00
Jonas Jenwald	1bbc694ac3	Assign the `quantizationTables` after parsing the entire JPEG image, to prevent issues when the DQT (Define Quantization Tables) marker is encountered after SOF{n} (Start of Frame) markers (issue 7406) This is a tentative patch that fixes 7406.	2016-08-31 18:42:05 +02:00
Yury Delendik	ffa99397ad	Merge pull request #7387 from Snuffleupagus/issue-5808 Attempt to ignore multiple identical Tf (setFont) commands in `PartialEvaluator_getTextContent` (issue 5808)	2016-08-30 15:21:41 -05:00
Tim van der Meij	f520616e00	Merge pull request #7570 from Snuffleupagus/issue-7569 Create a fallback annotation `id` for entries in `Annots` dictionaries that are not indirect objects (issue 7569)	2016-08-28 00:23:59 +02:00
Jonas Jenwald	088ce6c009	Add a unit-test to check that `ProblematicCharRanges` contains valid entries When adding new entries to `ProblematicCharRanges`, you have to be careful to not make any mistakes since that could cause glyph mapping issues. Currently the existing reference tests should probably help catch any errors, but based on experience I think that having a unit-test which specifically checks `ProblematicCharRanges` would be both helpful and timesaving when modifying/reviewing changes to this code. Hence this patch which adds a function (and unit-test) that is used to validate the entries in `ProblematicCharRanges`, and also checks that we don't accidentally add more character ranges than the Private Use Area can actually contain. The way that the validation code, and thus the unit-test, is implemented also means that we have an easy way to tell how much of the Private Use Area is potentially utilized by re-mapped characters.	2016-08-27 11:56:00 +02:00
Jonas Jenwald	78889646c8	Create a fallback annotation `id` for entries in `Annots` dictionaries that are not indirect objects (issue 7569) According to the PDF specification, see http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#page=86, entries in `Annots` dictionaries should be indirect objects, but obviously there're PDF generators that ignore this. Fixes 7569.	2016-08-27 10:56:16 +02:00
Jonas Jenwald	5379749d4b	Try to prevent `CanvasGraphics_getSinglePixelWidth` from intermittently returning incorrect values in Firefox (issue 7188) Fixes 7188.	2016-08-22 20:00:24 +02:00
Tim van der Meij	b4c8814fc9	Merge pull request #7534 from Snuffleupagus/isName-name-check Add a parameter to the `isName` function that enables checking not just that something is a `Name`, but also that the actual `name` properties matches	2016-08-17 15:48:42 +02:00
Jonas Jenwald	544d29f5cb	Add a `recoveryMode` that suppresses errors from the `Parser`, and utilize it when searching for the main trailer in `XRef_indexObjects` (bug 1250079) Instead of having `Parser_getObj` fail unconditionally for the referenced PDF file, this patch attempts to let searching for the main trailer continue even if there are errors. Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1250079.	2016-08-17 12:37:35 +02:00
Jonas Jenwald	83ce6f0b6d	Adjust the (applicable) existing `isName` callsites to use the new `isName(v, name)` version of the function	2016-08-10 11:15:08 +02:00
Jonas Jenwald	af636aae96	Add a parameter to the `isName` function that enables checking not just that something is a `Name`, but also that the actual `name` properties matches This is similar to the existing `isCmd` and `isDict` functions, which already support similar kind of checks. With the updated `isName` function, we'll be able to simplify many callsites from: `isName(someVariable) && someVariable.name === 'someName'` to: `isName(someVariable, 'someName')`.	2016-08-10 11:15:03 +02:00
Jonas Jenwald	77c6ed5389	Attempt to ignore multiple identical Tf (setFont) commands in `PartialEvaluator_getTextContent` (issue 5808) This patch improves the performance of issue 5808, but I'm not sure if it's enough to call it fixed. On average, this patch reduces the number of textLayer div's by a factor of 3, and it also reduces the time spend in `getTextContent` by a factor of ~2. The PDF file is generated by `Scribus PDF`, which for reasons I cannot understand is placing redundant `Tf` commands before every showText command. Note how the PDF file also contains lots of (basically) identical fonts, but with slightly different names, which causes unnecessary font-switching. This causes some unnecessary breaking of textLayer div's, but this issue cannot be easily worked around.	2016-07-27 21:37:52 +02:00
Yury Delendik	a02e2686b9	Merge pull request #7475 from Snuffleupagus/api-getTextContent-combineTextItems [api-minor] Add a parameter to `PDFPageProxy_getTextContent` that controls whether `PartialEvaluator_getTextContent` will attempt to combine same line text items	2016-07-27 08:34:24 -05:00
Jonas Jenwald	558a22cd02	Prevent errors when parsing Annotations with missing (or invalid) /Subtype entries (issue 7446) Note that I used a separate warning message for this case, instead of utilizing the same one as in the unsupported subtype case, to more clearly indicate that the PDF file itself is to blame rather than PDF.js. Fixes 7446.	2016-07-25 13:59:26 +02:00
Brendan Dahl	5678486802	Merge pull request #7347 from Snuffleupagus/evaluator-more-Ref_toString Slightly refactor the `fontRef` handling in `PartialEvaluator_loadFont` (issue 7403 and issue 7402)	2016-07-22 17:21:47 -07:00
Brendan Dahl	50d6e4f147	Merge pull request #7447 from Snuffleupagus/buildToUnicode-notdef Ignore .notdef in the `differences` array when building a fallback `toUnicode` map in `PartialEvaluator_buildToUnicode` (issue 5256)	2016-07-22 14:33:32 -07:00
Jonas Jenwald	390c02a3e9	Attempt to cache fonts that are direct objects (i.e. `Dict`s), as opposed to `Ref`s, to prevent re-rendering after `cleanup` from breaking (issue 7403 and issue 7402) Fonts that are not referenced by `Ref`s are very uncommon in practice, but it can unfortunately happen. In this case, we're currently not caching them in the usual way, i.e. by `Ref`, which leads to failures when a page is rendered after `cleanup` has run. The simplest solution would have been to remove the `font.translated` workaround, but since this would have meant loading these kind of fonts over and over, the patch attempts to be a bit clever about this situation. Note that if we instead loaded fonts per page, instead of per document, this issue wouldn't have existed.	2016-07-21 16:04:07 +02:00
Jonas Jenwald	2e9cd3ea64	Slightly refactor the `fontRef` handling in `PartialEvaluator_loadFont` (issue 7403 and issue 7402) Originally, I was just going to change this code to use `Ref_toString` in a couple more places. When I started reading the code, I figured that it wouldn't hurt to clean up a couple of comments. While doing this, I noticed that the logic for the (rare) `isDict(fontRef)` case could do with a few improvements. There should be no functional changes with this patch, but given the added reference checks, we will now avoid bogus `Ref`s when resolving font aliases. In practice, as issue 7403 shows, the current code can break certain PDF files even if it's very rare. Note that the only thing that this patch will change, is the `font.loadedName` in the case where a `fontRef` is a reference and the font doesn't have a descriptor. Previously for `fontRef = Ref(4, 0)` we'd get `font.loadedName = 'g_d0_f4_0'`, and with this patch `font.loadedName = g_d0_f4R`, which is actually one character shorted in most cases. (Given that `Ref_toString` contains an optimization for the `gen === 0` case, which is by far the most common `gen` value.) In the already existing fallback case, where the `fontName` is used to when creating the `font.loadedName`, we allow any alphanumeric character. Hence I don't see how (as mentioned above) e.g. `font.loadedName = g_d0_f4R` would be an issue here.	2016-07-21 16:03:33 +02:00
Tim van der Meij	10f9f11ec4	Merge pull request #7490 from Snuffleupagus/issue-7426 Don't map glyphs to the Lepcha Unicode block (issue 7426)	2016-07-21 14:39:19 +02:00
Jonas Jenwald	f297e4d17c	[api-minor] Add a parameter to `PDFPageProxy_getTextContent` that controls whether `PartialEvaluator_getTextContent` will attempt to combine same line text items From the discussion in issue 7445, it seems that there may be cases where an API consumer would want to get the text content as is, without combined text items.	2016-07-19 13:38:57 +02:00

1 2 3 4 5 ...

2589 Commits