pdf.js

Author	SHA1	Message	Date
Jonas Jenwald	5d578ea36a	[src/core/writer.js] Remove unnecessary string-wrapping for boolean values in `writeValue` (PR 13998 follow-up)	2021-09-12 15:45:45 +02:00
Jonas Jenwald	761519ef3f	Merge pull request #13998 from calixteman/bug1729971 Write boolean value when saving a form (bug 1729971)	2021-09-12 15:38:10 +02:00
Jonas Jenwald	a47844d1fc	Let `Lexer.getObj` return a dummy-`Cmd` for commands that start with a non-visible ASCII character (issue 13999) This way we avoid breaking badly generated PDF documents where a non-visible ASCII character is "glued" to a valid command.	2021-09-11 19:54:13 +02:00
Tim van der Meij	e97f01b17c	Merge pull request #13977 from Snuffleupagus/enqueueChunk-batch [api-minor] Reduce `postMessage` overhead, in `PartialEvaluator.getTextContent`, by sending text chunks in batches (issue 13962)	2021-09-11 13:34:07 +02:00
Jonas Jenwald	0e54f568fb	Re-factor the `CSS_PIXELS_PER_INCH`/`PDF_PIXELS_PER_INCH` exports (PR 13991 follow-up) For improved maintainability, since these constants are being exposed in the official API, this patch moves them into an Object instead.	2021-09-11 11:15:25 +02:00
Jonas Jenwald	bd51bbfd16	Remove `mozImageSmoothingEnabled` fallback in `CanvasGraphics.endGroup` This was added all the way back in PR 2936, however it's been unnecessary ever since Firefox 51 (released on 2017-01-24); please see the MDN compatibility data: https://developer.mozilla.org/en-US/docs/Web/API/CanvasRenderingContext2D/imageSmoothingEnabled#browser_compatibility	2021-09-11 10:30:39 +02:00
Jonas Jenwald	9ce63a6dc6	Merge pull request #13991 from brendandahl/interpolate Enable/disable image smoothing based on image interpolate value. (bug 1722191)	2021-09-11 10:02:53 +02:00
Brendan Dahl	f38fb42b42	Enable/disable image smoothing based on image interpolate value. (bug 1722191) While some of the output looks worse to my eye, this behavior more closely matches what I see when I open the PDFs in Adobe acrobat. Fixes: #4706, #9713, #8245, #1344	2021-09-10 14:23:35 -07:00
Calixte Denizet	474ab7c86d	Write boolean value when saving a form (bug 1729971) - it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1729971#c4.	2021-09-10 14:10:25 +02:00
calixteman	57b80074a2	Merge pull request #13995 from calixteman/xfa_record XFA - Handle $record shorcut in SOM expression (issue #13994)	2021-09-10 13:57:50 +02:00
Calixte Denizet	c5841b3794	XFA - Handle shorcut in SOM expression (issue #13994 )	2021-09-09 19:54:45 +02:00
Calixte Denizet	623860bf8f	XFA - Remove the checked attribute from the checkbox when unchecked (bug 1729877) - it aims to fix: https://bugzilla.mozilla.org/show_bug.cgi?id=1729877.	2021-09-09 19:14:16 +02:00
Jonas Jenwald	45ddb12f61	Remove no-op `onPull`/`onCancel` streamSink callbacks from the "GetTextContent"-handler The `MessageHandler`-implementation already handles either of these callbacks being undefined, hence there's no particular reason (as far as I can tell) to add no-op functions here. Also, in a couple of `MessageHandler`-methods, utilize an already existing local variable more.	2021-09-09 00:01:10 +02:00
Jonas Jenwald	f90f9466e3	[api-minor] Reduce `postMessage` overhead, in `PartialEvaluator.getTextContent`, by sending text chunks in batches (issue 13962) Following the STR in the issue, this patch reduces the number of `PartialEvaluator.getTextContent`-related `postMessage`-calls by approximately 78 percent.[1] Note that by enforcing a relatively low value when batching text chunks, we should thus improve worst-case scenarios while not negatively affect all `textLayer` building. While working on these changes I noticed, thanks to our unit-tests, that the implementation of the `appendEOL` function unfortunately means that the number and content of the textItems could actually be affected by the particular chunking used. That seems extremely unfortunate, since in practice this means that the particular chunking used is thus observable through the API. Obviously that should be a completely internal implementation detail, which is why this patch also modifies `appendEOL` to mitigate that.[2] Given that this patch adds a minimum batch size in `enqueueChunk`, there's obviously nothing preventing it from becoming a lot larger then the limit (depending e.g. on the PDF structure and the CPU load/speed). While sending more text chunks at once isn't an issue in itself, it could become problematic at the main-thread during `textLayer` building. Note how both the `PartialEvaluator` and `CanvasGraphics` implementations utilize `Date.now()`-checks, to prevent long-running parsing/rendering from "hanging" the respective thread. In the `textLayer` building we don't utilize such a construction[3], and streaming of textContent is thus essentially acting as a simple stand-in for that functionality. Hence why we want to avoid choosing a too large minimum batch size, since that could thus indirectly affect main-thread performance negatively. --- [1] While it'd be possible to go even lower, that'd likely require more invasive re-factoring/changes to the `PartialEvaluator.getTextContent`-code to ensure that the batches don't become too large. [2] This should also, as far as I can tell, explain some of the regressions observed in the "enhance" text-selection tests back in PR 13257. Looking closer at the `appendEOL` function it should potentially be changed even more, however that should probably not be done here. [3] I'd really like to avoid implementing something like that for the `textLayer` building as well, given that it'd require adding a fair bit of complexity.	2021-09-09 00:01:07 +02:00
Jonas Jenwald	69034ab8dc	Improve glyph mapping for non-embedded composite standard fonts (issue 11088) For non-embedded CIDFontType2 fonts with a non-/Identity encoding, use the /ToUnicode data to improve the glyph mapping.	2021-09-08 15:15:33 +02:00
Jonas Jenwald	4c1b586dd2	Reduce the size of `TextLayerRenderTask._textDivProperties` in "regular" text-selection mode While these changes will obviously not have a significant effect on overall memory usage, it cannot hurt as far as I'm concerned. This patch makes the following changes: - Clear out `_textDivProperties` once rendering is done, since those properties are only necessary to keep alive when enhanced text-selection is being used. - Reduce the size of the `_textDivProperties`-entries by default, since a majority of the properties are only relevant when enhanced text-selection is being used.	2021-09-05 12:12:34 +02:00
Tim van der Meij	1b20f61b56	Merge pull request #13972 from Snuffleupagus/issue-13971 Treat all content as visible when no optional content groups are defined (issue 13971)	2021-09-04 15:53:44 +02:00
Tim van der Meij	680f33c31c	Merge pull request #13961 from Snuffleupagus/simpler-regexp Simplify some regular expressions	2021-09-04 15:39:30 +02:00
Jonas Jenwald	6318ccf6d2	Treat all content as visible when no optional content groups are defined (issue 13971) In the referenced PDF document the /Contents stream contains MarkedContent-operators, however no optional content dictionary exists; according to [the specification](https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G7.3883825): > Null values or references to deleted objects shall be ignored. If this entry is not present, is an empty array, or contains references only to null or deleted objects, the membership dictionary shall have no effect on the visibility of any content.	2021-09-04 08:13:37 +02:00
Jonas Jenwald	3ccf277f58	Fallback to the /ToUnicode map for TrueType fonts with (3, 1) and (1, 0) cmap-tables (issue 13316) In the PDF document some of the glyphs have bogus `differences`-entries[1] that cannot be resolved to valid glyph names, thus causing the glyph mapping to fail. My initial idea was to use a similar approach as in the `PartialEvaluator._simpleFontToUnicode`-method, to extract the charCodes from those entries, however it turned out that that didn't actually help in this case (the mapping was still wrong). To fix this I'm thus proposing that we fallback to the /ToUnicode map when no other useable data exists (e.g. no post-table), since it hopefully shouldn't make things any worse than leaving parts of the glyph map empty (which currently happens). --- [1] As can be seem below, some of the entries are completely normal while others are non-standard: ``` Differences (array) 0 = 65 1 = /g5167 2 = /space 3 = /g11927 4 = /g17737 5 = /g11540 6 = /g2180 7 = /K 8 = /P 9 = /two 10 = /zero 11 = /one 12 = /five 13 = /four 14 = /g6932 15 = /g7246 16 = /g1691 17 = /g2343 18 = /g14792 19 = /g3325 20 = /g4280 21 = /g20383 22 = /g18166 23 = /g16988 24 = /g17943 25 = /g19223 26 = /g10830 27 = 97 28 = /g982 29 = /g1226 30 = /g5059 31 = /g2677 32 = /g1042 33 = /g11568 34 = /L 35 = /three 36 = /seven 37 = /g2364 38 = /g12063 39 = /g5356 40 = /g2173 41 = /g17877 42 = /g7273 43 = /g7647 44 = /g7224 45 = /g19327 46 = /g5054 47 = /g2342 48 = /g10136 49 = /g6856 50 = /g13381 51 = /g7257 52 = /g12093 53 = /g2359 ```	2021-09-04 07:38:22 +02:00
Brendan Dahl	da15dbf962	Merge pull request #13698 from linfangrong/master [FIX] fix jpx tag tree decode (issue 11957)	2021-09-03 10:00:19 -07:00
Brendan Dahl	a8ce15a2d7	Merge pull request #13966 from calixteman/no_ns XFA - Created data node mustn't belong to datasets namespace	2021-09-03 09:59:40 -07:00
Calixte Denizet	77b9657e57	XFA - Overwrite AcroForm dictionary when saving if no datasets in XFA (bug 1720179) - aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1720179 - in some pdfs the XFA array in AcroForm dictionary doesn't contain an entry for 'datasets' (which contains saved data), so basically this patch allows to overwrite the AcroForm dictionary with an updated XFA array when doing an incremental update.	2021-09-03 17:04:03 +02:00
Calixte Denizet	57ae3a5a76	XFA - Created data node mustn't belong to datasets namespace - when some named nodes in the template don't have their counterpart in datasets we create some nodes: the main node mustn't belong to the datasets namespace because it doesn't make sense and Acrobat Reader isn't able to read pdf with such nodes. - so created nodes under a datasets node have a namespaceId set to -1 and consequently when serialized no namespace prefix will appear.	2021-09-03 15:43:25 +02:00
Brendan Dahl	804abb3786	Merge pull request #13959 from calixteman/encrypt Correctly pad strings when saving an encrypted pdf (bug 1726789)	2021-09-02 11:41:02 -07:00
Jonas Jenwald	c42887221a	Simplify some regular expressions There's a fair number of regular expressions througout the code-base which are slightly more verbose than strictly necessary, in particular: - We have a lot of regular expressions that use `[0-9]` explicitly, and those can be simplified to use `\d` instead. - We have one instance of a regular expression containing a `A-Za-z0-9_` sequence, which can be simplified to use `\w` instead.	2021-09-02 11:50:42 +02:00
Calixte Denizet	9619bf92be	Correctly pad strings when saving an encrypted pdf (bug 1726789)	2021-09-02 10:37:21 +02:00
Tim van der Meij	0a366dda6a	Merge pull request #13955 from Snuffleupagus/issue-13433 Always prefer the post-table for TrueType fonts with (0, x) cmap-tables (issue 13433)	2021-09-01 21:46:34 +02:00
Tim van der Meij	19ce2de6f7	Merge pull request #13952 from Snuffleupagus/ItcSymbol Extend `getNonStdFontMap` for non-embedded versions of the ItcSymbol font (issue 11532)	2021-09-01 21:38:59 +02:00
Jonas Jenwald	b7b6076294	Always prefer the post-table for TrueType fonts with (0, x) cmap-tables (issue 13433) While I don't know if this is necessarily the "correct" solution, it does fix issue 13433 without breaking any of the existing reference-tests.	2021-09-01 12:35:49 +02:00
Jonas Jenwald	ba9f004097	Extend `getNonStdFontMap` for non-embedded versions of the ItcSymbol font (issue 11532) Despite its name, the fonts in ItcSymbol-family are "regular" fonts and not Symbol ones. However, given that the font name contains the word "Symbol" we ended up picking the wrong code-path in the `Font.fallbackToSystemFont`-method. Please note: While this patch ensures that the text becomes readable, by falling back a standard font, the rendering will obviously not be perfect. However, that's the PDF generators "fault" since non-embedded fonts cannot be guaranteed to render correctly in all environments.	2021-08-31 23:21:16 +02:00
Jonas Jenwald	1f56451d56	Implement `PDFNetworkStreamRangeRequestReader._onError`, to handle range request errors with XMLHttpRequest (issue 9883) Given that the Fetch API is normally being used now, these changes are probably less important now than they used to be. However, given that it's simple enough to implement this I figured why not just fix issue 9883 (better late than never I suppose).	2021-08-31 10:23:57 +02:00
Jonas Jenwald	bd9a92a161	Use optional chaining more in the `src/display/network.js` file Also changes the different `_onDone`/`_onProgress` methods to use consistent parameter names, and some other small improvements.	2021-08-31 10:23:54 +02:00
linfangrong	369f1899c6	[FIX] fix jpx tag tree decode (issue 11957)	2021-08-31 11:44:26 +08:00
Brendan Dahl	a7f807b059	Only use base encoding if it's populated. (bug 1727053) The font dict in this file has an encoding entry, but only specifies a differences map. The base encoding is empty in this case and shouldn't be used.	2021-08-30 12:51:59 -07:00
Brendan Dahl	306119b12a	Merge pull request #13932 from Snuffleupagus/oc-images Support Optional Content in Image-/XObjects (issue 13931)	2021-08-30 10:10:14 -07:00
Jonas Jenwald	cf0ccc4bab	Merge pull request #13937 from overleaf/jpa-fix-error-handling Fix handling of fetch errors	2021-08-30 15:50:03 +02:00
Jakob Ackermann	291ffd3059	Fix handling of fetch errors Testing: - delete the pdf file while the initial request is inflight - delete the pdf file after the initial request has finished Repeat for a small file and large file, exercising both one-off and chunked transports.	2021-08-30 12:43:28 +01:00
Tim van der Meij	954e1a1694	Merge pull request #13943 from Snuffleupagus/api-more-async Use `async` a bit more in the API	2021-08-29 14:34:14 +02:00
Jonas Jenwald	ce3f5ea2bf	Use `async` a bit more in the API This patch changes the `PDFDocumentLoadingTask.destroy`-method and the `_fetchDocument`-function to be `async`, which slightly simplifies the relevant code. Furthermore, remove the catch-handler from the `WorkerTransport.getPageIndex`-method since it's no longer needed. Given that the `MessageHandler` is nowadays wrapping every possible Exception, it's no longer necessary to try and re-wrap the reason here.	2021-08-29 12:31:28 +02:00
Jonas Jenwald	9ea3fa0747	Ensure that `PasswordException` is handled correctly in the `wrapReason` function While running the unit-tests with some logging statements added to this code, I noticed that `PasswordException` was missing from the list of potential Errors that could be passed to the `wrapReason` function.	2021-08-28 12:24:12 +02:00
Tim van der Meij	153d058b3a	Merge pull request #13933 from brendandahl/xfa-checkbox2 Fix saving of XFA checkboxes. (bug 1726381)	2021-08-27 22:45:44 +02:00
Jonas Jenwald	b34d2cdc42	Ensure that beginMarkedContentProps/endMarkedContent-operators, for /XObjects, are balanced in corrupt documents (PR 13854 follow-up) Something that I just realized is that while PR 13854 fixed an issue as reported, it could still cause bugs in other similarily broken documents since we'll not insert a matching endMarkedContent-operator in the operatorList.	2021-08-26 17:05:30 +02:00
Jonas Jenwald	853b1172a1	Support Optional Content in Image-/XObjects (issue 13931) Currently, in the `PartialEvaluator`, we only support Optional Content in Form-/XObjects. Hence this patch adds support for Image-/XObjects as well, which looks like a simple oversight in PR 12095 since the canvas-implementation already contains the necessary code to support this.	2021-08-26 16:54:15 +02:00
Brendan Dahl	6d2193a812	Fix saving of XFA checkboxes. (bug 1726381) Previously were were always setting the storage value to the on value.	2021-08-24 15:53:55 -07:00
Jonas Jenwald	2a0ad8e696	Add deprecation warnings for the `renderInteractiveForms` and `includeAnnotationStorage` options, in `PDFPageProxy.render` This is done separately from the previous patch, to make it easier to revert these changes once they've been included in a couple of releases. Please note that because these two options are mutually exclusive, which is a large part of the reason for the previous patch, it's not guaranteed that the fallback-values will always be correct in every situation (but it's the best that we can do).	2021-08-24 01:40:12 +02:00
Jonas Jenwald	41efa3c071	[api-minor] Introduce a new `annotationMode`-option, in `PDFPageProxy.{render, getOperatorList}` This is a follow-up to PRs 13867 and 13899. This patch is tagged `api-minor` for the following reasons: - It replaces the `renderInteractiveForms`/`includeAnnotationStorage`-options, in the `PDFPageProxy.render`-method, with the single `annotationMode`-option that controls which annotations are being rendered and how. Note that the old options were mutually exclusive, and setting both to `true` would result in undefined behaviour. - For improved consistency in the API, the `annotationMode`-option will also work together with the `PDFPageProxy.getOperatorList`-method. - It's now also possible to disable all annotation rendering in both the API and the Viewer, since the other changes meant that this could now be supported with a single added line on the worker-thread[1]; fixes 7282. --- [1] Please note that in order to simplify the overall implementation, we'll purposely only support disabling of all annotations and that the option is being shared between the API and the Viewer. For any more "specialized" use-cases, where e.g. only some annotation-types are being rendered and/or the API and Viewer render different sets of annotations, that'll have to be handled in third-party implementations/forks of the PDF.js code-base.	2021-08-24 01:13:02 +02:00
Brendan Dahl	56e7bb626c	Merge pull request #13660 from calixteman/no_xfaf XFA - Disable xfa rendering for XFAF pdfs	2021-08-23 12:30:29 -07:00
Calixte Denizet	04573d2dc8	XFA - Disable xfa rendering for XFAF pdfs - we'll implement XFAF support later.	2021-08-23 12:18:20 -07:00
Brendan Dahl	bf5a45ce6d	Merge pull request #13908 from brendandahl/xfa-find [api-minor] XFA - Support text search in XFA documents.	2021-08-23 08:53:02 -07:00

1 2 3 4 5 ...

4938 Commits