Sakurai/pdf.js - pdf.js - Gitea on kemo

Sakurai/pdf.js

Author	SHA1	Message	Date
Manas	f6d28ca323	Refactors CMapFactory.create to make it async	2016-03-21 23:08:19 +05:30
Yury Delendik	8ba413e761	Better errors capturing at the core and stop rendering on error.	2016-03-11 07:59:09 -06:00
Jonas Jenwald	93ea866f01	Remove `getAll` from `EvaluatorPreprocessor_read` For the operators that we currently support, the arguments are not `Dict`s, which means that it's not really necessary to use `Dict_getAll` in `EvaluatorPreprocessor_read`. Also, I do think that if/when we support operators that use `Dict`s as arguments, that should be dealt with in the corresponding `case` in `PartialEvaluator_getOperatorList` which handles the operator. The only reason that I can find for using `Dict_getAll` like that, is that prior to PR 6550 we would just append certain (currently unsupported) operators without doing any further processing/checking. But as issue 6549 showed, that can lead to issues in practice, which is why it was changed. In an effort to prevent possible issue with unsupported operators, this patch simply ignores operators with `Dict` arguments in `PartialEvaluator_getOperatorList`.	2016-02-12 22:31:50 +01:00
Jonas Jenwald	f7f60197ce	Replace `getAll` with `getKeys` in `loadType3Data` Not only is `getAll` less efficient, but given that we actually need the keys here, using `getKeys` seems much more suitable.	2016-02-10 20:19:14 +01:00
Jonas Jenwald	07e1ad40a2	Replace `getAll` with `getKeys` in `PartialEvaluator_hasBlendModes` to speed up loading of badly generated PDF files (issue 6961) Some bad PDF generators, in particular "Scribus PDF", duplicates resources a lot at various levels of the PDF files. This can lead to `PartialEvaluator_hasBlendModes` taking an unreasonable amount of time to complete. The reason is that the current code is using `Dict_getAll`, which recursively dereferences all indirect objects, which can be really slow. This patch instead uses `Dict_getKeys`, and then manually looks up only the necessary indirect objects. I've added the PDF file as a `load` test. The most important thing here is probably to ensure that the file remains available in the repo, and the comment should help reduced the chance of regressions. (Note that locally, the `load` test times out without this patch, but we cannot really assume that that always happens.) Fixes 6961.	2016-02-10 17:21:38 +01:00
Jonas Jenwald	a1fe2cb443	Don't directly access the private `map` in `setGState`, and ensure that we avoid indirect objects This patch is based on something I noticed while debugging some of the PDF files in issue 6931. In a number of the cases in `setGState`, we're implicitly assuming that we're not dealing with indirect objects (i.e. `Ref`s). See e.g. the 'Font' case, or the various cases where we simply do `gStateObj.push([key, value]);` (since the code in `canvas.js` won't be able to deal with a `Ref` for those cases). The reason that I didn't use `Dict_forEach` instead, is that it would re-introduce the unncessary closures that PR 5205 removed.	2016-02-03 17:13:42 +01:00
Jonas Jenwald	2d4a1aa0af	Actually ignore no-op `setGState` (PR 5192 followup) The intention of PR 5192 was to avoid adding empty `setGState` ops to the operatorList. But the patch accidentally used `>=`, which means that it's not actually working as intended, since empty arrays always have `length === 0`.	2016-02-03 17:13:02 +01:00
Jonas Jenwald	4770b516fe	Correct the upper bound used when building the `transferMap` for SMasks (PR 6723 followup) Even though the currently known test-cases render correctly without this patch, that seems more like a lucky coincidence, given that there's no guarantee that `transferMap[255] === 0` for every possible transfer function.	2016-02-03 13:41:10 +01:00
Jonas Jenwald	992472fd38	Ensure that we don't modify the `Dict` data when the `Differences` array of a font contains indirect objects This patch fixes an issue that I inadvertently introduced in PR 5815, where we accidentally modify the `Differences` array in the encoding dictionary for indirect objects. Instead of this change, we could also have used the now existing `Dict_getArray`. However in this case I don't think that would have been a good idea, since it would mean iterating through the array twice.	2016-01-30 13:31:24 +01:00
Yury Delendik	2edf2792dc	Replaces literal {} created lookup tables with Object.create	2016-01-28 12:18:38 -06:00
Yury Delendik	d6adf84159	Lazify OP_MAP.	2016-01-28 12:18:37 -06:00
Yury Delendik	1de90454b7	Lazify Metrics	2016-01-28 12:11:46 -06:00
Yury Delendik	55a201d92d	Lazify NormalizedUnicodes	2016-01-28 11:56:42 -06:00
Yury Delendik	d0738d7e24	Lazify stdFontMap, serifFonts, GlyphMapForStandardFonts	2016-01-28 11:51:54 -06:00
Yury Delendik	1a9a665adf	Refactor Encodings	2016-01-28 11:32:59 -06:00
Yury Delendik	6b60c8f4db	Adds UMD headers to core, display and shared files.	2015-12-15 13:24:39 -06:00
Yury Delendik	15c9969abe	Adds transfer function support for SMask.	2015-12-04 12:52:45 -06:00
Yury Delendik	c9cb6a3025	Replaces UnsupportedManager with callback.	2015-11-30 14:42:47 -06:00
Yury Delendik	e4e69e2f05	Set error font for Type3 if its loading failed.	2015-11-27 13:05:51 -06:00
Jonas Jenwald	6dfe53b976	[api-minor] Add a parameter to `PDFPageProxy_getTextContent` that enables replacing of all whitespace with standard spaces in the textLayer (issue 6612) This patch goes a bit further than issue 6612 requires, and replaces all kinds of whitespace with standard spaces. When testing this locally, it actually seemed to slightly improve two existing test-cases (`tracemonkey-text` and `taro-text`). Fixes 6612.	2015-11-25 17:28:40 +01:00
Yury Delendik	06c1904675	Refactors FontLoader to group fonts per document.	2015-11-24 13:27:22 -06:00
Manas	a2ba1b8189	Uses editorconfig to maintain consistent coding styles Removes the following as they unnecessary /* -- Mode: Java; tab-width: 2; indent-tabs-mode: nil; c-basic-offset: 2 -- / / vim: set shiftwidth=2 tabstop=2 autoindent cindent expandtab: */	2015-11-14 07:32:18 +05:30
Yury Delendik	fa423cfab0	Refactors fake space heuristics for speed.	2015-11-06 10:55:43 -06:00
Yury Delendik	376f8bde14	Combines standalone divs into text groups.	2015-11-06 10:20:49 -06:00
Yury Delendik	fa46b73c47	Better spacing in text layer.	2015-11-02 08:54:15 -06:00
Jonas Jenwald	1c66d4a106	Add a `totalLength` getter to `OperatorList`, since the `length` is zero after flushing In the `RenderPageRequest` handler in `worker.js`, we attempt to print an `info` message containing the rendering time and the length of the operator list. The latter is currently broken (and has been for quite some time), since the `length` of an `OperatorList` is reset when flushing occurs. This patch attempts to rectify this, by adding a getter which keeps track of the total length.	2015-10-26 18:12:14 +01:00
Yury Delendik	58c3ea0820	Adds thread abort capabilities.	2015-10-23 09:06:32 -05:00
Jonas Jenwald	2e751199fb	Prevent getOperatorList from failing to correctly parse OPS.paintXObject for TilingPatterns that are missing some /Resources entries (issue 6541) Fixes 6541.	2015-10-21 21:30:56 +02:00
Rob Wu	50ff2d4c2a	Ignore operators that are known to be unsupported `operatorList.addOp` adds the arguments to the list which is then passed as-is by postMessage to the main thread. But since we don't parse these operations, they are raw PDF objects and may therefore cause a serialization error. This is a conservative patch, and only affects operators which are known to be unsupported. We should ignore all unknown operators, but I haven't really looked into the consequences of doing that. Fixes #6549	2015-10-21 15:39:25 +02:00
Brendan Dahl	3eaeacfe19	Merge pull request #6476 from Snuffleupagus/PartialEvaluator_readToUnicode-cmap-length Right-size the `map` array in PartialEvaluator_readToUnicode	2015-10-09 10:31:28 -07:00
Jonas Jenwald	1b8cb52555	Prevent `PartialEvaluator_buildFormXObject` from failing if the `Matrix` or `BBox` contains indirect objects This patch fixes yet another instance of bad PDF data, specifically a case where the `BBox` array contains indirect objects (i.e. `Ref`s). Fixes the missing image in http://www.int.washington.edu/talks/WorkShops/int_08_37W/People/Franz_M/Franz.pdf#page=24. Note: There are missing images on a number of the pages in that file.	2015-09-29 10:11:49 +02:00
Jonas Jenwald	8d831449ab	Right-size the `map` array in PartialEvaluator_readToUnicode We can avoid a lot of intermediate resizings, by directly allocating the required number of elements for the `map` array.	2015-09-24 13:08:53 +02:00
Fabian Lange	2564827503	Fix text spacing with vertical fonts (#6387 ) According to the PDF spec 5.3.2, a positive value means in horizontal, that the next glyph is further to the left (so narrower), and in vertical that it is further down (so wider). This change fixes the way PDF.js has interpreted the value.	2015-09-15 09:28:45 +02:00
Rob Wu	b0a8c0fa40	cmaps: Use cmap.forEach instead of Array.forEach CMaps may be sparse. Array.prototype.forEach is terribly slow in Chrome (and also in Firefox) when the sparse array contains a key with a high value. E.g. console.time('forEach sparse') var a = []; a[0xFFFFFF] = 1; a.forEach(function(){}); console.timeEnd('forEach sparse'); // Chrome: 2890ms // Firefox: 1345ms Switching to CMap.prototype.forEach, which is optimized for such scenarios fixes the problem.	2015-08-08 13:30:30 +02:00
Jonas Jenwald	46a8485db4	Ignore paint form XObject when the name is missing (issue 4558) Fixes 4558 (since the font issues already appear to be fixed).	2015-06-22 22:10:26 +02:00
Tim van der Meij	90982332bf	Merge pull request #5995 from CodingFabian/tweak-char-spacing-text-selection Apply char spacing only when there are chars.	2015-05-14 20:06:22 +02:00
Fabian Lange	c2013094e7	Apply char spacing only when there are chars.	2015-05-13 23:45:20 +02:00
Tim van der Meij	b34366d2fc	Merge pull request #5898 from stri8ed/master Extract more accurate glyph heights from type3 fonts	2015-05-13 21:07:17 +02:00
Jonas Jenwald	6d2d854f65	Merge pull request #5815 from Snuffleupagus/type1-diff-refs Ensure that entries in the Differences array of Type1 fonts are either numbers or names	2015-05-07 22:33:23 +02:00
Brendan Dahl	cd53cbe7d4	Merge pull request #5964 from Snuffleupagus/bug-1157493 Handle the Encoding being a dictionary in PartialEvaluator_preEvaluateFont (bug 1157493)	2015-05-05 14:41:32 -07:00
Jonas Jenwald	760222cf0b	Handle the Encoding being a dictionary in PartialEvaluator_preEvaluateFont (bug 1157493) This is a regression from PR 4423. Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1157493.	2015-04-25 16:48:14 +02:00
Jonas Jenwald	7c7d05e7a3	Attempt to infer if a CMap file actually contains just a standard `Identity-H`/`Identity-V` map	2015-04-25 11:28:33 +02:00
Jonas Jenwald	4c2ad3bc7b	Ensure that entries in the Differences array of Type1 fonts are either numbers or names This patch is yet another installment in the (never ending) series of bugs in PDF files with non-embedded fonts. Fixes http://www.int.washington.edu/talks/WorkShops/int_08_37W/People/Franz_M/Franz.pdf.	2015-04-17 20:32:27 +02:00
Levi Melamed	a5159a7942	extract more accurate glpyh heights from type-3 fonts	2015-04-03 08:49:06 -05:00
Brendan Dahl	3a8d4a7d72	Merge pull request #5713 from Snuffleupagus/evaluator-IdentityToUnicodeMap Create a IdentityToUnicodeMap in evaluator.js when toUnicode contains IdentityH/IdentityV	2015-03-25 10:33:29 -07:00
Hengjie	109d67691c	Lower threshold Fixes text selection formatting with https://github.com/vortext/vortext/blob/master/resources/public/examples/TestDocument3.pdf	2015-02-13 22:27:49 -08:00
Jonas Jenwald	f19a1db414	Create a IdentityToUnicodeMap in evaluator.js when toUnicode contains IdentityH/IdentityV Currently if a font contains a `toUnicode` entry, we always create a new `ToUnicodeMap` in evaluator.js. This is done even for `IdentityV/IdentityH`, despite to possibility to use the much more compact `IdentityToUnicodeMap` representation. This patch refactors the `IdentityH/IdentityV` cases, to: - Avoid calling `IdentityCMap.getMap`, since this prevents allocating and iterating through an array with 65536 elements. - Ensure that the handling of `toUnicode` is actually correct in fonts.js. We rely on `toUnicode instanceof IdentityToUnicodeMap` in a few places, and currently this does not work correctly for `IdentityH/IdentityV`.	2015-02-09 16:52:31 +01:00
Yury Delendik	f5df30f967	Merge pull request #5445 from CodingFabian/fixImageCachingInParser Fixes caching of inline images during parsing.	2014-12-15 10:51:23 -06:00
palkan	4764c52b5b	fix passing null as Promise's onFullfilled (which is broken in Chrome 32)	2014-11-25 16:40:27 +04:00
Fabian Lange	970c048d50	fixes caching of inline images during parsing. As described in #5444, the evaluator will perform identity checking of paintImageMaskXObjects to decide if it can use paintImageMaskXObjectRepeat instead of paintImageMaskXObjectGroup. This can only ever work if the entry is a cache hit. However the previous caching implementation was doing a lazy caching, which would only consider a image cache worthy if it is repeated. Only then the repeated instance would be cached. As a result of this the sequence of identical images A1 A2 A3 A4 would be seen as A1 A2 A2 A2 by the evaluator, which prevents using the "repeat" optimization. Also only the last encountered image is cached, so A1 B1 A2 B2, would stay A1 B1 A2 B2. The new implementation drops the "lazy" init of the cache. The threshold for enabling an image to be cached is rather small, so the potential waste in storage and adler32 calculation is rather low. It also caches any eligible image by its adler32. The two example from above would now be A1 A1 A1 A1 and A1 B1 A1 B1 which not only saves temporary storage, but also prevents computing identical masks over and over again (which is the main performance impact of #2618)	2014-10-28 15:39:41 +01:00

1 2 3 4 5 ...