pdf.js

Author	SHA1	Message	Date
Calixte Denizet	9dae421a0d	Handle all the whitespaces the same way when creating text chunks	2022-01-15 21:44:00 +01:00
Tim van der Meij	922dac035c	Merge pull request #14448 from Snuffleupagus/Type3-circular-refs Prevent circular references in Type3 fonts	2022-01-15 14:11:47 +01:00
Tim van der Meij	a72d188599	Merge pull request #14439 from Snuffleupagus/issue-14438 Ignore Annotations with empty /Rect-entries in the display-layer (issue 14438)	2022-01-15 14:11:25 +01:00
Tim van der Meij	78f160b656	Merge pull request #14453 from Snuffleupagus/viewer-documenterror Dispatch a "documenterror" event in `PDFViewerApplication._documentError` (issue 14451)	2022-01-15 14:00:16 +01:00
Tim van der Meij	c0d2932faf	Merge pull request #14454 from Snuffleupagus/util-more-unreachable Replace some `assert` usage with `unreachable` in the `src/shared/util.js` file	2022-01-15 13:52:10 +01:00
Tim van der Meij	625f829842	Merge pull request #14446 from Snuffleupagus/issue-14435 Expose even more API-functionality in the TypeScript definitions (issue 14435, PR 14013 follow-up)	2022-01-15 13:46:11 +01:00
Jonas Jenwald	0e1b93bf20	Replace some `assert` usage with `unreachable` in the `src/shared/util.js` file Inlining the checks should be a tiny bit more efficient, since it avoids have to make unconditional function calls in these fairly commonly used helper functions.	2022-01-15 13:01:25 +01:00
Jonas Jenwald	bf8a58e5e3	Dispatch a "documenterror" event in `PDFViewerApplication._documentError` (issue 14451) Please note: This is a tentative patch, since I don't know if this is deemed important enough to fix. The new event could be seen as a supplement to the existing "documentinit" and "documentloaded" events, but for the case when a PDF document fails to load. To make the "documenterror" event generally useful, it'll include both the localized error message as well as the original reason for the error (when that exists).	2022-01-15 11:55:44 +01:00
Jonas Jenwald	e0032811cd	Merge pull request #14450 from Snuffleupagus/issue-14449 Add (basic) UTF-8 support in the `stringToPDFString` helper function (issue 14449)	2022-01-14 20:55:13 +01:00
Jonas Jenwald	12d8f0b64d	Re-factor the `stringToPDFString` helper function for UTF-16 strings This patch changes the function to instead utilize the `TextDecoder` for both kinds of UTF-16 BOM strings.	2022-01-14 20:38:40 +01:00
Jonas Jenwald	76444888fb	Add (basic) UTF-8 support in the `stringToPDFString` helper function (issue 14449) This patch implements this by looking for the UTF-8 BOM, i.e. `\xEF\xBB\xBF`, in order to determine the encoding.[1] The actual conversion is done using the `TextDecoder` interface, which should be available in all environments/browsers that we support; please see https://developer.mozilla.org/en-US/docs/Web/API/TextDecoder#browser_compatibility --- [1] Assuming that everything lacking a UTF-16 BOM would have to be UTF-8 encoded really doesn't seem correct.	2022-01-14 18:57:07 +01:00
Jonas Jenwald	4c55563574	Add an additional test-case for circular references in Type3 fonts The PDF document in this patch already worked without the previous patch, but I wanted to improve our test-coverage for the Type3-parsing. The attached PDF document was also found in https://github.com/pdf-association/safedocs/tree/main/Miscellaneous%20Targeted%20Test%20PDFs	2022-01-13 17:59:57 +01:00
Jonas Jenwald	53d4ee7990	Prevent circular references in Type3 fonts In corrupt PDF documents Type3 fonts may introduce circular dependencies, thus resulting in the affected font(s) never loading and parsing/rendering never completing. Note that I've not seen any real-world examples of this kind of font corruption, but the attached PDF document was rather found in https://github.com/pdf-association/safedocs/tree/main/Miscellaneous%20Targeted%20Test%20PDFs Please note: That repository contains a number of reduced test-cases that are specifically intended to test interoperability (between PDF viewer) and parsing/rendering for various kinds of strange/corrupt PDF documents. Some of the test-cases found there may thus not make sense to try and "fix" upfront, in my opinion, unless the problems are also found in real-world PDF documents.	2022-01-13 17:58:37 +01:00
Jonas Jenwald	b9849e38b8	Expose even more API-functionality in the TypeScript definitions (issue 14435, PR 14013 follow-up) While `PageViewport` apparently makes sense in TypeScript environments, given that it's being returned by the `PDFPageProxy.getViewport`-method in the API, we really don't want to extend the public API by simply exporting the class directly in `src/pdf.js` since it should never be called/initialized manually. Hence we follow the same pattern as in PR 14013, and also extend the API unit-tests to ensure that `PDFPageProxy.getViewport` always returns a `PageViewport`-instance as expected.	2022-01-13 12:05:40 +01:00
Tim van der Meij	ea57ef116e	Merge pull request #14443 from Snuffleupagus/issue-14442 Prevent run-time errors in `BaseViewer` when it's falling back to `SimpleLinkService` (issue 14442, PR 14295 follow-up)	2022-01-12 20:09:14 +01:00
Jonas Jenwald	8286066372	Prevent run-time errors in `BaseViewer` when it's falling back to `SimpleLinkService` (issue 14442, PR 14295 follow-up)	2022-01-12 17:04:51 +01:00
Jonas Jenwald	08d88a0235	Ignore Annotations with empty /Rect-entries in the display-layer (issue 14438) This prevents the `BaseSVGFactory.create`-method from throwing, and thus preventing any remaining Annotations (on the page) from rendering in corrupt documents.	2022-01-11 13:54:35 +01:00
Tim van der Meij	236c8d4786	Merge pull request #14432 from Snuffleupagus/update-packages Update packages and translations	2022-01-09 15:13:53 +01:00
Jonas Jenwald	365538a383	Update l10n files	2022-01-09 11:32:34 +01:00
Jonas Jenwald	457ff0d54a	Update Jasmine to version 4 For the unit-tests that were updated in this patch, note that I settled on simply using `toEqual` comparisons rather than updating the custom matchers (since those don't seem necessary any more). Please refer to the following resources for additional information: - https://github.com/jasmine/jasmine/blob/main/release_notes/4.0.0.md - https://github.com/jasmine/jasmine-npm/blob/main/release_notes/4.0.0.md - https://jasmine.github.io/tutorials/upgrading_to_Jasmine_4.0	2022-01-09 11:32:34 +01:00
Jonas Jenwald	38e574f1d5	Update npm packages	2022-01-09 10:49:21 +01:00
Tim van der Meij	8ac0ccc227	Merge pull request #14424 from Snuffleupagus/mv-addLinkAttributes [api-minor] Move `addLinkAttributes`, `LinkTarget`, and `removeNullCharacters` into the viewer (PR 14092 follow-up)	2022-01-08 13:19:11 +01:00
Tim van der Meij	8cf0a8c357	Merge pull request #14423 from Snuffleupagus/rm-getViewerConfiguration-eventBus Remove the `eventBus` parameter from `getViewerConfiguration`	2022-01-08 13:02:02 +01:00
calixteman	f25e95a2b4	Merge pull request #14429 from calixteman/14306 [JS] Fix few errors around AFSpecial_Keystroke	2022-01-08 03:50:05 -08:00
Calixte Denizet	6369617e6f	[JS] Fix few errors around AFSpecial_Keystroke - @cincodenada found some errors which are fixed in this patch; - it partially fixes issue #14306; - add some tests.	2022-01-08 12:34:56 +01:00
Calixte Denizet	9bb636402a	Use the correct dimension to know if we have to add an EOL in vertical mode	2022-01-07 15:19:03 +01:00
Jonas Jenwald	7b8794b37e	[api-minor] Move `removeNullCharacters` into the viewer This helper function has never been used in e.g. the worker-thread, hence its placement in `src/shared/util.js` led to a small amount of unnecessary duplication. After the previous patches this helper function is now only used in the viewer, hence it no longer seems necessary to expose it through the official API. Please note: It seems somewhat unlikely that third-party users were relying directly on this helper function, which is why it's not being exported as part of the viewer components. (If necessary, we can always change this later on.)	2022-01-06 12:25:33 +01:00
Jonas Jenwald	00aa9811e6	Convert the `pagesRefCache`, on `PDFLinkService`, from an Object to a Map This seems like a more appropriate data structure, and as part of these changes the property was also converted to a private one.	2022-01-06 12:25:33 +01:00
Jonas Jenwald	fc31e1ba87	Convert the `isValidExplicitDestination` helper to a private static method on `PDFLinkService` This patch also changes a previously "private" method, on `PDFLinkService`, to be properly private since that's now supported.	2022-01-06 12:25:33 +01:00
Jonas Jenwald	2d2b6463b8	[api-minor] Move `addLinkAttributes` and `LinkTarget` into the viewer As part of the changes/improvement in PR 14092, we're no longer using the `addLinkAttributes` directly in e.g. the AnnotationLayer-code. Given that the helper function is now only used in the viewer, hence it no longer seems necessary to expose it through the official API. Please note: It seems somewhat unlikely that third-party users were relying directly on the helper function, which is why it's not being exported as part of the viewer components. (If necessary, we can always change this later on.)	2022-01-06 12:25:33 +01:00
Jonas Jenwald	08256e6795	Remove the `eventBus` parameter from `getViewerConfiguration` This structure contains almost exclusively references to DOM elements (and a couple of simple strings), rather than complete classes/functions. Hence the `eventBus`-option sticks out a fair bit, and I'd guess that it's mostly unused in e.g. third-party implementations. Given that we, in multiple places, mention that the default viewer shouldn't be used as-is I really don't think that we need to keep this special `eventBus`-option around. Furthermore, nowadays it's also a lot easier to (safely) access the existing `EventBus`-instance in the viewer; see https://github.com/mozilla/pdf.js/wiki/Third-party-viewer-usage#initialization-promise which shows how to listen for the default viewer being initialized (and its `eventBus` thus being available).	2022-01-06 12:18:04 +01:00
Jonas Jenwald	290cbc5232	Merge pull request #14418 from calixteman/14415 Use positive dimensions for text chunks in the text layer (issue #14415)	2022-01-05 12:00:36 +01:00
Calixte Denizet	6cdae5ac4d	Use positive dimensions for text chunks in the text layer (issue #14415 ).	2022-01-05 10:49:56 +01:00
Jonas Jenwald	568633cf62	Merge pull request #14417 from mozilla/revert-14367-integration-tests Revert "Disable failing print actions integration test in Firefox"	2022-01-04 14:37:15 +01:00
Jonas Jenwald	2722deb610	Revert "Disable failing print actions integration test in Firefox"	2022-01-04 14:19:27 +01:00
Jonas Jenwald	2ca432d318	Merge pull request #14413 from timvandermeij/drop-beta Drop the beta logic from the Gulpfile/website/`pdfjs.config` file	2022-01-02 15:00:50 +01:00
Tim van der Meij	378c08a9b1	Drop the beta logic from the Gulpfile/website/`pdfjs.config` file From now on we only make stable releases, so the beta logic should be removed to simplify the code.	2022-01-02 14:38:36 +01:00
Tim van der Meij	f287c5f817	Merge pull request #14411 from Snuffleupagus/getAllPageDicts-async Convert `Catalog.getAllPageDicts` to an `async` method	2022-01-01 14:43:20 +01:00
Jonas Jenwald	b0e774d9c5	Convert `Catalog.getAllPageDicts` to an `async` method The patch in PR 14335 essentially re-introduced the old code from before PR 3848, however looking at this code a bit closer it should be possible to simplify it by making the method asynchronous. While this method is currently only used as a fallback in corrupt documents, the way that `MissingDataException`s are handled is less than ideal. Note that if a `MissingDataException` is thrown, we're forced to re-parse the entire /Pages tree[1]. With this method now being asynchronous, we're able to handle fetching of References in a much easier/nicer way than before without having to throw `MissingDataException`s and re-parse anything. These changes also let us simplify the call-site slightly, by calling the method directly instead of using the `PDFManager`-instance (since again it will no longer throw `MissingDataException`s). Furthermore, this patch contains the following other changes: - Reduce unnecessary duplication in the various `catch` handlers throughout the method, by simply moving the `XRefEntryException` handling into the `addPageError` helper function instead. - Move the "circular references"-check to occur slightly earlier, since there's obviously no point in asynchronously fetching data just to then throw an Error immediately afterwards. --- [1] Imagine e.g. a thousand page document, where there's a `MissingDataException` thrown when fetching/parsing page 900.	2021-12-31 22:03:10 +01:00
Tim van der Meij	3d7bb6c38d	Merge pull request #14409 from Snuffleupagus/getPageIndex-better-caching Improve caching for the `Catalog.getPageIndex` method (PR 13319 follow-up)	2021-12-31 19:19:14 +01:00
Jonas Jenwald	1491459dea	Improve caching for the `Catalog.getPageIndex` method (PR 13319 follow-up) This method is now being used a lot more, compared to when it's added, since it's now used together with scripting as part of the `PDFDocument.fieldObjects` parsing (called during viewer initialization). For /Page Dictionaries that we've already parsed, the `pageIndex` corresponding to a particular Reference is already known and we're thus able to skip all parsing in the `Catalog.getPageIndex` method for those cases.	2021-12-29 20:29:14 +01:00
Jonas Jenwald	a20393e6e4	Update `PDFDocument._getLinearizationPage` to do the /Type-check correctly (PR 14400 follow-up) I forgot about this in PR 14400, since we should obviously be consistent and given that the existing check is actually wrong; sorry about this!	2021-12-29 13:26:58 +01:00
Jonas Jenwald	b99927e1ee	Improve the API unit-tests for scripting-related functionality I happened to notice that we didn't have any unit-tests for either `getFieldObjects` or `getCalculationOrderIds`, on the `PDFDocumentProxy` class, which seems unfortunate since it's API functionality that we depend on in e.g. the viewer.	2021-12-29 12:57:32 +01:00
Tim van der Meij	e42d54e1b5	Merge pull request #14400 from Snuffleupagus/getPageDict-async [api-minor] Convert `Catalog.getPageDict` to an asynchronous method	2021-12-28 19:40:34 +01:00
Tim van der Meij	01b25b2612	Merge pull request #14391 from KouWakai/annot-border-correct Handle non-integer Annotation border widths correctly (issue 14203)	2021-12-28 19:28:32 +01:00
Tim van der Meij	07c32f0f4f	Merge pull request #14401 from Snuffleupagus/update-packages Update packages and translations	2021-12-28 19:17:31 +01:00
Jonas Jenwald	ea55e8bf41	Update l10n files	2021-12-26 11:19:19 +01:00
Jonas Jenwald	69f14b1ee9	Update npm packages	2021-12-26 11:09:29 +01:00
Jonas Jenwald	b513c64d9d	[api-minor] Convert `Catalog.getPageDict` to an asynchronous method Besides converting `Catalog.getPageDict` to an `async` method, thus simplifying the code, this patch also allows us to pro-actively fix a existing issue. Note how we're looking up References in such a way that `MissingDataException`s won't cause trouble, however it's technically possible that the entries (i.e. /Count, /Kids, and /Type) in a /Pages Dictionary could actually be indirect objects as well. In the existing code this could lead to some, or even all, pages failing to load/render as intended. In practice that doesn't appear to happen in real-world PDF documents, but given all the weird things that PDF software do I'd prefer to fix this pro-actively (rather than waiting for a bug report). With `Catalog.getPageDict` being `async` this is now really simple to address, however I didn't want to introduce a bunch more unconditional asynchronicity in this method if it could be avoided (since that could slow things down). Hence we'll synchronously lookup the raw data in a /Pages Dictionary, and only fallback to asynchronous data lookup when a Reference was encountered. In addition to the above, this patch also makes the following notable changes: - Let `Catalog.getPageDict` consistently reject with the actual error, regardless of what data we're fetching. Previously we'd "swallow" the actual errors except when looking up Dictionary entries, which is inconsistent and thus seem unfortunate. As can be seen from the updated unit-tests this change is API-observable, hence why the patch is tagged `[api-minor]`. - Improve the consistency of the Dictionary /Type-checks in both the `Catalog.getPageDict` and `Catalog.getAllPageDicts` methods. In `Catalog.getPageDict` there's a fallback code-path where we're incorrectly checking the /Page Dictionary for a /Contents-entry, which is wrong since a /Page Dictionary doesn't need to have a /Contents-entry in order to be valid. For consistency the `Catalog.getAllPageDicts` method is also updated to handle errors in the /Type-lookup correctly. - Reduce the `PagesCountLimit.PAUSE_EAGER_PAGE_INIT` viewer constant, to further improve loading/rendering performance of the second page during initialization of very long documents; PR 14359 follow-up.	2021-12-25 15:22:48 +01:00
KouWakai	98158b67a3	Handle non-integer Annotation border widths correctly (issue 14203) The existing code appears to be wrong, since according to the PDF specification the border width of an Annotation only has to be a number and not specifically an integer. Please see: - https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#page=392 - https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G11.2096210 - https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G6.1965562	2021-12-24 22:10:19 +09:00

1 2 3 4 5 ...

15268 Commits