Sakurai/pdf.js - pdf.js - Gitea on kemo

Sakurai/pdf.js

Author	SHA1	Message	Date
Calixte Denizet	cfb908c999	Add a cache to avoid to load several times a local font On my computer, it takes few tenths of a second to load a local font. Since a font can be used several times in a document, the cache will improve performances.	2023-05-10 20:01:21 +02:00
Jonas Jenwald	d950b91c4e	Introduce some logical assignment in the `src/core/` folder	2023-04-29 13:49:37 +02:00
Jonas Jenwald	5f64621d46	Use `String.prototype.replaceAll()` where appropriate This fairly new method allows replacing multiple occurrences within a string without having to use regular expressions. Please refer to: - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replaceAll - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replaceAll#browser_compatibility	2023-03-22 15:31:10 +01:00
Jonas Jenwald	23930a249e	[api-minor] Let `Catalog.getAllPageDicts` return an empty dictionary when loading the first /Page fails (issue 15590) In order to support opening certain corrupt PDF documents, particularly hand-edited ones, this patch adds support for letting the `Catalog.getAllPageDicts` method fallback to returning an empty dictionary to replace (only) the first /Page of the document. Given that the viewer cannot initialize/load without access to the first page, this will thus allow e.g. document-level scripting to run as expected. Note that by effectively replacing a corrupt or missing first /Page in this way[1], we'll now render nothing but a blank page for certain cases of broken/corrupt PDF documents which may look weird. Please note: This functionality is controlled via the existing `stopAtErrors` option, that can be passed to `getDocument`, since it's easy to imagine use-cases where this sort of fallback behaviour isn't desirable. --- [1] Currently we still require that a /Pages-dictionary is found though, however it may be possible to relax even that assumption if that becomes absolutely necessary in future corrupt documents.	2022-11-03 12:51:48 +01:00
Jonas Jenwald	d470010293	Re-factor the PDF version parsing in the worker-thread Part of this is very old code, and back when support for parsing the catalog-version was added things became less clear (in my opinion). Hence this patch tries to improve things, by e.g. validating the header- and catalog-version separately.	2022-10-15 12:06:39 +02:00
Jonas Jenwald	ce66fefbff	[api-minor] Add partial support for the "GoToE" action (issue 8844) Please note: The referenced issue is the only mention that I can find, in either GitHub or Bugzilla, of "GoToE" actions. Hence why I've purposely settled for a very simple, and partial, "GoToE" implementation to avoid complicating things initially.[1] In particular, this patch only supports "GoToE" actions that references the /EmbeddedFiles-dict in the PDF document. See https://web.archive.org/web/20220309040754if_/https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G11.2048909 --- [1] Usually I always prefer having real-world test-cases to work with, whenever I'm implementing new features.	2022-10-06 10:33:07 +02:00
Jonas Jenwald	60f6272ed9	Use more `for...of` loops in the code-base Most, if not all, of this code is old enough to predate the general availability of `for...of` iteration.	2022-10-03 13:08:38 +02:00
Jonas Jenwald	cc4baa2fe9	[api-minor] Add basic support for the `SetOCGState` action (issue 15372) Note that this patch implements the `SetOCGState`-handling in `PDFLinkService`, rather than as a new method in `OptionalContentConfig`[1], since this action is nothing but a series of `setVisibility`-calls and that it seems quite uncommon in real-world PDF documents. The new functionality also required some tweaks in the `PDFLayerViewer`, to ensure that the `layersView` in the sidebar is updated correctly when the optional-content visibility changes from "outside" of `PDFLayerViewer`. --- [1] We can obviously move this code into `OptionalContentConfig` instead, if deemed necessary, but for an initial implementation I figured that doing it this way might be acceptable.	2022-09-01 17:34:24 +02:00
Jonas Jenwald	216b86a082	[api-minor] Support Named-actions in the outline (issue 15367) Apparently this is implemented in e.g. Adobe Reader, and the specification does support it, however it cannot be commonly used in real-world PDF documents since it took over ten years for this feature to be requested.	2022-08-30 18:47:45 +02:00
Calixte Denizet	5f0c95e70e	[JS] Embedded JS scripts can have some null chars	2022-07-15 16:05:25 +02:00
Jonas Jenwald	9ac4536693	Enable the `unicorn/prefer-at` ESLint plugin rule (PR 15008 follow-up) Please find additional information here: - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/at - https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/prefer-at.md	2022-06-09 21:21:19 +02:00
Jonas Jenwald	df5a4fd0a7	Support encoded dest-strings in /GoTo destination dictionaries (issue 14864) Interestingly enough this appears to be the very first case of encoded dest-strings, in /GoTo destination dictionaries, that we've actually come across. What's really fascinating is that it's less than a week after issue 14847, given that these issues are somewhat similar.	2022-05-02 10:14:32 +02:00
Jonas Jenwald	71370d012b	Support destinations in NameTrees with encoded keys (issue 14847) Initially I considered updating the `NameOrNumberTree`-implementation to handle encoded keys, however that quickly became somewhat messy (especially in the `NameOrNumberTree.get`-method) since only NameTrees using string-keys. Hence the easiest solution, as far as I'm concerned, was thus to just update the `Catalog.destinations`-getter instead. Please note that in the referenced PDF document the `Catalog.destination`-method will thus fallback to fetch all destinations, which should be fine since this is the very first case of encoded keys that we've seen. Also changes the `NameOrNumberTree.getAll`-method to prevent a possible run-time error, although we've so far not seen such a case, for any non-Array Kids-entries found in a NameTree/NumberTree. Finally, to improve overall consistency and to hopefully prevent future bugs, the patch also updates a couple of other `NameTree` call-sites to correctly handle encoded keys. (Note that the `Catalog.attachments`-getter was already doing this.)	2022-04-27 11:19:55 +02:00
Jonas Jenwald	5bc7339c1b	Add support for the /Catalog Base-URI when resolving URLs (issue 14802) As far as I can tell, this is actually the very first time that we've seen a PDF document with a Base-URI specified in the /Catalog; please refer to the specification: https://web.archive.org/web/20220309040754if_/https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#G11.2097122 To simplify the overall implementation, this new parameter is accessed via the existing `BasePdfManager.docBaseUrl`-getter and will thus override any user-specified `docBaseUrl` API-parameter.	2022-04-19 17:14:52 +02:00
Jonas Jenwald	a919959d83	Slightly simplify the `Catalog._readMarkInfo` method We don't need to first check if the Dictionary contains the key, since trying to get a non-existent key simply returns `undefined` and we're already ensuring that the value is a boolean. Furthermore, we shouldn't need to worry about the `Object.prototype` containing enumerable properties since the checks (in `src/core/worker.js`) done for `Array.prototype` indirectly also cover `Object`s. (Keep in mind that an `Array` is just a special kind of `Object` in JavaScript.)	2022-04-05 16:37:51 +02:00
Jonas Jenwald	addb4cb12b	Use `String.prototype.repeat()` in a couple of spots Rather than using a temporary Array to manually create repeated strings, we can use `String.prototype.repeat()` instead. The reason that we didn't use this from the start is most likely because some browsers, notably IE, didn't support this; note https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/repeat#browser_compatibility	2022-03-30 15:42:40 +02:00
Jonas Jenwald	c0736647f9	Add general iteration support in the `RefSet` and `RefSetCache` classes This patch removes the existing `forEach` methods, in favor of making the classes properly iterable instead. Given that the classes are using a `Set` respectively a `Map` internally, implementing this is very easy/efficient and allows us to simplify some existing code.	2022-03-18 14:27:34 +01:00
Jonas Jenwald	939e6f0c4c	Fix a couple of small typos in JSDoc `typedef` comments While this doesn't affect the official API documentation, these cases should nonetheless be fixed.	2022-03-04 12:11:52 +01:00
Jonas Jenwald	99cd24ce3e	Remove the `isString` helper function The call-sites are replaced by direct `typeof`-checks instead, which removes unnecessary function calls. Note that in the `src/`-folder we already had more `typeof`-cases than `isString`-calls.	2022-02-26 16:33:41 +01:00
Jonas Jenwald	3704283f5b	Remove the `isBool` helper function The call-sites are replaced by direct `typeof`-checks instead, which removes unnecessary function calls.	2022-02-23 13:31:03 +01:00
Jonas Jenwald	82f1ee1755	Re-factor the `Catalog.viewerPreferences` method This removes the `ViewerPreferencesValidators` structure, and thus (slightly) simplifies the code overall. With these changes we only have to iterate through, and validate, the actually available Dictionary entries.	2022-02-23 13:25:56 +01:00
Jonas Jenwald	05edd91bdb	Remove the `isNum` helper function The call-sites are replaced by direct `typeof`-checks instead, which removes unnecessary function calls. Note that in the `src/`-folder we already had more `typeof`-cases than `isNum`-calls. These changes were mostly done using regular expression search-and-replace, with two exceptions: - In `Font._charToGlyph` we no longer unconditionally update the `width`, since that seems completely unnecessary. - In `PDFDocument.documentInfo`, when parsing custom entries, we now do the `typeof`-check once.	2022-02-22 11:55:34 +01:00
Jonas Jenwald	b282814e38	Prefer `instanceof Name` rather than calling `isName()` with one argument Unless you actually need to check that something is both a `Name` and also of the correct type, using `instanceof Name` directly should be a tiny bit more efficient since it avoids one function call and an unnecessary `undefined` check. This patch uses ESLint to enforce this, since we obviously still want to keep the `isName` helper function for where it makes sense.	2022-02-21 12:45:00 +01:00
Jonas Jenwald	4df82ad31e	Prefer `instanceof Dict` rather than calling `isDict()` with one argument Unless you actually need to check that something is both a `Dict` and also of the correct type, using `instanceof Dict` directly should be a tiny bit more efficient since it avoids one function call and an unnecessary `undefined` check. This patch uses ESLint to enforce this, since we obviously still want to keep the `isDict` helper function for where it makes sense.	2022-02-21 12:44:56 +01:00
Jonas Jenwald	2cb2f633ac	Remove the `isRef` helper function This helper function is not really needed, since it's just a wrapper around a simple `instanceof` check, and it only adds unnecessary indirection in the code.	2022-02-19 15:33:42 +01:00
Jonas Jenwald	1a31855977	Remove the `isStream` helper function At this point all the various Stream-classes extends an abstract base-class, hence this helper function is no longer necessary and only adds unnecessary indirection in the code.	2022-02-17 13:51:36 +01:00
Jonas Jenwald	8836593b9e	Add a (global) cache to the `getCharUnicodeCategory` function Given that the regular expression has already become more complex (after the initial patch adding it), it seems to me that it probably cannot hurt to add a global cache to reduce unnecessary re-parsing. Obviously the `Glyph`-instances are being cached per font, however in most documents multiple fonts are being used and in practice there's very often a fair amount of overlap between the /ToUnicode-data in different fonts[1]. Consider for example loading and rendering the entire `tracemonkey.pdf` document (from the test-suite), which isn't a particularily large document. In that case the `getCharUnicodeCategory` function is being called a total of `601` times, however there's only `106` unique unicode-chars being checked. Please note: In practice I suppose that this won't have a huge effect on overall performance, however given the relative simplicity of this patch I figured that it'd not hurt to submit it for review. --- [1] Consider e.g. how there's usually different fonts used for regular, bold, respectively italic text.	2022-01-25 09:59:34 +01:00
Jonas Jenwald	b0e774d9c5	Convert `Catalog.getAllPageDicts` to an `async` method The patch in PR 14335 essentially re-introduced the old code from before PR 3848, however looking at this code a bit closer it should be possible to simplify it by making the method asynchronous. While this method is currently only used as a fallback in corrupt documents, the way that `MissingDataException`s are handled is less than ideal. Note that if a `MissingDataException` is thrown, we're forced to re-parse the entire /Pages tree[1]. With this method now being asynchronous, we're able to handle fetching of References in a much easier/nicer way than before without having to throw `MissingDataException`s and re-parse anything. These changes also let us simplify the call-site slightly, by calling the method directly instead of using the `PDFManager`-instance (since again it will no longer throw `MissingDataException`s). Furthermore, this patch contains the following other changes: - Reduce unnecessary duplication in the various `catch` handlers throughout the method, by simply moving the `XRefEntryException` handling into the `addPageError` helper function instead. - Move the "circular references"-check to occur slightly earlier, since there's obviously no point in asynchronously fetching data just to then throw an Error immediately afterwards. --- [1] Imagine e.g. a thousand page document, where there's a `MissingDataException` thrown when fetching/parsing page 900.	2021-12-31 22:03:10 +01:00
Jonas Jenwald	1491459dea	Improve caching for the `Catalog.getPageIndex` method (PR 13319 follow-up) This method is now being used a lot more, compared to when it's added, since it's now used together with scripting as part of the `PDFDocument.fieldObjects` parsing (called during viewer initialization). For /Page Dictionaries that we've already parsed, the `pageIndex` corresponding to a particular Reference is already known and we're thus able to skip all parsing in the `Catalog.getPageIndex` method for those cases.	2021-12-29 20:29:14 +01:00
Jonas Jenwald	b513c64d9d	[api-minor] Convert `Catalog.getPageDict` to an asynchronous method Besides converting `Catalog.getPageDict` to an `async` method, thus simplifying the code, this patch also allows us to pro-actively fix a existing issue. Note how we're looking up References in such a way that `MissingDataException`s won't cause trouble, however it's technically possible that the entries (i.e. /Count, /Kids, and /Type) in a /Pages Dictionary could actually be indirect objects as well. In the existing code this could lead to some, or even all, pages failing to load/render as intended. In practice that doesn't appear to happen in real-world PDF documents, but given all the weird things that PDF software do I'd prefer to fix this pro-actively (rather than waiting for a bug report). With `Catalog.getPageDict` being `async` this is now really simple to address, however I didn't want to introduce a bunch more unconditional asynchronicity in this method if it could be avoided (since that could slow things down). Hence we'll synchronously lookup the raw data in a /Pages Dictionary, and only fallback to asynchronous data lookup when a Reference was encountered. In addition to the above, this patch also makes the following notable changes: - Let `Catalog.getPageDict` consistently reject with the actual error, regardless of what data we're fetching. Previously we'd "swallow" the actual errors except when looking up Dictionary entries, which is inconsistent and thus seem unfortunate. As can be seen from the updated unit-tests this change is API-observable, hence why the patch is tagged `[api-minor]`. - Improve the consistency of the Dictionary /Type-checks in both the `Catalog.getPageDict` and `Catalog.getAllPageDicts` methods. In `Catalog.getPageDict` there's a fallback code-path where we're incorrectly checking the /Page Dictionary for a /Contents-entry, which is wrong since a /Page Dictionary doesn't need to have a /Contents-entry in order to be valid. For consistency the `Catalog.getAllPageDicts` method is also updated to handle errors in the /Type-lookup correctly. - Reduce the `PagesCountLimit.PAUSE_EAGER_PAGE_INIT` viewer constant, to further improve loading/rendering performance of the second page during initialization of very long documents; PR 14359 follow-up.	2021-12-25 15:22:48 +01:00
Jonas Jenwald	fa51fd9428	Slightly reduce asynchronicity in the `Catalog.getPageDict` method (PR 14338 follow-up) After the changes in PR 14338, specifically in the `XRef.parse`-method, the /Pages-entry will now always have been fetched/validated when the `Catalog`-instance is created. Hence we can directly access the /Pages-entry in `Catalog.getPageDict` and thus avoid one asynchronous data-lookup per page in the document. (In practice this is unlikely to show up in e.g. benchmarks, but it really cannot hurt.) Finally, make sure that the `getPageDict`/`getAllPageDicts`-methods track the /Pages-tree reference correctly to prevent circular references in corrupt documents.	2021-12-13 21:18:06 +01:00
Tim van der Meij	a6dd39b645	Merge pull request #14358 from Snuffleupagus/checkLastPage-improvements Improve `PDFDocument.checkLastPage`/`Catalog.getAllPageDicts` for documents with corrupt XRef tables (PR 14311, 14335 follow-up)	2021-12-11 13:07:54 +01:00
Jonas Jenwald	70ac6b1694	Update `Catalog.getAllPageDicts` to always propagate the actual Errors (PR 14335 follow-up) Rather than "swallowing" the actual Errors, when data fetching fails, ensure that they're always being propagated as intended to the call-site instead. Note that we purposely handle `XRefEntryException` specially, to make it possible to fallback to indexing all XRef objects.	2021-12-10 15:22:36 +01:00
Jonas Jenwald	8a05db230e	Further improve caching in `Catalog.getPageDict`, for `disableAutoFetch` mode (PR 8207 follow-up) PR 8207 added caching to improve the performance of `Catalog.getPageDict`, by not having to repeatedly fetch the same data and also reducing the asynchronicity of that method. However, because of another oversight on my part, we're only caching /Page references once we've found the correct page. As long as all pages are loaded in order this doesn't really matter (happens by default in the viewer), but when `disableAutoFetch` is used the pages may be fetched in a more random order (this patch reduces the asynchronicity of `Catalog.getPageDict` slightly in that case).	2021-12-09 12:54:49 +01:00
Jonas Jenwald	5f295ba280	Improve caching in `Catalog.getPageDict` (PR 8207 follow-up) PR 8207 added caching to improve the performance of `Catalog.getPageDict`, by not having to repeatedly fetch the same data and also reducing the asynchronicity of that method. However, because of annoying off-by-one errors[1] the caching became less efficient than it could/should be.[2] Note here that the /Pages-tree is zero-indexed, and that e.g. `pageIndex = 5` thus correspond to the sixth page of the document. --- [1] In particular the `currentPageIndex + count < pageIndex` part. [2] For example, even when loading a relatively small/simple document such as `tracemonkey.pdf` in the viewer, the number of `xref.fetchAsync(currentNode)` calls are reduced from `56` to `44` with this patch.	2021-12-06 11:49:31 +01:00
Jonas Jenwald	40291d1943	Handle errors when fetching the raw /Metadata (issue 14305) Currently the `Catalog.metadata` getter only handles errors during parsing, however in a corrupt PDF document fetching of the raw /Metadata can obviously fail as well. Without this patch the `PDFDocumentProxy.getMetadata` method, in the API, can thus fail which it never should and this will cause the viewer to not initialize all state as expected. Fixes one of the documents in issue 14305.	2021-12-04 09:41:42 +01:00
Jonas Jenwald	1fac6371d3	[Regression] Eagerly fetch/parse the entire /Pages-tree in corrupt documents (issue 14303, PR 14311 follow-up) Please note: This is similar to the method that existed prior to PR 3848, but the new method will only be used as a fallback when parsing of corrupt PDF documents. The implementation in PR 14311 unfortunately turned out to be way too simplistic, as evident by the recently added test-files in issue 14303, since it may cause infinite loops in `PDFDocument.checkLastPage` for some corrupt PDF documents.[1] To avoid this, the easiest solution that I could come up with was to fallback to eagerly parsing the entire /Pages-tree when the /Count-entry validation fails during document initialization. Fixes at least two of the issues listed in issue 14303, namely the `poppler-395-0.pdf...` and `GHOSTSCRIPT-698804-1.pdf...` documents. --- [1] The whole point of PR 14311 was obviously to get rid of infinte loops during document initialization, not to introduce any more of those.	2021-12-02 14:31:04 +01:00
Jonas Jenwald	e045cd4520	Remove the unused `skipCount` parameter from `Catalog.getPageDict` (PR 14311 follow-up) This was added in PR 14311, but given that I completely missed to update the `PDFDocument.getPage` signature accordingly it's completely unused. Given that things work just as fine as-is, let's simply remove that optional parameter for now; sorry about the churn here!	2021-12-02 11:51:38 +01:00
Jonas Jenwald	63be23f05b	Handle errors correctly when data lookup fails during /Pages-tree parsing (issue 14303) This only applies to severely corrupt documents, where it's possible that the `Parser` throws when we try to access e.g. a /Kids-entry in the /Pages-tree. Fixes two of the issues listed in issue 14303, namely the `poppler-742-0.pdf...` and `poppler-937-0.pdf...` documents.	2021-12-02 10:54:40 +01:00
Jonas Jenwald	d0c4bbd828	[api-minor] Validate the /Pages-tree /Count entry during document initialization (issue 14303) This patch basically extends the approach from PR 10392, by also checking the last page. Currently, in e.g. the `Catalog.numPages`-getter, we're simply assuming that if the /Pages-tree has an integer /Count entry it must also be correct/valid. As can be seen in the referenced PDF documents, that entry may be completely bogus which causes general parsing to breaking down elsewhere in the worker-thread (and hanging the browser). Rather than hoping that the /Count entry is correct, similar to all other data found in PDF documents, we obviously need to validate it. This turns out to be a little less straightforward than one would like, since the only way to do this (as far as I know) is to parse the entire /Pages-tree and essentially counting the pages. To avoid doing that for all documents, this patch tries to take a short-cut by checking if the last page (based on the /Count entry) can be successfully fetched. If so, we assume that the /Count entry is correct and use it as-is, otherwise we'll iterate through (potentially) the entire /Pages-tree to determine the number of pages. Unfortunately these changes will have a number of somewhat negative side-effects, please see a possibly incomplete list below, however I cannot see a better way to address this bug. - This will slow down initial loading/rendering of all documents, at least by some amount, since we now need to fetch/parse more of the /Pages-tree in order to be able to access the last page of the PDF documents. - For poorly generated PDF documents, where the entire /Pages-tree only has one level, we'll unfortunately need to fetch/parse the entire /Pages-tree to get to the last page. While there's a cache to help reduce repeated data lookups, this will affect initial loading/rendering of some long PDF documents, - This will affect the `disableAutoFetch = true` mode negatively, since we now need to fetch/parse more data during document initialization. While the `disableAutoFetch = true` mode should still be helpful in larger/longer PDF documents, for smaller ones the effect/usefulness may unfortunately be lost. As one small additional bonus, we should now also be able to support opening PDF documents where the /Pages-tree /Count entry is completely invalid (e.g. contains a non-integer value). Fixes two of the issues listed in issue 14303, namely the `poppler-67295-0.pdf` and `poppler-85140-0.pdf` documents.	2021-11-27 21:57:35 +01:00
Jonas Jenwald	00720d059a	[api-minor] Include the /Lang-property in the `documentInfo`, and use it in the viewer (issue 14110) Please note: This is a tentative patch, since I don't have the necessary a11y-software to actually test it. To avoid having to add a new API-method just for a single string, I figured that adding the new property to the existing `documentInfo`-data (accessed via `PDFDocumentProxy.getMetadata` in the API) will hopefully be deemed acceptable.	2021-10-16 14:27:47 +02:00
Jonas Jenwald	cd94a44ca1	Remove some duplication in simple shadowed getters in `src/core/`-code In these cases there's no good reason, in my opinion, to duplicate the `shadow`-lines since that unnecessarily increases the risk of simple typos (see the previous patch).	2021-10-16 12:56:17 +02:00
Calixte Denizet	aecbd7cd89	AcroForm: Add support for ResetForm action - it aims to fix #12721. - Thanks to PR #14023, we've now the fieldObjects in the annotation layer so we can easily map fields names on their id if needed. - Reset values in the storage, in the JS sandbox and in the visible html elements.	2021-09-30 22:02:33 +02:00
Jonas Jenwald	e6e04694f4	[api-minor] Move the `addDefaultProtocolToUrl`/`tryConvertUrlEncoding` functionality into the `createValidAbsoluteUrl` function Having recently worked with, and reviewed patches touching, this code it seemed that it's probably not a bad idea to move that functionality into `createValidAbsoluteUrl` as new options instead. For the `addDefaultProtocolToUrl` functionality in particular, the existing helper function was not only moved but slightly improved as well. Looking at the code, I realized that there's a small risk that it would incorrectly match a relative URL-string too. With these changes, the `createValidAbsoluteUrl` call-sites in the `src/core/`-code can be simplified a little bit. Please note: This patch may, indirectly, change the format of the `unsafeUrl`-property returned with relevant Annotations and OutlineItems; hence the `api-minor` tag. However, I'd argue that it's actually more correct this way since the whole purpose of `unsafeUrl` is/was to return the URL data as-is without any parsing done.	2021-09-26 14:29:54 +02:00
Calixte Denizet	558e58f354	XFA - Add <a> element in button when an url is detected (bug 1716758) - it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1716758; - some buttons have a JS action with the pattern `app.launchURL(...)` (or similar) so extract when it's possible the url and generate a <a> element with the href equals to the found url; - pdf.js already had some code to handle that so this patch slightly refactor that.	2021-09-25 21:59:39 +02:00
Jonas Jenwald	81a1c1cef7	Correctly validate URLs in XFA documents (bug 1731240) With this patch we'll ensure that only valid absolute URLs can be used in XFA documents, similar to the existing validation done for "regular" PDF documents. Furthermore, we'll also attempt to add a default protocol (i.e. `http`) to URLs beginning with "www." in XFA documents as well; this on its own is enough to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1731240	2021-09-21 21:21:01 +02:00
Calixte Denizet	77b9657e57	XFA - Overwrite AcroForm dictionary when saving if no datasets in XFA (bug 1720179) - aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1720179 - in some pdfs the XFA array in AcroForm dictionary doesn't contain an entry for 'datasets' (which contains saved data), so basically this patch allows to overwrite the AcroForm dictionary with an updated XFA array when doing an incremental update.	2021-09-03 17:04:03 +02:00
Brendan Dahl	56e7bb626c	Merge pull request #13660 from calixteman/no_xfaf XFA - Disable xfa rendering for XFAF pdfs	2021-08-23 12:30:29 -07:00
Calixte Denizet	04573d2dc8	XFA - Disable xfa rendering for XFAF pdfs - we'll implement XFAF support later.	2021-08-23 12:18:20 -07:00
Calixte Denizet	5cdee80c8e	XFA - An image can be a stream in the pdf (bug 1718521) - hrefs can be found in catalog > Names > XFAImages	2021-07-05 14:06:23 +02:00

1 2