pdf.js

Author	SHA1	Message	Date
Brendan Dahl	53991d0924	Fix tiling pattern with smask. After drawing a tiling pattern we were not calling endDrawing, which handles compositing any active smasks. Fixes #8565.	2021-05-12 11:42:08 -07:00
Tim van der Meij	ba99e54c66	Merge pull request #13361 from brendandahl/patterns-fixes Fix several issues with radial/axial shadings and tiling patterns.	2021-05-12 20:27:37 +02:00
Brendan Dahl	ac44afa70e	Fix several issues with radial/axial shadings and tiling patterns. Previously, we set the base transformation and pattern matrix directly to the main rendering ctx of the page, however doing this caused the current transform to be lost. This would cause issues with things like shear missing so the pattern was misaligned or when stroke was used the scale of the line width or dash would be wrong. Instead we should leave the current transform and use setTransfrom on the pattern so it is applied correctly. For axial and radial shadings I had to create a temporary canvas to draw the shading so I could in turn use setTransform. Fixes: #13325, #6769, #7847, #11018, #11597, #11473 The following already in the corpus are improved: issue8078-page1 issue1877-page1	2021-05-11 16:32:24 -07:00
Jonas Jenwald	7548dc5ea2	Only include the `renderer`-preference in builds where `SVGGraphics` is defined After PR 13117 it's now (finally) possible for different build targets to specify individual options/preferences, and we can utilize that to only expose the `renderer`-preference in builds where `SVGGraphics` is actually defined. Note that for e.g. `MOZCENTRAL`-builds, trying to enable SVG-rendering will throw immediately and the preference thus doesn't make sense to include there. Also, update the dummy `SVGGraphics` to use a class, tweak the `PDFJSDev`-check in `src/display/svg.js` to agree fully with the option/preference, and remove an unnecessary `eslint-disable`.	2021-05-10 12:03:53 +02:00
Jonas Jenwald	2ba4b65ca8	[api-minor] Remove the WebGL implementation Reasons for the removal include: - This functionality was always somewhat experimental and has never been enabled by default, partly because of worries about rendering bugs caused by e.g. bad/outdated graphics drivers. - After the initial implementation, in PR 4286 (back in 2014), no additional functionality has been added to the WebGL implementation. - The vast majority of all documents do not benefit from WebGL rendering, since only a couple of specific features are supported (e.g. some Soft Masks and Patterns). - There is, and has always been, zero test-coverage for the WebGL implementation. - Overall performance, in the PDF.js library, has improved since the experimental WebGL implementation was added. Rather than shipping unused and untested code, it seems reasonable to simply remove the WebGL implementation for now; thanks to version control it's always possible to bring back the code should the need ever arise.	2021-05-09 16:38:44 +02:00
Jonas Jenwald	9a1758c6b8	Remove unnecessary closure in `src/display/text_layer.js`, and use standard classes With modern JavaScript modules, where you explicitly list the properties that should be exported, it's no longer necessary to wrap all of the code in a closure.[1] This patch also tries to clean-up/improve a couple of the existing JSDoc-comments. --- [1] This reduces the size, even of the built `pdf.js` file, since there's now a lot less unnecessary whitespace.	2021-05-05 18:44:56 +02:00
Calixte Denizet	3f29892d63	[JS] Fix several issues found in pdf in #13269 - app.alert and few other function can use an object as parameter ({cMsg: ...}); - support app.alert with a question and a yes/no answer; - update field siblings when one is changed in an action; - stop calculation if calculate is set to false in the middle of calculations; - get a boolean for checkboxes when they've been set through annotationStorage instead of a string.	2021-05-04 19:21:51 +02:00
Jonas Jenwald	90b5fcb8e0	Remove unnecessary TypedArray re-initialization in `FontFaceObject.createFontFaceRule` The `this.data` property is, when defined, sent from the worker-thread as a `Uint8Array` and there's thus no reason to re-initialize the TypedArray here. Note also the `FontFaceObject.createNativeFontFace` method just above, where we simply use `this.data` as-is. The explanation for this code looking like it does is, as is often the case, for historical reasons. Originally we only supported `@font-face`, before the Font Loading API existed, and back then we also polyfilled TypedArrays (using regular Arrays) which should explain this particular line of code.	2021-05-01 19:20:36 +02:00
calixteman	af4dc55019	[api-minor] Fix the way to chunk the strings (#13257 ) - Improve chunking in order to fix some bugs where the spaces aren't here: * track the last position where a glyph has been drawn; * when a new glyph (first glyph in a chunk) is added then compare its position with the last saved one and add a space or break: - there are multiple ways to move the glyphs and to avoid to have to deal with all the different possibilities it's a way easier to just compare positions; - and so there is now one function (i.e. "compareWithLastPosition") where all the job is done. - Add some breaks in order to get lines; - Remove the multiple whites spaces: * some spaces were filled with several whites spaces and so it makes harder to find some sequences of words using the search tool; * other pdf readers replace spaces by one white space. Update src/core/evaluator.js Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com> Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com>	2021-04-30 14:41:13 +02:00
Jonas Jenwald	e6601f4582	Convert the code in `src/display/canvas.js` to use standard classes This gets rid of a lot of boilerplate that stems from our old way of simulating classes, and it actually reduces the filesize noticeably. For e.g. `gulp mozcentral`, the built `pdf.js` files decreases from `318 404` to `314 722` bytes (~1 percent) with this patch.	2021-04-26 22:10:38 +02:00
Jonas Jenwald	4078dd856c	Clear some Arrays, rather than re-initialize them, in `src/display/`-code It's generally better to re-use the same Array, by clearing out all of its elements, rather than creating a new Array.	2021-04-24 13:00:53 +02:00
Jonas Jenwald	da22146b95	Replace a bunch of `Array.prototype.forEach()` cases with `for...of` loops instead Using `for...of` is a modern and generally much nicer pattern, since it gets rid of unnecessary callback-functions. (In a couple of spots, a "regular" `for` loop had to be used.)	2021-04-24 13:00:19 +02:00
Tim van der Meij	da0e7ea969	Merge pull request #13272 from calixteman/issue13271 Update all the text widgets having the same name with the same value	2021-04-23 21:08:54 +02:00
Brendan Dahl	5231d922ec	Add presentation role to text layer spans. (#13278 ) Keeps screen readers from pausing on every span so paragraphs are read more naturally. Note: this only seems to affect Firefox, Chrome automatically combines the spans.	2021-04-21 10:47:51 +02:00
Calixte Denizet	e868ab0051	Update all the text widgets having the same name with the same value	2021-04-20 20:03:19 +02:00
Brendan Dahl	ac3fa1e3d7	Merge pull request #13146 from calixteman/xfa_fonts XFA -- Load fonts permanently from the pdf	2021-04-16 12:55:12 -07:00
Calixte Denizet	7e9579045f	XFA -- Load fonts permanently from the pdf - Different fonts can be used in xfa and some of them are embedded in the pdf. - Load all the fonts in window.document. Update src/core/document.js Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com> Update src/core/worker.js Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com>	2021-04-15 17:57:42 +02:00
Jani Pehkonen	3a96977ea8	Implement visibility expressions for optional content	2021-04-14 17:39:41 +03:00
Cetin Sert	d498897ab5	Fix annotation input focus trap regression in Safari (#13232 ) `setSelectionRange(0, 0)` added in `44b24fcc29` for #12359, required only by Firefox ([bug](https://bugzilla.mozilla.org/show_bug.cgi?id=860329)), causes issues mozilla#13191, mozilla#12592 in Safari. `scrollLeft = 0` is a fix that breaks the focus trap in Safari while keeping Firefox behavior same for #12359.	2021-04-13 20:40:52 +02:00
Jonas Jenwald	5adee0cdd1	[api-minor] Let `PDFPageProxy.getStructTree` return `null`, rather than an empty structTree, for documents without any accessibility data (PR 13171 follow-up) This is first of all consistent with existing API-methods, where we return `null` when the data in question doesn't exist. Secondly, it should also be (slightly) more efficient since there's less dummy-data that we need to transfer between threads. Finally, this prevents us from adding an empty/unnecessary span to every single page even in documents without any structure tree data.	2021-04-11 12:35:33 +02:00
Tim van der Meij	03c8c89002	Merge pull request #13171 from brendandahl/struct-tree [api-minor] Add support for basic structure tree for accessibility.	2021-04-09 21:32:44 +02:00
Tim van der Meij	b0473eb353	Merge pull request #13207 from Snuffleupagus/api-AnnotationStorage-params [api-minor] Remove the manual passing of an `AnnotationStorage`-instance when calling various API-method	2021-04-09 21:09:16 +02:00
Brendan Dahl	fc9501a637	Add support for basic structure tree for accessibility. When a PDF is "marked" we now generate a separate DOM that represents the structure tree from the PDF. This DOM is inserted into the <canvas> element and allows screen readers to walk the tree and have more information about headings, images, links, etc. To link the structure tree DOM (which is empty) to the text layer aria-owns is used. This required modifying the text layer creation so that marked items are now tracked.	2021-04-09 09:56:28 -07:00
Jonas Jenwald	737a8e846d	Add `deprecated` handling of the now removed `AnnotationStorage` API-parameters These changes are done separately, to make it easier to remove them in the future.	2021-04-09 13:25:03 +02:00
Jonas Jenwald	72ef183085	[api-minor] Remove the manual passing of an `AnnotationStorage`-instance when calling various API-method Note how we purposely don't expose the `AnnotationStorage`-class directly in the official API (see `src/pdf.js`), since trying to use multiple ones simultaneously doesn't really make sense (e.g. in the viewer). Instead we lazily initialize, and cache, just one instance via `PDFDocumentProxy.annotationStorage` which should thus be available internally in the API itself without having to be manually passed to various methods. To support these changes, the `AnnotationStorage`-instance initialization is moved into the `WorkerTransport`-class to allow both `PDFDocumentProxy` and `PDFPageProxy` to access it. This patch implements the following simplifications: - Remove the `annotationStorage`-parameter from `PDFDocumentProxy.saveDocument`, since it's already available internally. Furthermore, while it's currently possible to call that method without an `AnnotationStorage`-instance, that really does not make any sense at all. In this case you're effectively reducing `PDFDocumentProxy.saveDocument` to a "regular" `PDFDocumentProxy.getData` call, but with a lot more overhead, which was obviously not the intention of the `PDFDocumentProxy.saveDocument`-method. - Try to discourage third-party users from calling `PDFDocumentProxy.saveDocument` unconditionally, as a replacement for `PDFDocumentProxy.getData` (note the previous point). - Replace the `annotationStorage`-parameter, in `PDFPageProxy.render`, with a boolean `includeAnnotationStorage`-parameter which simply indicates if the (internally available) `AnnotationStorage`-instance should be used during rendering (e.g. for printing). - By removing the need to manually provide `annotationStorage`-parameters to various API-methods, using the API should become simpler (e.g. for third-parties) since you no longer need to worry about manually fetching and passing around this data.	2021-04-09 13:24:25 +02:00
Ikko Ashimine	c4c4333d54	Fix typo in canvas.js Reseting -> Resetting	2021-04-08 23:45:24 +09:00
Jonas Jenwald	4e81e0e14f	Remove the deprecated `AnnotationStorage.getOrCreateValue`-method (PR 12759 follow-up) While this method has only been deprecated in one releases now, the `AnnotationStorage`-functionality is new enough that third-party implementations hopefully don't rely heavily on it just yet. (And removing this quickly should help reduce the likelihood that someone starts using it.)	2021-04-06 13:22:06 +02:00
Tim van der Meij	228adbf673	Merge pull request #13172 from Snuffleupagus/cleanup-keepFonts [api-minor] Add an option, in `PDFDocumentProxy.cleanup`, to allow fonts to remain attached to the DOM	2021-04-05 14:21:34 +02:00
Jonas Jenwald	16fd838f52	Convert the `renderTasks`, used in `PDFPageProxy.render`/`PDFPageProxy.getOperatorList`, to a Set When removing tasks we're currently forced to indirectly iterate through the array, which can be avoided by using a Set instead. Furthermore, we can also (slightly) modernize the code responsible for initializing the `renderTasks`.	2021-04-05 10:51:28 +02:00
Jonas Jenwald	a2bc6481a0	[api-minor] Add an option, in `PDFDocumentProxy.cleanup`, to allow fonts to remain attached to the DOM As mentioned in the JSDoc comment, this should not be used unless you know what you're doing, since it will lead to increased memory usage. However, in some situations (e.g. SVG-rendering), we still want to be able to run general clean-up on both the main/worker-thread while keeping loaded fonts attached to the DOM.[1] As part of these changes, `WorkerTransport.startCleanup` is converted to an async method and we'll also skip clean-up when destruction has started (since it's redundant). --- [1] The SVG-rendering mode is obviously not officially supported, since it's both rather incomplete and inherently slower. However with recent changes, whereby we cache repeated images on the document rather than the page level, memory usage can be a lot worse than before if we never attempt to release e.g. cached image-data when the viewer is in SVG-rendering mode.	2021-04-02 12:32:31 +02:00
Jonas Jenwald	48ff20493f	Mark some internal `PDFDocumentProxy`-properties as "private" These two properties were never intended to be anything but "private", hence it really cannot hurt to actually indicate that they're not part of any official API.	2021-04-02 12:26:32 +02:00
Tim van der Meij	ca7f546828	Merge pull request #12908 from calixteman/11918 Slightly rescale lineWidth to workaround chrome rendering issue	2021-03-31 21:56:31 +02:00
Calixte Denizet	a0cfb0841f	Slightly rescale lineWidth to workaround chrome rendering issue	2021-03-31 21:49:00 +02:00
Jonas Jenwald	db1e1612df	[api-minor] Support proper `URL`-objects, in addition to URL-strings, in `getDocument` Currently only URL-strings are officially supported by `getDocument`, however at this point in time I cannot really see any compelling reason to not support `URL`-objects as well. Most likely the reason that we've don't already support `URL`-objects, in `getDocument`, is that historically `URL` wasn't fully implemented across browsers and our old polyfill wasn't perfect; see https://developer.mozilla.org/en-US/docs/Web/API/URL/URL#browser_compatibility Please note: Because of how the `url` parameter is currently handled, there's actually some cases where passing a `URL`-object to `getDocument` already works. That, in my opinion, provides additional motivation for supporting `URL`-objects officially, since it makes the API more consistent. The following is an attempt to summarize the current situation, based on the actual code rather than the JSDocs: - `getDocument("url string")` works and is documented.[1] - `getDocument({ url: "url string", })` works and is documented.[1] - `getDocument(new URL(...))` throws immediately, since no supported parameters are found. - `getDocument({ url: new URL(...), })` actually works even though it's not documented.[1] Originally, when data was fetched on the worker-thread, this would likely have thrown since `URL` isn't clonable.[2] - `getDocument({ url: { abc: 123, }, })`, or some similarily meaningless input, will be "accepted" by `getDocument` and then throw a `MissingPDFException` when attempting to fetch the bogus data. With the changes in this patch, not only is `URL`-objects now officially supported and documented when calling `getDocument`, but we'll also do a much better job at actually validating any URL-data passed to `getDocument` (and instead fail early). --- [1] In browsers, we create a valid URL thus indirectly validating the input. In Node.js environments, on the other hand, no validation is done since obtaining a baseUrl is more difficult (and PDF.js is primarily written for browsers anyway). [2] https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm#supported_types	2021-03-31 16:21:41 +02:00
Jonas Jenwald	27add0f1f3	Re-factor the `source` parsing, in `getDocument`, to use `switch` rather than `if...else` Given the number of parameters that we now need to parse here, this code is no longer as readable as one would like. Hence this re-factoring, which will improve overall readability and also help with the next patch.	2021-03-31 16:21:37 +02:00
Jonas Jenwald	9c6770748c	Move the `PDFDocumentStats` typedef closer to its usage Currently this typedef appears slightly out-of-place, in the middle of the arguably much more important `getDocument` JSDocs.	2021-03-31 16:21:22 +02:00
calixteman	84d7cccb1d	JS - Handle correctly hierarchy of fields (#13133 ) * JS - Handle correctly hierarchy of fields - it aims to fix #13132; - annotations can inherit their actions from the parent field; - there are some fields which act as a container for other fields: - they can be access through js so need to add them with an empty type (nothing in the spec about that but checked in Acrobat); - calculation order list (CO) can reference them so need make them through this.getField; - getArray method must return kids. - field values are number, string, ... depending of their type but nothing in the spec on how to know what's the type: - according to the comment for Canonical Format: https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#page=461 - it seems that this "type" can be guessed from js action Format (when setting a type in Acrobat DC, the only affected thing is this action). - util.scand with an empty string returns the current date.	2021-03-30 08:50:35 -07:00
Jonas Jenwald	19c2dfbb96	Move rotation normalization from `PDFViewerApplication` and into `BaseViewer` The rotation handling that's currently living in `PDFViewerApplication` is very old, and pre-dates the introduction of the viewer components by years. As can be seen in the `BaseViewer.pagesRotation` setter, we're not actually normalizing the rotation as intended and instead rely on the caller to handle that correctly. This is first of all inconsistent, given how other setters are implemented, and secondly it could also lead to the rotation being set to a value outside of the `[0, 360)`-range. Finally, for improved consistency the rotation handling in `PageViewport` is updated similarly. Please note that this case, it's not changing the pre-existing logic.	2021-03-28 14:19:58 +02:00
calixteman	63471bcbbe	XFA - Convert some template properties into CSS ones (#13082 ) - implement few positioning properties: position, width, height, anchor; - implement font element; - implement fill element (used by font) and its children (linear, radial, ...); - font property is inherited from ancestor container (see https://www.pdfa.org/wp-content/uploads/2020/07/XFA-3_3.pdf#page=43) so let CSS handles that stuff; - in order to reduce the number of properties to set, only set non default properties and put the default in CSS; - set a background to some containers to be able to see them (will be removed in a future commit).	2021-03-25 13:02:39 +01:00
Tim van der Meij	8269ddbd16	Merge pull request #13105 from Snuffleupagus/BasePdfManager-parseDocBaseUrl Improve memory usage around the `BasePdfManager.docBaseUrl` parameter (PR 7689 follow-up)	2021-03-19 23:03:20 +01:00
Jonas Jenwald	57e7557235	Actually reset the `PDFPageProxy._xfaPromise` property as intended (PR 13069 follow-up) (#13119 ) Similar to the existing `annotationsPromise` and `_jsActionsPromise` properties, the new `_xfaPromise` should obviously also be reset, since otherwise you might end up holding onto a lot of data for pages that are no longer active. (That caching wasn't present in the original version of PR 13069, which is why I didn't spot it until now.)	2021-03-19 11:31:54 +01:00
calixteman	24e598a895	XFA - Add a layer to display XFA forms (#13069 ) - add an option to enable XFA rendering if any; - for now, let the canvas layer: it could be useful to implement XFAF forms (embedded pdf in xml stream for the background and xfa form for the foreground); - ui elements in template DOM are pretty close to their html counterpart so we generate a fake html DOM from template one: - it makes easier to translate template properties to html ones; - it makes faster the creation of the html element in the main thread.	2021-03-19 10:11:40 +01:00
Jonas Jenwald	c4c7216171	Improve memory usage around the `BasePdfManager.docBaseUrl` parameter (PR 7689 follow-up) While there is nothing outright wrong with the existing implementation, it can however lead to increased memory usage in one particular case (that I completely overlooked when implementing this): For "data:"-URLs, which by definition contains the entire PDF document and can thus be arbitrarily large, we obviously want to avoid sending, storing, and/or logging the "raw" docBaseUrl in that case. To address this, this patch makes the following changes: - Ignore any non-string in the `docBaseUrl` option passed to `getDocument`, since those are unsupported anyway, already on the main-thread. - Ignore "data:"-URLs in the `docBaseUrl` option passed to `getDocument`, to avoid having to send what could potentially be a very long string to the worker-thread. - Parse the `docBaseUrl` option directly in the `BasePdfManager`-constructors, on the worker-thread, to avoid having to store the "raw" docBaseUrl in the first place.	2021-03-17 15:48:24 +01:00
Jonas Jenwald	bd9dee1544	Move the `getPdfFilenameFromUrl` helper function from `web/ui_utils.js` and into `src/display/display_utils.js` It seems reasonable to place this alongside the similar `getFilenameFromUrl` helper function. This way, with the changes in the next patch, we also avoid having to expose the `isDataScheme` function in the API itself and we instead expose `getPdfFilenameFromUrl` in the API (which feels overall more appropriate).	2021-03-17 15:48:24 +01:00
Jonas Jenwald	50681d71c8	Ensure that `getDocument` handles Node.js `Buffer`s more gracefully (issue 13075) While the JSDocs have never advertised `getDocument` as supporting Node.js `Buffer`s, that apparently doesn't stop users from passing such data structures to `getDocument`. In theory the existing `instanceof Uint8Array` check ought to have caught Node.js `Buffer`s, however for reasons that I don't even pretend to understand that check actually passes. Hence this patch which, only in Node.js environments, will special-case `Buffer`s to hopefully provide a slightly better out-of-the-box behaviour in Node.js environments[1]. --- [1] Although I'm not sure that we necessarily want to advertise this in the JSDocs, given the specialized use-case.	2021-03-13 10:52:38 +01:00
Jonas Jenwald	b326432895	Simplify the data lookup in the `AnnotationStorage.getValue` method Rather than first checking if data exists before fetching it from storage, we can simply do the lookup directly and then check its value. Note that this follows the same pattern as utilized in the `AnnotationStorage.setValue` method.	2021-03-11 16:37:38 +01:00
Jonas Jenwald	a0e584eeb2	Replace the `objectFromEntries` helper function with an `objectFromMap` one instead Given that it's only used with `Map`s, and that it's currently implemented in such a way that we (indirectly) must iterate through the data twice, some simplification cannot hurt here. Note that the only reason that we're not using `Object.fromEntries(...)` directly, at each call-site, is that that one won't guarantee that a `null` prototype is being used.	2021-03-11 16:37:34 +01:00
Calixte Denizet	c01ef24541	JS - reset correctly radio buttons	2021-03-07 11:04:40 +01:00
Jonas Jenwald	6fd899dc44	[api-minor] Support the Content-Disposition filename in the Firefox PDF Viewer (bug 1694556, PR 9379 follow-up) As can be seen [in the mozilla-central code](https://searchfox.org/mozilla-central/rev/a6db3bd67367aa9ddd9505690cab09b47e65a762/toolkit/components/pdfjs/content/PdfStreamConverter.jsm#1222-1225), we're already getting the Content-Disposition filename. However, that data isn't passed through to the viewer nor to the `PDFDataTransportStream`-implementation, which explains why it's currently being ignored. Please note: This will also require a small mozilla-central patch, see https://bugzilla.mozilla.org/show_bug.cgi?id=1694556, to forward the necessary data to the viewer.	2021-02-26 10:50:29 +01:00
Jonas Jenwald	df931ef685	Move the opening of PDF file attachments into the `DownloadManager`-implementations Note how the `PDFAttachmentViewer` handles PDF file attachments specially, by opening them in a new window/tab, rather than forcing them to be downloaded. This is done to improve the overall UX, since browsers in general are able to handle PDF files internally. However, for file annotations we're currently not attempting to do the same thing and are instead just downloading them directly. In order to unify the behaviour, without having to duplicate a lot of code, the opening of PDF file attachments is thus moved into a new `DownloadManager.openOrDownloadData` method.	2021-02-23 13:44:23 +01:00

1 2 3 4 5 ...

1188 Commits