Sakurai/pdf.js - pdf.js - Gitea on kemo

Sakurai/pdf.js

Author	SHA1	Message	Date
MMeent	3631121841	Add normalization for Hyphen -> Hyphen-minus Previously these two characters were not searchable interchangably, even when Hyphen-Minus is being changed to Hyphen in some text to PDF pipelines.	2021-06-04 15:54:52 +02:00
Tim van der Meij	ff393d6e96	Convert the `pendingFindMatches` member, in `web/pdf_find_controller.js`, from an object to a set We only want to track page numbers instead of actual data, so using a set conveys that intention more clearly and is slightly more efficient.	2021-04-05 19:33:53 +02:00
Jonas Jenwald	bc13932ac1	Use more optional chaining in the `web/`-folder (PR 12961 follow-up) I overlooked these cases previously, but there's no reason why optional chaining (and nullish coalescing) cannot be used here as well.	2021-03-07 16:20:52 +01:00
Ross Johnson	6dae2677d5	[api-minor] Highlight search results correctly for normalized text (PR 9448) This patch is a rebased and refactored version of PR 9448, such that it applies cleanly given that `PDFFindController` has changed since that PR was opened; obviously keeping the original author information intact. This patch will thus ensure that e.g. fractions, and other things that we normalize before searching, will still be highlighted correctly in the textLayer. Furthermore, this patch also adds basic unit-tests for this functionality. Note: The `[api-minor]` tag is added, since third-party implementations of the `PDFFindController` must now always use the `pageMatchesLength` property to get accurate length information (see the `web/text_layer_builder.js` changes). Co-authored-by: Ross Johnson <ross@mazira.com> Co-authored-by: Jonas Jenwald <jonas.jenwald@gmail.com>	2021-01-12 18:08:08 +01:00
DesWurstes	72f48ee089	Return the query with the findcontrols	2020-08-20 11:18:43 +01:00
Jonas Jenwald	426945b480	Update Prettier to version 2.0 Please note that these changes were done automatically, using `gulp lint --fix`. Given that the major version number was increased, there's a fair number of (primarily whitespace) changes; please see https://prettier.io/blog/2020/03/21/2.0.0.html In order to reduce the size of these changes somewhat, this patch maintains the old "arrowParens" style for now (once mozilla-central updates Prettier we can simply choose the same formatting, assuming it will differ here).	2020-04-14 12:28:14 +02:00
Jonas Jenwald	7fd5f2dd61	[api-minor] Remove the `getGlobalEventBus` viewer functionality (PR 11631 follow-up) The correct/intended way of working with the "viewer components" is by providing an `EventBus` instance upon initialization, and the `getGlobalEventBus` was only added for backwards compatibility. Note, for example, that using `getGlobalEventBus` doesn't really work at all well with a use-case where there's multiple `PDFViewer` instances on a one page, since it may then be difficult/impossible to tell which viewer a particular event originated from. All of the "viewer components" examples have been previously updated, such that there's no longer any code/examples which relies on the now removed `getGlobalEventBus` functionality.	2020-03-29 12:20:23 +02:00
Jonas Jenwald	886b256ada	Remove variable shadowing from the JavaScript files in the `web/` folder This is part of a series of patches that will try to split PR 11566 into smaller chunks, to make reviewing more feasible. Once all the code has been fixed, we'll be able to eventually enable the ESLint `no-shadow` rule; see https://eslint.org/docs/rules/no-shadow	2020-03-13 12:59:58 +01:00
Jonas Jenwald	4a1b056c82	Re-factor the `EventBus` to allow servicing of "external" event listeners after the viewer components have updated Since the goal has always been, essentially since the `EventBus` abstraction was added, to remove all dispatching of DOM events[1] from the viewer components this patch tries to address one thing that came up when updating the examples: The DOM events are always dispatched last, and it's thus guaranteed that all internal event listeners have been invoked first. However, there's no such guarantees with the general `EventBus` functionality and the order in which event listeners are invoked is not specified. With the promotion of the `EventBus` in the examples, over DOM events, it seems like a good idea to at least try to keep this ordering invariant[2] intact. Obviously this won't prevent anyone from manually calling the new internal viewer component methods on the `EventBus`, but hopefully that won't be too common since any existing third-party code would obviously use the `on`/`off` methods and that all of the examples shows the correct usage (which should be similarily documented on the "Third party viewer usage" Wiki-page). --- [1] Looking at the various Firefox-tests, I'm not sure that it'll be possible to (easily) re-write all of them to not rely on DOM events (since getting access to `PDFViewerApplication` might be generally difficult/messy depending on scopes). In any case, even if technically feasible, it would most likely add a lot of complication that may not be desireable in the various Firefox-tests. All-in-all, I'd be fine with keeping the DOM events only for the `MOZCENTRAL` target and gated on `Cu.isInAutomation` (or similar) rather than a preference. [2] I wouldn't expect any real bugs in a custom implementation, simply based on event ordering, but it nonetheless seem like a good idea if any "external" events are still handled last.	2020-02-27 19:38:13 +01:00
Jonas Jenwald	9a437a158f	[api-minor] Deprecate `getGlobalEventBus` and update the "viewer components" examples accordingly To avoid outright breaking third-party usages of the "viewer components" the `getGlobalEventBus` functionality is left intact, but a deprecation message is printed if the function is invoked. The various examples are updated to explicitly initialize an `EventBus` instance, and provide that when initializing the relevant viewer components.	2020-02-27 14:44:48 +01:00
Jonas Jenwald	36881e3770	Ensure that all `import` and `require` statements, in the entire code-base, have a `.js` file extension In order to eventually get rid of SystemJS and start using native `import`s instead, we'll need to provide "complete" file identifiers since otherwise there'll be MIME type errors when attempting to use `import`.	2020-01-04 13:01:43 +01:00
Jonas Jenwald	a63f7ad486	Fix the linting errors, from the Prettier auto-formatting, that ESLint `--fix` couldn't handle This patch makes the follow changes: - Remove no longer necessary inline `// eslint-disable-...` comments. - Fix `// eslint-disable-...` comments that Prettier moved down, thus causing new linting errors. - Concatenate strings which now fit on just one line. - Fix comments that are now too long. - Finally, and most importantly, adjust comments that Prettier moved down, since the new positions often is confusing or outright wrong.	2019-12-26 12:35:12 +01:00
Jonas Jenwald	de36b2aaba	Enable auto-formatting of the entire code-base using Prettier (issue 11444) Note that Prettier, purposely, has only limited [configuration options](https://prettier.io/docs/en/options.html). The configuration file is based on [the one in `mozilla central`](https://searchfox.org/mozilla-central/source/.prettierrc) with just a few additions (to avoid future breakage if the defaults ever changes). Prettier is being used for a couple of reasons: - To be consistent with `mozilla-central`, where Prettier is already in use across the tree. - To ensure a consistent coding style everywhere, which is automatically enforced during linting (since Prettier is used as an ESLint plugin). This thus ends "all" formatting disussions once and for all, removing the need for review comments on most stylistic matters. Many ESLint options are now redundant, and I've tried my best to remove all the now unnecessary options (but I may have missed some). Note also that since Prettier considers the `printWidth` option as a guide, rather than a hard rule, this patch resorts to a small hack in the ESLint config to ensure that comments won't become too long. Please note: This patch is generated automatically, by appending the `--fix` argument to the ESLint call used in the `gulp lint` task. It will thus require some additional clean-up, which will be done in a separate commit. (On a more personal note, I'll readily admit that some of the changes Prettier makes are extremely ugly. However, in the name of consistency we'll probably have to live with that.)	2019-12-26 12:34:24 +01:00
Tim van der Meij	8b4ae6f3eb	Consistently use `@type` for getter data types in JSDoc comments Sometimes we also used `@return` or `@returns`, but `@type` is what the JSDoc documentation recommends. This also improves the documentation because before this commit the types were not shown and now they are.	2019-10-13 13:58:17 +02:00
Jonas Jenwald	d6cc393cd9	Remove a superfluous `linkService.isPageVisible` check from `PDFFindController` (PR 10217 follow-up) Unless the `PDFLinkService` instance contains all of the expected methods, a lot of things will break in various places in the default viewer. Hence there's not much value in having this check, and outright falling seems more appropriate. Finally, this also makes the return value explicit in this case, since that's consistent with the rest of the `PDFFindController._shouldDirtyMatch` method.	2019-06-10 21:04:47 +02:00
Tim van der Meij	4724ebbcf1	Merge pull request #10231 from Snuffleupagus/find-no-scroll-highlightAll Stop scrolling the document when "Highlight All" is toggled in the findbar (issue 5561)	2018-11-10 20:37:47 +01:00
Tim van der Meij	5b1b5730a1	Merge pull request #10220 from Snuffleupagus/find-less-scrolling Only scroll search results into view as a result of an actual find operation, and not when the user scrolls/zooms/rotates the document (bug 1237076, issue 6746)	2018-11-10 20:29:02 +01:00
Jonas Jenwald	06609b5337	Prevent errors if `PDFFindController.executeCommand` is ever called without a `state` object Most of the code in `PDFFindController` assumes that a valid `state` always exits, hence it cannot hurt to add a simple check to avoid errors being thrown.	2018-11-09 11:32:19 +01:00
Jonas Jenwald	8afb550218	When the search query changes, regardless of the search command, always re-calculate matches (bug 1030622) Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1030622	2018-11-09 11:32:19 +01:00
Jonas Jenwald	de6b0fd12d	Stop scrolling the document when "Highlight All" is toggled in the findbar (issue 5561) This is consistent with the general, e.g. HTML, search functionality of the Firefox browser.	2018-11-09 11:31:59 +01:00
Jonas Jenwald	fd87f13521	Only scroll search results into view as a result of an actual find operation, and not when the user scrolls/zooms/rotates the document (bug 1237076, issue 6746) Currently searching, and particularily highlighting of search results, may interfere with subsequent user-interactions such as scrolling/zooming/rotating which can result in a somewhat jarring UX where the document suddenly "jumps" to a previous position. This is especially annoying in cases where the highlighted search result isn't even visible when a user initiated scrolling/zooming/rotating happens, and there exists a couple of bugs/issues about this behaviour. It seems reasonable, as far as I'm concerned, to treat searching as one operation and any subsequent non-search user interactions with the viewer as separate and thus not scroll the current search result into view unless the user is actually doing another search. This also seems consistent with general searching in e.g. Firefox and Adobe Reader: - Compare with "regular" searching of e.g. HTML files in Firefox, where the user scrolling and/or zooming the document will not force a currently highlighted search result to become re-scrolled into view. - Compare also with Adobe Reader, where the user scrolling, zooming, and/or rotating the document will not force the currently highlighted search result to become re-scrolled into view. The question is then why search highlighting was implemented this way in PDF.js to begin with. It might be that this wasn't really intended behaviour, but more a consequence of the asynchronous nature of the API. Considering that most operations, such as fetching the page, rendering it and extracting its text-content are all asynchronous; searching and highlighting of matches thus becomes asynchronous too. However, it should be possible to track when search results have been scrolled into view and highlighted, and thus prevent these wierd "jumps" when the user interacts with the document. Please note: Unfortunately this required moving the scrolling of matches back into `PDFFindController`, since I simply couldn't see any other (reasonable) way of implementing the functionality without tracking the `_shouldScroll` property in only one spot. However, given that the new `PDFFindController.scrollMatchIntoView` method follows a similar pattern as `BaseViewer.scrollPageIntoView` and `PDFThumbnailViewer.scrollThumbnailIntoView`, this is hopefully deemed OK.	2018-11-09 11:30:45 +01:00
Jonas Jenwald	d805d799ff	For repeated 'findagain' operations, attempt to reset the search position if the user has e.g. scrolled in the document (issue 4141) Currently we'll only attempt to start from the current page when a new search is done, however for 'findagain' operations we'll always continue from the last match position. This could easily lead to confusing behaviour if the user has scrolled to a completely different part of the document. In an attempt to improve this somewhat, for repeated 'findagain' operations, we'll instead reset the position to the current page when it's absolutely certain that the user has scrolled. Note that this required adding a new `BaseViewer` method, and exposing that through `PDFLinkService`, in order to check if a given page is visible. In an attempt to avoid issues, in custom implementations of `PDFFindController`, the code checks for the existence of the `PDFLinkService.isPageVisible` method before using it.	2018-11-03 12:03:11 +01:00
Jonas Jenwald	d7941b4ce7	Add a helper method, in `PDFFindController`, to determine if matches need to be re-calculated when a new search operation occurs	2018-11-03 11:52:48 +01:00
Jonas Jenwald	af99d1dc08	Attempt to improve readability of `PDFFindController.executeCommand` by (slightly) refactoring the code responsible for calling `PDFFindController._nextMatch` Unfortunately the `PDFFindController.executeCommand` method has now become a bit more complicated than one would like, but hopefully this small change will improve the structure somewhat (especially for subsequent patches).	2018-11-03 11:48:40 +01:00
Jonas Jenwald	e2e9657ed0	Remove the `attachDOMEventsToEventBus` functionality, since `EventBus` instances are able to re-dispatch events to the DOM (PR 10019, bug 1492849 follow-up) This also removes the old 'pagechange'/'scalechange'/'documentload' events.	2018-10-31 23:32:39 +01:00
Jonas Jenwald	014b7a3147	Reduce the number of redundant `updatetextlayermatches` events dispatched when calculating matches in `PDFFindController` Currently `PDFFindController._calculateMatch` is (indirectly) dispatching an `updatetextlayermatches` event for every single page of the document. For short documents, such as the `tracemonkey` file, this probably doesn't matter too much, but for documents with a couple of thousand pages it seems unfortunate. It shouldn't be necessary, in general, to dispatch `updatetextlayermatches` events here, since that's already being taken care of in `PDFFindController._updateMatch` which is always called when a match has been found. However, when `highlightAll` is set we still need to ensure that pages which finished rendered before searching begun are updated correctly.	2018-10-31 16:05:12 +01:00
Jonas Jenwald	96abb4bbe7	[Regression] Ensure that "Highlight All" is propagated to all pages for 'findagain' events where the findbar was previously closed (PR 10100 follow-up) STR: 1. Open the default viewer, with the `tracemonkey` file. 2. Open the findbar, and search for "trace". 3. Enable the "Highlight All" option. 4. Close the findbar. 5. Re-open the findbar, and click on the "findNext" button. 6. Scroll down to the second page of the document. ER: Since "Highlight All" is active, all matches on the second page should be highlighted. AR: No matches are highlighted on the second page.	2018-10-29 19:50:29 +01:00
Tim van der Meij	991a574c60	Merge pull request #10184 from Snuffleupagus/findbarclose-abort Ensure that matches are not scrolled into after the findbar has been closed (PR 10100 follow-up)	2018-10-28 14:01:03 +01:00
Tim van der Meij	04ce2afd4a	Merge pull request #10182 from Snuffleupagus/TextLayerBuilder-rm-findController-checks Small clean-up of the search related methods in `TextLayerBuilder`	2018-10-28 13:45:01 +01:00
Jonas Jenwald	5dc12f9a6d	Only normalize the search query once, in `PDFFindController, for every page being searched For a short document, such as e.g. the `tracemonkey` file, this repeated normalization won't matter much, but for documents with a couple of thousand pages it seems completely unnecessary (and wasteful) to keep repeating the normalization whenever for every single page.	2018-10-27 11:44:24 +02:00
Jonas Jenwald	84ae4f9a5e	Only normalize the text-content once, in `PDFFindController`, and not on every new search operation Currently the text-content is normalized every time that a new search operation is started, which seems completely useless considering that the "raw" text-content is never used for anything. For a short document, such as e.g. the `tracemonkey` file, this repeated normalization won't matter much, but for documents with a couple of thousand pages it seems completely unnecessary (and wasteful) to keep repeating the normalization whenever e.g. a new search operation starts.	2018-10-26 20:23:32 +02:00
Jonas Jenwald	12d8b52c49	Move the `normalize` helper function out of `PDFFindController` In the event that multiple instances of `PDFFindController` ever exists simultaneously, they will all be able to share just one `normalize` function in this way. Furthermore, the regular expression is now created lazily rather than at class construction time.	2018-10-26 18:22:32 +02:00
Jonas Jenwald	64d75c32bf	Ensure that matches are not scrolled into after the findbar has been closed (PR 10100 follow-up) Despite all highlighted matches being removed in response to the 'findbarclose' event, there's a risk that a match could still be scrolled into view after the findbar has been closed[1]. Hence we need to ensure that long running searches, particularily those happening in large and/or slow loading documents[2], are ignored as well. --- [1] The match is hidden, as expected, but the document could still scroll unexpectedly. [2] Large documents loaded with `disableAutoFetch = true` and `disableStream = true` set are particularily susceptible to this issue.	2018-10-26 12:43:12 +02:00
Jonas Jenwald	27b21f2558	Add a `_updateAllPages` helper method to `PDFFindController` in order to reduce the amount of event dispatching Given that dispatching the 'updatetextlayermatches' event with `pageIndex = -1` set is now used to target the textLayers of all pages, there's no need to send individual events to every single page during `_nextMatch`. Since there can be an arbitrary number of pages in a document, this small/simple optimization seems too easy to ignore.	2018-10-26 11:50:44 +02:00
Jonas Jenwald	d73a71fb90	Small clean-up of the search related methods in `TextLayerBuilder` This patch does four things: - Change the search related methods in `TextLayerBuilder` to be "private", since there're only called from within the class itself now. - Use `const` for local variables not intended to change in the search related methods in `TextLayerBuilder`. - Finally, removes most `this.findController` checks since they are redundant. Note how both `this._convertMatches` and `this._renderMatches` are only ever called, from `this._updateMatches`, when `this.findController` is actually defined. Hence there's really no need to repeat those checks all over the place, especially with all the relevant methods now being marked as "private". - Always initialize the `this._pageMatchesLength` property with an empty array, to simplify the code in `TextLayerBuilder`.	2018-10-25 21:38:25 +02:00
Jonas Jenwald	2ed3591b22	Make `PDFFindController` less confusing to use, by allowing searching to start when `setDocument` is called This patch is based on something that I noticed while working on PR 10126. The recent re-factoring of `PDFFindController` brought many improvements, among those the fact that access to `BaseViewer` is no longer required. However, with these changes there's one thing which now strikes me as not particularly user-friendly[1]: The fact that in order for searching to actually work, `PDFFindController.setDocument` must be called and a 'pagesinit' event must be dispatched (from somewhere). For all other viewer components, calling the `setDocument` method[2] is enough in order for the component to actually be usable. The `PDFFindController` thus stands out quite a bit, and it also becomes difficult to work with in any sort of custom implementation. For example: Imagine someone trying to use `PDFFindController` separately from the viewer[3], which should now be relatively simple given the re-factoring, and thus having to (somehow) figure out that they'll also need to manually dispatch a 'pagesinit' event for searching to work. Note that the above even affects the unit-tests, where an out-of-place 'pagesinit' event is being used. To attempt to address these problems, I'm thus suggesting that only `setDocument` should be used to indicate that searching may start. For the default viewer and/or the viewer components, `BaseViewer.setDocument` will now call `PDFFindController.setDocument` when the document is ready, thus requiring no outside configuration anymore[4]. For custom implementation, and the unit-tests, it's now as simple as just calling `PDFFindController.setDocument` to allow searching to start. --- [1] I should have caught this during review of PR 10099, but unfortunately it's sometimes not until you actually work with the code in question that things like these become clear. [2] Assuming, obviously, that the viewer component in question actually implements such a method :-) [3] There's even a very recent issue, filed by someone trying to do just that. [4] Short of providing a `PDFFindController` instance when creating a `BaseViewer` instance, of course.	2018-10-04 10:28:50 +02:00
Jonas Jenwald	6be4921eaf	Make the clearing of find highlights, when closing the findbar, asynchronous Since searching itself is an asynchronous operation, removal of highlights needs to be asynchronous too since otherwise there's a risk that the events happen in the wrong order and find highlights thus remain visible. Also, this patch will now ensure that only 'findbarclose' events for the current document is handled since other ones doesn't really matter. Note in particular that when no document is loaded text-layers are, obviously, not present and subsequently it's unnecessary to attempt to hide non-existent find highlights.	2018-10-03 10:47:14 +02:00
Jonas Jenwald	236871c68b	[Regression] Restore the ability to start searching before a document has loaded, and ignore searches for previously opened documents (PR 10099 follow-up) For many years it's been possible to enter a search term into the findbar(s) before the document has finised loading, such that searching starts immediately once it has loaded. PR 10099 accidentally broke that, which I unfortunately missed during reviewing. Since searching is asynchronous you cannot directly check in `executeCommand` if the document is loaded/current, but need to wait until searching is actually enabled first. Furthermore this patch also ensures that the `_findTimeout` is always correctly cleared given that it adds further asynchronous behaviour to searching, since you obviously only want to deal with searches relevant to the current document.	2018-10-03 10:47:07 +02:00
Tim van der Meij	1b402996cf	Implement a basic unit test for the find controller This commit shows that we can now unit test the find controller and that executing regular queries works. Note that this is only a first step and not a complete suite of unit tests for all possible options of the find controller. While writing this unit test, I found two smaller issues that I addressed directly. The first one is that in the previous find controller refactoring I forgot to rename some occurrences of a now private member variable. Fortunately this did not cause any bugs since we did have a public getter and the fetched value may be changed by reference, but it's nevertheless good to fix. The second issue is that some entries in the `test/unit/clitests.json` file were not correct, resulting in these tests not being executed on e.g., Travis CI.	2018-09-30 18:32:34 +02:00
Tim van der Meij	38ff79186a	Replace callbacks for updating the UI with dispatching events on the event bus This makes it more similar to how other components update the viewer UI and avoids the need to have extra member variables and checks.	2018-09-30 16:59:57 +02:00
Tim van der Meij	e0c811f2ed	Use the link service for getting and setting page information This removes the dependency on a `PDFViewer` instance from the find controller, which makes it more similar to other components and makes it easier to unit test with a mock link service. Finally, we remove the search capabilities from the SVG example since it doesn't work there because there is no separate text layer.	2018-09-30 16:59:46 +02:00
Tim van der Meij	e293c12afc	Implement the `setDocument` method for the find controller Now it follows the same pattern as e.g., the document properties component, which allows us to have one instance of the find controller and set a new document to search upon switching documents. Moreover, this allows us to get rid of the dependency on `pdfViewer` in order to fetch the text content for a page. This is working towards getting rid of the `pdfViewer` dependency upon initializing the component entirely in future commits. Finally, we make the `reset` method private since it's not supposed to be used from the outside anymore now that `setDocument` takes care of this, similar to other components.	2018-09-30 16:57:40 +02:00
Tim van der Meij	b14c1fbc28	Use the `updatetextlayermatches` event for highlighting matches on a page This makes use of the event bus instead of requiring the PDF viewer instance to get the page view for a page and calling `updateMatches` on it.	2018-09-30 16:57:18 +02:00
Jonas Jenwald	f29b4d1116	Clear all find highlights when the findbar is closed (issue 7468) Please note that this will require a `mozilla-central` follow-up patch, in order for this to work in the built-in Firefox PDF viewer as well.	2018-09-26 10:20:45 +02:00
Jonas Jenwald	be7fdf148c	Further ensure that `PDFFindController._requestMatchesCount` won't return broken data (PR 10052 follow-up) This prevents the findbar from intermittently displaying `0 of {number} matches`, which could theoretically happen for large and/or slow loading documents.	2018-09-15 23:45:38 +02:00
Tim van der Meij	67e1e39f99	Move scrolling the selected match into view from the find controller to the text layer builder The find controller should only coordinate finding a string in the document and should not be responsible for presenting the matches to the user. The text layer builder already contains the logic to render the matches in the viewer, so it should also take care of scrolling the selected match into view.	2018-09-13 22:06:01 +02:00
Tim van der Meij	ede414554e	Change `let` to `const` where possible in the find controller Doing so clearly indicates which variables are read-only and may not be mutated, which helps readability and prevents subtle issues.	2018-09-13 22:06:00 +02:00
Tim van der Meij	38c9f5fc24	Mark all private members as such in the find controller Moreover, use getters for all members that are only being read.	2018-09-13 22:05:41 +02:00
Tim van der Meij	a859f0eafd	Remove unnecessary `startedTextExtraction` member variable from the find controller The find controller already has quite a lot of state to maintain. We can avoid keeping track of this member variable because when the find controller is reset, so is the extract text promises array. Therefore, we can just check if that array contains items or not to determine if text extraction already started. Moreover, there is no need to reset the `pageContents` array since the `reset` method already takes care of that.	2018-09-11 21:19:55 +02:00
Tim van der Meij	21d959bb82	Remove unused member variable `hadMatch` from the find controller It's only being assigned, but not read anymore.	2018-09-11 21:19:41 +02:00

1 2 3