Charlie Reis | b3353499 | 2020-06-26 05:28:17 | [diff] [blame] | 1 | # Life of a Navigation |
Nasko Oskov | cf0dd68 | 2020-01-16 23:31:09 | [diff] [blame] | 2 | |
Charlie Reis | b3353499 | 2020-06-26 05:28:17 | [diff] [blame] | 3 | Navigation is one of the main functions of a browser. It is the process through |
| 4 | which the user loads documents. This documentation traces the life of a |
| 5 | navigation from the time a URL is typed in the URL bar to the time the web page |
| 6 | is completely loaded. This is one example of many types of navigations, some of |
| 7 | which may start in different places (e.g., in the renderer process). |
| 8 | |
| 9 | See also: |
| 10 | * [Life of a Navigation tech talk](https://youtu.be/mX7jQsGCF6E) and |
| 11 | [slides](https://docs.google.com/presentation/d/1YVqDmbXI0cllpfXD7TuewiexDNZYfwk6fRdmoXJbBlM/edit), |
| 12 | for an overview from Chrome University. |
| 13 | * [Navigation Concepts](navigation_concepts.md), for useful notes on |
| 14 | navigation-related concepts in Chromium. |
| 15 | |
| 16 | [TOC] |
Nasko Oskov | cf0dd68 | 2020-01-16 23:31:09 | [diff] [blame] | 17 | |
| 18 | |
Charlie Reis | b3353499 | 2020-06-26 05:28:17 | [diff] [blame] | 19 | ## BeforeUnload |
Nasko Oskov | cf0dd68 | 2020-01-16 23:31:09 | [diff] [blame] | 20 | |
Charlie Reis | b3353499 | 2020-06-26 05:28:17 | [diff] [blame] | 21 | Once a URL is entered, the first step of a navigation is to execute the |
Nasko Oskov | cf0dd68 | 2020-01-16 23:31:09 | [diff] [blame] | 22 | beforeunload event handler of the previous document, if a document is already |
| 23 | loaded. This allows the previous document to prompt the user whether they want |
| 24 | to leave, to avoid losing any unsaved data. In this case, the user can cancel |
| 25 | the navigation and no more work will be performed. |
| 26 | |
| 27 | |
Charlie Reis | b3353499 | 2020-06-26 05:28:17 | [diff] [blame] | 28 | ## Network Request and Response |
Nasko Oskov | cf0dd68 | 2020-01-16 23:31:09 | [diff] [blame] | 29 | |
| 30 | If there is no beforeunload handler registered, or the user agrees to proceed, |
| 31 | the next step is making a network request to the specified URL to retrieve the |
Charlie Reis | b3353499 | 2020-06-26 05:28:17 | [diff] [blame] | 32 | contents of the document to be rendered. (Note that not all navigations will go |
| 33 | to the actual network, for cases like ServiceWorkers, WebUI, cache, data:, etc.) |
| 34 | Assuming no network error is encountered (e.g. DNS resolution error, socket |
| 35 | connection timeout, etc.), the server will respond with data, with the response |
| 36 | headers coming first. The parsed headers give enough information to determine |
| 37 | what needs to be done next. |
Nasko Oskov | cf0dd68 | 2020-01-16 23:31:09 | [diff] [blame] | 38 | |
| 39 | The HTTP response code allows the browser process to know whether one of the |
| 40 | following conditions has occurred: |
| 41 | |
| 42 | * A successful response follows (2xx) |
| 43 | * A redirect has been encountered (response 3xx) |
| 44 | * An HTTP level error has occurred (response 4xx, 5xx) |
| 45 | |
| 46 | There are two cases where a navigation network request can complete without |
Charlie Reis | b3353499 | 2020-06-26 05:28:17 | [diff] [blame] | 47 | resulting in a new document being rendered. The first one is HTTP response code |
| 48 | 204 or 205, which tells the browser that the response was successful, but there |
| 49 | is no content that follows, and therefore the current document must remain |
| 50 | active. The other case is when the server responds with a `Content-Disposition` |
| 51 | response header indicating that the response must be treated as a download |
| 52 | instead of a navigation. |
Nasko Oskov | cf0dd68 | 2020-01-16 23:31:09 | [diff] [blame] | 53 | |
Charlie Reis | b3353499 | 2020-06-26 05:28:17 | [diff] [blame] | 54 | If the server responds with a redirect, Chromium makes another request based on |
| 55 | the HTTP response code and the Location header. The browser continues following |
| 56 | redirects until either an error or a successful response is encountered. |
Nasko Oskov | cf0dd68 | 2020-01-16 23:31:09 | [diff] [blame] | 57 | |
Charlie Reis | b3353499 | 2020-06-26 05:28:17 | [diff] [blame] | 58 | Once there are no more redirects, the network stack determines if MIME type |
| 59 | sniffing is needed to detect what type of response the server has sent. This is |
| 60 | only needed if the response is not a 204/205 nor a download, doesn't already |
| 61 | have a `Content-Type` response header, and doesn’t include a |
| 62 | `X-Content-Type-Options: nosniff` response header. If MIME type sniffing is |
| 63 | needed, the network stack will read a small chunk of the actual response data |
| 64 | before proceeding with the commit. |
Nasko Oskov | cf0dd68 | 2020-01-16 23:31:09 | [diff] [blame] | 65 | |
| 66 | |
Charlie Reis | b3353499 | 2020-06-26 05:28:17 | [diff] [blame] | 67 | ## Commit |
Nasko Oskov | cf0dd68 | 2020-01-16 23:31:09 | [diff] [blame] | 68 | |
| 69 | At this point the response is passed from the network stack to the browser |
| 70 | process to be used for rendering a new document. The browser process selects |
| 71 | an appropriate renderer process for the new document based on the origin and |
| 72 | headers of the response as well as the current process model and isolation |
| 73 | policy. It then sends the response to the chosen process, waiting for it to |
| 74 | create the document and send an acknowledgement. This acknowledgement from the |
| 75 | renderer process marks the _commit_ time, when the browser process changes its |
| 76 | security state to reflect the new document and creates a session history entry |
| 77 | for the previous document. |
| 78 | |
| 79 | As part of creating the new document, the old document needs to be unloaded. |
| 80 | In navigations that stay in the same renderer process, the old document is |
| 81 | unloaded by Blink before the new document is created, including running any |
| 82 | registered unload handlers. In the case of a navigation that goes |
| 83 | cross-process, any unload handlers are executed in the previous document’s |
| 84 | process concurrently with the creation of the new document in the new process. |
| 85 | |
| 86 | Once the creation of the new document is complete and the browser process |
| 87 | receives the commit message from the renderer process, the navigation is |
| 88 | complete. |
| 89 | |
| 90 | |
Charlie Reis | b3353499 | 2020-06-26 05:28:17 | [diff] [blame] | 91 | ## Loading |
Nasko Oskov | cf0dd68 | 2020-01-16 23:31:09 | [diff] [blame] | 92 | |
| 93 | Even once navigation is complete, the user doesn't actually see the new page |
| 94 | yet. Most people use the word navigation to describe the act of moving from |
| 95 | one page to another, but in Chromium we separate that process into two phases. |
| 96 | So far we have described the _navigation_ phase; once the navigation has been |
Charlie Reis | b3353499 | 2020-06-26 05:28:17 | [diff] [blame] | 97 | committed, Chromium moves into the _loading_ phase. Loading consists of |
Nasko Oskov | cf0dd68 | 2020-01-16 23:31:09 | [diff] [blame] | 98 | reading the remaining response data from the server, parsing it, rendering the |
| 99 | document so it is visible to the user, executing any script accompanying it, |
| 100 | and loading any subresources specified by the document. |
| 101 | |
Nasko Oskov | cf0dd68 | 2020-01-16 23:31:09 | [diff] [blame] | 102 | The main reason for splitting into these two phases is that errors are treated |
| 103 | differently before and after a navigation commits. Consider the case where the |
| 104 | server responds with an HTTP error code. When this happens, the browser still |
| 105 | commits a new document, but that document is an error page. The error page is |
| 106 | either generated based on the HTTP response code or read as the response data |
| 107 | from the server. On the other hand, if a successful navigation has committed a |
| 108 | real document and has moved to the loading phase, it is still possible to |
| 109 | encounter an error, for example a network connection can be terminated or |
| 110 | times out. In that case the browser displays as much of the new document as it |
| 111 | can, without showing an error page. |
| 112 | |
| 113 | |
Charlie Reis | b3353499 | 2020-06-26 05:28:17 | [diff] [blame] | 114 | ## WebContentsObserver |
Nasko Oskov | cf0dd68 | 2020-01-16 23:31:09 | [diff] [blame] | 115 | |
| 116 | Chromium exposes the various stages of navigation and document loading through |
| 117 | methods on the [WebContentsObserver] interface. |
| 118 | |
Charlie Reis | b3353499 | 2020-06-26 05:28:17 | [diff] [blame] | 119 | ### Navigation |
Nasko Oskov | cf0dd68 | 2020-01-16 23:31:09 | [diff] [blame] | 120 | |
Charlie Reis | b3353499 | 2020-06-26 05:28:17 | [diff] [blame] | 121 | * `DidStartNavigation` - invoked after executing the beforeunload event handler |
Nasko Oskov | cf0dd68 | 2020-01-16 23:31:09 | [diff] [blame] | 122 | and before making the initial network request. |
Charlie Reis | b3353499 | 2020-06-26 05:28:17 | [diff] [blame] | 123 | * `DidRedirectNavigation` - invoked every time a server redirect is encountered. |
| 124 | * `ReadyToCommitNavigation` - invoked at the time the browser process has |
Nasko Oskov | cf0dd68 | 2020-01-16 23:31:09 | [diff] [blame] | 125 | determined that it will commit the navigation and has picked a renderer |
| 126 | process for it, but before it has sent it to the renderer process. It is not |
| 127 | invoked for same-document navigations. |
Charlie Reis | b3353499 | 2020-06-26 05:28:17 | [diff] [blame] | 128 | * `DidFinishNavigation` - invoked once the navigation has committed. The commit |
Nasko Oskov | cf0dd68 | 2020-01-16 23:31:09 | [diff] [blame] | 129 | can be either an error page if the server responded with an error code or a |
| 130 | successful document. |
| 131 | |
| 132 | |
Charlie Reis | b3353499 | 2020-06-26 05:28:17 | [diff] [blame] | 133 | ### Loading |
Nasko Oskov | cf0dd68 | 2020-01-16 23:31:09 | [diff] [blame] | 134 | |
Charlie Reis | b3353499 | 2020-06-26 05:28:17 | [diff] [blame] | 135 | * `DidStartLoading` - invoked once per WebContents, when a navigation is about |
Nasko Oskov | cf0dd68 | 2020-01-16 23:31:09 | [diff] [blame] | 136 | to start, after executing the beforeunload handler. This is equivalent to the |
| 137 | browser UI starting to show a spinner or other visual indicator for |
| 138 | navigation and is invoked before the DidStartNavigation method for the |
| 139 | navigation. |
Charlie Reis | b3353499 | 2020-06-26 05:28:17 | [diff] [blame] | 140 | * `DOMContentLoaded` - invoked per RenderFrameHost, when the document itself |
Nasko Oskov | cf0dd68 | 2020-01-16 23:31:09 | [diff] [blame] | 141 | has completed loading, but before subresources may have completed loading. |
Charlie Reis | b3353499 | 2020-06-26 05:28:17 | [diff] [blame] | 142 | * `DidFinishLoad` - invoked per RenderFrameHost, when the document and all of |
| 143 | its subresources have finished loading. |
| 144 | * `DidStopLoading` - invoked once per WebContents, when the top-level document, |
Nasko Oskov | cf0dd68 | 2020-01-16 23:31:09 | [diff] [blame] | 145 | all of its subresources, all subframes, and their subresources have completed |
| 146 | loading. This is equivalent to the browser UI stop showing a spinner or other |
| 147 | visual indicator for navigation and loading. |
Charlie Reis | b3353499 | 2020-06-26 05:28:17 | [diff] [blame] | 148 | * `DidFailLoad` - invoked per RenderFrameHost, when the document load failed, |
| 149 | for example due to network connection termination before reading all of the |
Nasko Oskov | cf0dd68 | 2020-01-16 23:31:09 | [diff] [blame] | 150 | response data. |
| 151 | |
| 152 | |
Charlie Reis | b3353499 | 2020-06-26 05:28:17 | [diff] [blame] | 153 | ## NavigationThrottles |
Nasko Oskov | cf0dd68 | 2020-01-16 23:31:09 | [diff] [blame] | 154 | |
Charlie Reis | b3353499 | 2020-06-26 05:28:17 | [diff] [blame] | 155 | NavigationThrottles allow observing, deferring, blocking, and canceling a given |
| 156 | navigation. They should not generally be used for modifying a navigation (e.g., |
| 157 | simulating a redirect), as discussed in |
| 158 | [Navigation Concepts](navigation_concepts.md#rules-for-canceling-navigations). |
| 159 | They are typically registered in |
| 160 | `NavigationThrottleRunner::RegisterNavigationThrottles` or |
| 161 | `ContentBrowserClient::CreateThrottlesForNavigation`. |
Nasko Oskov | cf0dd68 | 2020-01-16 23:31:09 | [diff] [blame] | 162 | |
Nate Chapin | 060cb95 | 2023-02-08 21:13:07 | [diff] [blame] | 163 | The most common NavigationThrottles events are `WillStartRequest`, |
| 164 | `WillRedirectRequest`, and `WillProcessResponse`, which allow intercepting a |
| 165 | navigation before sending the network request, during any redirects, and after |
| 166 | receiving the response. These events are only invoked on navigations that |
| 167 | require a URLLoader (see NavigationRequest::NeedsUrlLoader). |
| 168 | A NavigationThrottle that wishes to intercept a non-URLLoader navigation |
| 169 | (same-document navigations, about:blank, etc.) should register itself in |
| 170 | `NavigationThrottleRunner::RegisterNavigationThrottlesForCommitWithoutUrlLoader`, |
| 171 | and will get a single `WillCommitWithoutUrlLoader` event instead of the full |
| 172 | set of events centered on network requests. Page-activation navigations, such |
| 173 | as activating a prerendered page or restoring a page from the back-forward |
| 174 | cache, skip NavigationThrottles entirely. |
Nasko Oskov | cf0dd68 | 2020-01-16 23:31:09 | [diff] [blame] | 175 | |
John Palmer | 046f987 | 2021-05-24 01:24:56 | [diff] [blame] | 176 | [WebContentsObserver]: https://source.chromium.org/chromium/chromium/src/+/main:content/public/browser/web_contents_observer.h |