Charlie Reis | b3353499 | 2020-06-26 05:28:17 | [diff] [blame] | 1 | # Navigation Concepts |
| 2 | |
| 3 | This documentation covers a set of important topics to understand related to |
| 4 | navigation. For a timeline of how a given navigation proceeds, see [Life of a |
| 5 | Navigation](navigation.md). |
| 6 | |
| 7 | [TOC] |
| 8 | |
| 9 | |
| 10 | ## Same-Document and Cross-Document Navigations |
| 11 | |
| 12 | Chromium defines two types of navigations based on whether the navigation |
| 13 | results in a new document or not. A _cross-document_ navigation is one that |
| 14 | results in creating a new document to replace an existing document. This is |
| 15 | the type of navigation that most users are familiar with. A _same-document_ |
| 16 | navigation does not create a new document, but rather keeps the same document |
| 17 | and changes state associated with it. A same-document navigation does create a |
| 18 | new session history entry, even though the same document remains active. This |
| 19 | can be the result of one of the following cases: |
| 20 | |
| 21 | * Navigating to a fragment within an existing document (e.g. |
| 22 | `https://foo.com/1.html#fragment`). |
| 23 | * A document calling the `history.pushState()` or `history.replaceState()` APIs. |
Charlie Reis | ef219c8 | 2021-03-17 23:53:59 | [diff] [blame] | 24 | * A new document created via `document.open()`, which may change the URL to |
| 25 | match the document that initiated the call (possibly from another frame). |
Charlie Reis | b3353499 | 2020-06-26 05:28:17 | [diff] [blame] | 26 | * A session history navigation that stays in the same document, such as going |
| 27 | back/forward to an existing entry for the same document. |
| 28 | |
| 29 | |
| 30 | ## Browser-Initiated and Renderer-Initiated Navigations |
| 31 | |
| 32 | Chromium also defines two types of navigations based on which process started |
| 33 | the navigation: _browser-initiated_ and _renderer-initiated_. This distinction |
| 34 | is useful when making decisions about navigations, for example whether an |
| 35 | ongoing navigation needs to be cancelled or not when a new navigation is |
| 36 | starting. It is also used for some security decisions, such as whether to |
| 37 | display the target URL of the navigation in the address bar or not. |
| 38 | Browser-initiated navigations are more trustworthy, as they are usually in |
| 39 | response to a user interaction with the UI of the browser. Renderer-initiated |
| 40 | navigations originate in the renderer process, which may be under the control of |
| 41 | an attacker. Note that some renderer-initiated navigations may be considered |
| 42 | user-initiated, if they were performed with a [user |
| 43 | activation](https://mustaqahmed.github.io/user-activation-v2/) (e.g., links), |
| 44 | while others are not user-initiated (e.g., script navigations). |
| 45 | |
| 46 | |
| 47 | ## Last Committed, Pending, and Visible URLs |
| 48 | |
| 49 | Many features care about the URL or Origin of a given document, or about a |
| 50 | pending navigation, or about what is showing in the address bar. These are all |
| 51 | different concepts with different security implications, so be sure to use the |
| 52 | correct value for your use case. |
| 53 | |
| 54 | See [Origin vs URL](security/origin-vs-url.md) when deciding whether to check |
| 55 | the Origin or URL. In many cases that care about the security context, Origin |
| 56 | should be preferred. |
| 57 | |
| 58 | The _last committed_ URL or Origin represents the document that is currently in |
| 59 | the frame, regardless of what is showing in the address bar. This is almost |
| 60 | always what should be used for feature-related state, unless a feature is |
Keren Zhu | dc8454f | 2024-04-08 15:47:57 | [diff] [blame] | 61 | explicitly tied to the address bar (e.g., padlock icon). This is empty if no |
| 62 | navigation is ever committed. e.g. if a tab is newly open for a navigation but |
| 63 | then the navigation got cancelled. See |
Charlie Reis | b3353499 | 2020-06-26 05:28:17 | [diff] [blame] | 64 | `RenderFrameHost::GetLastCommittedOrigin` (or URL) and |
| 65 | `NavigationController::GetLastCommittedEntry`. |
| 66 | |
| 67 | The _pending_ URL exists when a main frame navigation has started but has not |
| 68 | yet committed. This URL is only sometimes shown to the user in the address bar; |
| 69 | see the description of visible URLs below. Features should rarely need to care |
| 70 | about the pending URL, unless they are probing for a navigation they expect to |
| 71 | have started. See `NavigationController::GetPendingEntry`. |
| 72 | |
| 73 | The _visible_ URL is what the address bar displays. This is carefully managed to |
| 74 | show the main frame's last committed URL in most cases, and the pending URL in |
| 75 | cases where it is safe and unlikely to be abused for a _URL spoof attack_ (where |
| 76 | an attacker is able to display content as if it came from a victim URL). In |
| 77 | general, the visible URL is: |
| 78 | |
| 79 | * The pending URL for browser-initiated navigations like typed URLs or |
Keren Zhu | dc8454f | 2024-04-08 15:47:57 | [diff] [blame] | 80 | bookmarks, excluding session history navigations. This becomes empty if the |
| 81 | navigation is cancelled. |
Charlie Reis | b3353499 | 2020-06-26 05:28:17 | [diff] [blame] | 82 | * The last committed URL for renderer-initiated navigations, where an attacker |
| 83 | might have control over the contents of the document and the pending URL. |
Keren Zhu | dc8454f | 2024-04-08 15:47:57 | [diff] [blame] | 84 | This is also used when there is no ongoing navigations, and it is empty when |
| 85 | no navigation is ever committed. |
Charlie Reis | b3353499 | 2020-06-26 05:28:17 | [diff] [blame] | 86 | * A renderer-initiated navigation's URL is only visible while pending if it |
| 87 | opens in a new unmodified tab (so that an unhelpful `about:blank` URL is not |
| 88 | displayed), but only until another document tries to access the initial empty |
| 89 | document of the new tab. For example, an attacker window might open a new tab |
| 90 | to a slow victim URL, then inject content into the initial `about:blank` |
| 91 | document as if the slow URL had committed. If that occurs, the visible URL |
| 92 | reverts to `about:blank` to avoid a URL spoof scenario. Once the initial |
| 93 | navigation commits in the new tab, pending renderer-initiated navigation URLs |
| 94 | are no longer displayed. |
| 95 | |
| 96 | |
| 97 | ## Virtual URLs |
| 98 | |
| 99 | Virtual URLs are a way for features to change how certain URLs are displayed to |
| 100 | the user (whether visible or committed). They are generally implemented using |
| 101 | BrowserURLHandlers. Examples include: |
| 102 | |
| 103 | * View Source URLs, where the `view-source:` prefix is not present in the |
| 104 | actual committed URL. |
| 105 | * DOM Distiller URLs, where the original URL is displayed to the user rather |
| 106 | than the more complex distiller URL. |
| 107 | |
| 108 | |
| 109 | ## Redirects |
| 110 | |
| 111 | Navigations can redirect to other URLs in two different ways. |
| 112 | |
| 113 | A _server redirect_ happens when the browser receives a 300-level HTTP response |
| 114 | code before the document commits, telling it to request a different URL, |
| 115 | possibly cross-origin. The new request will usually be an HTTP GET request, |
| 116 | unless the redirect is triggered by a 307 or 308 response code, which preserves |
| 117 | the original request method and body. Server redirects are managed by a single |
| 118 | NavigationRequest. No document is committed to session history, but the original |
| 119 | URL remains in the redirect chain. |
| 120 | |
| 121 | In contrast, a _client redirect_ happens after a document has committed, when |
| 122 | the HTML in the document instructs the browser to request a new document (e.g., |
| 123 | via meta tags or JavaScript). Blink classifies the navigation as a client |
| 124 | redirect based partly on how much time has passed. In this case, a session |
| 125 | history item is created for the redirecting document, but it gets replaced when |
| 126 | the actual destination document commits. A separate NavigationRequest is used |
| 127 | for the second navigation. |
| 128 | |
| 129 | |
| 130 | ## Concurrent Navigations |
| 131 | |
| 132 | Many navigations can be in progress simultaneously. In general, every frame is |
| 133 | considered independent and may have its own navigations(s), with each tracked by |
| 134 | a NavigationRequest. Within a frame, it is possible to have multiple concurrent |
| 135 | navigations: |
| 136 | |
| 137 | * **A cross-document navigation waiting for its final response (at most one per |
| 138 | frame).** The NavigationRequest is owned by FrameTreeNode during this stage, |
| 139 | which can take several seconds. Some special case navigations do not use a |
| 140 | network request and skip this stage (e.g., `about:blank`, `about:srcdoc`, |
| 141 | MHTML). |
| 142 | * **A queue of cross-document navigations that are between "ready to commit" |
| 143 | and "commit," while the browser process waits for a commit acknowledgement |
| 144 | from the renderer process.** While rare, it is possible for multiple |
| 145 | navigations to be in this stage concurrently if the renderer process is slow. |
| 146 | The NavigationRequests are owned by the RenderFrameHost during this stage, |
| 147 | which is usually short-lived. |
| 148 | * **Same-document navigations.** These can be: |
| 149 | * Renderer-initiated (e.g., `pushState`, fragment link click). In this case, |
| 150 | the browser process creates and destroys a NavigationRequest in the same |
| 151 | task. |
| 152 | * Browser-initiated (e.g., omnibox fragment change). In this case, the |
| 153 | browser process creates a NavigationRequest owned by the RenderFrameHost |
| 154 | and immediately tells the renderer to commit. |
| 155 | |
| 156 | Note that the navigation code is not re-entrant. Callers must not start a new |
| 157 | navigation while a call to `NavigateWithoutEntry` or |
| 158 | `NavigateToExistingPendingEntry` is on the stack, to avoid a CHECK that guards |
| 159 | against use-after-free for `pending_entry_`. |
| 160 | |
| 161 | |
| 162 | ## Rules for Canceling Navigations |
| 163 | |
| 164 | We generally do not want an abusive page to prevent the user from navigating |
| 165 | away, such as by endlessly starting new navigations that interrupt or cancel the |
| 166 | user's attempts. Generally, a new navigation will cancel an existing one in a |
| 167 | frame, but we make the following exception: a renderer-initiated navigation is |
| 168 | ignored iff there is an ongoing browser-initiated navigation and the new |
| 169 | navigation lacks a user activation. (This is implemented in |
| 170 | `Navigator::ShouldIgnoreIncomingRendererRequest`.) |
| 171 | |
| 172 | NavigationThrottles also have an ability to cancel navigations when desired by a |
| 173 | feature. Keep in mind that it is problematic to simulate a redirect by canceling |
| 174 | a navigation and starting a new one, since this may lose relevant context from |
| 175 | the original navigation (e.g., ReloadType, CSP state, Sec-Fetch-Metadata state, |
| 176 | redirect chain, etc), and it will lead to unexpected observer events and metrics |
| 177 | (e.g., extra navigation starts, inflated numbers of canceled navigations, etc). |
| 178 | Feature authors that want to simulate redirects may want to consider using a |
| 179 | URLLoaderRequestInterceptor instead. |
| 180 | |
| 181 | |
| 182 | ## Error Pages |
| 183 | |
| 184 | There are several types of error pages that can be displayed when a navigation |
| 185 | is not successful. |
| 186 | |
| 187 | The server can return a custom error page, such as a 400 or 500 level HTTP |
| 188 | response code page. These pages are rendered much like a successful navigation |
| 189 | to the site (and go into an appropriate process for that site), but the error |
| 190 | code is available and `NavigationHandle::IsErrorPage()` is true. |
| 191 | |
| 192 | If the navigation fails to get a response from the server (e.g., the DNS lookup |
| 193 | fails), then Chromium will display an error page. For main frames, this error |
| 194 | page will be in a special error page process, not affiliated with any site or |
| 195 | containing any untrustworthy content from the web. In these failed cases, |
| 196 | NetErrorHelperCore may try to reload the URL at a later time (e.g., if a network |
| 197 | connection comes back online), to load the document in an appropriate process. |
| 198 | |
| 199 | If instead the navigation is blocked (e.g., by an extension API or a |
| 200 | NavigationThrottle), then Chromium will similarly display an error page in a |
| 201 | special error page process. However, in blocked cases, Chromium will not attempt |
| 202 | to reload the URL at a later time. |
| 203 | |
| 204 | |
| 205 | ## Interstitial Pages |
| 206 | |
| 207 | Interstitial pages are implemented as committed error pages. (Prior to |
| 208 | [issue 448486](https://crbug.com/448486), they were implemented as overlays.) |
| 209 | The original in-progress navigation is canceled when the interstitial is |
| 210 | displayed, and Chromium repeats the navigation if the user chooses to proceed. |
| 211 | |
| 212 | Note that some interstitials can be shown after a page has committed (e.g., when |
| 213 | a subresource load triggers a Safe Browsing error). In this case, Chromium |
| 214 | navigates away from the original page to the interstitial page, with the intent |
| 215 | of replacing the original NavigationEntry. However, the original NavigationEntry |
| 216 | is preserved in `NavigationControllerImpl::entry_replaced_by_post_commit_error_` |
| 217 | in case the user chooses to dismiss the interstitial and return to the original |
| 218 | page. |