Kent Tamura | 59ffb02 | 2018-11-27 05:30:56 | [diff] [blame] | 1 | # Web Test Baseline Fallback |
Robert Ma | 06f7acc | 2017-11-14 17:55:47 | [diff] [blame] | 2 | |
| 3 | |
| 4 | *** promo |
Kent Tamura | 59ffb02 | 2018-11-27 05:30:56 | [diff] [blame] | 5 | Read [Web Test Expectations and Baselines](web_test_expectations.md) first |
Robert Ma | 06f7acc | 2017-11-14 17:55:47 | [diff] [blame] | 6 | if you have not. |
| 7 | *** |
| 8 | |
| 9 | Baselines can vary by platforms, in which case we need to check in multiple |
| 10 | versions of a baseline. Meanwhile, we would like to avoid storing identical |
| 11 | baselines by allowing a platform to fall back to another. This document first |
| 12 | introduces how platform-specific baselines are structured and how we search for |
| 13 | a baseline (the fallback mechanism), and then goes into the details of baseline |
| 14 | optimization and rebaselining. |
| 15 | |
| 16 | [TOC] |
| 17 | |
| 18 | ## Terminology |
| 19 | |
| 20 | * **Root directory**: |
Kent Tamura | 59ffb02 | 2018-11-27 05:30:56 | [diff] [blame] | 21 | [`//src/third_party/blink/web_tests`](../../third_party/blink/web_tests) |
| 22 | is the root directory (of all the web tests and baselines). All relative |
Robert Ma | 06f7acc | 2017-11-14 17:55:47 | [diff] [blame] | 23 | paths in this document start from this directory. |
| 24 | * **Test name**: the name of a test is its relative path from the root |
| 25 | directory (e.g. `html/dom/foo/bar.html`). |
| 26 | * **Baseline name**: replacing the extension of a test name with |
| 27 | `-expected.{txt,png,wav}` gives the corresponding baseline name. |
| 28 | * **Virtual tests**: tests can have virtual variants. For example, |
| 29 | `virtual/gpu/html/dom/foo/bar.html` is the virtual variant of |
| 30 | `html/dom/foo/bar.html` in the `gpu` suite. Only the latter file exists on |
| 31 | disk, and is called the base of the virtual test. See |
Kent Tamura | 59ffb02 | 2018-11-27 05:30:56 | [diff] [blame] | 32 | [Web Tests#Testing Runtime Flags](web_tests.md#testing-runtime-flags) |
Robert Ma | 06f7acc | 2017-11-14 17:55:47 | [diff] [blame] | 33 | for more details. |
| 34 | * **Platform directory**: each directory under |
Kent Tamura | 59ffb02 | 2018-11-27 05:30:56 | [diff] [blame] | 35 | [`platform/`](../../third_party/blink/web_tests/platform) is a platform |
Robert Ma | 06f7acc | 2017-11-14 17:55:47 | [diff] [blame] | 36 | directory that contains baselines (no tests) for that platform. Directory |
| 37 | names are in the form of `PLATFORM-VERSION` (e.g. `mac-mac10.12`), except |
| 38 | for the latest version of a platform which is just `PLATFORM` (e.g. `mac`). |
| 39 | |
| 40 | ## Baseline fallback |
| 41 | |
| 42 | Each platform has a pre-configured fallback when a baseline cannot be found in |
| 43 | this platform directory. A general rule is to have older versions of an OS |
| 44 | falling back to newer versions. Besides, Android falls back to Linux, which then |
| 45 | falls back to Windows. Eventually, all platforms fall back to the root directory |
| 46 | (i.e. the generic baselines that live alongside tests). The rules are configured |
| 47 | by `FALLBACK_PATHS` in each Port class in |
Kent Tamura | 0101944 | 2018-05-01 22:06:58 | [diff] [blame] | 48 | [`//src/third_party/blink/tools/blinkpy/web_tests/port`](../../third_party/blink/tools/blinkpy/web_tests/port). |
Robert Ma | 06f7acc | 2017-11-14 17:55:47 | [diff] [blame] | 49 | |
| 50 | All platforms can be organized into a tree based on their fallback relations (we |
| 51 | are not considering virtual test suites yet). See the lower half (the |
| 52 | non-virtual subtree) of this |
| 53 | [graph](https://docs.google.com/drawings/d/13l3IUlSE99RoKjDwEWuY1O77simAhhF6Wi0fZdkSaMA/). |
| 54 | Walking from a platform to the root gives the **search path** of that platform. |
| 55 | We check each directory on the search path in order and see if "directory + |
| 56 | baseline name" points to a file on disk (note that baseline names are relative |
| 57 | paths), and stop at the first one found. |
| 58 | |
| 59 | ### Virtual test suites |
| 60 | |
| 61 | Now we add virtual test suites to the picture, using a test named |
| 62 | `virtual/gpu/html/dom/foo/bar.html` as an example to demonstrate the process. |
| 63 | The baseline search process for a virtual test consists of two passes: |
| 64 | |
| 65 | 1. Treat the virtual test name as a regular test name and search for the |
| 66 | corresponding baseline name using the same search path, which means we are in |
| 67 | fact searching in directories like `platform/*/virtual/gpu/...`, and |
| 68 | eventually `virtual/gpu/...` (a.k.a. the virtual root). |
| 69 | 2. If no baseline can be found so far, we retry with the non-virtual (base) test |
| 70 | name `html/dom/foo/bar.html` and walk the search path again. |
| 71 | |
| 72 | The [graph](https://docs.google.com/drawings/d/13l3IUlSE99RoKjDwEWuY1O77simAhhF6Wi0fZdkSaMA/) |
| 73 | visualizes the full picture. Note that the two passes are in fact the same with |
| 74 | different test names, so the virtual subtree is a mirror of the non-virtual |
| 75 | subtree. The two trees are connected by the virtual root that has different |
| 76 | ancestors (fallbacks) depending on which platform we start from; this is the |
| 77 | result of the two-pass baseline search. |
| 78 | |
| 79 | *** promo |
| 80 | __Note:__ there are in fact two more places to be searched before everything |
| 81 | else: additional directories given via command line arguments and flag-specific |
| 82 | baseline directories. They are maintained manually and are not discussed in this |
| 83 | document. |
| 84 | *** |
| 85 | |
| 86 | ## Tooling implementation |
| 87 | |
| 88 | This section describes the implications the fallback mechanism has on the |
Kent Tamura | b53757e | 2018-04-20 17:54:48 | [diff] [blame] | 89 | implementation details of tooling, namely `blink_tool.py`. If you are not |
| 90 | hacking `blinkpy`, you can stop here. |
Robert Ma | 06f7acc | 2017-11-14 17:55:47 | [diff] [blame] | 91 | |
| 92 | ### Optimization |
| 93 | |
| 94 | We can remove a baseline if it is the same as its fallback. An extreme example |
| 95 | is that if all platforms have the same result, we can just have a single generic |
| 96 | baseline. Here is the algorithm used by |
Kent Tamura | 0101944 | 2018-05-01 22:06:58 | [diff] [blame] | 97 | [`blink_tool.py optimize-baselines`](../../third_party/blink/tools/blinkpy/common/checkout/baseline_optimizer.py) |
Robert Ma | 06f7acc | 2017-11-14 17:55:47 | [diff] [blame] | 98 | to optimize the duplication away. |
| 99 | |
| 100 | Notice from the previous section that the virtual and non-virtual parts are two |
| 101 | identically structured subtrees. Trees are easy to work with: we can simply |
| 102 | traverse the tree from leaves up to the root, and if there are two identical |
| 103 | baselines on two nodes on the path with no other nodes in between or all nodes |
| 104 | in between have no baselines, keep the one closer to the root (delete the |
| 105 | baseline on the node further from the root). |
| 106 | |
| 107 | The virtual root is special because it has multiple parents. Yet if we can cut |
| 108 | the edges between the two subtrees (i.e. to make the virtual subtree |
| 109 | self-contained), we can apply the same algorithm to both of them. A subtree is |
| 110 | self-contained when it does not need to fallback to ancestors, which can be |
| 111 | guaranteed by placing a baseline on its root. If the virtual root already has a |
| 112 | baseline, we can simply ignore these edges without doing anything; otherwise, we |
| 113 | need to make sure all children of the virtual root have baselines by copying |
| 114 | the non-virtual fallbacks to the ones that do not (we cannot copy the generic |
| 115 | baseline to the virtual root because virtual platforms may have different |
| 116 | results). |
| 117 | |
Robert Ma | 89eeaa5 | 2017-12-02 01:34:47 | [diff] [blame] | 118 | In addition, the optimizer also removes redundant all-PASS testharness.js |
| 119 | results. Such baselines are redundant when there are no other fallbacks later |
| 120 | on the search path (including if the all-PASS baselines are at root), because |
Kent Tamura | a045a7f | 2018-04-25 05:08:11 | [diff] [blame] | 121 | `run_web_tests.py` assumes all-PASS testharness.js results when baselines can |
Robert Ma | 89eeaa5 | 2017-12-02 01:34:47 | [diff] [blame] | 122 | not be found for a platform. |
| 123 | |
Robert Ma | 06f7acc | 2017-11-14 17:55:47 | [diff] [blame] | 124 | ### Rebaseline |
| 125 | |
Kent Tamura | b53757e | 2018-04-20 17:54:48 | [diff] [blame] | 126 | The fallback mechanism also affects the rebaseline tool (`blink_tool.py |
Robert Ma | 06f7acc | 2017-11-14 17:55:47 | [diff] [blame] | 127 | rebaseline{-cl}`). When asked to rebaseline a test on some platforms, the tool |
| 128 | downloads results from corresponding try bots and put them into the respective |
| 129 | platform directories. This is potentially problematic. Because of the fallback |
| 130 | mechanism, the new baselines may affect some other platforms that are not being |
| 131 | rebaselining but fall back to the rebaselined platforms. |
| 132 | |
| 133 | The solution is to copy the current baselines from the to-be-rebaselined |
| 134 | platforms to all the platforms that immediately fall back to them (i.e. down one |
| 135 | level in the fallback tree) before downloading new baselines. This is done in a |
| 136 | hidden internal command |
Kent Tamura | b53757e | 2018-04-20 17:54:48 | [diff] [blame] | 137 | [`blink_tool.py copy-existing-baselines`](../../third_party/blink/tools/blinkpy/tool/commands/copy_existing_baselines.py), |
| 138 | which is always executed by `blink_tool.py rebaseline`. |
Robert Ma | 06f7acc | 2017-11-14 17:55:47 | [diff] [blame] | 139 | |
Kent Tamura | b53757e | 2018-04-20 17:54:48 | [diff] [blame] | 140 | Finally, `blink_tool.py rebaseline{-cl}` also does optimization in the end by |
Robert Ma | 06f7acc | 2017-11-14 17:55:47 | [diff] [blame] | 141 | default. |