blob: f426e680ace6cabd30844c089c9b62626c3b7a12 [file] [log] [blame] [view]
Kai Ninomiyaa6429fb32018-03-30 01:30:561# GPU Testing
2
3This set of pages documents the setup and operation of the GPU bots and try
4servers, which verify the correctness of Chrome's graphically accelerated
5rendering pipeline.
6
7[TOC]
8
9## Overview
10
11The GPU bots run a different set of tests than the majority of the Chromium
12test machines. The GPU bots specifically focus on tests which exercise the
13graphics processor, and whose results are likely to vary between graphics card
14vendors.
15
16Most of the tests on the GPU bots are run via the [Telemetry framework].
17Telemetry was originally conceived as a performance testing framework, but has
18proven valuable for correctness testing as well. Telemetry directs the browser
19to perform various operations, like page navigation and test execution, from
20external scripts written in Python. The GPU bots launch the full Chromium
21browser via Telemetry for the majority of the tests. Using the full browser to
22execute tests, rather than smaller test harnesses, has yielded several
23advantages: testing what is shipped, improved reliability, and improved
24performance.
25
26[Telemetry framework]: https://github.com/catapult-project/catapult/tree/master/telemetry
27
28A subset of the tests, called "pixel tests", grab screen snapshots of the web
29page in order to validate Chromium's rendering architecture end-to-end. Where
30necessary, GPU-specific results are maintained for these tests. Some of these
31tests verify just a few pixels, using handwritten code, in order to use the
32same validation for all brands of GPUs.
33
34The GPU bots use the Chrome infrastructure team's [recipe framework], and
35specifically the [`chromium`][recipes/chromium] and
36[`chromium_trybot`][recipes/chromium_trybot] recipes, to describe what tests to
37execute. Compared to the legacy master-side buildbot scripts, recipes make it
38easy to add new steps to the bots, change the bots' configuration, and run the
39tests locally in the same way that they are run on the bots. Additionally, the
40`chromium` and `chromium_trybot` recipes make it possible to send try jobs which
41add new steps to the bots. This single capability is a huge step forward from
42the previous configuration where new steps were added blindly, and could cause
43failures on the tryservers. For more details about the configuration of the
44bots, see the [GPU bot details].
45
John Palmer046f9872021-05-24 01:24:5646[recipe framework]: https://chromium.googlesource.com/external/github.com/luci/recipes-py/+/main/doc/user_guide.md
47[recipes/chromium]: https://chromium.googlesource.com/chromium/tools/build/+/main/scripts/slave/recipes/chromium.py
48[recipes/chromium_trybot]: https://chromium.googlesource.com/chromium/tools/build/+/main/scripts/slave/recipes/chromium_trybot.py
Kai Ninomiyaa6429fb32018-03-30 01:30:5649[GPU bot details]: gpu_testing_bot_details.md
50
51The physical hardware for the GPU bots lives in the Swarming pool\*. The
52Swarming infrastructure ([new docs][new-testing-infra], [older but currently
53more complete docs][isolated-testing-infra]) provides many benefits:
54
55* Increased parallelism for the tests; all steps for a given tryjob or
56 waterfall build run in parallel.
57* Simpler scaling: just add more hardware in order to get more capacity. No
58 manual configuration or distribution of hardware needed.
59* Easier to run certain tests only on certain operating systems or types of
60 GPUs.
61* Easier to add new operating systems or types of GPUs.
62* Clearer description of the binary and data dependencies of the tests. If
63 they run successfully locally, they'll run successfully on the bots.
64
65(\* All but a few one-off GPU bots are in the swarming pool. The exceptions to
66the rule are described in the [GPU bot details].)
67
68The bots on the [chromium.gpu.fyi] waterfall are configured to always test
69top-of-tree ANGLE. This setup is done with a few lines of code in the
70[tools/build workspace]; search the code for "angle".
71
72These aspects of the bots are described in more detail below, and in linked
73pages. There is a [presentation][bots-presentation] which gives a brief
74overview of this documentation and links back to various portions.
75
76<!-- XXX: broken link -->
77[new-testing-infra]: https://github.com/luci/luci-py/wiki
78[isolated-testing-infra]: https://www.chromium.org/developers/testing/isolated-testing/infrastructure
Kenneth Russell8a386d42018-06-02 09:48:0179[chromium.gpu]: https://ci.chromium.org/p/chromium/g/chromium.gpu/console
80[chromium.gpu.fyi]: https://ci.chromium.org/p/chromium/g/chromium.gpu.fyi/console
Josip Sokcevicba144412020-09-09 20:57:0581[tools/build workspace]: https://source.chromium.org/chromium/chromium/tools/build/+/HEAD:recipes/recipe_modules/chromium_tests/builders/chromium_gpu_fyi.py
Kai Ninomiyaa6429fb32018-03-30 01:30:5682[bots-presentation]: https://docs.google.com/presentation/d/1BC6T7pndSqPFnituR7ceG7fMY7WaGqYHhx5i9ECa8EI/edit?usp=sharing
83
84## Fleet Status
85
86Please see the [GPU Pixel Wrangling instructions] for links to dashboards
87showing the status of various bots in the GPU fleet.
88
Brian Sheedy5a4c0a392021-09-22 21:28:3589[GPU Pixel Wrangling instructions]: http://go/gpu-pixel-wrangler#fleet-status
Kai Ninomiyaa6429fb32018-03-30 01:30:5690
91## Using the GPU Bots
92
93Most Chromium developers interact with the GPU bots in two ways:
94
951. Observing the bots on the waterfalls.
962. Sending try jobs to them.
97
98The GPU bots are grouped on the [chromium.gpu] and [chromium.gpu.fyi]
99waterfalls. Their current status can be easily observed there.
100
101To send try jobs, you must first upload your CL to the codereview server. Then,
102either clicking the "CQ dry run" link or running from the command line:
103
104```sh
105git cl try
106```
107
108Sends your job to the default set of try servers.
109
110The GPU tests are part of the default set for Chromium CLs, and are run as part
111of the following tryservers' jobs:
112
Stephen Martinis089f5f02019-02-12 02:42:24113* [linux-rel], formerly on the `tryserver.chromium.linux` waterfall
114* [mac-rel], formerly on the `tryserver.chromium.mac` waterfall
Kai Ninomiya3c25da92019-10-25 23:13:32115* [win10_chromium_x64_rel_ng], formerly on the `tryserver.chromium.win` waterfall
Kai Ninomiyaa6429fb32018-03-30 01:30:56116
Kai Ninomiya3c25da92019-10-25 23:13:32117[linux-rel]: https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux-rel?limit=100
118[mac-rel]: https://ci.chromium.org/p/chromium/builders/luci.chromium.try/mac-rel?limit=100
119[win10_chromium_x64_rel_ng]: https://ci.chromium.org/p/chromium/builders/luci.chromium.try/win10_chromium_x64_rel_ng?limit=100
Kai Ninomiyaa6429fb32018-03-30 01:30:56120
121Scan down through the steps looking for the text "GPU"; that identifies those
122tests run on the GPU bots. For each test the "trigger" step can be ignored; the
123step further down for the test of the same name contains the results.
124
125It's usually not necessary to explicitly send try jobs just for verifying GPU
126tests. If you want to, you must invoke "git cl try" separately for each
127tryserver master you want to reference, for example:
128
129```sh
Stephen Martinis089f5f02019-02-12 02:42:24130git cl try -b linux-rel
131git cl try -b mac-rel
132git cl try -b win7-rel
Kai Ninomiyaa6429fb32018-03-30 01:30:56133```
134
135Alternatively, the Gerrit UI can be used to send a patch set to these try
136servers.
137
138Three optional tryservers are also available which run additional tests. As of
139this writing, they ran longer-running tests that can't run against all Chromium
140CLs due to lack of hardware capacity. They are added as part of the included
141tryservers for code changes to certain sub-directories.
142
Corentin Wallezb78c44a2018-04-12 14:29:47143* [linux_optional_gpu_tests_rel] on the [luci.chromium.try] waterfall
144* [mac_optional_gpu_tests_rel] on the [luci.chromium.try] waterfall
145* [win_optional_gpu_tests_rel] on the [luci.chromium.try] waterfall
Kai Ninomiyaa6429fb32018-03-30 01:30:56146
Corentin Wallezb78c44a2018-04-12 14:29:47147[linux_optional_gpu_tests_rel]: https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux_optional_gpu_tests_rel
148[mac_optional_gpu_tests_rel]: https://ci.chromium.org/p/chromium/builders/luci.chromium.try/mac_optional_gpu_tests_rel
149[win_optional_gpu_tests_rel]: https://ci.chromium.org/p/chromium/builders/luci.chromium.try/win_optional_gpu_tests_rel
Kenneth Russell42732952018-06-27 02:08:42150[luci.chromium.try]: https://ci.chromium.org/p/chromium/g/luci.chromium.try/builders
Kai Ninomiyaa6429fb32018-03-30 01:30:56151
152Tryservers for the [ANGLE project] are also present on the
153[tryserver.chromium.angle] waterfall. These are invoked from the Gerrit user
154interface. They are configured similarly to the tryservers for regular Chromium
155patches, and run the same tests that are run on the [chromium.gpu.fyi]
156waterfall, in the same way (e.g., against ToT ANGLE).
157
158If you find it necessary to try patches against other sub-repositories than
159Chromium (`src/`) and ANGLE (`src/third_party/angle/`), please
160[file a bug](http://crbug.com/new) with component Internals\>GPU\>Testing.
161
John Palmer046f9872021-05-24 01:24:56162[ANGLE project]: https://chromium.googlesource.com/angle/angle/+/main/README.md
Kai Ninomiyaa6429fb32018-03-30 01:30:56163[tryserver.chromium.angle]: https://build.chromium.org/p/tryserver.chromium.angle/waterfall
164[file a bug]: http://crbug.com/new
165
166## Running the GPU Tests Locally
167
168All of the GPU tests running on the bots can be run locally from a Chromium
169build. Many of the tests are simple executables:
170
171* `angle_unittests`
Takuto Ikutaf5333252019-11-06 16:07:08172* `gl_tests`
Kai Ninomiyaa6429fb32018-03-30 01:30:56173* `gl_unittests`
174* `tab_capture_end2end_tests`
175
176Some run only on the chromium.gpu.fyi waterfall, either because there isn't
177enough machine capacity at the moment, or because they're closed-source tests
178which aren't allowed to run on the regular Chromium waterfalls:
179
180* `angle_deqp_gles2_tests`
181* `angle_deqp_gles3_tests`
182* `angle_end2end_tests`
183* `audio_unittests`
184
185The remaining GPU tests are run via Telemetry. In order to run them, just
186build the `chrome` target and then
187invoke `src/content/test/gpu/run_gpu_integration_test.py` with the appropriate
188argument. The tests this script can invoke are
189in `src/content/test/gpu/gpu_tests/`. For example:
190
191* `run_gpu_integration_test.py context_lost --browser=release`
Kai Ninomiyaa6429fb32018-03-30 01:30:56192* `run_gpu_integration_test.py webgl_conformance --browser=release --webgl-conformance-version=1.0.2`
193* `run_gpu_integration_test.py maps --browser=release`
194* `run_gpu_integration_test.py screenshot_sync --browser=release`
195* `run_gpu_integration_test.py trace_test --browser=release`
196
Brian Sheedyc4650ad02019-07-29 17:31:38197The pixel tests are a bit special. See
198[the section on running them locally](#Running-the-pixel-tests-locally) for
199details.
200
Kenneth Russellfa3ffde2018-10-24 21:24:38201If you're testing on Android and have built and deployed
202`ChromePublic.apk` to the device, use `--browser=android-chromium` to
203invoke it.
204
Brian Sheedy15587f72021-04-16 19:56:06205**Note:** The tests require some third-party Python packages. Obtaining these
206packages is handled automatically by `vpython`, and the script's shebang should
207use vpython if running the script directly. If you're used to invoking `python`
208to run a script, simply use `vpython` instead, e.g.
209`vpython run_gpu_integration_test.py ...`.
Kai Ninomiyaa6429fb32018-03-30 01:30:56210
Kenneth Russellfa3ffde2018-10-24 21:24:38211You can run a subset of tests with this harness:
Kai Ninomiyaa6429fb32018-03-30 01:30:56212
213* `run_gpu_integration_test.py webgl_conformance --browser=release
214 --test-filter=conformance_attribs`
215
Brian Sheedy15587f72021-04-16 19:56:06216The exact command used to invoke the test on the bots can be found in one of
217two ways:
Kai Ninomiyaa6429fb32018-03-30 01:30:56218
Brian Sheedy15587f72021-04-16 19:56:062191. Looking at the [json.input][trigger_input] of the trigger step under
220 `requests[task_slices][command]`. The arguments after the last `--` are
221 used to actually run the test.
2221. Looking at the top of a [swarming task][sample_swarming_task].
Kai Ninomiyaa6429fb32018-03-30 01:30:56223
Brian Sheedy15587f72021-04-16 19:56:06224In both cases, the following can be omitted when running locally since they're
225only necessary on swarming:
226* `testing/test_env.py`
227* `testing/scripts/run_gpu_integration_test_as_googletest.py`
228* `--isolated-script-test-output`
229* `--isolated-script-test-perf-output`
Kai Ninomiyaa6429fb32018-03-30 01:30:56230
Kai Ninomiyaa6429fb32018-03-30 01:30:56231
Brian Sheedy15587f72021-04-16 19:56:06232[trigger_input]: https://logs.chromium.org/logs/chromium/buildbucket/cr-buildbucket.appspot.com/8849851608240828544/+/u/test_pre_run__14_/l_trigger__webgl2_conformance_d3d11_passthrough_tests_on_NVIDIA_GPU_on_Windows_on_Windows-10-18363/json.input
233[sample_swarming_task]: https://chromium-swarm.appspot.com/task?id=52f06058bfb31b10
Kai Ninomiyaa6429fb32018-03-30 01:30:56234
235The Maps test requires you to authenticate to cloud storage in order to access
236the Web Page Reply archive containing the test. See [Cloud Storage Credentials]
237for documentation on setting this up.
238
239[Cloud Storage Credentials]: gpu_testing_bot_details.md#Cloud-storage-credentials
240
Brian Sheedy15587f72021-04-16 19:56:06241### Telemetry Test Suites
242The Telemetry-based tests are all technically the same target,
243`telemetry_gpu_integration_test`, just run with different runtime arguments. The
244first positional argument passed determines which suite will run, and additional
245runtime arguments may cause the step name to change on the bots. Here is a list
246of all suites and resulting step names as of April 15th 2021:
247
248* `context_lost`
249 * `context_lost_passthrough_tests`
250 * `context_lost_tests`
251 * `context_lost_validating_tests`
252 * `gl_renderer_context_lost_tests`
253* `depth_capture`
254 * `depth_capture_tests`
255 * `gl_renderer_depth_capture_tests`
256* `hardware_accelerated_feature`
257 * `gl_renderer_hardware_accelerated_feature_tests`
258 * `hardware_accelerated_feature_tests`
259* `gpu_process`
260 * `gl_renderer_gpu_process_launch_tests`
261 * `gpu_process_launch_tests`
262* `info_collection`
263 * `info_collection_tests`
264* `maps`
265 * `gl_renderer_maps_pixel_tests`
266 * `maps_pixel_passthrough_test`
267 * `maps_pixel_test`
268 * `maps_pixel_validating_test`
269 * `maps_tests`
270* `pixel`
271 * `android_webview_pixel_skia_gold_test`
272 * `dawn_pixel_skia_gold_test`
273 * `egl_pixel_skia_gold_test`
274 * `gl_renderer_pixel_skia_gold_tests`
275 * `pixel_skia_gold_passthrough_test`
276 * `pixel_skia_gold_validating_test`
277 * `pixel_tests`
278 * `skia_renderer_pixel_skia_gold_test`
279 * `vulkan_pixel_skia_gold_test`
280* `power`
281 * `power_measurement_test`
282* `screenshot_sync`
283 * `gl_renderer_screenshot_sync_tests`
284 * `screenshot_sync_passthrough_tests`
285 * `screenshot_sync_tests`
286 * `screenshot_sync_validating_tests`
287* `trace_test`
288 * `trace_test`
289* `webgl_conformance`
290 * `webgl2_conformance_d3d11_passthrough_tests`
291 * `webgl2_conformance_gl_passthrough_tests`
292 * `webgl2_conformance_gles_passthrough_tests`
293 * `webgl2_conformance_tests`
294 * `webgl2_conformance_validating_tests`
295 * `webgl_conformance_d3d11_passthrough_tests`
296 * `webgl_conformance_d3d9_passthrough_tests`
297 * `webgl_conformance_fast_call_tests`
298 * `webgl_conformance_gl_passthrough_tests`
299 * `webgl_conformance_gles_passthrough_tests`
300 * `webgl_conformance_metal_passthrough_tests`
301 * `webgl_conformance_swangle_passthrough_tests`
302 * `webgl_conformance_swiftshader_validating_tests`
303 * `webgl_conformance_tests`
304 * `webgl_conformance_validating_tests`
305 * `webgl_conformance_vulkan_passthrough_tests`
306
Kenneth Russellfa3ffde2018-10-24 21:24:38307### Running the pixel tests locally
Kai Ninomiyaa6429fb32018-03-30 01:30:56308
Brian Sheedyc4650ad02019-07-29 17:31:38309The pixel tests are a special case because they use an external Skia service
310called Gold to handle image approval and storage. See
311[GPU Pixel Testing With Gold] for specifics.
Kenneth Russellfa3ffde2018-10-24 21:24:38312
Brian Sheedyc4650ad02019-07-29 17:31:38313[GPU Pixel Testing With Gold]: gpu_pixel_testing_with_gold.md
Kenneth Russellfa3ffde2018-10-24 21:24:38314
Brian Sheedyc4650ad02019-07-29 17:31:38315TL;DR is that the pixel tests use a binary called `goldctl` to download and
316upload data when running pixel tests.
Kenneth Russellfa3ffde2018-10-24 21:24:38317
Brian Sheedyc4650ad02019-07-29 17:31:38318Normally, `goldctl` uploads images and image metadata to the Gold server when
319used. This is not desirable when running locally for a couple reasons:
Kenneth Russellfa3ffde2018-10-24 21:24:38320
Brian Sheedyc4650ad02019-07-29 17:31:383211. Uploading requires the user to be whitelisted on the server, and whitelisting
322everyone who wants to run the tests locally is not a viable solution.
3232. Images produced during local runs are usually slightly different from those
324that are produced on the bots due to hardware/software differences. Thus, most
325images uploaded to Gold from local runs would likely only ever actually be used
326by tests run on the machine that initially generated those images, which just
327adds noise to the list of approved images.
Kenneth Russellfa3ffde2018-10-24 21:24:38328
Brian Sheedyc4650ad02019-07-29 17:31:38329Additionally, the tests normally rely on the Gold server for viewing images
330produced by a test run. This does not work if the data is not actually uploaded.
Kenneth Russellfa3ffde2018-10-24 21:24:38331
Brian Sheedyb70d3102019-10-14 22:41:50332The pixel tests contain logic to automatically determine whether they are
333running on a workstation or not, as well as to determine what git revision is
334being tested. This *should* mean that the pixel tests will automatically work
335when run locally. However, if the local run detection code fails for some
336reason, you can manually pass some flags to force the same behavior:
337
Brian Sheedy2df4e142020-06-15 21:49:33338In order to get around the local run issues, simply pass the
339`--local-pixel-tests` flag to the tests. This will disable uploading, but
340otherwise go through the same steps as a test normally would. Each test will
341also print out `file://` URLs to the produced image, the closest image for the
342test known to Gold, and the diff between the two.
Kenneth Russellfa3ffde2018-10-24 21:24:38343
Brian Sheedyc4650ad02019-07-29 17:31:38344Because the image produced by the test locally is likely slightly different from
345any of the approved images in Gold, local test runs are likely to fail during
346the comparison step. In order to cut down on the amount of noise, you can also
347pass the `--no-skia-gold-failure` flag to not fail the test on a failed image
348comparison. When using `--no-skia-gold-failure`, you'll also need to pass the
349`--passthrough` flag in order to actually see the link output.
Kenneth Russellfa3ffde2018-10-24 21:24:38350
Brian Sheedyc4650ad02019-07-29 17:31:38351Example usage:
Brian Sheedy2df4e142020-06-15 21:49:33352`run_gpu_integration_test.py pixel --no-skia-gold-failure --local-pixel-tests
jonross8de90742019-10-15 19:10:48353--passthrough`
Kenneth Russellfa3ffde2018-10-24 21:24:38354
jonross8de90742019-10-15 19:10:48355If, for some reason, the local run code is unable to determine what the git
Brian Sheedy4d335deb2020-04-01 20:47:32356revision is, simply pass `--git-revision aabbccdd`. Note that `aabbccdd` must
jonross8de90742019-10-15 19:10:48357be replaced with an actual Chromium src revision (typically whatever revision
Andrew Williamsbbc1a1e2021-07-21 01:51:22358origin/main is currently synced to) in order for the tests to work. This can
jonross8de90742019-10-15 19:10:48359be done automatically using:
Brian Sheedy2df4e142020-06-15 21:49:33360``run_gpu_integration_test.py pixel --no-skia-gold-failure --local-pixel-tests
Andrew Williamsbbc1a1e2021-07-21 01:51:22361--passthrough --git-revision `git rev-parse origin/main` ``
Kai Ninomiyaa6429fb32018-03-30 01:30:56362
Kai Ninomiyaa6429fb32018-03-30 01:30:56363## Running Binaries from the Bots Locally
364
365Any binary run remotely on a bot can also be run locally, assuming the local
366machine loosely matches the architecture and OS of the bot.
367
368The easiest way to do this is to find the ID of the swarming task and use
369"swarming.py reproduce" to re-run it:
370
Takuto Ikuta2d01a492021-06-04 00:28:58371* `./src/tools/luci-go/swarming reproduce -S https://chromium-swarm.appspot.com [task ID]`
Kai Ninomiyaa6429fb32018-03-30 01:30:56372
373The task ID can be found in the stdio for the "trigger" step for the test. For
374example, look at a recent build from the [Mac Release (Intel)] bot, and
375look at the `gl_unittests` step. You will see something like:
376
Yves Gereya702f6222019-01-24 11:07:30377[Mac Release (Intel)]: https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Mac%20Release%20%28Intel%29/
Kai Ninomiyaa6429fb32018-03-30 01:30:56378
379```
380Triggered task: gl_unittests on Intel GPU on Mac/Mac-10.12.6/[TRUNCATED_ISOLATE_HASH]/Mac Release (Intel)/83664
381To collect results, use:
382 swarming.py collect -S https://chromium-swarm.appspot.com --json /var/folders/[PATH_TO_TEMP_FILE].json
383Or visit:
384 https://chromium-swarm.appspot.com/user/task/[TASK_ID]
385```
386
387There is a difference between the isolate's hash and Swarming's task ID. Make
388sure you use the task ID and not the isolate's hash.
389
390As of this writing, there seems to be a
391[bug](https://github.com/luci/luci-py/issues/250)
392when attempting to re-run the Telemetry based GPU tests in this way. For the
393time being, this can be worked around by instead downloading the contents of
Brian Sheedy15587f72021-04-16 19:56:06394the isolate. To do so, look into the "Reproducing the task locally" section on
395a swarming task, which contains something like:
Kai Ninomiyaa6429fb32018-03-30 01:30:56396
Brian Sheedy15587f72021-04-16 19:56:06397```
398Download inputs files into directory foo:
Junji Watanabe160300022021-09-27 03:09:53399# (if needed, use "\${platform}" as-is) cipd install "infra/tools/luci/cas/\${platform}" -root bar
400# (if needed) ./bar/cas login
401./bar/cas download -cas-instance projects/chromium-swarm/instances/default_instance -digest 68ae1d6b22673b0ab7b4427ca1fc2a4761c9ee53474105b9076a23a67e97a18a/647 -dir foo
Brian Sheedy15587f72021-04-16 19:56:06402```
Kai Ninomiyaa6429fb32018-03-30 01:30:56403
404Before attempting to download an isolate, you must ensure you have permission
405to access the isolate server. Full instructions can be [found
406here][isolate-server-credentials]. For most cases, you can simply run:
407
Takuto Ikuta2d01a492021-06-04 00:28:58408* `./src/tools/luci-go/isolate login`
Kai Ninomiyaa6429fb32018-03-30 01:30:56409
410The above link requires that you log in with your @google.com credentials. It's
411not known at the present time whether this works with @chromium.org accounts.
412Email kbr@ if you try this and find it doesn't work.
413
414[isolate-server-credentials]: gpu_testing_bot_details.md#Isolate-server-credentials
415
416## Running Locally Built Binaries on the GPU Bots
417
418See the [Swarming documentation] for instructions on how to upload your binaries to the isolate server and trigger execution on Swarming.
419
John Budorickb2ff2242019-11-14 17:35:59420Be sure to use the correct swarming dimensions for your desired GPU e.g. "1002:6613" instead of "AMD Radeon R7 240 (1002:6613)" which is how it appears on swarming task page. You can query bots in the chromium.tests.gpu pool to find the correct dimensions:
Sunny Sachanandani8d071572019-06-13 20:17:58421
Takuto Ikuta2d01a492021-06-04 00:28:58422* `tools\luci-go\swarming bots -S chromium-swarm.appspot.com -d pool=chromium.tests.gpu`
Sunny Sachanandani8d071572019-06-13 20:17:58423
Kai Ninomiyaa6429fb32018-03-30 01:30:56424[Swarming documentation]: https://www.chromium.org/developers/testing/isolated-testing/for-swes#TOC-Run-a-test-built-locally-on-Swarming
425
Kenneth Russell42732952018-06-27 02:08:42426## Moving Test Binaries from Machine to Machine
427
428To create a zip archive of your personal Chromium build plus all of
429the Telemetry-based GPU tests' dependencies, which you can then move
430to another machine for testing:
431
4321. Build Chrome (into `out/Release` in this example).
Brian Sheedy15587f72021-04-16 19:56:064331. `vpython tools/mb/mb.py zip out/Release/ telemetry_gpu_integration_test out/telemetry_gpu_integration_test.zip`
Kenneth Russell42732952018-06-27 02:08:42434
435Then copy telemetry_gpu_integration_test.zip to another machine. Unzip
436it, and cd into the resulting directory. Invoke
437`content/test/gpu/run_gpu_integration_test.py` as above.
438
439This workflow has been tested successfully on Windows with a
440statically-linked Release build of Chrome.
441
442Note: on one macOS machine, this command failed because of a broken
443`strip-json-comments` symlink in
444`src/third_party/catapult/common/node_runner/node_runner/node_modules/.bin`. Deleting
445that symlink allowed it to proceed.
446
447Note also: on the same macOS machine, with a component build, this
448command failed to zip up a working Chromium binary. The browser failed
449to start with the following error:
450
451`[0626/180440.571670:FATAL:chrome_main_delegate.cc(1057)] Check failed: service_manifest_data_pack_.`
452
453In a pinch, this command could be used to bundle up everything, but
454the "out" directory could be deleted from the resulting zip archive,
455and the Chromium binaries moved over to the target machine. Then the
456command line arguments `--browser=exact --browser-executable=[path]`
457can be used to launch that specific browser.
458
459See the [user guide for mb](../../tools/mb/docs/user_guide.md#mb-zip), the
460meta-build system, for more details.
461
Kai Ninomiyaa6429fb32018-03-30 01:30:56462## Adding New Tests to the GPU Bots
463
464The goal of the GPU bots is to avoid regressions in Chrome's rendering stack.
465To that end, let's add as many tests as possible that will help catch
466regressions in the product. If you see a crazy bug in Chrome's rendering which
467would be easy to catch with a pixel test running in Chrome and hard to catch in
468any of the other test harnesses, please, invest the time to add a test!
469
470There are a couple of different ways to add new tests to the bots:
471
4721. Adding a new test to one of the existing harnesses.
4732. Adding an entire new test step to the bots.
474
475### Adding a new test to one of the existing test harnesses
476
477Adding new tests to the GTest-based harnesses is straightforward and
478essentially requires no explanation.
479
480As of this writing it isn't as easy as desired to add a new test to one of the
481Telemetry based harnesses. See [Issue 352807](http://crbug.com/352807). Let's
482collectively work to address that issue. It would be great to reduce the number
483of steps on the GPU bots, or at least to avoid significantly increasing the
484number of steps on the bots. The WebGL conformance tests should probably remain
485a separate step, but some of the smaller Telemetry based tests
486(`context_lost_tests`, `memory_test`, etc.) should probably be combined into a
487single step.
488
489If you are adding a new test to one of the existing tests (e.g., `pixel_test`),
490all you need to do is make sure that your new test runs correctly via isolates.
491See the documentation from the GPU bot details on [adding new isolated
Daniel Bratellf73f0df2018-09-24 13:52:49492tests][new-isolates] for the gn args and authentication needed to upload
Kai Ninomiyaa6429fb32018-03-30 01:30:56493isolates to the isolate server. Most likely the new test will be Telemetry
Takuto Ikuta2d01a492021-06-04 00:28:58494based, and included in the `telemetry_gpu_test_run` isolate.
Kai Ninomiyaa6429fb32018-03-30 01:30:56495
496[new-isolates]: gpu_testing_bot_details.md#Adding-a-new-isolated-test-to-the-bots
497
Jamie Madill5b0716b2019-10-24 16:43:47498### Adding new steps to the GPU Bots
Kai Ninomiyaa6429fb32018-03-30 01:30:56499
500The tests that are run by the GPU bots are described by a couple of JSON files
501in the Chromium workspace:
502
John Palmer046f9872021-05-24 01:24:56503* [`chromium.gpu.json`](https://chromium.googlesource.com/chromium/src/+/main/testing/buildbot/chromium.gpu.json)
504* [`chromium.gpu.fyi.json`](https://chromium.googlesource.com/chromium/src/+/main/testing/buildbot/chromium.gpu.fyi.json)
Kai Ninomiyaa6429fb32018-03-30 01:30:56505
506These files are autogenerated by the following script:
507
John Palmer046f9872021-05-24 01:24:56508* [`generate_buildbot_json.py`](https://chromium.googlesource.com/chromium/src/+/main/testing/buildbot/generate_buildbot_json.py)
Kai Ninomiyaa6429fb32018-03-30 01:30:56509
Kenneth Russell8a386d42018-06-02 09:48:01510This script is documented in
John Palmer046f9872021-05-24 01:24:56511[`testing/buildbot/README.md`](https://chromium.googlesource.com/chromium/src/+/main/testing/buildbot/README.md). The
Kenneth Russell8a386d42018-06-02 09:48:01512JSON files are parsed by the chromium and chromium_trybot recipes, and describe
513two basic types of tests:
Kai Ninomiyaa6429fb32018-03-30 01:30:56514
515* GTests: those which use the Googletest and Chromium's `base/test/launcher/`
516 frameworks.
Kenneth Russell8a386d42018-06-02 09:48:01517* Isolated scripts: tests whose initial entry point is a Python script which
518 follows a simple convention of command line argument parsing.
519
520The majority of the GPU tests are however:
521
522* Telemetry based tests: an isolated script test which is built on the
523 Telemetry framework and which launches the entire browser.
Kai Ninomiyaa6429fb32018-03-30 01:30:56524
525A prerequisite of adding a new test to the bots is that that test [run via
Kenneth Russell8a386d42018-06-02 09:48:01526isolates][new-isolates]. Once that is done, modify `test_suites.pyl` to add the
527test to the appropriate set of bots. Be careful when adding large new test steps
528to all of the bots, because the GPU bots are a limited resource and do not
529currently have the capacity to absorb large new test suites. It is safer to get
530new tests running on the chromium.gpu.fyi waterfall first, and expand from there
531to the chromium.gpu waterfall (which will also make them run against every
Stephen Martinis089f5f02019-02-12 02:42:24532Chromium CL by virtue of the `linux-rel`, `mac-rel`, `win7-rel` and
533`android-marshmallow-arm64-rel` tryservers' mirroring of the bots on this
534waterfall – so be careful!).
Kai Ninomiyaa6429fb32018-03-30 01:30:56535
536Tryjobs which add new test steps to the chromium.gpu.json file will run those
537new steps during the tryjob, which helps ensure that the new test won't break
538once it starts running on the waterfall.
539
540Tryjobs which modify chromium.gpu.fyi.json can be sent to the
541`win_optional_gpu_tests_rel`, `mac_optional_gpu_tests_rel` and
542`linux_optional_gpu_tests_rel` tryservers to help ensure that they won't
543break the FYI bots.
544
Kenneth Russellfa3ffde2018-10-24 21:24:38545## Debugging Pixel Test Failures on the GPU Bots
546
Brian Sheedyc4650ad02019-07-29 17:31:38547If pixel tests fail on the bots, the build step will contain either one or more
548links titled `gold_triage_link for <test name>` or a single link titled
549`Too many artifacts produced to link individually, click for links`, which
550itself will contain links. In either case, these links will direct to Gold
551pages showing the image produced by the image and the approved image that most
552closely matches it.
Kenneth Russellfa3ffde2018-10-24 21:24:38553
Quinten Yearsley317532d2021-10-20 17:10:31554Note that for the tests which programmatically check colors in certain regions of
Brian Sheedyc4650ad02019-07-29 17:31:38555the image (tests with `expected_colors` fields in [pixel_test_pages]), there
556likely won't be a closest approved image since those tests only upload data to
557Gold in the event of a failure.
Kenneth Russellfa3ffde2018-10-24 21:24:38558
Brian Sheedyc4650ad02019-07-29 17:31:38559[pixel_test_pages]: https://cs.chromium.org/chromium/src/content/test/gpu/gpu_tests/pixel_test_pages.py
Kenneth Russellfa3ffde2018-10-24 21:24:38560
Kai Ninomiyaa6429fb32018-03-30 01:30:56561## Updating and Adding New Pixel Tests to the GPU Bots
562
Brian Sheedyc4650ad02019-07-29 17:31:38563If your CL adds a new pixel test or modifies existing ones, it's likely that
564you will have to approve new images. Simply run your CL through the CQ and
565follow the steps outline [here][pixel wrangling triage] under the "Check if any
566pixel test failures are actual failures or need to be rebaselined." step.
Kai Ninomiyaa6429fb32018-03-30 01:30:56567
Brian Sheedy5a4c0a392021-09-22 21:28:35568[pixel wrangling triage]: http://go/gpu-pixel-wrangler-info#how-to-keep-the-bots-green
Kai Ninomiyaa6429fb32018-03-30 01:30:56569
Brian Sheedy5a88cc72019-09-27 23:04:35570If you are adding a new pixel test, it is beneficial to set the
571`grace_period_end` argument in the test's definition. This will allow the test
572to run for a period without actually failing on the waterfall bots, giving you
573some time to triage any additional images that show up on them. This helps
574prevent new tests from making the bots red because they're producing slightly
575different but valid images from the ones triaged while the CL was in review.
576Example:
577
578```
579from datetime import date
580
581...
582
583PixelTestPage(
584 'foo_pixel_test.html',
585 ...
586 grace_period_end=date(2020, 1, 1)
587)
588```
589
590You should typically set the grace period to end 1-2 days after the the CL will
591land.
592
Brian Sheedyc4650ad02019-07-29 17:31:38593Once your CL passes the CQ, you should be mostly good to go, although you should
594keep an eye on the waterfall bots for a short period after your CL lands in case
595any configurations not covered by the CQ need to have images approved, as well.
Brian Sheedy5a88cc72019-09-27 23:04:35596All untriaged images for your test can be found by substituting your test name
597into:
598
599`https://chrome-gpu-gold.skia.org/search?query=name%3D<test name>`
Kai Ninomiyaa6429fb32018-03-30 01:30:56600
Brian Sheedye4a03fc2020-05-13 23:12:00601**NOTE** If you have a grace period active for your test, then Gold will be told
602to ignore results for the test. This is so that it does not comment on unrelated
603CLs about untriaged images if your test is noisy. Images will still be uploaded
604to Gold and can be triaged, but will not show up on the main page's untriaged
605image list, and you will need to enable the "Ignored" toggle at the top of the
606page when looking at the triage page specific to your test.
607
Kai Ninomiyaa6429fb32018-03-30 01:30:56608## Stamping out Flakiness
609
610It's critically important to aggressively investigate and eliminate the root
611cause of any flakiness seen on the GPU bots. The bots have been known to run
612reliably for days at a time, and any flaky failures that are tolerated on the
613bots translate directly into instability of the browser experienced by
614customers. Critical bugs in subsystems like WebGL, affecting high-profile
615products like Google Maps, have escaped notice in the past because the bots
616were unreliable. After much re-work, the GPU bots are now among the most
617reliable automated test machines in the Chromium project. Let's keep them that
618way.
619
620Flakiness affecting the GPU tests can come in from highly unexpected sources.
621Here are some examples:
622
623* Intermittent pixel_test failures on Linux where the captured pixels were
624 black, caused by the Display Power Management System (DPMS) kicking in.
625 Disabled the X server's built-in screen saver on the GPU bots in response.
626* GNOME dbus-related deadlocks causing intermittent timeouts ([Issue
627 309093](http://crbug.com/309093) and related bugs).
628* Windows Audio system changes causing intermittent assertion failures in the
629 browser ([Issue 310838](http://crbug.com/310838)).
630* Enabling assertion failures in the C++ standard library on Linux causing
631 random assertion failures ([Issue 328249](http://crbug.com/328249)).
632* V8 bugs causing random crashes of the Maps pixel test (V8 issues
633 [3022](https://code.google.com/p/v8/issues/detail?id=3022),
634 [3174](https://code.google.com/p/v8/issues/detail?id=3174)).
635* TLS changes causing random browser process crashes ([Issue
636 264406](http://crbug.com/264406)).
637* Isolated test execution flakiness caused by failures to reliably clean up
638 temporary directories ([Issue 340415](http://crbug.com/340415)).
639* The Telemetry-based WebGL conformance suite caught a bug in the memory
640 allocator on Android not caught by any other bot ([Issue
641 347919](http://crbug.com/347919)).
642* context_lost test failures caused by the compositor's retry logic ([Issue
643 356453](http://crbug.com/356453)).
644* Multiple bugs in Chromium's support for lost contexts causing flakiness of
645 the context_lost tests ([Issue 365904](http://crbug.com/365904)).
646* Maps test timeouts caused by Content Security Policy changes in Blink
647 ([Issue 395914](http://crbug.com/395914)).
648* Weak pointer assertion failures in various webgl\_conformance\_tests caused
649 by changes to the media pipeline ([Issue 399417](http://crbug.com/399417)).
650* A change to a default WebSocket timeout in Telemetry causing intermittent
651 failures to run all WebGL conformance tests on the Mac bots ([Issue
652 403981](http://crbug.com/403981)).
653* Chrome leaking suspended sub-processes on Windows, apparently a preexisting
654 race condition that suddenly showed up ([Issue
655 424024](http://crbug.com/424024)).
656* Changes to Chrome's cross-context synchronization primitives causing the
657 wrong tiles to be rendered ([Issue 584381](http://crbug.com/584381)).
658* A bug in V8's handling of array literals causing flaky failures of
659 texture-related WebGL 2.0 tests ([Issue 606021](http://crbug.com/606021)).
660* Assertion failures in sync point management related to lost contexts that
661 exposed a real correctness bug ([Issue 606112](http://crbug.com/606112)).
662* A bug in glibc's `sem_post`/`sem_wait` primitives breaking V8's parallel
663 garbage collection ([Issue 609249](http://crbug.com/609249)).
Kenneth Russelld5efb3f2018-05-11 01:40:45664* A change to Blink's memory purging primitive which caused intermittent
665 timeouts of WebGL conformance tests on all platforms ([Issue
666 840988](http://crbug.com/840988)).
Brian Sheedy382a59b42020-06-09 00:22:32667* Screen DPI being inconsistent across seemingly identical Linux machines,
668 causing the Maps pixel test to flakily produce incorrectly sized images
669 ([Issue 1091410](https://crbug.com/1091410)).
Kai Ninomiyaa6429fb32018-03-30 01:30:56670
671If you notice flaky test failures either on the GPU waterfalls or try servers,
672please file bugs right away with the component Internals>GPU>Testing and
673include links to the failing builds and copies of the logs, since the logs
674expire after a few days. [GPU pixel wranglers] should give the highest priority
675to eliminating flakiness on the tree.
676
Brian Sheedy5a4c0a392021-09-22 21:28:35677[GPU pixel wranglers]: http://go/gpu-pixel-wrangler