blob: 66ccad4b24d61a8990df0c2782643bec9be66b38 [file] [log] [blame] [view]
Kai Ninomiyaa6429fb32018-03-30 01:30:561# GPU Testing
2
3This set of pages documents the setup and operation of the GPU bots and try
4servers, which verify the correctness of Chrome's graphically accelerated
5rendering pipeline.
6
7[TOC]
8
9## Overview
10
11The GPU bots run a different set of tests than the majority of the Chromium
12test machines. The GPU bots specifically focus on tests which exercise the
13graphics processor, and whose results are likely to vary between graphics card
14vendors.
15
16Most of the tests on the GPU bots are run via the [Telemetry framework].
17Telemetry was originally conceived as a performance testing framework, but has
18proven valuable for correctness testing as well. Telemetry directs the browser
19to perform various operations, like page navigation and test execution, from
20external scripts written in Python. The GPU bots launch the full Chromium
21browser via Telemetry for the majority of the tests. Using the full browser to
22execute tests, rather than smaller test harnesses, has yielded several
23advantages: testing what is shipped, improved reliability, and improved
24performance.
25
26[Telemetry framework]: https://github.com/catapult-project/catapult/tree/master/telemetry
27
28A subset of the tests, called "pixel tests", grab screen snapshots of the web
29page in order to validate Chromium's rendering architecture end-to-end. Where
30necessary, GPU-specific results are maintained for these tests. Some of these
31tests verify just a few pixels, using handwritten code, in order to use the
32same validation for all brands of GPUs.
33
34The GPU bots use the Chrome infrastructure team's [recipe framework], and
35specifically the [`chromium`][recipes/chromium] and
36[`chromium_trybot`][recipes/chromium_trybot] recipes, to describe what tests to
37execute. Compared to the legacy master-side buildbot scripts, recipes make it
38easy to add new steps to the bots, change the bots' configuration, and run the
39tests locally in the same way that they are run on the bots. Additionally, the
40`chromium` and `chromium_trybot` recipes make it possible to send try jobs which
41add new steps to the bots. This single capability is a huge step forward from
42the previous configuration where new steps were added blindly, and could cause
43failures on the tryservers. For more details about the configuration of the
44bots, see the [GPU bot details].
45
John Palmer046f9872021-05-24 01:24:5646[recipe framework]: https://chromium.googlesource.com/external/github.com/luci/recipes-py/+/main/doc/user_guide.md
47[recipes/chromium]: https://chromium.googlesource.com/chromium/tools/build/+/main/scripts/slave/recipes/chromium.py
48[recipes/chromium_trybot]: https://chromium.googlesource.com/chromium/tools/build/+/main/scripts/slave/recipes/chromium_trybot.py
Kai Ninomiyaa6429fb32018-03-30 01:30:5649[GPU bot details]: gpu_testing_bot_details.md
50
51The physical hardware for the GPU bots lives in the Swarming pool\*. The
52Swarming infrastructure ([new docs][new-testing-infra], [older but currently
53more complete docs][isolated-testing-infra]) provides many benefits:
54
55* Increased parallelism for the tests; all steps for a given tryjob or
56 waterfall build run in parallel.
57* Simpler scaling: just add more hardware in order to get more capacity. No
58 manual configuration or distribution of hardware needed.
59* Easier to run certain tests only on certain operating systems or types of
60 GPUs.
61* Easier to add new operating systems or types of GPUs.
62* Clearer description of the binary and data dependencies of the tests. If
63 they run successfully locally, they'll run successfully on the bots.
64
65(\* All but a few one-off GPU bots are in the swarming pool. The exceptions to
66the rule are described in the [GPU bot details].)
67
68The bots on the [chromium.gpu.fyi] waterfall are configured to always test
69top-of-tree ANGLE. This setup is done with a few lines of code in the
70[tools/build workspace]; search the code for "angle".
71
72These aspects of the bots are described in more detail below, and in linked
73pages. There is a [presentation][bots-presentation] which gives a brief
74overview of this documentation and links back to various portions.
75
76<!-- XXX: broken link -->
77[new-testing-infra]: https://github.com/luci/luci-py/wiki
78[isolated-testing-infra]: https://www.chromium.org/developers/testing/isolated-testing/infrastructure
Kenneth Russell8a386d42018-06-02 09:48:0179[chromium.gpu]: https://ci.chromium.org/p/chromium/g/chromium.gpu/console
80[chromium.gpu.fyi]: https://ci.chromium.org/p/chromium/g/chromium.gpu.fyi/console
Josip Sokcevicba144412020-09-09 20:57:0581[tools/build workspace]: https://source.chromium.org/chromium/chromium/tools/build/+/HEAD:recipes/recipe_modules/chromium_tests/builders/chromium_gpu_fyi.py
Kai Ninomiyaa6429fb32018-03-30 01:30:5682[bots-presentation]: https://docs.google.com/presentation/d/1BC6T7pndSqPFnituR7ceG7fMY7WaGqYHhx5i9ECa8EI/edit?usp=sharing
83
84## Fleet Status
85
86Please see the [GPU Pixel Wrangling instructions] for links to dashboards
87showing the status of various bots in the GPU fleet.
88
Brian Sheedy5a4c0a392021-09-22 21:28:3589[GPU Pixel Wrangling instructions]: http://go/gpu-pixel-wrangler#fleet-status
Kai Ninomiyaa6429fb32018-03-30 01:30:5690
91## Using the GPU Bots
92
93Most Chromium developers interact with the GPU bots in two ways:
94
951. Observing the bots on the waterfalls.
962. Sending try jobs to them.
97
98The GPU bots are grouped on the [chromium.gpu] and [chromium.gpu.fyi]
99waterfalls. Their current status can be easily observed there.
100
101To send try jobs, you must first upload your CL to the codereview server. Then,
102either clicking the "CQ dry run" link or running from the command line:
103
104```sh
105git cl try
106```
107
108Sends your job to the default set of try servers.
109
110The GPU tests are part of the default set for Chromium CLs, and are run as part
111of the following tryservers' jobs:
112
Stephen Martinis089f5f02019-02-12 02:42:24113* [linux-rel], formerly on the `tryserver.chromium.linux` waterfall
114* [mac-rel], formerly on the `tryserver.chromium.mac` waterfall
Kai Ninomiya3c25da92019-10-25 23:13:32115* [win10_chromium_x64_rel_ng], formerly on the `tryserver.chromium.win` waterfall
Kai Ninomiyaa6429fb32018-03-30 01:30:56116
Kai Ninomiya3c25da92019-10-25 23:13:32117[linux-rel]: https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux-rel?limit=100
118[mac-rel]: https://ci.chromium.org/p/chromium/builders/luci.chromium.try/mac-rel?limit=100
119[win10_chromium_x64_rel_ng]: https://ci.chromium.org/p/chromium/builders/luci.chromium.try/win10_chromium_x64_rel_ng?limit=100
Kai Ninomiyaa6429fb32018-03-30 01:30:56120
121Scan down through the steps looking for the text "GPU"; that identifies those
122tests run on the GPU bots. For each test the "trigger" step can be ignored; the
123step further down for the test of the same name contains the results.
124
125It's usually not necessary to explicitly send try jobs just for verifying GPU
126tests. If you want to, you must invoke "git cl try" separately for each
127tryserver master you want to reference, for example:
128
129```sh
Stephen Martinis089f5f02019-02-12 02:42:24130git cl try -b linux-rel
131git cl try -b mac-rel
132git cl try -b win7-rel
Kai Ninomiyaa6429fb32018-03-30 01:30:56133```
134
135Alternatively, the Gerrit UI can be used to send a patch set to these try
136servers.
137
138Three optional tryservers are also available which run additional tests. As of
139this writing, they ran longer-running tests that can't run against all Chromium
140CLs due to lack of hardware capacity. They are added as part of the included
141tryservers for code changes to certain sub-directories.
142
Corentin Wallezb78c44a2018-04-12 14:29:47143* [linux_optional_gpu_tests_rel] on the [luci.chromium.try] waterfall
144* [mac_optional_gpu_tests_rel] on the [luci.chromium.try] waterfall
145* [win_optional_gpu_tests_rel] on the [luci.chromium.try] waterfall
Kai Ninomiyaa6429fb32018-03-30 01:30:56146
Corentin Wallezb78c44a2018-04-12 14:29:47147[linux_optional_gpu_tests_rel]: https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux_optional_gpu_tests_rel
148[mac_optional_gpu_tests_rel]: https://ci.chromium.org/p/chromium/builders/luci.chromium.try/mac_optional_gpu_tests_rel
149[win_optional_gpu_tests_rel]: https://ci.chromium.org/p/chromium/builders/luci.chromium.try/win_optional_gpu_tests_rel
Kenneth Russell42732952018-06-27 02:08:42150[luci.chromium.try]: https://ci.chromium.org/p/chromium/g/luci.chromium.try/builders
Kai Ninomiyaa6429fb32018-03-30 01:30:56151
152Tryservers for the [ANGLE project] are also present on the
153[tryserver.chromium.angle] waterfall. These are invoked from the Gerrit user
154interface. They are configured similarly to the tryservers for regular Chromium
155patches, and run the same tests that are run on the [chromium.gpu.fyi]
156waterfall, in the same way (e.g., against ToT ANGLE).
157
158If you find it necessary to try patches against other sub-repositories than
159Chromium (`src/`) and ANGLE (`src/third_party/angle/`), please
160[file a bug](http://crbug.com/new) with component Internals\>GPU\>Testing.
161
John Palmer046f9872021-05-24 01:24:56162[ANGLE project]: https://chromium.googlesource.com/angle/angle/+/main/README.md
Kai Ninomiyaa6429fb32018-03-30 01:30:56163[tryserver.chromium.angle]: https://build.chromium.org/p/tryserver.chromium.angle/waterfall
164[file a bug]: http://crbug.com/new
165
166## Running the GPU Tests Locally
167
168All of the GPU tests running on the bots can be run locally from a Chromium
169build. Many of the tests are simple executables:
170
171* `angle_unittests`
Takuto Ikutaf5333252019-11-06 16:07:08172* `gl_tests`
Kai Ninomiyaa6429fb32018-03-30 01:30:56173* `gl_unittests`
174* `tab_capture_end2end_tests`
175
176Some run only on the chromium.gpu.fyi waterfall, either because there isn't
177enough machine capacity at the moment, or because they're closed-source tests
178which aren't allowed to run on the regular Chromium waterfalls:
179
180* `angle_deqp_gles2_tests`
181* `angle_deqp_gles3_tests`
182* `angle_end2end_tests`
183* `audio_unittests`
184
185The remaining GPU tests are run via Telemetry. In order to run them, just
Brian Sheedy251556b2021-11-15 23:28:09186build the `telemetry_gpu_integration_test` target (or
187`telemetry_gpu_integration_test_android_chrome` for Android) and then
Kai Ninomiyaa6429fb32018-03-30 01:30:56188invoke `src/content/test/gpu/run_gpu_integration_test.py` with the appropriate
189argument. The tests this script can invoke are
190in `src/content/test/gpu/gpu_tests/`. For example:
191
192* `run_gpu_integration_test.py context_lost --browser=release`
Kai Ninomiyaa6429fb32018-03-30 01:30:56193* `run_gpu_integration_test.py webgl_conformance --browser=release --webgl-conformance-version=1.0.2`
194* `run_gpu_integration_test.py maps --browser=release`
195* `run_gpu_integration_test.py screenshot_sync --browser=release`
196* `run_gpu_integration_test.py trace_test --browser=release`
197
Brian Sheedyc4650ad02019-07-29 17:31:38198The pixel tests are a bit special. See
199[the section on running them locally](#Running-the-pixel-tests-locally) for
200details.
201
Brian Sheedy251556b2021-11-15 23:28:09202The `--browser=release` argument can be changed to `--browser=debug` if you
203built in a directory such as `out/Debug`. If you built in some non-standard
204directory such as `out/my_special_gn_config`, you can instead specify
205`--browser=exact --browser-executable=out/my_special_gn_config/chrome`.
206
207If you're testing on Android, use `--browser=android-chromium` instead of
208`--browser=release/debug` to invoke it. Additionally, Telemetry will likely
209complain about being unable to find the browser binary on Android if you build
210in a non-standard output directory. Thus, `out/Release` or `out/Debug` are
211suggested when testing on Android.
Kenneth Russellfa3ffde2018-10-24 21:24:38212
Brian Sheedy15587f72021-04-16 19:56:06213**Note:** The tests require some third-party Python packages. Obtaining these
214packages is handled automatically by `vpython`, and the script's shebang should
215use vpython if running the script directly. If you're used to invoking `python`
216to run a script, simply use `vpython` instead, e.g.
217`vpython run_gpu_integration_test.py ...`.
Kai Ninomiyaa6429fb32018-03-30 01:30:56218
Kenneth Russellfa3ffde2018-10-24 21:24:38219You can run a subset of tests with this harness:
Kai Ninomiyaa6429fb32018-03-30 01:30:56220
221* `run_gpu_integration_test.py webgl_conformance --browser=release
222 --test-filter=conformance_attribs`
223
Brian Sheedy15587f72021-04-16 19:56:06224The exact command used to invoke the test on the bots can be found in one of
225two ways:
Kai Ninomiyaa6429fb32018-03-30 01:30:56226
Brian Sheedy15587f72021-04-16 19:56:062271. Looking at the [json.input][trigger_input] of the trigger step under
228 `requests[task_slices][command]`. The arguments after the last `--` are
229 used to actually run the test.
2301. Looking at the top of a [swarming task][sample_swarming_task].
Kai Ninomiyaa6429fb32018-03-30 01:30:56231
Brian Sheedy15587f72021-04-16 19:56:06232In both cases, the following can be omitted when running locally since they're
233only necessary on swarming:
234* `testing/test_env.py`
235* `testing/scripts/run_gpu_integration_test_as_googletest.py`
236* `--isolated-script-test-output`
237* `--isolated-script-test-perf-output`
Kai Ninomiyaa6429fb32018-03-30 01:30:56238
Kai Ninomiyaa6429fb32018-03-30 01:30:56239
Brian Sheedy15587f72021-04-16 19:56:06240[trigger_input]: https://logs.chromium.org/logs/chromium/buildbucket/cr-buildbucket.appspot.com/8849851608240828544/+/u/test_pre_run__14_/l_trigger__webgl2_conformance_d3d11_passthrough_tests_on_NVIDIA_GPU_on_Windows_on_Windows-10-18363/json.input
241[sample_swarming_task]: https://chromium-swarm.appspot.com/task?id=52f06058bfb31b10
Kai Ninomiyaa6429fb32018-03-30 01:30:56242
243The Maps test requires you to authenticate to cloud storage in order to access
244the Web Page Reply archive containing the test. See [Cloud Storage Credentials]
245for documentation on setting this up.
246
247[Cloud Storage Credentials]: gpu_testing_bot_details.md#Cloud-storage-credentials
248
Brian Sheedy15587f72021-04-16 19:56:06249### Telemetry Test Suites
250The Telemetry-based tests are all technically the same target,
251`telemetry_gpu_integration_test`, just run with different runtime arguments. The
252first positional argument passed determines which suite will run, and additional
253runtime arguments may cause the step name to change on the bots. Here is a list
254of all suites and resulting step names as of April 15th 2021:
255
256* `context_lost`
257 * `context_lost_passthrough_tests`
258 * `context_lost_tests`
259 * `context_lost_validating_tests`
260 * `gl_renderer_context_lost_tests`
261* `depth_capture`
262 * `depth_capture_tests`
263 * `gl_renderer_depth_capture_tests`
264* `hardware_accelerated_feature`
265 * `gl_renderer_hardware_accelerated_feature_tests`
266 * `hardware_accelerated_feature_tests`
267* `gpu_process`
268 * `gl_renderer_gpu_process_launch_tests`
269 * `gpu_process_launch_tests`
270* `info_collection`
271 * `info_collection_tests`
272* `maps`
273 * `gl_renderer_maps_pixel_tests`
274 * `maps_pixel_passthrough_test`
275 * `maps_pixel_test`
276 * `maps_pixel_validating_test`
277 * `maps_tests`
278* `pixel`
279 * `android_webview_pixel_skia_gold_test`
280 * `dawn_pixel_skia_gold_test`
281 * `egl_pixel_skia_gold_test`
282 * `gl_renderer_pixel_skia_gold_tests`
283 * `pixel_skia_gold_passthrough_test`
284 * `pixel_skia_gold_validating_test`
285 * `pixel_tests`
286 * `skia_renderer_pixel_skia_gold_test`
287 * `vulkan_pixel_skia_gold_test`
288* `power`
289 * `power_measurement_test`
290* `screenshot_sync`
291 * `gl_renderer_screenshot_sync_tests`
292 * `screenshot_sync_passthrough_tests`
293 * `screenshot_sync_tests`
294 * `screenshot_sync_validating_tests`
295* `trace_test`
296 * `trace_test`
297* `webgl_conformance`
298 * `webgl2_conformance_d3d11_passthrough_tests`
299 * `webgl2_conformance_gl_passthrough_tests`
300 * `webgl2_conformance_gles_passthrough_tests`
301 * `webgl2_conformance_tests`
302 * `webgl2_conformance_validating_tests`
303 * `webgl_conformance_d3d11_passthrough_tests`
304 * `webgl_conformance_d3d9_passthrough_tests`
305 * `webgl_conformance_fast_call_tests`
306 * `webgl_conformance_gl_passthrough_tests`
307 * `webgl_conformance_gles_passthrough_tests`
308 * `webgl_conformance_metal_passthrough_tests`
309 * `webgl_conformance_swangle_passthrough_tests`
310 * `webgl_conformance_swiftshader_validating_tests`
311 * `webgl_conformance_tests`
312 * `webgl_conformance_validating_tests`
313 * `webgl_conformance_vulkan_passthrough_tests`
314
Kenneth Russellfa3ffde2018-10-24 21:24:38315### Running the pixel tests locally
Kai Ninomiyaa6429fb32018-03-30 01:30:56316
Brian Sheedyc4650ad02019-07-29 17:31:38317The pixel tests are a special case because they use an external Skia service
318called Gold to handle image approval and storage. See
319[GPU Pixel Testing With Gold] for specifics.
Kenneth Russellfa3ffde2018-10-24 21:24:38320
Brian Sheedyc4650ad02019-07-29 17:31:38321[GPU Pixel Testing With Gold]: gpu_pixel_testing_with_gold.md
Kenneth Russellfa3ffde2018-10-24 21:24:38322
Brian Sheedyc4650ad02019-07-29 17:31:38323TL;DR is that the pixel tests use a binary called `goldctl` to download and
324upload data when running pixel tests.
Kenneth Russellfa3ffde2018-10-24 21:24:38325
Brian Sheedyc4650ad02019-07-29 17:31:38326Normally, `goldctl` uploads images and image metadata to the Gold server when
327used. This is not desirable when running locally for a couple reasons:
Kenneth Russellfa3ffde2018-10-24 21:24:38328
Brian Sheedyc4650ad02019-07-29 17:31:383291. Uploading requires the user to be whitelisted on the server, and whitelisting
330everyone who wants to run the tests locally is not a viable solution.
3312. Images produced during local runs are usually slightly different from those
332that are produced on the bots due to hardware/software differences. Thus, most
333images uploaded to Gold from local runs would likely only ever actually be used
334by tests run on the machine that initially generated those images, which just
335adds noise to the list of approved images.
Kenneth Russellfa3ffde2018-10-24 21:24:38336
Brian Sheedyc4650ad02019-07-29 17:31:38337Additionally, the tests normally rely on the Gold server for viewing images
338produced by a test run. This does not work if the data is not actually uploaded.
Kenneth Russellfa3ffde2018-10-24 21:24:38339
Brian Sheedyb70d3102019-10-14 22:41:50340The pixel tests contain logic to automatically determine whether they are
341running on a workstation or not, as well as to determine what git revision is
342being tested. This *should* mean that the pixel tests will automatically work
343when run locally. However, if the local run detection code fails for some
344reason, you can manually pass some flags to force the same behavior:
345
Brian Sheedy2df4e142020-06-15 21:49:33346In order to get around the local run issues, simply pass the
347`--local-pixel-tests` flag to the tests. This will disable uploading, but
348otherwise go through the same steps as a test normally would. Each test will
349also print out `file://` URLs to the produced image, the closest image for the
350test known to Gold, and the diff between the two.
Kenneth Russellfa3ffde2018-10-24 21:24:38351
Brian Sheedyc4650ad02019-07-29 17:31:38352Because the image produced by the test locally is likely slightly different from
353any of the approved images in Gold, local test runs are likely to fail during
354the comparison step. In order to cut down on the amount of noise, you can also
355pass the `--no-skia-gold-failure` flag to not fail the test on a failed image
356comparison. When using `--no-skia-gold-failure`, you'll also need to pass the
357`--passthrough` flag in order to actually see the link output.
Kenneth Russellfa3ffde2018-10-24 21:24:38358
Brian Sheedyc4650ad02019-07-29 17:31:38359Example usage:
Brian Sheedy2df4e142020-06-15 21:49:33360`run_gpu_integration_test.py pixel --no-skia-gold-failure --local-pixel-tests
jonross8de90742019-10-15 19:10:48361--passthrough`
Kenneth Russellfa3ffde2018-10-24 21:24:38362
jonross8de90742019-10-15 19:10:48363If, for some reason, the local run code is unable to determine what the git
Brian Sheedy4d335deb2020-04-01 20:47:32364revision is, simply pass `--git-revision aabbccdd`. Note that `aabbccdd` must
jonross8de90742019-10-15 19:10:48365be replaced with an actual Chromium src revision (typically whatever revision
Andrew Williamsbbc1a1e2021-07-21 01:51:22366origin/main is currently synced to) in order for the tests to work. This can
jonross8de90742019-10-15 19:10:48367be done automatically using:
Brian Sheedy2df4e142020-06-15 21:49:33368``run_gpu_integration_test.py pixel --no-skia-gold-failure --local-pixel-tests
Andrew Williamsbbc1a1e2021-07-21 01:51:22369--passthrough --git-revision `git rev-parse origin/main` ``
Kai Ninomiyaa6429fb32018-03-30 01:30:56370
Kai Ninomiyaa6429fb32018-03-30 01:30:56371## Running Binaries from the Bots Locally
372
373Any binary run remotely on a bot can also be run locally, assuming the local
374machine loosely matches the architecture and OS of the bot.
375
376The easiest way to do this is to find the ID of the swarming task and use
377"swarming.py reproduce" to re-run it:
378
Takuto Ikuta2d01a492021-06-04 00:28:58379* `./src/tools/luci-go/swarming reproduce -S https://chromium-swarm.appspot.com [task ID]`
Kai Ninomiyaa6429fb32018-03-30 01:30:56380
381The task ID can be found in the stdio for the "trigger" step for the test. For
382example, look at a recent build from the [Mac Release (Intel)] bot, and
383look at the `gl_unittests` step. You will see something like:
384
Yves Gereya702f6222019-01-24 11:07:30385[Mac Release (Intel)]: https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Mac%20Release%20%28Intel%29/
Kai Ninomiyaa6429fb32018-03-30 01:30:56386
387```
388Triggered task: gl_unittests on Intel GPU on Mac/Mac-10.12.6/[TRUNCATED_ISOLATE_HASH]/Mac Release (Intel)/83664
389To collect results, use:
390 swarming.py collect -S https://chromium-swarm.appspot.com --json /var/folders/[PATH_TO_TEMP_FILE].json
391Or visit:
392 https://chromium-swarm.appspot.com/user/task/[TASK_ID]
393```
394
395There is a difference between the isolate's hash and Swarming's task ID. Make
396sure you use the task ID and not the isolate's hash.
397
398As of this writing, there seems to be a
399[bug](https://github.com/luci/luci-py/issues/250)
400when attempting to re-run the Telemetry based GPU tests in this way. For the
401time being, this can be worked around by instead downloading the contents of
Brian Sheedy15587f72021-04-16 19:56:06402the isolate. To do so, look into the "Reproducing the task locally" section on
403a swarming task, which contains something like:
Kai Ninomiyaa6429fb32018-03-30 01:30:56404
Brian Sheedy15587f72021-04-16 19:56:06405```
406Download inputs files into directory foo:
Junji Watanabe160300022021-09-27 03:09:53407# (if needed, use "\${platform}" as-is) cipd install "infra/tools/luci/cas/\${platform}" -root bar
408# (if needed) ./bar/cas login
409./bar/cas download -cas-instance projects/chromium-swarm/instances/default_instance -digest 68ae1d6b22673b0ab7b4427ca1fc2a4761c9ee53474105b9076a23a67e97a18a/647 -dir foo
Brian Sheedy15587f72021-04-16 19:56:06410```
Kai Ninomiyaa6429fb32018-03-30 01:30:56411
412Before attempting to download an isolate, you must ensure you have permission
413to access the isolate server. Full instructions can be [found
414here][isolate-server-credentials]. For most cases, you can simply run:
415
Takuto Ikuta2d01a492021-06-04 00:28:58416* `./src/tools/luci-go/isolate login`
Kai Ninomiyaa6429fb32018-03-30 01:30:56417
418The above link requires that you log in with your @google.com credentials. It's
419not known at the present time whether this works with @chromium.org accounts.
420Email kbr@ if you try this and find it doesn't work.
421
422[isolate-server-credentials]: gpu_testing_bot_details.md#Isolate-server-credentials
423
424## Running Locally Built Binaries on the GPU Bots
425
426See the [Swarming documentation] for instructions on how to upload your binaries to the isolate server and trigger execution on Swarming.
427
John Budorickb2ff2242019-11-14 17:35:59428Be sure to use the correct swarming dimensions for your desired GPU e.g. "1002:6613" instead of "AMD Radeon R7 240 (1002:6613)" which is how it appears on swarming task page. You can query bots in the chromium.tests.gpu pool to find the correct dimensions:
Sunny Sachanandani8d071572019-06-13 20:17:58429
Takuto Ikuta2d01a492021-06-04 00:28:58430* `tools\luci-go\swarming bots -S chromium-swarm.appspot.com -d pool=chromium.tests.gpu`
Sunny Sachanandani8d071572019-06-13 20:17:58431
Kai Ninomiyaa6429fb32018-03-30 01:30:56432[Swarming documentation]: https://www.chromium.org/developers/testing/isolated-testing/for-swes#TOC-Run-a-test-built-locally-on-Swarming
433
Kenneth Russell42732952018-06-27 02:08:42434## Moving Test Binaries from Machine to Machine
435
436To create a zip archive of your personal Chromium build plus all of
437the Telemetry-based GPU tests' dependencies, which you can then move
438to another machine for testing:
439
4401. Build Chrome (into `out/Release` in this example).
Brian Sheedy15587f72021-04-16 19:56:064411. `vpython tools/mb/mb.py zip out/Release/ telemetry_gpu_integration_test out/telemetry_gpu_integration_test.zip`
Kenneth Russell42732952018-06-27 02:08:42442
443Then copy telemetry_gpu_integration_test.zip to another machine. Unzip
444it, and cd into the resulting directory. Invoke
445`content/test/gpu/run_gpu_integration_test.py` as above.
446
447This workflow has been tested successfully on Windows with a
448statically-linked Release build of Chrome.
449
450Note: on one macOS machine, this command failed because of a broken
451`strip-json-comments` symlink in
452`src/third_party/catapult/common/node_runner/node_runner/node_modules/.bin`. Deleting
453that symlink allowed it to proceed.
454
455Note also: on the same macOS machine, with a component build, this
456command failed to zip up a working Chromium binary. The browser failed
457to start with the following error:
458
459`[0626/180440.571670:FATAL:chrome_main_delegate.cc(1057)] Check failed: service_manifest_data_pack_.`
460
461In a pinch, this command could be used to bundle up everything, but
462the "out" directory could be deleted from the resulting zip archive,
463and the Chromium binaries moved over to the target machine. Then the
464command line arguments `--browser=exact --browser-executable=[path]`
465can be used to launch that specific browser.
466
467See the [user guide for mb](../../tools/mb/docs/user_guide.md#mb-zip), the
468meta-build system, for more details.
469
Kai Ninomiyaa6429fb32018-03-30 01:30:56470## Adding New Tests to the GPU Bots
471
472The goal of the GPU bots is to avoid regressions in Chrome's rendering stack.
473To that end, let's add as many tests as possible that will help catch
474regressions in the product. If you see a crazy bug in Chrome's rendering which
475would be easy to catch with a pixel test running in Chrome and hard to catch in
476any of the other test harnesses, please, invest the time to add a test!
477
478There are a couple of different ways to add new tests to the bots:
479
4801. Adding a new test to one of the existing harnesses.
4812. Adding an entire new test step to the bots.
482
483### Adding a new test to one of the existing test harnesses
484
485Adding new tests to the GTest-based harnesses is straightforward and
486essentially requires no explanation.
487
488As of this writing it isn't as easy as desired to add a new test to one of the
489Telemetry based harnesses. See [Issue 352807](http://crbug.com/352807). Let's
490collectively work to address that issue. It would be great to reduce the number
491of steps on the GPU bots, or at least to avoid significantly increasing the
492number of steps on the bots. The WebGL conformance tests should probably remain
493a separate step, but some of the smaller Telemetry based tests
494(`context_lost_tests`, `memory_test`, etc.) should probably be combined into a
495single step.
496
497If you are adding a new test to one of the existing tests (e.g., `pixel_test`),
498all you need to do is make sure that your new test runs correctly via isolates.
499See the documentation from the GPU bot details on [adding new isolated
Daniel Bratellf73f0df2018-09-24 13:52:49500tests][new-isolates] for the gn args and authentication needed to upload
Kai Ninomiyaa6429fb32018-03-30 01:30:56501isolates to the isolate server. Most likely the new test will be Telemetry
Takuto Ikuta2d01a492021-06-04 00:28:58502based, and included in the `telemetry_gpu_test_run` isolate.
Kai Ninomiyaa6429fb32018-03-30 01:30:56503
504[new-isolates]: gpu_testing_bot_details.md#Adding-a-new-isolated-test-to-the-bots
505
Jamie Madill5b0716b2019-10-24 16:43:47506### Adding new steps to the GPU Bots
Kai Ninomiyaa6429fb32018-03-30 01:30:56507
508The tests that are run by the GPU bots are described by a couple of JSON files
509in the Chromium workspace:
510
John Palmer046f9872021-05-24 01:24:56511* [`chromium.gpu.json`](https://chromium.googlesource.com/chromium/src/+/main/testing/buildbot/chromium.gpu.json)
512* [`chromium.gpu.fyi.json`](https://chromium.googlesource.com/chromium/src/+/main/testing/buildbot/chromium.gpu.fyi.json)
Kai Ninomiyaa6429fb32018-03-30 01:30:56513
514These files are autogenerated by the following script:
515
John Palmer046f9872021-05-24 01:24:56516* [`generate_buildbot_json.py`](https://chromium.googlesource.com/chromium/src/+/main/testing/buildbot/generate_buildbot_json.py)
Kai Ninomiyaa6429fb32018-03-30 01:30:56517
Kenneth Russell8a386d42018-06-02 09:48:01518This script is documented in
John Palmer046f9872021-05-24 01:24:56519[`testing/buildbot/README.md`](https://chromium.googlesource.com/chromium/src/+/main/testing/buildbot/README.md). The
Kenneth Russell8a386d42018-06-02 09:48:01520JSON files are parsed by the chromium and chromium_trybot recipes, and describe
521two basic types of tests:
Kai Ninomiyaa6429fb32018-03-30 01:30:56522
523* GTests: those which use the Googletest and Chromium's `base/test/launcher/`
524 frameworks.
Kenneth Russell8a386d42018-06-02 09:48:01525* Isolated scripts: tests whose initial entry point is a Python script which
526 follows a simple convention of command line argument parsing.
527
528The majority of the GPU tests are however:
529
530* Telemetry based tests: an isolated script test which is built on the
531 Telemetry framework and which launches the entire browser.
Kai Ninomiyaa6429fb32018-03-30 01:30:56532
533A prerequisite of adding a new test to the bots is that that test [run via
Kenneth Russell8a386d42018-06-02 09:48:01534isolates][new-isolates]. Once that is done, modify `test_suites.pyl` to add the
535test to the appropriate set of bots. Be careful when adding large new test steps
536to all of the bots, because the GPU bots are a limited resource and do not
537currently have the capacity to absorb large new test suites. It is safer to get
538new tests running on the chromium.gpu.fyi waterfall first, and expand from there
539to the chromium.gpu waterfall (which will also make them run against every
Stephen Martinis089f5f02019-02-12 02:42:24540Chromium CL by virtue of the `linux-rel`, `mac-rel`, `win7-rel` and
541`android-marshmallow-arm64-rel` tryservers' mirroring of the bots on this
542waterfall – so be careful!).
Kai Ninomiyaa6429fb32018-03-30 01:30:56543
544Tryjobs which add new test steps to the chromium.gpu.json file will run those
545new steps during the tryjob, which helps ensure that the new test won't break
546once it starts running on the waterfall.
547
548Tryjobs which modify chromium.gpu.fyi.json can be sent to the
549`win_optional_gpu_tests_rel`, `mac_optional_gpu_tests_rel` and
550`linux_optional_gpu_tests_rel` tryservers to help ensure that they won't
551break the FYI bots.
552
Kenneth Russellfa3ffde2018-10-24 21:24:38553## Debugging Pixel Test Failures on the GPU Bots
554
Brian Sheedyc4650ad02019-07-29 17:31:38555If pixel tests fail on the bots, the build step will contain either one or more
556links titled `gold_triage_link for <test name>` or a single link titled
557`Too many artifacts produced to link individually, click for links`, which
558itself will contain links. In either case, these links will direct to Gold
559pages showing the image produced by the image and the approved image that most
560closely matches it.
Kenneth Russellfa3ffde2018-10-24 21:24:38561
Quinten Yearsley317532d2021-10-20 17:10:31562Note that for the tests which programmatically check colors in certain regions of
Brian Sheedyc4650ad02019-07-29 17:31:38563the image (tests with `expected_colors` fields in [pixel_test_pages]), there
564likely won't be a closest approved image since those tests only upload data to
565Gold in the event of a failure.
Kenneth Russellfa3ffde2018-10-24 21:24:38566
Brian Sheedyc4650ad02019-07-29 17:31:38567[pixel_test_pages]: https://cs.chromium.org/chromium/src/content/test/gpu/gpu_tests/pixel_test_pages.py
Kenneth Russellfa3ffde2018-10-24 21:24:38568
Kai Ninomiyaa6429fb32018-03-30 01:30:56569## Updating and Adding New Pixel Tests to the GPU Bots
570
Brian Sheedyc4650ad02019-07-29 17:31:38571If your CL adds a new pixel test or modifies existing ones, it's likely that
572you will have to approve new images. Simply run your CL through the CQ and
573follow the steps outline [here][pixel wrangling triage] under the "Check if any
574pixel test failures are actual failures or need to be rebaselined." step.
Kai Ninomiyaa6429fb32018-03-30 01:30:56575
Brian Sheedy5a4c0a392021-09-22 21:28:35576[pixel wrangling triage]: http://go/gpu-pixel-wrangler-info#how-to-keep-the-bots-green
Kai Ninomiyaa6429fb32018-03-30 01:30:56577
Brian Sheedy5a88cc72019-09-27 23:04:35578If you are adding a new pixel test, it is beneficial to set the
579`grace_period_end` argument in the test's definition. This will allow the test
580to run for a period without actually failing on the waterfall bots, giving you
581some time to triage any additional images that show up on them. This helps
582prevent new tests from making the bots red because they're producing slightly
583different but valid images from the ones triaged while the CL was in review.
584Example:
585
586```
587from datetime import date
588
589...
590
591PixelTestPage(
592 'foo_pixel_test.html',
593 ...
594 grace_period_end=date(2020, 1, 1)
595)
596```
597
598You should typically set the grace period to end 1-2 days after the the CL will
599land.
600
Brian Sheedyc4650ad02019-07-29 17:31:38601Once your CL passes the CQ, you should be mostly good to go, although you should
602keep an eye on the waterfall bots for a short period after your CL lands in case
603any configurations not covered by the CQ need to have images approved, as well.
Brian Sheedy5a88cc72019-09-27 23:04:35604All untriaged images for your test can be found by substituting your test name
605into:
606
607`https://chrome-gpu-gold.skia.org/search?query=name%3D<test name>`
Kai Ninomiyaa6429fb32018-03-30 01:30:56608
Brian Sheedye4a03fc2020-05-13 23:12:00609**NOTE** If you have a grace period active for your test, then Gold will be told
610to ignore results for the test. This is so that it does not comment on unrelated
611CLs about untriaged images if your test is noisy. Images will still be uploaded
612to Gold and can be triaged, but will not show up on the main page's untriaged
613image list, and you will need to enable the "Ignored" toggle at the top of the
614page when looking at the triage page specific to your test.
615
Kai Ninomiyaa6429fb32018-03-30 01:30:56616## Stamping out Flakiness
617
618It's critically important to aggressively investigate and eliminate the root
619cause of any flakiness seen on the GPU bots. The bots have been known to run
620reliably for days at a time, and any flaky failures that are tolerated on the
621bots translate directly into instability of the browser experienced by
622customers. Critical bugs in subsystems like WebGL, affecting high-profile
623products like Google Maps, have escaped notice in the past because the bots
624were unreliable. After much re-work, the GPU bots are now among the most
625reliable automated test machines in the Chromium project. Let's keep them that
626way.
627
628Flakiness affecting the GPU tests can come in from highly unexpected sources.
629Here are some examples:
630
631* Intermittent pixel_test failures on Linux where the captured pixels were
632 black, caused by the Display Power Management System (DPMS) kicking in.
633 Disabled the X server's built-in screen saver on the GPU bots in response.
634* GNOME dbus-related deadlocks causing intermittent timeouts ([Issue
635 309093](http://crbug.com/309093) and related bugs).
636* Windows Audio system changes causing intermittent assertion failures in the
637 browser ([Issue 310838](http://crbug.com/310838)).
638* Enabling assertion failures in the C++ standard library on Linux causing
639 random assertion failures ([Issue 328249](http://crbug.com/328249)).
640* V8 bugs causing random crashes of the Maps pixel test (V8 issues
641 [3022](https://code.google.com/p/v8/issues/detail?id=3022),
642 [3174](https://code.google.com/p/v8/issues/detail?id=3174)).
643* TLS changes causing random browser process crashes ([Issue
644 264406](http://crbug.com/264406)).
645* Isolated test execution flakiness caused by failures to reliably clean up
646 temporary directories ([Issue 340415](http://crbug.com/340415)).
647* The Telemetry-based WebGL conformance suite caught a bug in the memory
648 allocator on Android not caught by any other bot ([Issue
649 347919](http://crbug.com/347919)).
650* context_lost test failures caused by the compositor's retry logic ([Issue
651 356453](http://crbug.com/356453)).
652* Multiple bugs in Chromium's support for lost contexts causing flakiness of
653 the context_lost tests ([Issue 365904](http://crbug.com/365904)).
654* Maps test timeouts caused by Content Security Policy changes in Blink
655 ([Issue 395914](http://crbug.com/395914)).
656* Weak pointer assertion failures in various webgl\_conformance\_tests caused
657 by changes to the media pipeline ([Issue 399417](http://crbug.com/399417)).
658* A change to a default WebSocket timeout in Telemetry causing intermittent
659 failures to run all WebGL conformance tests on the Mac bots ([Issue
660 403981](http://crbug.com/403981)).
661* Chrome leaking suspended sub-processes on Windows, apparently a preexisting
662 race condition that suddenly showed up ([Issue
663 424024](http://crbug.com/424024)).
664* Changes to Chrome's cross-context synchronization primitives causing the
665 wrong tiles to be rendered ([Issue 584381](http://crbug.com/584381)).
666* A bug in V8's handling of array literals causing flaky failures of
667 texture-related WebGL 2.0 tests ([Issue 606021](http://crbug.com/606021)).
668* Assertion failures in sync point management related to lost contexts that
669 exposed a real correctness bug ([Issue 606112](http://crbug.com/606112)).
670* A bug in glibc's `sem_post`/`sem_wait` primitives breaking V8's parallel
671 garbage collection ([Issue 609249](http://crbug.com/609249)).
Kenneth Russelld5efb3f2018-05-11 01:40:45672* A change to Blink's memory purging primitive which caused intermittent
673 timeouts of WebGL conformance tests on all platforms ([Issue
674 840988](http://crbug.com/840988)).
Brian Sheedy382a59b42020-06-09 00:22:32675* Screen DPI being inconsistent across seemingly identical Linux machines,
676 causing the Maps pixel test to flakily produce incorrectly sized images
677 ([Issue 1091410](https://crbug.com/1091410)).
Kai Ninomiyaa6429fb32018-03-30 01:30:56678
679If you notice flaky test failures either on the GPU waterfalls or try servers,
680please file bugs right away with the component Internals>GPU>Testing and
681include links to the failing builds and copies of the logs, since the logs
682expire after a few days. [GPU pixel wranglers] should give the highest priority
683to eliminating flakiness on the tree.
684
Brian Sheedy5a4c0a392021-09-22 21:28:35685[GPU pixel wranglers]: http://go/gpu-pixel-wrangler