Skip to content

CI: retry flaky CUDA compute check#2034

Draft
choijon5 wants to merge 1 commit into
mainfrom
choijon5/stack/5
Draft

CI: retry flaky CUDA compute check#2034
choijon5 wants to merge 1 commit into
mainfrom
choijon5/stack/5

Conversation

@choijon5
Copy link
Copy Markdown
Contributor

@choijon5 choijon5 commented Apr 17, 2026

Current retry bot retries on a different machine. For some of the failures I've seen, the machine is not down. This retries multiple times on the same machine before retrying on a different machine.
Not seeing B200 failures now, so will try this when we see B200 failures again.

choijon5 added a commit that referenced this pull request Apr 17, 2026
stack-info: PR: #2034, branch: choijon5/stack/5
@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 17, 2026
choijon5 added a commit that referenced this pull request Apr 17, 2026
stack-info: PR: #2034, branch: choijon5/stack/5
stack-info: PR: #2034, branch: choijon5/stack/5
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant