[Autotuner] Add FROM_BEST_AVAILABLE initial population strategy#1365
Merged
Conversation
86dc446 to
b4d04a7
Compare
Collaborator
Author
7df1c0a to
ff8d8c8
Compare
Moved imports to top Made MAX_BEST_AVAILABLE_CONFIGS configurable Added cache scan limit configurable Updated tests accordingly
Rename test_warm_start to reflect from_best_available
…igSpec attributes instead of hardcoded list
other improvements
2c448da to
c10607d
Compare
Collaborator
Author
|
@jansel I don't think the failed test is related to this PR |
jansel
requested changes
Feb 10, 2026
Contributor
jansel
left a comment
There was a problem hiding this comment.
@fulvius31 can you rebase and resolve the merge conflict? That might also fix the test.
Collaborator
Author
jansel
requested changes
Feb 28, 2026
Collaborator
Author
|
@jansel I don't think the tests fail were related to this PR. |
jansel
requested changes
Mar 1, 2026
jansel
approved these changes
Mar 3, 2026
Contributor
|
@fulvius31 can you rebase and fix merge conflicts? |
653df7f to
549bfef
Compare
Collaborator
Author
@jansel done |
nullplay
pushed a commit
to nullplay/helion
that referenced
this pull request
Mar 17, 2026
umechand-amd
pushed a commit
to umechand-amd/helion
that referenced
this pull request
Mar 23, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
FROM_BEST_AVAILABLEinitial population strategy that bootstraps autotuning from previously cached best configs, probably addressing the request for "bootstrapping from a known good config" in #1274.Target use case: Developers iterating on kernel code who want faster autotuning without trying to not sacrifice kernel performance and without falling back to fixed, pre-defined configs.
How it works
Cache matching uses hardware name + normalized specialization key (tensor dtype, device, shape, strides), filtering out code object references so configs transfer across kernel edits.
Benchmark results using 1 cached best_config and default PatternSearch
The kernels used are the one from ~/examples.
Hardware : Nvidia RTX 5090
torch Version: 2.10.0+cu130
helion Version: 0.2.11.dev7+ga7e94e60c
triton Version: 3.6.0+git9844da95
MatMul Benchmark
Result: FROM_BEST_AVAILABLE with full cache matches Full Random kernel times across all implementations at 13x less tuning cost. FROM_DEFAULT is 19x faster but produces 14-31% slower kernels.
Softmax Benchmark
Result: Mixed outcome—FROM_DEFAULT wins for Helion Simple (best kernel at lowest cost), but FROM_BEST_AVAILABLE (default cache) wins for Helion Two Pass.
Key takeaways
When to use
Usage
HELION_AUTOTUNE_EFFORT=quick HELION_AUTOTUNER_INITIAL_POPULATION=from_best_available python example/{matmul.py,softmax.py}Configuration
HELION_BEST_AVAILABLE_MAX_CONFIGSHELION_BEST_AVAILABLE_MAX_CACHE_SCAN