Skip to content

[Pallas] Use fori_loop if loop bounds includes non-constexpr symbolic values#1927

Merged
AmesingFlank merged 1 commit into
mainfrom
AmesingFlank/stack/3
Apr 3, 2026
Merged

[Pallas] Use fori_loop if loop bounds includes non-constexpr symbolic values#1927
AmesingFlank merged 1 commit into
mainfrom
AmesingFlank/stack/3

Conversation

@AmesingFlank
Copy link
Copy Markdown
Contributor

@AmesingFlank AmesingFlank commented Apr 2, 2026

AmesingFlank added a commit that referenced this pull request Apr 2, 2026
… values

stack-info: PR: #1927, branch: AmesingFlank/stack/3
@AmesingFlank AmesingFlank force-pushed the AmesingFlank/stack/3 branch from 927f4d5 to 31c7abf Compare April 2, 2026 17:05
@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 2, 2026
@AmesingFlank AmesingFlank changed the title [Pallas] Use fori_loop if loop bounds includes non-constexpr symbolic values [Pallas] Use fori_loop/emit_pipeline if loop bounds includes non-constexpr symbolic values Apr 2, 2026
Comment thread helion/language/loops.py Outdated
@AmesingFlank AmesingFlank marked this pull request as draft April 2, 2026 20:36
@AmesingFlank AmesingFlank changed the title [Pallas] Use fori_loop/emit_pipeline if loop bounds includes non-constexpr symbolic values [Pallas] Use fori_loop if loop bounds includes non-constexpr symbolic values Apr 2, 2026
@AmesingFlank AmesingFlank marked this pull request as ready for review April 2, 2026 20:36
@AmesingFlank AmesingFlank marked this pull request as draft April 2, 2026 20:38
@AmesingFlank AmesingFlank marked this pull request as ready for review April 2, 2026 20:39
@AmesingFlank AmesingFlank marked this pull request as draft April 2, 2026 20:41
@AmesingFlank AmesingFlank marked this pull request as ready for review April 2, 2026 20:41
@AmesingFlank AmesingFlank marked this pull request as draft April 2, 2026 20:53
@AmesingFlank AmesingFlank marked this pull request as ready for review April 2, 2026 20:53
AmesingFlank added a commit that referenced this pull request Apr 2, 2026
… values

stack-info: PR: #1927, branch: AmesingFlank/stack/3
@AmesingFlank AmesingFlank marked this pull request as draft April 2, 2026 21:00
@AmesingFlank AmesingFlank force-pushed the AmesingFlank/stack/3 branch from 31c7abf to 2314a4c Compare April 2, 2026 21:00
@AmesingFlank AmesingFlank marked this pull request as ready for review April 2, 2026 21:00
@AmesingFlank AmesingFlank requested a review from v0i0 April 2, 2026 21:01
@AmesingFlank AmesingFlank marked this pull request as draft April 2, 2026 21:17
@AmesingFlank AmesingFlank marked this pull request as ready for review April 2, 2026 21:18
@AmesingFlank AmesingFlank marked this pull request as draft April 2, 2026 21:20
@AmesingFlank AmesingFlank marked this pull request as ready for review April 2, 2026 21:21
@AmesingFlank AmesingFlank marked this pull request as draft April 2, 2026 21:31
@AmesingFlank AmesingFlank marked this pull request as ready for review April 2, 2026 21:31
@AmesingFlank AmesingFlank marked this pull request as draft April 2, 2026 21:41
@AmesingFlank AmesingFlank marked this pull request as ready for review April 2, 2026 21:41
@AmesingFlank AmesingFlank marked this pull request as draft April 3, 2026 00:57
@AmesingFlank AmesingFlank marked this pull request as ready for review April 3, 2026 00:58
@AmesingFlank AmesingFlank marked this pull request as draft April 3, 2026 01:02
@AmesingFlank AmesingFlank marked this pull request as ready for review April 3, 2026 01:02
@AmesingFlank AmesingFlank marked this pull request as draft April 3, 2026 01:23
@AmesingFlank AmesingFlank marked this pull request as ready for review April 3, 2026 01:23
… values

stack-info: PR: #1927, branch: AmesingFlank/stack/3
@AmesingFlank AmesingFlank marked this pull request as draft April 3, 2026 03:40
@AmesingFlank AmesingFlank force-pushed the AmesingFlank/stack/3 branch from 2314a4c to ad0f62f Compare April 3, 2026 03:40
@AmesingFlank AmesingFlank marked this pull request as ready for review April 3, 2026 03:40
@AmesingFlank AmesingFlank marked this pull request as draft April 3, 2026 03:46
@AmesingFlank AmesingFlank marked this pull request as ready for review April 3, 2026 03:47
@AmesingFlank AmesingFlank marked this pull request as draft April 3, 2026 05:01
@AmesingFlank AmesingFlank marked this pull request as ready for review April 3, 2026 05:02
@AmesingFlank AmesingFlank marked this pull request as draft April 3, 2026 05:06
@AmesingFlank AmesingFlank marked this pull request as ready for review April 3, 2026 05:07
@AmesingFlank AmesingFlank merged commit a6ba80b into main Apr 3, 2026
21 checks passed
norx1991 added a commit that referenced this pull request Apr 3, 2026
For single-tile Pallas kernels with no inner loops, pallas_loop_type
(emit_pipeline/fori_loop) is meaningless and wastes autotuner trials.
Move the fragment registration from PallasBackend.tunable_fragments()
to ConfigSpec._flat_fields(), gated on a has_pallas_inner_loops flag
that is set during tracing when an inner device loop is encountered.

Also simplify the symbolic-bounds logic from #1927: instead of mutating
backend_tunable_fragments, set a has_pallas_symbolic_bounds flag and
filter out "default" when building the fragment in _flat_fields().
norx1991 added a commit that referenced this pull request Apr 3, 2026
For single-tile Pallas kernels with no inner loops, pallas_loop_type
(emit_pipeline/fori_loop) is meaningless and wastes autotuner trials.
Move the fragment registration from PallasBackend.tunable_fragments()
to ConfigSpec._flat_fields(), gated on a has_pallas_inner_loops flag
that is set during tracing when an inner device loop is encountered.

Also simplify the symbolic-bounds logic from #1927: instead of mutating
backend_tunable_fragments, set a has_pallas_symbolic_bounds flag and
filter out "default" when building the fragment in _flat_fields().
norx1991 added a commit that referenced this pull request Apr 3, 2026
For single-tile Pallas kernels with no inner loops, pallas_loop_type
(emit_pipeline/fori_loop) is meaningless and wastes autotuner trials.
Move the fragment registration from PallasBackend.tunable_fragments()
to ConfigSpec._flat_fields(), gated on a has_pallas_inner_loops flag
that is set during tracing when an inner device loop is encountered.

Also simplify the symbolic-bounds logic from #1927: instead of mutating
backend_tunable_fragments, set a has_pallas_symbolic_bounds flag and
filter out "default" when building the fragment in _flat_fields().
norx1991 added a commit that referenced this pull request Apr 3, 2026
For single-tile Pallas kernels with no inner loops, pallas_loop_type
(emit_pipeline/fori_loop) is meaningless and wastes autotuner trials.
Move the fragment registration from PallasBackend.tunable_fragments()
to ConfigSpec._flat_fields(), gated on a has_pallas_inner_loops flag
that is set during tracing when an inner device loop is encountered.

Also simplify the symbolic-bounds logic from #1927: instead of mutating
backend_tunable_fragments, set a has_pallas_symbolic_bounds flag and
filter out "default" when building the fragment in _flat_fields().
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants