indutor: fix issue of compute index_expr range by XiaobingSuper · Pull Request #103147 · pytorch/pytorch

XiaobingSuper · 2023-06-07T07:24:53Z

Stack from ghstack (oldest at bottom):

-> indutor: fix issue of compute index_expr range #103147

For the CPU inductor side, there has an optimization to convert int64 index_expr to int32 for good performance(https://github.com/pytorch/pytorch/blob/main/torch/_inductor/codegen/cpp.py#L2034), but for ModularIndexing exp, we replace it as division(https://github.com/pytorch/pytorch/blob/main/torch/_inductor/optimize_indexing.py#L73, ModularIndexing doesn't have derivative) to compute derivative and then compute the expr's value range, there may meet issue which the min value may greater than the max value(ModularIndexing(513*i2 + i3 + 262400, 512, 513), with vars_ranges is {i2: ValueRanges(lower=0, upper=256), i3: ValueRanges(lower=0, upper=513)}).

One solution is that we don't replace ModularIndexing, but it can't get the value range.
Another solution is that return inf range when the min val is great than the max val.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @ngimel @yf225 @chenyang78

[ghstack-poisoned]

pytorch-bot · 2023-06-07T07:24:56Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/103147

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ 2 Unrelated Failures

As of commit e95601d:

UNSTABLE - The following jobs failed but were likely due to flakiness present on trunk and has been marked as unstable:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

XiaobingSuper · 2023-06-07T08:08:20Z

I just meet one index expr which only has one ModularIndexing, for a complex expr has ModularIndexing and other operatoions, I don't think it is ok using the current method to get the value range.

jgong5 · 2023-06-07T08:38:01Z

+        # min_value may be greater than max_value, such as ModularIndexing(513*i2 + i3 + 262400, 512, 513),
+        # with vars_ranges is {i2: ValueRanges(lower=0, upper=256), i3: ValueRanges(lower=0, upper=513)}.


I guess a better approach is to deduce the range with the divisor. For example, we may create a new symbol with the range [0, divisor-1] suppose the divisor is constant, and put this range to the vars_ranges to calculate the min/max value with the algorithm in this function.

For the CPU inductor side, there has an optimization to convert ```int64``` index_expr to ```int32``` for good performance(https://github.com/pytorch/pytorch/blob/main/torch/_inductor/codegen/cpp.py#L2034), but for ```ModularIndexing``` exp, we replace it as division(https://github.com/pytorch/pytorch/blob/main/torch/_inductor/optimize_indexing.py#L73, ```ModularIndexing``` doesn't have derivative) to compute derivative and then compute the expr's value range, there may meet issue which the min value may greater than the max value(```ModularIndexing(513*i2 + i3 + 262400, 512, 513), with vars_ranges is {i2: ValueRanges(lower=0, upper=256), i3: ValueRanges(lower=0, upper=513)}```). One solution is that we don't replace ```ModularIndexing```, but it can't get the value range. Another solution is that return ```inf``` range when the min val is great than the max val. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 [ghstack-poisoned]

ghstack-source-id: a4a1ec5 Pull Request resolved: #103147

jgong5 · 2023-06-07T10:15:14Z

+    if len(symbols) == 0:
        return ValueRanges(expr, expr)

+    vars_ranges_temp = vars_ranges.copy()


simpler to do vars_ranges = vars_ranges.copy()?

ngimel · 2023-06-07T17:09:10Z

We should just return (0, z-1) range for ModularIndexing, and not go through derivative computation. cc @eellison

XiaobingSuper · 2023-06-08T01:07:08Z

We should just return (0, z-1) range for ModularIndexing, and not go through derivative computation. cc @eellison

Yes, for a simple expr which only has ModularIndexing, it is ok just to return (0, z-1), but for a complex expr, I think we need still to use derivative and replace ModularIndexing as a variable that has a range (0, z-1). @eellison

eellison

possible to add test ?

eellison · 2023-06-09T19:12:19Z

        def mod_indexing_rep(x, y, z):
            if z.is_constant():
-                return x / y
+                new_var = sympy_symbol("mod_index" + f"{next(cnt)}")


Should we check if x / y has a range <= z and return x / y in that case ?

If we want to return x/y, it needs to check z is positive or not, and x/y ranges are the same sign, for example, if x/y ranges are [-2, 2], z is 4, we can't direct return x/y. if we want to return x/y, we may need to add consider many conditions, I think using z's range is ok even if the range is huge for some cases.

maybe add TODO to optimize more

Yes, added.

ngimel · 2023-06-09T21:26:09Z

cc @lezcano, does this overlap with #102722?

lezcano · 2023-06-10T01:35:56Z

Yeah, #102722 is strictly better than this, but this is better than the current state so I'm fine with merging this one.

FWIW, I'm going to be on PTO for the next 3 weeks, so let's merge this one now and I'll have the other one ready when I'm back.

For the CPU inductor side, there has an optimization to convert ```int64``` index_expr to ```int32``` for good performance(https://github.com/pytorch/pytorch/blob/main/torch/_inductor/codegen/cpp.py#L2034), but for ```ModularIndexing``` exp, we replace it as division(https://github.com/pytorch/pytorch/blob/main/torch/_inductor/optimize_indexing.py#L73, ```ModularIndexing``` doesn't have derivative) to compute derivative and then compute the expr's value range, there may meet issue which the min value may greater than the max value(```ModularIndexing(513*i2 + i3 + 262400, 512, 513), with vars_ranges is {i2: ValueRanges(lower=0, upper=256), i3: ValueRanges(lower=0, upper=513)}```). One solution is that we don't replace ```ModularIndexing```, but it can't get the value range. Another solution is that return ```inf``` range when the min val is great than the max val. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 [ghstack-poisoned]

ghstack-source-id: c543560 Pull Request resolved: #103147

XiaobingSuper · 2023-06-12T03:05:53Z

possible to add test ?

Added.

For the CPU inductor side, there has an optimization to convert ```int64``` index_expr to ```int32``` for good performance(https://github.com/pytorch/pytorch/blob/main/torch/_inductor/codegen/cpp.py#L2034), but for ```ModularIndexing``` exp, we replace it as division(https://github.com/pytorch/pytorch/blob/main/torch/_inductor/optimize_indexing.py#L73, ```ModularIndexing``` doesn't have derivative) to compute derivative and then compute the expr's value range, there may meet issue which the min value may greater than the max value(```ModularIndexing(513*i2 + i3 + 262400, 512, 513), with vars_ranges is {i2: ValueRanges(lower=0, upper=256), i3: ValueRanges(lower=0, upper=513)}```). One solution is that we don't replace ```ModularIndexing```, but it can't get the value range. Another solution is that return ```inf``` range when the min val is great than the max val. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 [ghstack-poisoned]

XiaobingSuper · 2023-06-13T00:46:30Z

@pytorchbot rebase

pytorchmergebot · 2023-06-13T00:48:38Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

For the CPU inductor side, there has an optimization to convert ```int64``` index_expr to ```int32``` for good performance(https://github.com/pytorch/pytorch/blob/main/torch/_inductor/codegen/cpp.py#L2034), but for ```ModularIndexing``` exp, we replace it as division(https://github.com/pytorch/pytorch/blob/main/torch/_inductor/optimize_indexing.py#L73, ```ModularIndexing``` doesn't have derivative) to compute derivative and then compute the expr's value range, there may meet issue which the min value may greater than the max value(```ModularIndexing(513*i2 + i3 + 262400, 512, 513), with vars_ranges is {i2: ValueRanges(lower=0, upper=256), i3: ValueRanges(lower=0, upper=513)}```). One solution is that we don't replace ```ModularIndexing```, but it can't get the value range. Another solution is that return ```inf``` range when the min val is great than the max val. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 [ghstack-poisoned]

pytorchmergebot · 2023-06-13T00:48:55Z

Successfully rebased gh/XiaobingSuper/129/orig onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via ghstack checkout https://github.com/pytorch/pytorch/pull/103147)

ghstack-source-id: 30df406 Pull Request resolved: #103147

eellison · 2023-06-15T19:44:31Z

            (torch.randn(8),),
        )

+    @patch("torch.cuda.is_available", lambda: False)


why do you need this ?

eellison · 2023-06-15T19:46:16Z

        def mod_indexing_rep(x, y, z):
            if z.is_constant():
-                return x / y
+                new_var = sympy_symbol("mod_index" + f"{next(cnt)}")


maybe add TODO to optimize more

For the CPU inductor side, there has an optimization to convert ```int64``` index_expr to ```int32``` for good performance(https://github.com/pytorch/pytorch/blob/main/torch/_inductor/codegen/cpp.py#L2034), but for ```ModularIndexing``` exp, we replace it as division(https://github.com/pytorch/pytorch/blob/main/torch/_inductor/optimize_indexing.py#L73, ```ModularIndexing``` doesn't have derivative) to compute derivative and then compute the expr's value range, there may meet issue which the min value may greater than the max value(```ModularIndexing(513*i2 + i3 + 262400, 512, 513), with vars_ranges is {i2: ValueRanges(lower=0, upper=256), i3: ValueRanges(lower=0, upper=513)}```). One solution is that we don't replace ```ModularIndexing```, but it can't get the value range. Another solution is that return ```inf``` range when the min val is great than the max val. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 [ghstack-poisoned]

ghstack-source-id: 48bae4a Pull Request resolved: #103147

For the CPU inductor side, there has an optimization to convert ```int64``` index_expr to ```int32``` for good performance(https://github.com/pytorch/pytorch/blob/main/torch/_inductor/codegen/cpp.py#L2034), but for ```ModularIndexing``` exp, we replace it as division(https://github.com/pytorch/pytorch/blob/main/torch/_inductor/optimize_indexing.py#L73, ```ModularIndexing``` doesn't have derivative) to compute derivative and then compute the expr's value range, there may meet issue which the min value may greater than the max value(```ModularIndexing(513*i2 + i3 + 262400, 512, 513), with vars_ranges is {i2: ValueRanges(lower=0, upper=256), i3: ValueRanges(lower=0, upper=513)}```). One solution is that we don't replace ```ModularIndexing```, but it can't get the value range. Another solution is that return ```inf``` range when the min val is great than the max val. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 [ghstack-poisoned]

ghstack-source-id: 73040eb Pull Request resolved: #103147

XiaobingSuper · 2023-06-19T04:37:03Z

@pytorchbot merge

pytorchmergebot · 2023-06-19T04:38:57Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

indutor: fix issue of compute index_expr range

7020d87

[ghstack-poisoned]

github-actions Bot added ciflow/inductor module: inductor labels Jun 7, 2023

pytorchbot added the open source label Jun 7, 2023

XiaobingSuper marked this pull request as draft June 7, 2023 07:38

XiaobingSuper requested review from eellison, jgong5 and peterbell10 June 7, 2023 07:39

XiaobingSuper marked this pull request as ready for review June 7, 2023 07:59

jgong5 reviewed Jun 7, 2023

View reviewed changes

XiaobingSuper added a commit that referenced this pull request Jun 7, 2023

indutor: fix issue of compute index_expr range

7de7224

ghstack-source-id: a4a1ec5 Pull Request resolved: #103147

jgong5 approved these changes Jun 7, 2023

View reviewed changes

eellison reviewed Jun 9, 2023

View reviewed changes

XiaobingSuper added a commit that referenced this pull request Jun 12, 2023

indutor: fix issue of compute index_expr range

ac63949

ghstack-source-id: c543560 Pull Request resolved: #103147

XiaobingSuper requested a review from eellison June 12, 2023 06:36

XiaobingSuper linked an issue Jun 12, 2023 that may be closed by this pull request

[Inductor][CPU] ValueRangeError: Invalid ranges cased model crash for hf_Longformer & AllenaiLongformerBase #103133

Closed

XiaobingSuper requested a review from ngimel June 13, 2023 00:47

pytorchmergebot pushed a commit that referenced this pull request Jun 13, 2023

indutor: fix issue of compute index_expr range

31a0333

ghstack-source-id: 30df406 Pull Request resolved: #103147

eellison approved these changes Jun 15, 2023

View reviewed changes

XiaobingSuper added release notes: inductor ciflow/trunk Trigger trunk jobs on your pull request labels Jun 16, 2023

XiaobingSuper added a commit that referenced this pull request Jun 16, 2023

indutor: fix issue of compute index_expr range

0ee8198

ghstack-source-id: 48bae4a Pull Request resolved: #103147

XiaobingSuper added a commit that referenced this pull request Jun 19, 2023

indutor: fix issue of compute index_expr range

3df1050

ghstack-source-id: 73040eb Pull Request resolved: #103147

pytorchmergebot added the merging label Jun 19, 2023

pytorchmergebot added Merged and removed merging labels Jun 19, 2023

pytorchmergebot closed this in 79fe3ae Jun 19, 2023

XiaobingSuper mentioned this pull request Jun 19, 2023

inductor: move the CPU weight packing path after of AOTAutograd #103851

Closed

facebook-github-bot deleted the gh/XiaobingSuper/129/head branch June 22, 2023 14:16

		# min_value may be greater than max_value, such as ModularIndexing(513*i2 + i3 + 262400, 512, 513),
		# with vars_ranges is {i2: ValueRanges(lower=0, upper=256), i3: ValueRanges(lower=0, upper=513)}.

Conversation

XiaobingSuper commented Jun 7, 2023 • edited by pytorch-bot Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented Jun 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/103147

✅ 2 Unrelated Failures

Uh oh!

XiaobingSuper commented Jun 7, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ngimel commented Jun 7, 2023

Uh oh!

XiaobingSuper commented Jun 8, 2023

Uh oh!

eellison left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

XiaobingSuper Jun 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ngimel commented Jun 9, 2023

Uh oh!

lezcano commented Jun 10, 2023

Uh oh!

XiaobingSuper commented Jun 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

XiaobingSuper commented Jun 13, 2023

Uh oh!

pytorchmergebot commented Jun 13, 2023

Uh oh!

pytorchmergebot commented Jun 13, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

XiaobingSuper commented Jun 19, 2023

Uh oh!

pytorchmergebot commented Jun 19, 2023

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

XiaobingSuper commented Jun 7, 2023 •

edited by pytorch-bot Bot

Loading

pytorch-bot Bot commented Jun 7, 2023 •

edited

Loading

XiaobingSuper Jun 12, 2023 •

edited

Loading

XiaobingSuper commented Jun 12, 2023 •

edited

Loading