Skip to content

indutor: fix issue of compute index_expr range#103147

Closed
XiaobingSuper wants to merge 7 commits intogh/XiaobingSuper/129/basefrom
gh/XiaobingSuper/129/head
Closed

indutor: fix issue of compute index_expr range#103147
XiaobingSuper wants to merge 7 commits intogh/XiaobingSuper/129/basefrom
gh/XiaobingSuper/129/head

Conversation

@XiaobingSuper
Copy link
Copy Markdown
Collaborator

@XiaobingSuper XiaobingSuper commented Jun 7, 2023

Stack from ghstack (oldest at bottom):

For the CPU inductor side, there has an optimization to convert int64 index_expr to int32 for good performance(https://github.com/pytorch/pytorch/blob/main/torch/_inductor/codegen/cpp.py#L2034), but for ModularIndexing exp, we replace it as division(https://github.com/pytorch/pytorch/blob/main/torch/_inductor/optimize_indexing.py#L73, ModularIndexing doesn't have derivative) to compute derivative and then compute the expr's value range, there may meet issue which the min value may greater than the max value(ModularIndexing(513*i2 + i3 + 262400, 512, 513), with vars_ranges is {i2: ValueRanges(lower=0, upper=256), i3: ValueRanges(lower=0, upper=513)}).

One solution is that we don't replace ModularIndexing, but it can't get the value range.
Another solution is that return inf range when the min val is great than the max val.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @ngimel @yf225 @chenyang78

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Jun 7, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/103147

Note: Links to docs will display an error until the docs builds have been completed.

✅ 2 Unrelated Failures

As of commit e95601d:

UNSTABLE - The following jobs failed but were likely due to flakiness present on trunk and has been marked as unstable:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@XiaobingSuper XiaobingSuper marked this pull request as draft June 7, 2023 07:38
@XiaobingSuper XiaobingSuper marked this pull request as ready for review June 7, 2023 07:59
@XiaobingSuper
Copy link
Copy Markdown
Collaborator Author

I just meet one index expr which only has one ModularIndexing, for a complex expr has ModularIndexing and other operatoions, I don't think it is ok using the current method to get the value range.

Comment thread torch/_inductor/optimize_indexing.py Outdated
Comment on lines +128 to +129
# min_value may be greater than max_value, such as ModularIndexing(513*i2 + i3 + 262400, 512, 513),
# with vars_ranges is {i2: ValueRanges(lower=0, upper=256), i3: ValueRanges(lower=0, upper=513)}.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess a better approach is to deduce the range with the divisor. For example, we may create a new symbol with the range [0, divisor-1] suppose the divisor is constant, and put this range to the vars_ranges to calculate the min/max value with the algorithm in this function.

For the CPU inductor side, there has an optimization to convert ```int64``` index_expr to ```int32``` for good performance(https://github.com/pytorch/pytorch/blob/main/torch/_inductor/codegen/cpp.py#L2034), but for ```ModularIndexing``` exp, we replace it as division(https://github.com/pytorch/pytorch/blob/main/torch/_inductor/optimize_indexing.py#L73, ```ModularIndexing``` doesn't have derivative) to compute derivative and then compute the expr's value range, there may meet issue which the min value may greater than the max value(```ModularIndexing(513*i2 + i3 + 262400, 512, 513), with vars_ranges is {i2: ValueRanges(lower=0, upper=256), i3: ValueRanges(lower=0, upper=513)}```).

One solution is that we don't replace ```ModularIndexing```, but it can't get the value range.
Another solution is that return ```inf``` range when the min val is great than the max val.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225

[ghstack-poisoned]
XiaobingSuper added a commit that referenced this pull request Jun 7, 2023
ghstack-source-id: a4a1ec5
Pull Request resolved: #103147
Comment thread torch/_inductor/optimize_indexing.py Outdated
if len(symbols) == 0:
return ValueRanges(expr, expr)

vars_ranges_temp = vars_ranges.copy()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simpler to do vars_ranges = vars_ranges.copy()?

@ngimel
Copy link
Copy Markdown
Collaborator

ngimel commented Jun 7, 2023

We should just return (0, z-1) range for ModularIndexing, and not go through derivative computation. cc @eellison

@XiaobingSuper
Copy link
Copy Markdown
Collaborator Author

We should just return (0, z-1) range for ModularIndexing, and not go through derivative computation. cc @eellison

Yes, for a simple expr which only has ModularIndexing, it is ok just to return (0, z-1), but for a complex expr, I think we need still to use derivative and replace ModularIndexing as a variable that has a range (0, z-1). @eellison

Copy link
Copy Markdown
Contributor

@eellison eellison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

possible to add test ?

def mod_indexing_rep(x, y, z):
if z.is_constant():
return x / y
new_var = sympy_symbol("mod_index" + f"{next(cnt)}")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check if x / y has a range <= z and return x / y in that case ?

Copy link
Copy Markdown
Collaborator Author

@XiaobingSuper XiaobingSuper Jun 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to return x/y, it needs to check z is positive or not, and x/y ranges are the same sign, for example, if x/y ranges are [-2, 2], z is 4, we can't direct return x/y. if we want to return x/y, we may need to add consider many conditions, I think using z's range is ok even if the range is huge for some cases.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add TODO to optimize more

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, added.

@ngimel
Copy link
Copy Markdown
Collaborator

ngimel commented Jun 9, 2023

cc @lezcano, does this overlap with #102722?

@lezcano
Copy link
Copy Markdown
Collaborator

lezcano commented Jun 10, 2023

Yeah, #102722 is strictly better than this, but this is better than the current state so I'm fine with merging this one.

FWIW, I'm going to be on PTO for the next 3 weeks, so let's merge this one now and I'll have the other one ready when I'm back.

For the CPU inductor side, there has an optimization to convert ```int64``` index_expr to ```int32``` for good performance(https://github.com/pytorch/pytorch/blob/main/torch/_inductor/codegen/cpp.py#L2034), but for ```ModularIndexing``` exp, we replace it as division(https://github.com/pytorch/pytorch/blob/main/torch/_inductor/optimize_indexing.py#L73, ```ModularIndexing``` doesn't have derivative) to compute derivative and then compute the expr's value range, there may meet issue which the min value may greater than the max value(```ModularIndexing(513*i2 + i3 + 262400, 512, 513), with vars_ranges is {i2: ValueRanges(lower=0, upper=256), i3: ValueRanges(lower=0, upper=513)}```).

One solution is that we don't replace ```ModularIndexing```, but it can't get the value range.
Another solution is that return ```inf``` range when the min val is great than the max val.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225

[ghstack-poisoned]
XiaobingSuper added a commit that referenced this pull request Jun 12, 2023
ghstack-source-id: c543560
Pull Request resolved: #103147
@XiaobingSuper
Copy link
Copy Markdown
Collaborator Author

XiaobingSuper commented Jun 12, 2023

possible to add test ?

Added.

For the CPU inductor side, there has an optimization to convert ```int64``` index_expr to ```int32``` for good performance(https://github.com/pytorch/pytorch/blob/main/torch/_inductor/codegen/cpp.py#L2034), but for ```ModularIndexing``` exp, we replace it as division(https://github.com/pytorch/pytorch/blob/main/torch/_inductor/optimize_indexing.py#L73, ```ModularIndexing``` doesn't have derivative) to compute derivative and then compute the expr's value range, there may meet issue which the min value may greater than the max value(```ModularIndexing(513*i2 + i3 + 262400, 512, 513), with vars_ranges is {i2: ValueRanges(lower=0, upper=256), i3: ValueRanges(lower=0, upper=513)}```).

One solution is that we don't replace ```ModularIndexing```, but it can't get the value range.
Another solution is that return ```inf``` range when the min val is great than the max val.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225

[ghstack-poisoned]
@XiaobingSuper
Copy link
Copy Markdown
Collaborator Author

@pytorchbot rebase

@XiaobingSuper XiaobingSuper requested a review from ngimel June 13, 2023 00:47
@pytorchmergebot
Copy link
Copy Markdown
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

For the CPU inductor side, there has an optimization to convert ```int64``` index_expr to ```int32``` for good performance(https://github.com/pytorch/pytorch/blob/main/torch/_inductor/codegen/cpp.py#L2034), but for ```ModularIndexing``` exp, we replace it as division(https://github.com/pytorch/pytorch/blob/main/torch/_inductor/optimize_indexing.py#L73, ```ModularIndexing``` doesn't have derivative) to compute derivative and then compute the expr's value range, there may meet issue which the min value may greater than the max value(```ModularIndexing(513*i2 + i3 + 262400, 512, 513), with vars_ranges is {i2: ValueRanges(lower=0, upper=256), i3: ValueRanges(lower=0, upper=513)}```).

One solution is that we don't replace ```ModularIndexing```, but it can't get the value range.
Another solution is that return ```inf``` range when the min val is great than the max val.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225

[ghstack-poisoned]
@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Successfully rebased gh/XiaobingSuper/129/orig onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via ghstack checkout https://github.com/pytorch/pytorch/pull/103147)

pytorchmergebot pushed a commit that referenced this pull request Jun 13, 2023
ghstack-source-id: 30df406
Pull Request resolved: #103147
Comment thread test/inductor/test_cpu_repro.py Outdated
(torch.randn(8),),
)

@patch("torch.cuda.is_available", lambda: False)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need this ?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove it.

def mod_indexing_rep(x, y, z):
if z.is_constant():
return x / y
new_var = sympy_symbol("mod_index" + f"{next(cnt)}")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add TODO to optimize more

For the CPU inductor side, there has an optimization to convert ```int64``` index_expr to ```int32``` for good performance(https://github.com/pytorch/pytorch/blob/main/torch/_inductor/codegen/cpp.py#L2034), but for ```ModularIndexing``` exp, we replace it as division(https://github.com/pytorch/pytorch/blob/main/torch/_inductor/optimize_indexing.py#L73, ```ModularIndexing``` doesn't have derivative) to compute derivative and then compute the expr's value range, there may meet issue which the min value may greater than the max value(```ModularIndexing(513*i2 + i3 + 262400, 512, 513), with vars_ranges is {i2: ValueRanges(lower=0, upper=256), i3: ValueRanges(lower=0, upper=513)}```).

One solution is that we don't replace ```ModularIndexing```, but it can't get the value range.
Another solution is that return ```inf``` range when the min val is great than the max val.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78

[ghstack-poisoned]
@XiaobingSuper XiaobingSuper added release notes: inductor ciflow/trunk Trigger trunk jobs on your pull request labels Jun 16, 2023
XiaobingSuper added a commit that referenced this pull request Jun 16, 2023
ghstack-source-id: 48bae4a
Pull Request resolved: #103147
For the CPU inductor side, there has an optimization to convert ```int64``` index_expr to ```int32``` for good performance(https://github.com/pytorch/pytorch/blob/main/torch/_inductor/codegen/cpp.py#L2034), but for ```ModularIndexing``` exp, we replace it as division(https://github.com/pytorch/pytorch/blob/main/torch/_inductor/optimize_indexing.py#L73, ```ModularIndexing``` doesn't have derivative) to compute derivative and then compute the expr's value range, there may meet issue which the min value may greater than the max value(```ModularIndexing(513*i2 + i3 + 262400, 512, 513), with vars_ranges is {i2: ValueRanges(lower=0, upper=256), i3: ValueRanges(lower=0, upper=513)}```).

One solution is that we don't replace ```ModularIndexing```, but it can't get the value range.
Another solution is that return ```inf``` range when the min val is great than the max val.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78

[ghstack-poisoned]
XiaobingSuper added a commit that referenced this pull request Jun 19, 2023
ghstack-source-id: 73040eb
Pull Request resolved: #103147
@XiaobingSuper
Copy link
Copy Markdown
Collaborator Author

@pytorchbot merge

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

[Inductor][CPU] ValueRangeError: Invalid ranges cased model crash for hf_Longformer & AllenaiLongformerBase

7 participants