[MPS] MPSNDArray error: product of dimension sizes > 2**31

### 🐛 Describe the bug

## Full error message (no traceback):
```
AppleInternal/Library/BuildRoots/20d6c351-ee94-11ec-bcaf-7247572f23b4/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Types/MPSNDArray.mm:705: failed assertion '[MPSNDArray initWithDevice:descriptor:] Error: product of dimension sizes > 2**31 '
```

## How to reproduce
1. Install stable-diffusion using [instructions for macOS](https://github.com/magnusviri/stable-diffusion/tree/apple-silicon-mps-support)
2. run ```python scripts/txt2img.py --prompt "a horse" --plms --n_samples 1 --n_rows 1 --n_iter 1```: runs well.
3. But if you add ```--W 1024 --H 1024``` flag, which means width and height respectively, then it'll return the error.
* default width and height is 512, so no flag means 512x512

I'm finding a way to reproduce it without installing the whole procedure, so I'll update the procedure soon.

Edit: repro by @Birch-san 
```python
from torch import einsum, ones
import argparse

parser = argparse.ArgumentParser(description='mpsndarray test')
parser.add_argument('--n_samples', type=int, default=2)
args = parser.parse_args()
n_samples = args.n_samples

einsum('b i d, b j d -> b i j', ones(16 * n_samples, 4096, 40, device='mps'), ones(16 * n_samples, 4096, 40, device='mps')).shape

print(n_samples, 'passed')
```

It fails when n_samples is 2 or over 7. Which looks pretty weird.

## About vram?
As you would all expect, the error seems to be something about VRAM. However, there remains question.
1. The error seems to be the size exceeding ```INT_MAX(2**31)```
The error doesn't occur at ```--W 512 --H 512``` or lower resolution.
2. The error is a software issue
Unlike errors like ```CUDA out of memory```, this error isn't about the real memory limit.
If the error was due to lack of VRAM, the code above (```--W 1024 --H 1024```) should run on M1 Max 64GB since ```--W 512 --H 512``` runs well on my M1 8G macbook. Also, the limit 2**31 is a fixed number, which would not change from the current memory usage.

So, my expectation is that something is being computed in 32-bit, which shouldn't be.


This **might not be torch's problem** - maybe(surely) metal.

However, all helps will be accepted gracefully.

Thanks.

### Versions

PyTorch version: 1.13.0.dev20220824
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: macOS 12.5.1 (arm64)
GCC version: Could not collect
Clang version: 13.1.6 (clang-1316.0.21.2.5)
CMake version: version 3.24.1
Libc version: N/A

Python version: 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:04:14)  [Clang 12.0.1 ] (64-bit runtime)
Python platform: macOS-12.5.1-arm64-arm-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] numpy==1.23.2
[pip3] pytorch-lightning==1.7.2
[pip3] torch==1.13.0.dev20220824
[pip3] torch-fidelity==0.3.0
[pip3] torchaudio==0.13.0.dev20220824
[pip3] torchmetrics==0.9.3
[pip3] torchvision==0.14.0.dev20220824
[conda] numpy                     1.23.2           py38h579d673_0    conda-forge
[conda] pytorch                   1.13.0.dev20220824         py3.8_0    pytorch-nightly
[conda] pytorch-lightning         1.7.2                    pypi_0    pypi
[conda] torch-fidelity            0.3.0                    pypi_0    pypi
[conda] torchaudio                0.13.0.dev20220824        py38_cpu    pytorch-nightly
[conda] torchmetrics              0.9.3                    pypi_0    pypi
[conda] torchvision               0.14.0.dev20220824        py38_cpu    pytorch-nightly

cc @kulinseth @albanD

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[MPS] MPSNDArray error: product of dimension sizes > 2**31 #84039

🐛 Describe the bug

Full error message (no traceback):

How to reproduce

About vram?

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[MPS] MPSNDArray error: product of dimension sizes > 2**31 #84039

Description

🐛 Describe the bug

Full error message (no traceback):

How to reproduce

About vram?

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions