Skip to content

[LTXPipeline] Update latents dtype to match VAE dtype #11533

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

james-p-xu
Copy link
Contributor

What does this PR do?

The LTX Video HuggingFace Documentation states:

Note: The recommended dtype is for the transformer component. The VAE and text encoders can be either torch.float32, torch.bfloat16 or torch.float16 but the recommended dtype is torch.bfloat16 as used in the original repository.

This implies that the VAE dtype can be different than the rest of the pipeline. e.g. the VAE could be in fp32 while the transformer is in fp16.

However, when running with fp32 VAE and fp16 transformer, I hit the following error.

RuntimeError: Input type (c10::Half) and bias type (float) should be the same

I believe the proper fix should be to ensure that the latents tensor (input) is the same dtype as the VAE (bias?). We can ensure this by casting the latents tensor to the proper dtype before the VAE decoder is run.

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@james-p-xu
Copy link
Contributor Author

cc: @a-r-r-o-w, not able to request you as a reviewer. Happy to chat about any changes / other ideas you might have here.

Copy link
Member

@a-r-r-o-w a-r-r-o-w left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, makes sense!

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@a-r-r-o-w a-r-r-o-w merged commit 3c0a012 into huggingface:main May 9, 2025
12 checks passed
@james-p-xu james-p-xu deleted the jamxu/update-ltx-pipeline-latents-dtype branch May 9, 2025 16:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants