[LTXPipeline] Update latents dtype to match VAE dtype #11533

james-p-xu · 2025-05-09T09:35:26Z

What does this PR do?

The LTX Video HuggingFace Documentation states:

Note: The recommended dtype is for the transformer component. The VAE and text encoders can be either torch.float32, torch.bfloat16 or torch.float16 but the recommended dtype is torch.bfloat16 as used in the original repository.

This implies that the VAE dtype can be different than the rest of the pipeline. e.g. the VAE could be in fp32 while the transformer is in fp16.

However, when running with fp32 VAE and fp16 transformer, I hit the following error.

RuntimeError: Input type (c10::Half) and bias type (float) should be the same

I believe the proper fix should be to ensure that the latents tensor (input) is the same dtype as the VAE (bias?). We can ensure this by casting the latents tensor to the proper dtype before the VAE decoder is run.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

james-p-xu · 2025-05-09T09:37:30Z

cc: @a-r-r-o-w, not able to request you as a reviewer. Happy to chat about any changes / other ideas you might have here.

a-r-r-o-w

Thanks, makes sense!

HuggingFaceDocBuilderDev · 2025-05-09T09:57:55Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

fix: update latents dtype to match vae

497b28a

a-r-r-o-w approved these changes May 9, 2025

View reviewed changes

a-r-r-o-w merged commit 3c0a012 into huggingface:main May 9, 2025
12 checks passed

james-p-xu deleted the jamxu/update-ltx-pipeline-latents-dtype branch May 9, 2025 16:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LTXPipeline] Update latents dtype to match VAE dtype #11533

[LTXPipeline] Update latents dtype to match VAE dtype #11533

james-p-xu commented May 9, 2025

james-p-xu commented May 9, 2025

a-r-r-o-w left a comment

HuggingFaceDocBuilderDev commented May 9, 2025

[LTXPipeline] Update latents dtype to match VAE dtype #11533

[LTXPipeline] Update latents dtype to match VAE dtype #11533

Conversation

james-p-xu commented May 9, 2025

What does this PR do?

Before submitting

Who can review?

james-p-xu commented May 9, 2025

a-r-r-o-w left a comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented May 9, 2025