-
Notifications
You must be signed in to change notification settings - Fork 6k
[Feature request] LTX-Video v0.9.6 15x faster inference than non-distilled model. #11359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This comment has been minimized.
This comment has been minimized.
import torch
from diffusers import LTXVideoTransformer3DModel, FlowMatchEulerDiscreteScheduler, LTXConditionPipeline
import os
from diffusers.utils import export_to_video
transformer = LTXVideoTransformer3DModel.from_pretrained(
"multimodalart/ltxv-2b-0.9.6-distilled",
subfolder="transformer",
torch_dtype=torch.bfloat16,
variant="bf16"
)
scheduler = FlowMatchEulerDiscreteScheduler.from_pretrained(
"multimodalart/ltxv-2b-0.9.6-distilled",
subfolder="scheduler"
)
pipe = LTXConditionPipeline.from_pretrained(
"Lightricks/LTX-Video-0.9.5",
transformer=transformer,
scheduler=scheduler, #add or remove the scheduler to see the difference
torch_dtype=torch.bfloat16,
)
pipe.enable_sequential_cpu_offload()
# pipe.enable_model_cpu_offload()
prompt = "A woman eating a burger"
negative_prompt = "worst quality, inconsistent motion, blurry, jittery, distorted"
generator = torch.Generator(device="cuda").manual_seed(42)
video = pipe(
prompt=prompt,
negative_prompt=negative_prompt,
width=1216,
height=704,
num_frames=121,
num_inference_steps=8,
guidance_scale=1,
generator=generator
).frames[0]
export_to_video(video, "distilled_scheduler.mp4", fps=24) distilled_scheduler.mp4BTW: This model is insane. On 8 GB VRAM |
Removed the scheduler
distilled_scheduler1.mp4 |
Using LTXPipeline FlowMatchEulerDiscreteScheduler distilled_scheduler3.mp4Default scheduler distilled_scheduler4.mp4 |
Hello @yiyixuxu Could this be considered if not a lot of changes, plz. |
Merged
No longer needed. Distilled model is working fine with LTXPipeline (Do not use conditioning pipeline) Refer the discussion here |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Is your feature request related to a problem? Please describe.
No problem. This request is Low priority. As and when time allows.
Describe the solution you'd like.
Please support the new release of LTX-Video 0.9.6
Describe alternatives you've considered.
Original repo have support but it is easier to use with diffusers
Additional context.
April, 15th, 2025: New checkpoints v0.9.6:
Release a new checkpoint ltxv-2b-0.9.6-dev-04-25 with improved quality
Release a new distilled model ltxv-2b-0.9.6-distilled-04-25
15x faster inference than non-distilled model.
Does not require classifier-free guidance and spatio-temporal guidance.
Supports sampling with 8 (recommended), 4, 2 or 1 diffusion steps.
Improved prompt adherence, motion quality and fine details.
New default resolution and FPS: 1216 × 704 pixels at 30 FPS
Still real time on H100 with the distilled model.
Other resolutions and FPS are still supported.
Support stochastic inference (can improve visual quality when using the distilled model)
https://github.com/Lightricks/LTX-Video
Feedback on distilled model
https://www.reddit.com/r/StableDiffusion/comments/1k1xk1m/6_seconds_video_in_60_seconds_in_this_quality_is/
https://www.reddit.com/r/StableDiffusion/comments/1k1o4x8/the_new_ltxvideo_096_distilled_model_is_actually/
The text was updated successfully, but these errors were encountered: