Skip to content

[New Pipeline]: SmoothCache: A Universal Inference Acceleration Technique for Diffusion Transformers #11135

Open
@ziyu-guo

Description

@ziyu-guo

Model/Pipeline/Scheduler description

Repo: https://github.com/Roblox/SmoothCache
Paper: https://huggingface.co/papers/2411.10510

This is a training-free acceleration technique for DiT pipelines, that controls caching behavior of individual components and works across different pipelines and modalities.

There is a non-intrusive helper class implemented for plug-and-play integration with Diffusers DiT Pipeline. No changes inside Diffusers needed for this to work.

We're happy to add SmoothCache in the form of a doc-only PR to benefit a broader user-base.

Open source status

  • The model implementation is available.
    The model weights are available (Only relevant if addition is not a scheduler).

Provide useful links for the implementation

No response

Activity

ukaprch

ukaprch commented on Mar 23, 2025

@ukaprch

I ran into a problem with pip install dit-smoothcache:
It didn't copy over all the components.

Once I got that settled away, I had to comment out the following from init.py:

`'''

Try to import DiTCacheHelper

try:
from .dit_cache_helper import DiTCacheHelper
all.append('DiTCacheHelper')
except ImportError:
print("Warning: DiTCacheHelper not imported. Ensure necessary dependencies are installed.")
'''`

I kept getting the error when importing DitCacheHelper:

try:
# Assuming DiTBlock is defined in 'models/dit.py' in the DiT repository
from models import DiTBlock
except ImportError:
print("Warning: DiT library is not accessible. DiTCacheHelper cannot be used.")
DiTBlock = None

which I assume needs to be investigated or implemented for this to work?

Then after the above I ran your included code in my FluxPipeline inference:

# Import SmoothCacheHelper from SmoothCache import DiffuserCacheHelper import json # Initialize the DiffuserCacheHelper with the model with open("./smoothcache_schedules/diffuser_schedule.json", "r") as f: schedule = json.load(f) cache_helper = DiffuserCacheHelper(pipe.transformer, schedule=schedule) # Enable the caching helper cache_helper.enable()

But I didn't see any performance improvement. Is it possible this is due to my transformer being quantized?

ziyu-guo

ziyu-guo commented on Mar 24, 2025

@ziyu-guo
Author

Hi @ukaprch, thank you for trying it out. We have implemented inference helpers for the original DiT and Diffusers DiT Pipeline. The optional import warnings are to check if either of these dependencies exist.

Assuming you have latest Diffusers package installed, you can start your experiment with following code:

import json
import torch
from diffusers import DiTPipeline, DPMSolverMultistepScheduler

# Import SmoothCacheHelper
from SmoothCache import DiffuserCacheHelper  

# Load the DiT pipeline and scheduler
pipe = DiTPipeline.from_pretrained("facebook/DiT-XL-2-256", torch_dtype=torch.float16)
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe = pipe.to("cuda")

# Initialize the DiffuserCacheHelper with the model
with open("smoothcache_schedules/50-N-3-threshold-0.35.json", "r") as f:
    schedule = json.load(f)
cache_helper = DiffuserCacheHelper(pipe.transformer, schedule=schedule)

# Enable the caching helper
cache_helper.enable()
# Prepare the input
words = ["Labrador retriever"]
class_ids = pipe.get_label_ids(words)

# Generate images with the pipeline
generator = torch.manual_seed(33)
image = pipe(class_labels=class_ids, num_inference_steps=50, generator=generator).images[0]

# Restore the original forward method and disable the helper
# disable() should be paired up with enable() 
cache_helper.disable()

We've also provided a set of example caching schedules for 30/50/70 timesteps here.

If you wish to adapt the SmoothCacheHelper class to a new pipeline, you can follow the code example for DiffuserCacheHelper

Let us know if this anwsers your questions.

ziyu-guo

ziyu-guo commented on Apr 15, 2025

@ziyu-guo
Author

A quick follow-up: we're working on extending SmoothCache to make it stackable on top of FluxPipeline. Please stay tuned.

github-actions

github-actions commented on May 10, 2025

@github-actions
Contributor

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    staleIssues that haven't received updates

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @ziyu-guo@ukaprch

        Issue actions

          [New Pipeline]: SmoothCache: A Universal Inference Acceleration Technique for Diffusion Transformers · Issue #11135 · huggingface/diffusers