add sana-sprint by yiyixuxu · Pull Request #11074 · huggingface/diffusers

yiyixuxu · 2025-03-17T01:18:49Z

# test sana sprint
"""
python scripts/convert_sana_to_diffusers.py \
  --orig_ckpt_path /raid/.cache/huggingface/yiyi/models--JunsongChen--Sana_Sprint_1600M_1024px/snapshots/8ecfdc7e6269e5b065f5b2cf3fac9a2a1a778c6a/Sana_Sprint_1600M_1024px_36K.pth \
  --model_type SanaSprint_1600M_P1_D20 \
  --image_size 1024 \
  --dump_path /raid/yiyi/Sana-Sprint-yiyi \
  --save_full_pipeline \
  --scheduler_type scm
"""

from diffusers import SanaSprintPipeline
import torch

device = "cuda:0"
dtype = torch.bfloat16

repo = "/raid/yiyi/Sana-Sprint-yiyi"

pipeline = SanaSprintPipeline.from_pretrained(repo, torch_dtype=dtype)
pipeline.to(device)


prompt = "a tiny astronaut hatching from an egg on the moon"

image = pipeline(prompt=prompt, num_inference_steps=2).images[0]
image.save("test_out.png")

vibe tests with different timesteps settings

# test sana sprint
# (pipeline)
test_max_timesteps = [1.57080, 1.56830, 1.56580, 1.56454, 1.56246, 1.55830, 1.55413, 1.55080, 1.54580]
test_intermediate_timesteps = [None, 1.0, 1.1, 1.2, 1.3, 1.4]
test_num_inference_steps = [1,2,4]

# test_max_timesteps = [1.57080]
# test_intermediate_timesteps = [None]
# test_num_inference_steps = [1]

from diffusers import SanaSprintPipeline
import torch

device = "cuda:0"
dtype = torch.bfloat16
repo = "/raid/yiyi/Sana-Sprint-yiyi"

def run_sana(pipeline, num_inference_steps, max_timesteps, intermediate_timesteps):
    prompt = "a tiny astronaut hatching from an egg on the moon"
    generator = torch.Generator(device=device).manual_seed(123)
    test_name = f"num_inference_steps_{num_inference_steps}_max_timesteps_{max_timesteps}_intermediate_timesteps_{intermediate_timesteps}"
    print(f"--------------------------------")
    print(f"Running test:")
    print(f"num_inference_steps: {num_inference_steps}")
    print(f"max_timesteps: {max_timesteps}")
    print(f"intermediate_timesteps: {intermediate_timesteps}")
    try:
        image = pipeline(prompt=prompt, num_inference_steps=num_inference_steps, max_timesteps=max_timesteps, intermediate_timesteps=intermediate_timesteps, generator=generator).images[0]
        image.save(f"yiyi_test_10_1_out_{test_name}.png")
    except Exception as e:
        print(e)
    print(f"--------------------------------")


pipeline = SanaSprintPipeline.from_pretrained(repo, torch_dtype=dtype)
pipeline.to(device)

for num_inference_steps in test_num_inference_steps:
    for max_timesteps in test_max_timesteps:
        for intermediate_timesteps in test_intermediate_timesteps:
            run_sana(pipeline, num_inference_steps, max_timesteps, intermediate_timesteps)

HuggingFaceDocBuilderDev · 2025-03-17T01:25:08Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ishan-modi · 2025-03-17T03:35:39Z

Nice Work !! just a heads up this PR might have conflicts with #11040 if merged first

lawrence-cj · 2025-03-17T04:03:28Z

Wonderful work. Since SANA-Sprint and SANA-1.5 follow the same architecture, so this PR would make SANA-1.5 work as well.
@yiyixuxu @sayakpaul

* 1. update conversion script for sana1.5; 2. add conversion script for sana-sprint; * seperate sana and sana-sprint conversion scripts; * update for upstream * fix the } bug * add a doc for SanaSprintPipeline; * minor update; * make style && make quality

…ana-sprint

yiyixuxu · 2025-03-20T10:48:09Z

@bot /style

yiyixuxu · 2025-03-20T10:48:24Z

@bot/ style

github-actions · 2025-03-20T10:49:05Z

Style fixes have been applied. View the workflow run here.

yiyixuxu · 2025-03-20T10:51:15Z

cc @lawrence-cj can you do a review?

a-r-r-o-w

Really amazing work! Can't wait for the release ❤️

a-r-r-o-w · 2025-03-20T12:51:05Z

+        >>> from diffusers import SanaPipeline
+
+        >>> pipe = SanaPipeline.from_pretrained(


Example to be updated to SanaSprintPipeline

hlky

Thanks @yiyixuxu

hlky · 2025-03-20T11:32:30Z

+            attention_mask = attn.prepare_attention_mask(attention_mask, sequence_length, batch_size)
+            # scaled_dot_product_attention expects attention_mask shape to be
+            # (batch, heads, source_length, target_length)
+            attention_mask = attention_mask.view(batch_size, attn.heads, -1, attention_mask.shape[-1])


In other recent models we found that attention mask with shape [B, 1, 1, N] is faster as the total size is smaller and PyTorch's broadcasting handles it. Something to look into, if we see a benefit all occurrences of this code can be updated.

hlky · 2025-03-20T13:47:36Z

+            latents = latents.to(self.vae.dtype)
+            try:
+                image = self.vae.decode(latents / self.vae.config.scaling_factor, return_dict=False)[0]
+            except torch.cuda.OutOfMemoryError as e:


Nice!

For XPU we need to use torch.OutOfMemoryError, also looks like that will work on CUDA.

https://github.com/pytorch/pytorch/blob/00a2c68f67adbd38847845016fd1ab9275cefbab/test/test_xpu.py#L446
https://github.com/pytorch/pytorch/blob/00a2c68f67adbd38847845016fd1ab9275cefbab/test/test_cuda.py#L3950
https://github.com/pytorch/pytorch/blob/00a2c68f67adbd38847845016fd1ab9275cefbab/test/test_cuda.py#L4154

+1 on this.

add a note about max_timesteps

Co-authored-by: Aryan <aryan@huggingface.co>

lawrence-cj · 2025-03-21T09:43:04Z

@yiyixuxu
Comments here: #11131
Official model: https://huggingface.co/Efficient-Large-Model/Sana_Sprint_1.6B_1024px_diffusers

sayakpaul · 2025-03-21T10:00:33Z

+            try:
+                image = self.vae.decode(latents / self.vae.config.scaling_factor, return_dict=False)[0]
+            except torch.cuda.OutOfMemoryError as e:
+                warnings.warn(


Should this be logger.warning()?

sayakpaul · 2025-03-21T10:04:38Z

+        else:
+            # max_timesteps=arctan(80/0.5)=1.56454 is the default from sCM paper, we choose a different value here
+            self.timesteps = torch.linspace(max_timesteps, 0, num_inference_steps + 1, device=device).float()
+        print(f"Set timesteps: {self.timesteps}")


Suggested change

print(f"Set timesteps: {self.timesteps}")

* change sample prompt; * only 1024px is supported;

first commit

c952370

lawrence-cj and others added 10 commits March 16, 2025 20:39

change name from SanaSCMPipeline to SanaSprintPipeline. (#11076)

9714187

add conversion sript

ae4c3fd

style

0d6309a

copies

5b19b22

pipeline_sana_scm -> pipeline_sana_sprint

4eef82b

remove unused __init__ arg for scm scheduler

398ca0c

up

8070495

up upp

be73b59

Merge branch 'sana-sprint' of github.com:huggingface/diffusers into s…

da3c917

…ana-sprint

yiyixuxu requested review from a-r-r-o-w and hlky March 20, 2025 10:47

Apply style fixes

9cd5f1e

a-r-r-o-w approved these changes Mar 20, 2025

View reviewed changes

hlky approved these changes Mar 20, 2025

View reviewed changes

sayakpaul and others added 7 commits March 20, 2025 06:38

[docs] add a note about max_timesteps (#11122)

8e4f711

add a note about max_timesteps

add to torctree

1de087e

Apply suggestions from code review

3734af8

Co-authored-by: Aryan <aryan@huggingface.co>

up

8c07fcc

update docstring example

eae8ed7

add tests

c4d049c

up

c3e107f

yiyixuxu commented Mar 21, 2025

View reviewed changes

Comment thread scripts/convert_sana_to_diffusers.py Outdated

yiyixuxu commented Mar 21, 2025

View reviewed changes

Comment thread scripts/convert_sana_to_diffusers.py Outdated

Apply suggestions from code review

94d87d5

sayakpaul reviewed Mar 21, 2025

View reviewed changes

This was referenced Mar 21, 2025

🎉Glad to announce SANA-Sprint is available! NVlabs/Sana#209

Open

Sana finetuned example NVlabs/Sana#74

Open

lawrence-cj and others added 2 commits March 21, 2025 05:57

[SANA-Sprint] remove used multi-scale bin (#11131)

a220997

* change sample prompt; * only 1024px is supported;

style

7a04604

yiyixuxu merged commit 8a63aa5 into main Mar 21, 2025

		>>> from diffusers import SanaPipeline

		>>> pipe = SanaPipeline.from_pretrained(

Conversation

yiyixuxu commented Mar 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Mar 17, 2025

Uh oh!

ishan-modi commented Mar 17, 2025

Uh oh!

lawrence-cj commented Mar 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yiyixuxu commented Mar 20, 2025

Uh oh!

yiyixuxu commented Mar 20, 2025

Uh oh!

github-actions Bot commented Mar 20, 2025

Uh oh!

yiyixuxu commented Mar 20, 2025

Uh oh!

a-r-r-o-w left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

a-r-r-o-w Mar 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

hlky left a comment

Choose a reason for hiding this comment

Uh oh!

hlky Mar 20, 2025

Choose a reason for hiding this comment

Uh oh!

hlky Mar 20, 2025

Choose a reason for hiding this comment

Uh oh!

sayakpaul Mar 21, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

lawrence-cj commented Mar 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sayakpaul Mar 21, 2025

Choose a reason for hiding this comment

Uh oh!

sayakpaul Mar 21, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

yiyixuxu commented Mar 17, 2025 •

edited

Loading

lawrence-cj commented Mar 17, 2025 •

edited

Loading

lawrence-cj commented Mar 21, 2025 •

edited

Loading