Qwen Image generating black images

#78
by rsshekhawat - opened

Why does this code generates black images using Qwen Image model ?

import torch
from diffusers import QwenImagePipeline
from PIL import Image
from IPython.display import display

pipe = QwenImagePipeline.from_pretrained("Qwen/Qwen-Image", torch_dtype=torch.bfloat16)
pipe.enable_sequential_cpu_offload()
pipe.vae.enable_slicing()
pipe.vae.enable_tiling()
pipe.to(torch.float16)
prompt = "A cat holding a sign that says hello world"

image = pipe(
prompt=prompt,
num_inference_steps=4,
true_cfg_scale=1.0,
generator = torch.Generator(device="cuda").manual_seed(21)
).images[0]
display(image)

I have the same problem. I'm trying to use Qwen-Image with diffusers (and the sample code on the model card page) with an RTX 3060 with 12GB of VRAM. It runs with no errors but only outputs a black image.

Hey, actually I got it working now.

what did you have to change?

I think this code should work for you

import torch
from diffusers import QwenImagePipeline
from PIL import Image
from IPython.display import display

pipe = QwenImagePipeline.from_pretrained("Qwen/Qwen-Image", torch_dtype=torch.bfloat16)
print("Loading LoRA weights...")
pipe.load_lora_weights("lightx2v/Qwen-Image-Lightning", weight_name="Qwen-Image-Lightning-8steps-V1.1.safetensors")
pipe.load_lora_weights("lightx2v/Qwen-Image-Lightning", weight_name="Qwen-Image-Lightning-4steps-V1.0-bf16.safetensors")
pipe.enable_sequential_cpu_offload()
prompt = "A cat holding a sign that says hello world"

image = pipe(
prompt=prompt,
num_inference_steps=4,
true_cfg_scale=1.0,
generator = torch.Generator(device="cuda").manual_seed(21)
).images[0]
display(image)

May I ask you what do you use to train your LoRAs ? Do you use any GPU cloud provider or training them locally ?

I don’t train Loras. Not sure what you mean by that. All my stuff I do locally on a dual Xeon setup with 128GB of system ram and 4 RTX 3060 GPUs with 12GB of each.

I’ll have to try out the code snippet you posted, but it’ll be a couple days prob as I just kicked off another batch inference job for something else and my system is tied up doing that right now.

I think this code should work for you

import torch
from diffusers import QwenImagePipeline
from PIL import Image
from IPython.display import display

pipe = QwenImagePipeline.from_pretrained("Qwen/Qwen-Image", torch_dtype=torch.bfloat16)
print("Loading LoRA weights...")
pipe.load_lora_weights("lightx2v/Qwen-Image-Lightning", weight_name="Qwen-Image-Lightning-8steps-V1.1.safetensors")
pipe.load_lora_weights("lightx2v/Qwen-Image-Lightning", weight_name="Qwen-Image-Lightning-4steps-V1.0-bf16.safetensors")
pipe.enable_sequential_cpu_offload()
prompt = "A cat holding a sign that says hello world"

image = pipe(
prompt=prompt,
num_inference_steps=4,
true_cfg_scale=1.0,
generator = torch.Generator(device="cuda").manual_seed(21)
).images[0]
display(image)

this still generates black frames for me

Sign up or log in to comment