Skip to content

Commit 223e441

Browse files
committed
Add Anima Pipeline
Signed-off-by: akshatvishu <akshatnayak197@gmail.com>
1 parent 31886d6 commit 223e441

21 files changed

Lines changed: 2810 additions & 38 deletions

File tree

benchmarks/diffusion/README.md

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -149,3 +149,74 @@ batch may still pay compile or CUDA-graph capture cost.
149149

150150
For a Qwen-Image continuous-batching replay example, see
151151
[`performance_dashboard/qwen_image_serving_performance.md`](./performance_dashboard/qwen_image_serving_performance.md).
152+
153+
## 4. Anima Native Single-File Benchmarking
154+
155+
Native Anima is benchmarked as a text-to-image model through the same serving
156+
benchmark entrypoint. Unlike standard HuggingFace model IDs, Anima serves the
157+
raw single-file transformer checkpoint and loads non-denoiser components from a
158+
Diffusers-layout component directory.
159+
160+
Download the official Anima checkpoint and components first. The commands below
161+
use `/path/to/models` as a placeholder; replace it with any local directory that
162+
has enough space for the checkpoint and component files.
163+
164+
```bash
165+
mkdir -p /path/to/models/anima-official
166+
mkdir -p /path/to/models/anima-components
167+
168+
hf download circlestone-labs/Anima \
169+
split_files/diffusion_models/anima-base-v1.0.safetensors \
170+
--local-dir /path/to/models/anima-official
171+
172+
hf download circlestone-labs/Anima-Base-v1.0-Diffusers \
173+
--local-dir /path/to/models/anima-components
174+
175+
CHECKPOINT=/path/to/models/anima-official/split_files/diffusion_models/anima-base-v1.0.safetensors
176+
COMPONENTS=/path/to/models/anima-components
177+
```
178+
179+
Run these commands from the vLLM-Omni repository in the Python environment or
180+
container where vLLM-Omni is installed.
181+
182+
Start the server with the checkpoint as `--model` and pass the component
183+
directory through `--diffusers-load-kwargs`:
184+
185+
```bash
186+
vllm serve "$CHECKPOINT" \
187+
--omni \
188+
--port 8099 \
189+
--model-class-name AnimaPipeline \
190+
--diffusers-load-kwargs "{\"components_path\":\"$COMPONENTS\"}"
191+
```
192+
193+
Then run the standard diffusion serving benchmark:
194+
195+
```bash
196+
python3 benchmarks/diffusion/diffusion_benchmark_serving.py \
197+
--base-url http://localhost:8099 \
198+
--endpoint /v1/chat/completions \
199+
--model "$CHECKPOINT" \
200+
--task t2i \
201+
--dataset random \
202+
--num-prompts 10 \
203+
--max-concurrency 1 \
204+
--warmup-requests 1 \
205+
--warmup-concurrency 1 \
206+
--width 1024 \
207+
--height 1024 \
208+
--num-inference-steps 50
209+
```
210+
211+
This matches the Diffusers baseline defaults for Anima: 1024x1024, 50 denoising
212+
steps, `max_sequence_length=512`, one image per prompt, empty negative prompt,
213+
and CFG scale 4.0 from the default guider. Do not pass `guidance_scale` through
214+
the benchmark unless you are intentionally measuring a non-default CFG setting.
215+
216+
Native Anima currently supports baseline single-GPU execution. Cache-DiT,
217+
TeaCache, CPU offload, layer-wise offload, quantization, TP/SP, CFG parallel,
218+
HSDP, and step execution are not supported by `AnimaPipeline` yet.
219+
220+
Anima uses the default single diffusion stage for local single-file checkpoint
221+
discovery when `--model-class-name AnimaPipeline` is provided; no deploy config
222+
is required.

docs/models/supported_models.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -85,5 +85,6 @@ th {
8585
| `MiniCPMO45OmniForConditionalGeneration` | MiniCPM-o 4.5 | `openbmb/MiniCPM-o-4_5` | ✅︎ | | ✅︎ | |
8686
| `ErnieImagePipeline` | ERNIE-Image | `baidu/ERNIE-Image`, `baidu/ERNIE-Image-Turbo` | ✅︎ | ✅︎ | ✅︎ | ✅︎ |
8787
|`HiDreamImagePipeline` | HiDream-I1-Full | `HiDream-ai/HiDream-I1-Full` | ✅︎ | ✅︎ | | |
88+
| `AnimaPipeline` | Anima | `circlestone-labs/Anima` | ✅︎ | ✅︎ | | |
8889

8990
✅︎ indicates the model is supported on that backend. Empty cells mean not listed as supported on that backend.

docs/user_guide/examples/online_serving/diffusers_pipeline_adapter.md

Lines changed: 57 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -39,9 +39,63 @@ vllm serve "stable-diffusion-v1-5/stable-diffusion-v1-5" \
3939
--diffusion-load-format diffusers
4040
```
4141

42-
Users turn on the diffusers backend primarily through `--diffusion-load-format diffusers` argument.
43-
There are two more optional arguments, `--diffusers-load-kwargs` and `--diffusers-call-kwargs`,
44-
which are only valid together with `--diffusion-load-format diffusers`.
42+
Users turn on the diffusers backend primarily through the `--diffusion-load-format diffusers` argument.
43+
44+
### Single-File Checkpoints
45+
46+
For single-file checkpoints (such as `.safetensors` or `.ckpt`), users can load them via the `--diffusion-load-format diffusers_single_file` argument (or simply point `--model` to a local single checkpoint file).
47+
48+
If a Diffusers pipeline class is needed, specify it using `--model-class-name`:
49+
50+
```bash
51+
vllm serve "/path/to/model.safetensors" \
52+
--omni \
53+
--diffusion-load-format diffusers_single_file \
54+
--model-class-name SomeDiffusersPipeline
55+
```
56+
57+
Using `--diffusion-load-format diffusers_single_file` explicitly bypasses standard directory-based config loading. This allows you to pass a Hugging Face Hub ID (e.g. `repo/model`) or URL as the `--model` argument to fetch single files remotely, provided the specified Diffusers pipeline supports remote loading.
58+
59+
### Native Anima Single-File Checkpoints
60+
61+
Anima single-file checkpoints are served through the native `AnimaPipeline`, not through `AnimaModularPipeline.from_single_file()`. If `--model-class-name AnimaModularPipeline` is passed for a local single-file checkpoint, vLLM-Omni maps it to `AnimaPipeline`.
62+
63+
Use `--model-class-name AnimaPipeline`. The native path reads the Anima transformer single-file checkpoint directly, converts original Cosmos transformer keys when needed, and loads the Cosmos transformer and text conditioner into vLLM-Omni native modules.
64+
65+
The native path also needs the non-denoiser components (`text_encoder`, `tokenizer`, `t5_tokenizer`, `vae`, and optionally `scheduler`). These must be in Diffusers `from_pretrained()` layout. Raw Anima auxiliary files such as `qwen_3_06b_base.safetensors` and `qwen_image_vae.safetensors` are converter inputs; they are not accepted directly as `components_path`.
66+
67+
Use the Anima converter from the Diffusers reference implementation to prepare the component directory:
68+
69+
```bash
70+
python /path/to/convert_anima_to_diffusers.py \
71+
--transformer_ckpt_path /path/to/anima-base-v1.0.safetensors \
72+
--text_encoder_ckpt_path /path/to/qwen_3_06b_base.safetensors \
73+
--vae_ckpt_path /path/to/qwen_image_vae.safetensors \
74+
--qwen_tokenizer_path /path/to/qwen-tokenizer \
75+
--t5_tokenizer_path /path/to/t5-tokenizer \
76+
--output_path /path/to/anima-components \
77+
--save_pipeline
78+
```
79+
80+
Then point `--model` at the raw Anima transformer checkpoint and `components_path` at the converted directory:
81+
82+
```bash
83+
vllm serve "/path/to/anima.safetensors" \
84+
--omni \
85+
--model-class-name AnimaPipeline \
86+
--diffusers-load-kwargs '{
87+
"components_path": "/path/to/anima-components"
88+
}'
89+
```
90+
91+
No deploy config is required for local Anima single-file checkpoint discovery
92+
when `--model-class-name AnimaPipeline` is provided.
93+
94+
Native Anima currently supports baseline single-GPU execution. Cache-DiT,
95+
TeaCache, CPU offload, layer-wise offload, quantization, TP/SP, CFG parallel,
96+
HSDP, and step execution are not supported by `AnimaPipeline` yet.
97+
98+
There are two more optional arguments, `--diffusers-load-kwargs` and `--diffusers-call-kwargs`, which are valid together with `--diffusion-load-format diffusers` or `diffusers_single_file`. Native Anima also accepts `--diffusers-load-kwargs` for component paths such as `components_path`, but does not delegate denoising to Diffusers.
4599

46100
After launching the model, users send a request as usual. Refer to other documentation pages on how to request a particular input/output modality, such as `examples/online_serving/text_to_image/openai_chat_client.py`.
47101

examples/offline_inference/text_to_image/README.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -196,6 +196,23 @@ python examples/offline_inference/text_to_image/text_to_image.py \
196196
--auxiliary-text-encoder meta-llama/Meta-Llama-3.1-8B-Instruct \
197197
--output /output.png
198198
```
199+
### Anima Single-File Checkpoints
200+
201+
To load Anima, point `--model` to the single-file checkpoint path, pass the native pipeline class name using `--model-class-name`, and supply the converted components directory using `--diffusers-load-kwargs`:
202+
203+
```bash
204+
python examples/offline_inference/text_to_image/text_to_image.py \
205+
--model /path/to/models/anima-official/split_files/diffusion_models/anima-base-v1.0.safetensors \
206+
--model-class-name AnimaPipeline \
207+
--diffusers-load-kwargs '{"components_path": "/path/to/models/anima-components"}' \
208+
--prompt "A cinematic close-up of a glass teapot on a wooden table." \
209+
--seed 42 \
210+
--guidance-scale 4.0 \
211+
--num-inference-steps 50 \
212+
--height 1024 \
213+
--width 1024 \
214+
--output anima_output.png
215+
```
199216

200217
### Batch Requests (Multiple Prompts)
201218

examples/offline_inference/text_to_image/text_to_image.py

Lines changed: 31 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,16 @@ def parse_profiler_config(value: str) -> dict[str, Any]:
4040
return config
4141

4242

43+
def parse_json_dict(value: str) -> dict[str, Any]:
44+
try:
45+
config = json.loads(value)
46+
except json.JSONDecodeError as e:
47+
raise argparse.ArgumentTypeError(f"Must be a valid JSON object: {e}") from e
48+
if not isinstance(config, dict):
49+
raise argparse.ArgumentTypeError("Must be a JSON object (dict)")
50+
return config
51+
52+
4353
def parse_args() -> argparse.Namespace:
4454
parser = argparse.ArgumentParser(description="Generate an image with supported diffusion models.")
4555
parser.add_argument(
@@ -306,6 +316,18 @@ def parse_args() -> argparse.Namespace:
306316
default=None,
307317
help="Supplementary auxiliary text encoder parameters model name or path (especially for Hidream-l1-full).",
308318
)
319+
parser.add_argument(
320+
"--model-class-name",
321+
type=str,
322+
default=None,
323+
help="Override the diffusion pipeline class name (e.g. AnimaPipeline).",
324+
)
325+
parser.add_argument(
326+
"--diffusers-load-kwargs",
327+
type=parse_json_dict,
328+
default=None,
329+
help='JSON object passed to model loader (e.g. \'{"components_path": "/path"}\').',
330+
)
309331
current_omni_platform.pre_register_and_update(parser)
310332
return parser.parse_args()
311333

@@ -401,9 +423,13 @@ def main():
401423
}
402424
if args.stage_configs_path:
403425
omni_kwargs["stage_configs_path"] = args.stage_configs_path
404-
if use_nextstep:
426+
if args.model_class_name:
427+
omni_kwargs["model_class_name"] = args.model_class_name
428+
elif use_nextstep:
405429
# NextStep-1.1 requires explicit pipeline class
406430
omni_kwargs["model_class_name"] = "NextStep11Pipeline"
431+
if args.diffusers_load_kwargs is not None:
432+
omni_kwargs["diffusers_load_kwargs"] = args.diffusers_load_kwargs
407433
omni = Omni(**omni_kwargs)
408434

409435
if profiler_enabled:
@@ -432,6 +458,10 @@ def main():
432458
print(f" LoRA: scale={args.lora_scale}")
433459
if args.stage_configs_path:
434460
print(f" stage-configs-path: {args.stage_configs_path}")
461+
if args.model_class_name:
462+
print(f" Model class name: {args.model_class_name}")
463+
if args.diffusers_load_kwargs is not None:
464+
print(f" Diffusers load kwargs: {args.diffusers_load_kwargs}")
435465
print(f"{'=' * 60}\n")
436466

437467
# Build LoRA request when --lora-path is set

examples/online_serving/diffusers_pipeline_adapter/README.md

Lines changed: 25 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,31 @@ vllm serve "stable-diffusion-v1-5/stable-diffusion-v1-5" \
3838

3939
Users turn on the diffusers backend primarily through `--diffusion-load-format diffusers` argument.
4040
There are two more optional arguments, `--diffusers-load-kwargs` and `--diffusers-call-kwargs`,
41-
which are only valid together with `--diffusion-load-format diffusers`.
41+
which are valid together with `--diffusion-load-format diffusers` or `diffusers_single_file`.
42+
Native Anima also accepts `--diffusers-load-kwargs` for component paths such as `components_path`,
43+
but does not delegate denoising to Diffusers.
44+
45+
### Native Anima Single-File Checkpoints
46+
47+
Anima single-file checkpoints are served through the native `AnimaPipeline`, not through
48+
`AnimaModularPipeline.from_single_file()`. If `--model-class-name AnimaModularPipeline`
49+
is passed for a local single-file checkpoint, vLLM-Omni maps it to `AnimaPipeline`.
50+
51+
```bash
52+
vllm serve "/path/to/anima-base-v1.0.safetensors" \
53+
--omni \
54+
--model-class-name AnimaPipeline \
55+
--diffusers-load-kwargs '{"components_path": "/path/to/anima-components"}'
56+
```
57+
58+
No deploy config is required for local Anima single-file checkpoint discovery
59+
when `--model-class-name AnimaPipeline` is provided.
60+
61+
The native path needs the non-denoiser components (`text_encoder`, `tokenizer`,
62+
`t5_tokenizer`, `vae`, and optionally `scheduler`) in Diffusers `from_pretrained()`
63+
layout. Native Anima currently supports baseline single-GPU execution.
64+
Cache-DiT, TeaCache, CPU offload, layer-wise offload, quantization, TP/SP, CFG
65+
parallel, HSDP, and step execution are not supported by `AnimaPipeline` yet.
4266

4367
After launching the model, users send a request as usual. Refer to other documentation pages on how to request a particular input/output modality, such as `examples/online_serving/text_to_image/openai_chat_client.py`.
4468

0 commit comments

Comments
 (0)