Releases: vipshop/cache-dit
Releases · vipshop/cache-dit
v1.0.15
What's Changed
- feat: support cache & tp for wan vace by @DefTruth in #406
- feat: support mochi-1-preview Tensor Parallelism by @gameofdimension in #408
- chore: Update README.md by @DefTruth in #409
- feat: support HunyuanDiT Tensor Parallelism by @gameofdimension in #411
- bugfix: fix summary stats from dict by @DefTruth in #412
- bugfix: fix strify error while no-cache by @DefTruth in #414
- feat: support wan vace context parallel by @DefTruth in #415
- chore: Update README.md by @DefTruth in #416
- feat: support Wan2.1-VACE Tensor Parallelism by @gameofdimension in #417
- misc: use dummy blocks for flux by default by @DefTruth in #418
Full Changelog: v1.0.14...v1.0.15
v1.0.14
v1.0.13
v1.0.12
What's Changed
- chore: update support matrix by @DefTruth in #383
- fix hyvideo cp monkey patch by @DefTruth in #384
- chore: Update README.md by @DefTruth in #385
- chore: update bench configs by @DefTruth in #386
- parallel: auto select backend if needed by @DefTruth in #387
- feat: support cogvideox context parallel by @DefTruth in #388
- feat: support Tensor Parallelism for Chroma by @gameofdimension in #389
- feat: support cogview4 & consisid cp by @DefTruth in #391
- chore: Update CP/TP docs by @DefTruth in #392
- feat: support chorma-hd cp by @DefTruth in #393
- feat: support Kandinsky5 Tensor Parallelism by @gameofdimension in #394
- feat: fully support nunchaku flux cp by @DefTruth in #396
- feat: fully support nunchaku qwen-image cp by @DefTruth in #397
- hotfix for qwen-image-lightning nunchaku + cp by @DefTruth in #398
Full Changelog: v1.0.11...v1.0.12
v1.0.11
What's Changed
- chore: update parallelism docs by @DefTruth in #346
- chore: add params modifier docs by @DefTruth in #347
- chore: Update cache-dit desc by @DefTruth in #348
- chore: Update cache-dit desc by @DefTruth in #349
- chore: Update DBCache docs by @DefTruth in #350
- refactor: move cache_factory to caching by @DefTruth in #351
- fix example utils key error by @DefTruth in #352
- feat: add wan2.2 cp example by @DefTruth in #354
- feat: add ltxvideo cp example by @DefTruth in #357
- bugfix: fix ltxvideo context parallel noise result by @DefTruth in #361
- chore: Update README.md by @DefTruth in #362
- chore: Update README.md by @DefTruth in #363
- chore: Update README.md by @DefTruth in #364
- chore: Update README.md by @DefTruth in #365
- chore: Update README.md by @DefTruth in #366
- feat: make cache compatible with block-level cp by @DefTruth in #369
- feat: use template cp for native attn by @DefTruth in #371
- feat: add Qwen-Image-Lightning CP/TP example by @DefTruth in #372
- feat: support TP for hunyuanimage2.1/video by @gameofdimension in #373
- chore: Update Tensor Parallelism docs by @DefTruth in #376
- chore: Update examples by @DefTruth in #377
- misc: add custom attn dispatch env by @DefTruth in #378
- feat: add diffusers' hyimage21 adapter by @DefTruth in #379
- feat: support hyimage21 context parallel by @DefTruth in #381
- feat: support hyvideo context parallel by @DefTruth in #382
Full Changelog: v1.0.10...v1.0.11
v1.0.10
What's Changed
- feat: mark check forward pattern default false by @DefTruth in #321
- feat: support transformer-only api by @DefTruth in #323
- chore: cache-dit recommended by HelloGithub by @DefTruth in #328
- chore: Update News by @DefTruth in #329
- feat: support FLUX.1 w/ Tensor Parallelism by @gameofdimension in #326
- feat: support longcat-video w/ 4/8-bits by @DefTruth in #325
- chore: Update News by @DefTruth in #334
- feat: support Qwen-Image tensor parallelism by @gameofdimension in #331
- chore: update badges by @DefTruth in #336
- feat: refactor tensor parallelism dispatch by @DefTruth in #337
- chore: update qwen-image tp example by @DefTruth in #338
- refactor context parallelism dispatch by @DefTruth in #340
- feat: Support Wan series tensor parallelism by @gameofdimension in #341
- docs: add quantize api docs by @DefTruth in #343
- chore: update quantize api docs by @DefTruth in #344
- docs: add parallelism docs by @DefTruth in #345
Full Changelog: v1.0.9...v1.0.10
v1.0.9
What's Changed
- chore: Update News by @DefTruth in #314
- chore: Update README.md by @DefTruth in #315
- feat: allow empty cache_config w/ parallelism by @DefTruth in #317
- feat: flux nunchaku + context parallelism by @DefTruth in #318
- chore: add perf mode to example by @DefTruth in #319
Full Changelog: v1.0.8...v1.0.9
v1.0.8
v1.0.7
What's Changed
- fix some typo & parallel config check by @DefTruth in #299
- remove wrong comments by @DefTruth in #301
- feat: Qwen-Image context parallelism example by @DefTruth in #303
- feat: support kandinsky5 pipeline by @DefTruth in #304
- feat: support Photoroom/PRX pipeline by @DefTruth in #308
- chore: add Kandinsky-5 cache assets by @DefTruth in #310
- chore: add Kandinsky-5 cache assets by @DefTruth in #311
Full Changelog: v1.0.6...v1.0.7
⚡️Hybird Context Parallelism
cache-dit is compatible with context parallelism. Currently, we support the use of Hybrid Cache + Context Parallelism scheme (via NATIVE_DIFFUSER parallelism backend) in cache-dit. Users can use Context Parallelism to further accelerate the speed of inference! For more details, please refer to 📚examples/parallelism.
from cache_dit import ParallelismConfig
cache_dit.enable_cache(
pipe_or_adapter,
cache_config=DBCacheConfig(...),
# Set ulysses_size > 1 to enable ulysses style context parallelism.
parallelism_config=ParallelismConfig(ulysses_size=2),
)
# Then, run with torchrun cmd:
# torchrun --nproc_per_node=2 parallel_cache.py