Releases · vipshop/cache-dit · GitHub

13 Nov 03:27

DefTruth

v1.0.15

What's Changed

feat: support cache & tp for wan vace by @DefTruth in #406
feat: support mochi-1-preview Tensor Parallelism by @gameofdimension in #408
chore: Update README.md by @DefTruth in #409
feat: support HunyuanDiT Tensor Parallelism by @gameofdimension in #411
bugfix: fix summary stats from dict by @DefTruth in #412
bugfix: fix strify error while no-cache by @DefTruth in #414
feat: support wan vace context parallel by @DefTruth in #415
chore: Update README.md by @DefTruth in #416
feat: support Wan2.1-VACE Tensor Parallelism by @gameofdimension in #417
misc: use dummy blocks for flux by default by @DefTruth in #418

Full Changelog: v1.0.14...v1.0.15

Contributors

DefTruth and gameofdimension

Assets 2

11 Nov 04:36

DefTruth

v1.0.14

What's Changed

chore: maybe empty cache after parallelism by @DefTruth in #402
hotfix for _attention_dispatch check by @DefTruth in #404
fix nunchaku deps import errors by @DefTruth in #405

Full Changelog: v1.0.13...v1.0.14

Contributors

DefTruth

Assets 2

07 Nov 11:29

DefTruth

v1.0.13

What's Changed

feat: support pixart context parallel by @DefTruth in #399
feat: support DiT-XL context parallel by @DefTruth in #400

Full Changelog: v1.0.12...v1.0.13

Contributors

DefTruth

Assets 2

07 Nov 04:15

DefTruth

v1.0.12

What's Changed

chore: update support matrix by @DefTruth in #383
fix hyvideo cp monkey patch by @DefTruth in #384
chore: Update README.md by @DefTruth in #385
chore: update bench configs by @DefTruth in #386
parallel: auto select backend if needed by @DefTruth in #387
feat: support cogvideox context parallel by @DefTruth in #388
feat: support Tensor Parallelism for Chroma by @gameofdimension in #389
feat: support cogview4 & consisid cp by @DefTruth in #391
chore: Update CP/TP docs by @DefTruth in #392
feat: support chorma-hd cp by @DefTruth in #393
feat: support Kandinsky5 Tensor Parallelism by @gameofdimension in #394
feat: fully support nunchaku flux cp by @DefTruth in #396
feat: fully support nunchaku qwen-image cp by @DefTruth in #397
hotfix for qwen-image-lightning nunchaku + cp by @DefTruth in #398

Full Changelog: v1.0.11...v1.0.12

Contributors

DefTruth and gameofdimension

Assets 2

05 Nov 03:12

DefTruth

v1.0.11

What's Changed

chore: update parallelism docs by @DefTruth in #346
chore: add params modifier docs by @DefTruth in #347
chore: Update cache-dit desc by @DefTruth in #348
chore: Update cache-dit desc by @DefTruth in #349
chore: Update DBCache docs by @DefTruth in #350
refactor: move cache_factory to caching by @DefTruth in #351
fix example utils key error by @DefTruth in #352
feat: add wan2.2 cp example by @DefTruth in #354
feat: add ltxvideo cp example by @DefTruth in #357
bugfix: fix ltxvideo context parallel noise result by @DefTruth in #361
chore: Update README.md by @DefTruth in #362
chore: Update README.md by @DefTruth in #363
chore: Update README.md by @DefTruth in #364
chore: Update README.md by @DefTruth in #365
chore: Update README.md by @DefTruth in #366
feat: make cache compatible with block-level cp by @DefTruth in #369
feat: use template cp for native attn by @DefTruth in #371
feat: add Qwen-Image-Lightning CP/TP example by @DefTruth in #372
feat: support TP for hunyuanimage2.1/video by @gameofdimension in #373
chore: Update Tensor Parallelism docs by @DefTruth in #376
chore: Update examples by @DefTruth in #377
misc: add custom attn dispatch env by @DefTruth in #378
feat: add diffusers' hyimage21 adapter by @DefTruth in #379
feat: support hyimage21 context parallel by @DefTruth in #381
feat: support hyvideo context parallel by @DefTruth in #382

Full Changelog: v1.0.10...v1.0.11

Contributors

DefTruth and gameofdimension

Assets 2

30 Oct 12:29

DefTruth

v1.0.10

What's Changed

feat: mark check forward pattern default false by @DefTruth in #321
feat: support transformer-only api by @DefTruth in #323
chore: cache-dit recommended by HelloGithub by @DefTruth in #328
chore: Update News by @DefTruth in #329
feat: support FLUX.1 w/ Tensor Parallelism by @gameofdimension in #326
feat: support longcat-video w/ 4/8-bits by @DefTruth in #325
chore: Update News by @DefTruth in #334
feat: support Qwen-Image tensor parallelism by @gameofdimension in #331
chore: update badges by @DefTruth in #336
feat: refactor tensor parallelism dispatch by @DefTruth in #337
chore: update qwen-image tp example by @DefTruth in #338
refactor context parallelism dispatch by @DefTruth in #340
feat: Support Wan series tensor parallelism by @gameofdimension in #341
docs: add quantize api docs by @DefTruth in #343
chore: update quantize api docs by @DefTruth in #344
docs: add parallelism docs by @DefTruth in #345

Full Changelog: v1.0.9...v1.0.10

Contributors

DefTruth and gameofdimension

Assets 2

24 Oct 02:43

DefTruth

v1.0.9

What's Changed

chore: Update News by @DefTruth in #314
chore: Update README.md by @DefTruth in #315
feat: allow empty cache_config w/ parallelism by @DefTruth in #317
feat: flux nunchaku + context parallelism by @DefTruth in #318
chore: add perf mode to example by @DefTruth in #319

Full Changelog: v1.0.8...v1.0.9

Contributors

DefTruth

Assets 2

22 Oct 08:12

DefTruth

v1.0.8

What's Changed

deps: minumum install dependencies by @DefTruth in #312
feat: refactor parallelism backends by @DefTruth in #313

Full Changelog: v1.0.7...v1.0.8

Contributors

DefTruth

Assets 2

22 Oct 03:53

DefTruth

v1.0.7

What's Changed

fix some typo & parallel config check by @DefTruth in #299
remove wrong comments by @DefTruth in #301
feat: Qwen-Image context parallelism example by @DefTruth in #303
feat: support kandinsky5 pipeline by @DefTruth in #304
feat: support Photoroom/PRX pipeline by @DefTruth in #308
chore: add Kandinsky-5 cache assets by @DefTruth in #310
chore: add Kandinsky-5 cache assets by @DefTruth in #311

Full Changelog: v1.0.6...v1.0.7

Contributors

DefTruth

Assets 2

20 Oct 10:44

DefTruth

⚡️Hybird Context Parallelism

cache-dit is compatible with context parallelism. Currently, we support the use of Hybrid Cache + Context Parallelism scheme (via NATIVE_DIFFUSER parallelism backend) in cache-dit. Users can use Context Parallelism to further accelerate the speed of inference! For more details, please refer to 📚examples/parallelism.

from cache_dit import ParallelismConfig

cache_dit.enable_cache(
    pipe_or_adapter, 
    cache_config=DBCacheConfig(...),
    # Set ulysses_size > 1 to enable ulysses style context parallelism.
    parallelism_config=ParallelismConfig(ulysses_size=2),
)
# Then, run with torchrun cmd:
# torchrun --nproc_per_node=2 parallel_cache.py

Assets 2