File tree Expand file tree Collapse file tree
Expand file tree Collapse file tree Original file line number Diff line number Diff line change @@ -98,7 +98,6 @@ Nothing here yet. I am working through the Seen list on hardware; entries gradua
9898GB10's Blackwell architecture supports NVFP4 (4-bit floating point) in hardware. It runs faster than INT4 at similar quality.
9999
100100- [ AEON-7/Gemma-4-26B-A4B-it-Uncensored-NVFP4] ( https://github.com/AEON-7/Gemma-4-26B-A4B-it-Uncensored-NVFP4 ) - NVFP4 Gemma 4 26B MoE on DGX Spark with DFlash speculative decoding, 39-155 tok/s single-stream.
101- - [ AEON-7/Gemma-4-31B-DECKARD-HERETIC-Uncensored-NVFP4] ( https://github.com/AEON-7/Gemma-4-31B-DECKARD-HERETIC-Uncensored-NVFP4 ) - NVFP4-quantized Gemma 4 31B for DGX Spark with deployment guide.
102101- [ AEON-7/Gemma-4-31B-Uncensored-NVFP4-DFlash] ( https://github.com/AEON-7/Gemma-4-31B-Uncensored-NVFP4-DFlash ) - vLLM image for DGX Spark serving NVFP4 Gemma 4 31B (Deckard Heretic) with z-lab DFlash speculative decoding.
103102- [ AEON-7/Nemotron-3-Nano-Omni-AEON-Ultimate-Uncensored] ( https://github.com/AEON-7/Nemotron-3-Nano-Omni-AEON-Ultimate-Uncensored ) - Source-built vLLM image for DGX Spark serving abliterated Nemotron-3-Nano-Omni multimodal in BF16 and NVFP4.
104103- [ AEON-7/Qwen3.6-27B-AEON-Ultimate-Uncensored-DFlash] ( https://github.com/AEON-7/Qwen3.6-27B-AEON-Ultimate-Uncensored-DFlash ) - Prebuilt vLLM container for DGX Spark with abliterated Qwen3.6-27B (NVFP4 + DFlash), sm_121a-patched for 37.6 tok/s vs 10.5 raw.
You can’t perform that action at this time.
0 commit comments