Can this be used for Qwen3, QwQ, or DeepSeek v3 / R2? #233

BradKML · 2025-06-06T06:30:28Z

BradKML
Jun 6, 2025

It would be really great if 16/32/64GB of RAM can handle SOTA models

subrat243 · 2026-04-18T16:38:09Z

subrat243
Apr 18, 2026

Short answer: Partially yes — here's a breakdown by model:

Qwen / Qwen2.5 → ✅ Officially Supported

As of v2.11.0, AirLLM officially supports the Qwen and Qwen2.5 model families. Just use AutoModel:

from airllm import AutoModel
model = AutoModel.from_pretrained("Qwen/Qwen2.5-72B-Instruct")

Qwen3 → ⚠️ Likely works, not officially confirmed

Qwen3 uses a similar transformer architecture to Qwen2.5. AutoModel automatically detects the model architecture from config.json, so Qwen3 dense models (e.g. Qwen/Qwen3-32B) will likely work as a drop-in. That said, the MoE variants (Qwen3-235B-A22B) use a different routing architecture that hasn't been explicitly tested with AirLLM's layer-sharding approach.

QwQ-32B → ⚠️ Should work (it's Qwen2.5-based)

QwQ-32B uses the Qwen2.5 architecture under the hood, so it should be compatible with AirLLM v2.11.0+. Try:

model = AutoModel.from_pretrained("Qwen/QwQ-32B")

DeepSeek V3 / R2 → ❌ Not officially supported

DeepSeek V3 is a 671B MoE model with a custom architecture (Multi-head Latent Attention + expert routing). AirLLM currently doesn't list it as supported, and its MoE design makes layer-by-layer sharding non-trivial. You'd likely hit errors. For DeepSeek, tools like llama.cpp with GGUF or Ollama are more practical right now.

About RAM (16/32/64GB):

AirLLM's bottleneck is actually disk I/O and VRAM, not system RAM. A 70B model needs ~140GB of disk space after sharding. 16–64GB of system RAM is fine as a buffer, but you still need at least 4–8GB VRAM. CPU-only inference is supported since v2.10.1 but is very slow.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Can this be used for Qwen3, QwQ, or DeepSeek v3 / R2? #233

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Can this be used for Qwen3, QwQ, or DeepSeek v3 / R2? #233

Uh oh!

BradKML Jun 6, 2025

Replies: 1 comment

Uh oh!

subrat243 Apr 18, 2026

BradKML
Jun 6, 2025

subrat243
Apr 18, 2026