revert: remove instanttensor loader by lightseek-bot · Pull Request #462 · lightseekorg/tokenspeed

lightseek-bot · 2026-06-16T08:17:11Z

Summary

Revert d3b285b
Remove the InstantTensor load format, docs, CI args, and runtime tests added by that commit

Tests

.venv/bin/python -m compileall -q python/tokenspeed
PRE_COMMIT_HOME=/tmp/pre-commit pre-commit run --all-files

Signed-off-by: lightseek-bot <243258330+lightseek-bot@users.noreply.github.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3863362906

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-16T08:22:38Z

-                    normal_weights.append((name, weight))
+        for name, weight in weights:
+            if ".experts" in name:
+                mxfp4_weights.append((name, weight))


Keep GPT-OSS MXFP4 expert loading streamed

Here weights is the loader generator, but this loop now appends every .experts tensor before any copy into model parameters occurs. For MXFP4 checkpoints such as openai/gpt-oss-120b, those expert tensors dominate the checkpoint; with the default safetensors iterator this pins all loaded shards in host memory until the list is handed to _load_mxfp4_experts_weights, so large-model loads can OOM instead of streaming each expert into its preallocated slot.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-16T08:22:38Z

+                vision_weights.append((name, loaded_weight))
+            else:
+                name = name.replace("language_model.", "")
+                language_weights.append((name, loaded_weight))


Stream Kimi language weights into the LM

Appending every non-vision tensor means Kimi language weights are all materialized before DeepseekV3ForCausalLM.load_weights is invoked. For large checkpoints such as nvidia/Kimi-K2.5-NVFP4 (still used by the CI configs changed here), the safetensors iterator will keep the whole language checkpoint alive in host RAM rather than freeing shards as DeepSeek consumes them, which can OOM large Kimi loads; pass a generator to language_model.load_weights and only collect vision weights.

Useful? React with 👍 / 👎.

revert: remove instanttensor loader

3863362

Signed-off-by: lightseek-bot <243258330+lightseek-bot@users.noreply.github.com>

lightseek-bot requested a review from a team as a code owner June 16, 2026 08:17

chatgpt-codex-connector Bot reviewed Jun 16, 2026

View reviewed changes

Merge branch 'main' into bot/revert-instanttensor-loader

69fd8bf

torchspec-bot approved these changes Jun 16, 2026

View reviewed changes

torchspec-bot merged commit ec60f40 into main Jun 16, 2026
69 of 73 checks passed

torchspec-bot deleted the bot/revert-instanttensor-loader branch June 16, 2026 20:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

revert: remove instanttensor loader#462

revert: remove instanttensor loader#462
torchspec-bot merged 2 commits into
mainfrom
bot/revert-instanttensor-loader

lightseek-bot commented Jun 16, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 16, 2026

Uh oh!

chatgpt-codex-connector Bot Jun 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lightseek-bot commented Jun 16, 2026

Summary

Tests

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants