-
Notifications
You must be signed in to change notification settings - Fork 166
fix(deepseek-v4): release superseded interior continuation-state snapshots #460
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 3 commits
bae5df6
ed895f5
b2d853b
4ad2e0f
b9efe77
1dd77d3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1501,7 +1501,47 @@ void HybridPrefixCache::CommitChunk(const std::string& request_id, TreeNode* ter | |
| last_committed = target; | ||
| } | ||
|
|
||
| (void)commitTerminalContinuationSnapshot(tables, terminal, chunk_depth); | ||
| const bool terminal_state_committed = commitTerminalContinuationSnapshot(tables, terminal, chunk_depth); | ||
|
|
||
| // Release superseded interior continuation-state snapshots. | ||
| // | ||
| // A continuation-state restore resumes from the deepest matching terminal | ||
| // (Match). Each turn's terminal becomes an interior ancestor on the next | ||
| // turn, but nothing released its now-superseded trailing-window state | ||
| // snapshot, so these pinned pages accumulate one window per turn and | ||
| // exhaust the small State pools (e.g. v4.c128a.compressor_state). Release an | ||
| // ancestor's State portion (keeping its History chain) only when it is | ||
| // provably unreferenced, which requires BOTH: | ||
| // (1) the owning request's sliding window has advanced past the ancestor | ||
| // (node_depth + window <= chunk_depth), so ReleaseSkipped has already | ||
| // dropped those pages from this request's own borrowed set; and | ||
| // (2) no OTHER request references the ancestor. Each request holds exactly | ||
| // one DeviceNodeRef that Locks its whole path to root (NodeRef::Lock), | ||
| // so Device().RefCount() == 1 means this committing request is the | ||
| // sole referencer and no other request can be borrowing the node's | ||
| // continuation-state window. When shared (RefCount > 1, e.g. a second | ||
| // request whose prefix runs through this node), keep the snapshot so | ||
| // the sharer's continuation-state resume stays valid; it is released | ||
| // on a later commit once the sharer's ref drops. | ||
| // Gate on a complete terminal snapshot so a resume anchor always remains. | ||
| if (terminal_state_committed) { | ||
| std::int32_t max_state_window = 0; | ||
| for (const auto& gid : paged_cache_continuation_state_groups_) { | ||
| auto alloc_it = paged_cache_allocators_.find(gid); | ||
| if (alloc_it != paged_cache_allocators_.end() && alloc_it->second != nullptr) { | ||
| max_state_window = | ||
| std::max(max_state_window, alloc_it->second->Config().sliding_window_tokens.value_or(0)); | ||
| } | ||
| } | ||
| for (TreeNode* cur = terminal->Parent(); cur != nullptr && !cur->IsRoot(); cur = cur->Parent()) { | ||
| if (!cur->HasPagedCacheSnapshot()) continue; | ||
| if (static_cast<std::int32_t>(cur->DepthInTokens()) + max_state_window > chunk_depth) { | ||
| continue; | ||
| } | ||
| if (!cur->OnDevice() || cur->Device().RefCount() != 1) continue; | ||
| DetachStateSnapshotFromNode(cur); | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
When this downgrades an interior node to a history-only snapshot, a later request that branches before the next state-complete boundary can no longer commit past that node: Useful? React with 👍 / 👎. |
||
| } | ||
| } | ||
| } | ||
|
|
||
| } // namespace tokenspeed | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When a shared-prefix request has been retracted, it keeps only a
HostNodeRefand its paged-cache table is not released until recovery, so it no longer contributes toDevice().RefCount(). If another request sharing this ancestor commits with this count equal to 1, this branch deletes the ancestor's continuation-state snapshot even though the retracted request may later recover throughMatch(...StateRecovery)/AdmitChunkFromRetracted; that can leave recovery without the saved state pages (or with stale borrowed ids until release). Device refcount alone therefore does not prove no other live request still depends on the snapshot.Useful? React with 👍 / 👎.