fix(workflows): restore interactive-loop cross-gate session continuity (reverts #1923)#2005
fix(workflows): restore interactive-loop cross-gate session continuity (reverts #1923)#2005ianstantiate wants to merge 1 commit into
Conversation
Reverts the change from coleam00#1923, which forced a fresh Claude session on the first iteration after every interactive-loop approval gate by adding `(isLoopResume && i === startIteration)` to `needsFreshSession` in dag-executor.ts. That broke cross-gate conversation continuity: every turn after a human gate began a context-free session carrying only $LOOP_USER_INPUT (with $LOOP_PREV_OUTPUT empty), making multi-turn interactive loops (interviews, iterative refinement) impossible. Restores the released behavior `needsFreshSession = loop.fresh_context || i === 1`, reinstates the test assertion guarding cross-gate continuity (sessionArg === 'loop-session-1'), and removes coleam00#1923's regression test that encoded the fresh-session behavior. coleam00#1291's fail-loud isError handling is left untouched. Confirming and fixing the original coleam00#1208 crash is separate work. Closes coleam00#2004
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughReverts a regression introduced in PR ChangesInteractive Loop Session Threading Regression Fix
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Thanks @ianstantiate — good catch, and the right call. #1923 traded away the single most valuable interactive-loop behavior (cross-gate session continuity) to "fix" #1208, and this restores it cleanly. It also lines up with the original #1208 investigation: I ran a multi-agent review over the diff. Summary below. ✅ Confirmed solid
Recommended hardening (non-blocking — small, just to lock it in)
One follow-up worth filing (pre-existing, not introduced here)For non-Claude providers (Codex/Pi/Copilot/OpenCode), a cold-resume falls back to a fresh session and emits a Net: the core revert is good to go. Happy for you to take the I1–I3 polish if you'd like (they're small) — or if you'd rather, hand it over and I'll finish it off. Whichever's easier. Either way, thanks for catching this. |
|
Taking this over to get it across the line — thanks @ianstantiate! I rebased your fix onto current |
Summary
$LOOP_USER_INPUT.(isLoopResume && i === startIteration)term fromneedsFreshSessionindag-executor.ts, restoringloop.fresh_context || i === 1; reinstated the resume test assertion (sessionArg === 'loop-session-1'); removed fix(workflows): interactive loop resume uses fresh session on first iteration #1923's regression test that encoded the fresh-session behavior; reverted theloop-nodes.mddoc note.isErrorhandling,fresh_context: trueloops, non-interactive loops, andpersist_sessionloops are all untouched. This PR does not attempt to diagnose or fix the original #1208 crash — that is separate work.UX Journey
Before
After
Architecture Diagram
Before
After
Connection inventory:
dag-executor.ts::executeLoopNodeaiClient.sendQueryresumeSessionIdagain threads the stored gate session on the first resumed iteration (was forced toundefinedby #1923)dag-executor.ts::executeLoopNodeloopGateMeta.sessionIdresumeSessionIdagain (was read then discarded)isErrorfail-loud throwerrors[]Label Snapshot
risk: lowsize: XSworkflows(+docs,tests)workflows:dag-executor(executeLoopNode)Change Metadata
bug(revert of a regression)workflowsLinked Issue
Validation Evidence (required)
bun run validatewas run on this branch. Per-step results:bun run validatestep passes. The full per-package test suite is 5144 pass / 0 fail.format:checknote: the tree-wideprettier --check .only warns on untracked local files that are not part of this PR and don't exist in a clean checkout/CI. The 3 files this PR changes are Prettier-clean (bun x prettier --checkon them → "All matched files use Prettier code style!"), and the pre-commit hook (lint-staged: prettier + eslint) ran clean on them at commit time.dag-executor.tsandloop-nodes.md, and was verified byte-identical to a truegit revertof fix(workflows): interactive loop resume uses fresh session on first iteration #1923.Security Impact (required)
Compatibility / Migration
dev-only and unreleased.Human Verification (required)
needsFreshSessionterm removed, test assertion restored to'loop-session-1', fix(workflows): interactive loop resume uses fresh session on first iteration #1923 regression test removed, docs note reverted);type-check/lint/loop-test green; fix(workflows): fail loudly on SDK isError results in DAG and loop nodes #1291's "fails loudly onerror_during_execution" test still passes.fresh_context: trueand non-interactive loop paths are logically unchanged (theirneedsFreshSessionterms are untouched); fix(workflows): fail loudly on SDK isError results in DAG and loop nodes #1291 fail-loud path unchanged.bun run validateand per-package test suite (5144 pass / 0 fail) were run — see Validation Evidence.Side Effects / Blast Radius (required)
executeLoopNode); anyinteractive: trueloop that resumes after a gate.error_during_executionwill once again attempt a session resume on the first post-gate iteration. With #1291 in place this now fails loudly with the real SDKerrors[]rather than being masked — which is the intended behavior, and the data needed to actually diagnose fix: interactive loop resume crashes with error_during_execution (stale session) #1208.loop_node.iteration_sdk_errorlog +loop_iteration_failedevent surface any genuine resume failure with the SDK error strings.Rollback Plan (required)
error_during_executionfailure on the first post-gate iteration (witherrors[]detail), not a silent regression.Risks and Mitigations
error_during_execution(suspected environmental — Docker/VPS, Slack batch-streaming,CLAUDE_CODE_OAUTH_TOKENrefresh, MCP/tool/network) could resurface for affected users once session resume is restored.fresh_context: true(every iteration fresh) until fix: interactive loop resume crashes with error_during_execution (stale session) #1208 is properly diagnosed.Summary by CodeRabbit
Bug Fixes
fresh_contextis explicitly enabled.Documentation
fresh_contextbehavior during iterations.