Skip to content

fix(workflows): restore interactive-loop cross-gate session continuity (reverts #1923)#2005

Closed
ianstantiate wants to merge 1 commit into
coleam00:devfrom
ianstantiate:fix/revert-1923-loop-session
Closed

fix(workflows): restore interactive-loop cross-gate session continuity (reverts #1923)#2005
ianstantiate wants to merge 1 commit into
coleam00:devfrom
ianstantiate:fix/revert-1923-loop-session

Conversation

@ianstantiate

@ianstantiate ianstantiate commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Problem: #1923 forced a fresh Claude session on the first iteration after every interactive-loop approval gate, so the agent lost all memory of its prior turns across the gate (tracked in #2004).
  • Why it matters: It breaks the single most valuable use of interactive loops — a multi-turn conversation / interview / iterative refinement that a human steers across gates. Every post-gate turn started with no memory of the prior turns, seeing only $LOOP_USER_INPUT.
  • What changed: Removed the (isLoopResume && i === startIteration) term from needsFreshSession in dag-executor.ts, restoring loop.fresh_context || i === 1; reinstated the resume test assertion (sessionArg === 'loop-session-1'); removed fix(workflows): interactive loop resume uses fresh session on first iteration #1923's regression test that encoded the fresh-session behavior; reverted the loop-nodes.md doc note.
  • What did NOT change (scope boundary): #1291's fail-loud isError handling, fresh_context: true loops, non-interactive loops, and persist_session loops are all untouched. This PR does not attempt to diagnose or fix the original #1208 crash — that is separate work.

UX Journey

Before

User                     Archon (interactive loop)          Claude SDK
────                     ─────────────────────────          ──────────
approve gate "feedback" ▶ resume at iteration 2
                          needsFreshSession = TRUE           (isLoopResume && i === startIteration)
                          resumeSessionId = undefined ──────▶ FRESH session — no memory of iter 1
sees context-free reply ◀─ agent only has $LOOP_USER_INPUT

After

User                     Archon (interactive loop)          Claude SDK
────                     ─────────────────────────          ──────────
approve gate "feedback" ▶ resume at iteration 2
                          [needsFreshSession = false]        (fresh_context=false, i>1)
                          [resumeSessionId = gate session] ─▶ RESUMES the iteration-1 session
sees context-aware reply◀─ agent retains prior turns + the new $LOOP_USER_INPUT

Architecture Diagram

Before

dag-executor.ts :: executeLoopNode  (isLoopResume=true, startIteration=2)
  needsFreshSession = fresh_context || i===1 || (isLoopResume && i===startIteration)  ← TRUE on resume
  resumeSessionId   = undefined
  aiClient.sendQuery(prompt, cwd, undefined, opts)   ← fresh session, prior context lost

After

dag-executor.ts :: executeLoopNode  (isLoopResume=true, startIteration=2)
  [~] needsFreshSession = fresh_context || i===1        ← false on resume
  resumeSessionId       = currentSessionId (gate sessionId)
  aiClient.sendQuery(prompt, cwd, gateSessionId, opts)  ← threads the stored session

Connection inventory:

From To Status Notes
dag-executor.ts::executeLoopNode aiClient.sendQuery modified resumeSessionId again threads the stored gate session on the first resumed iteration (was forced to undefined by #1923)
dag-executor.ts::executeLoopNode loopGateMeta.sessionId modified now passed through as resumeSessionId again (was read then discarded)
loop iteration result handler #1291 isError fail-loud throw unchanged a genuine resume failure still throws loudly with errors[]

Label Snapshot

  • Risk: risk: low
  • Size: size: XS
  • Scope: workflows (+ docs, tests)
  • Module: workflows:dag-executor (executeLoopNode)

Change Metadata

  • Change type: bug (revert of a regression)
  • Primary scope: workflows

Linked Issue

Validation Evidence (required)

bun run validate was run on this branch. Per-step results:

bun run check:bundled         # ✅ pass
bun run check:bundled-skill   # ✅ pass
bun run check:bundled-schema  # ✅ pass
bun run check:pi-vendor-map   # ✅ pass (OK)
bun run type-check            # ✅ exit 0 — all 10 packages
bun run lint --max-warnings 0 # ✅ exit 0
bun run format:check          # ✅ changed files clean (see note)
bun run test                  # ✅ 5144 pass / 0 fail — all packages

Security Impact (required)

  • New permissions/capabilities? No
  • New external network calls? No
  • Secrets/tokens handling changed? No
  • File system access scope changed? No
  • This is a behavior revert in the workflow loop executor; no security surface is touched.

Compatibility / Migration

  • Backward compatible? Yes — restores the behavior shipped in the latest tagged release; the reverted behavior was dev-only and unreleased.
  • Config/env changes? No
  • Database migration needed? No

Human Verification (required)

Side Effects / Blast Radius (required)

Rollback Plan (required)

Risks and Mitigations

Summary by CodeRabbit

  • Bug Fixes

    • Fixed loop node session management during interactive resume: session IDs are now properly preserved when resuming iterations, unless fresh_context is explicitly enabled.
  • Documentation

    • Updated loop node guide to clarify fresh_context behavior during iterations.

Reverts the change from coleam00#1923, which forced a fresh Claude session on the
first iteration after every interactive-loop approval gate by adding
`(isLoopResume && i === startIteration)` to `needsFreshSession` in
dag-executor.ts. That broke cross-gate conversation continuity: every turn
after a human gate began a context-free session carrying only
$LOOP_USER_INPUT (with $LOOP_PREV_OUTPUT empty), making multi-turn
interactive loops (interviews, iterative refinement) impossible.

Restores the released behavior `needsFreshSession = loop.fresh_context ||
i === 1`, reinstates the test assertion guarding cross-gate continuity
(sessionArg === 'loop-session-1'), and removes coleam00#1923's regression test that
encoded the fresh-session behavior. coleam00#1291's fail-loud isError handling is
left untouched. Confirming and fixing the original coleam00#1208 crash is separate
work.

Closes coleam00#2004
@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 25855e56-a5ef-45a8-a9aa-e9e1cd000398

📥 Commits

Reviewing files that changed from the base of the PR and between e77a338 and d6cb519.

📒 Files selected for processing (3)
  • packages/docs-web/src/content/docs/guides/loop-nodes.md
  • packages/workflows/src/dag-executor.test.ts
  • packages/workflows/src/dag-executor.ts

📝 Walkthrough

Walkthrough

Reverts a regression introduced in PR #1923: the needsFreshSession expression in executeLoopNode is simplified to loop.fresh_context || i === 1, removing the clause that forced a fresh AI session on the first iteration after an interactive loop resume. The test is updated to assert the stored gate session id is reused, and the corresponding docs note is trimmed.

Changes

Interactive Loop Session Threading Regression Fix

Layer / File(s) Summary
Loop session freshness logic and test
packages/workflows/src/dag-executor.ts, packages/workflows/src/dag-executor.test.ts
needsFreshSession is changed to loop.fresh_context || i === 1, dropping the isLoopResume && i === startIteration clause. The test assertion for the resumed iteration is updated to expect the stored gate session id (loop-session-1) instead of undefined.
fresh_context docs cleanup
packages/docs-web/src/content/docs/guides/loop-nodes.md
Removes the sentence describing the always-fresh first-resumed-iteration behavior and the $LOOP_USER_INPUT note, matching the restored runtime behavior.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related PRs

  • coleam00/Archon#1923: Directly introduced the isLoopResume && i === startIteration clause in executeLoopNode that this PR reverts, along with the docs wording and test expectations being changed here.
  • coleam00/Archon#1367: Also modifies executeLoopNode interactive loop resume first-iteration behavior in dag-executor.ts/dag-executor.test.ts, specifically the $LOOP_PREV_OUTPUT substitution on the same resumed iteration path.

Poem

🐇 Hop, hop, the session lives on,
No fresh start when the gate is gone!
The context flows from turn to turn,
A regression reversed — lesson learned.
The loop remembers every word I've said,
No more context lost, no blank slate dread! 🎉

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and specifically describes the main change: reverting PR #1923 to restore session continuity in interactive loops after approval gates.
Description check ✅ Passed The description comprehensively covers all template sections: problem, impact, changes made, scope boundaries, UX journey, architecture diagrams, metadata, validation evidence, security, compatibility, human verification, side effects, rollback plan, and risks.
Linked Issues check ✅ Passed The PR fully satisfies all four coding objectives from issue #2004: removes the (isLoopResume && i === startIteration) clause, restores session continuity with stored gate session IDs, preserves the original behavior with fresh_context handling intact, and clearly maintains scope boundaries.
Out of Scope Changes check ✅ Passed All three file changes are directly in-scope: dag-executor.ts session logic revert, dag-executor.test.ts regression test removal, and loop-nodes.md documentation revert. No unrelated changes detected.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Wirasm

Wirasm commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator

Thanks @ianstantiate — good catch, and the right call. #1923 traded away the single most valuable interactive-loop behavior (cross-gate session continuity) to "fix" #1208, and this restores it cleanly. It also lines up with the original #1208 investigation: error_during_execution arrives as a yielded isError: true result (environmental — Docker/VPS, OAuth refresh, etc.), not the thrown stale-session rejection — so forcing a fresh session on resume was never actually addressing the root cause.

I ran a multi-agent review over the diff. Summary below.

✅ Confirmed solid

Recommended hardening (non-blocking — small, just to lock it in)

  • I1 — the 2-call integration test never asserts the resumed session id. One-liner at that test: expect(resumeSessionArg).toBe('loop-session-1').
  • I2 — the re-exposed fix: interactive loop resume crashes with error_during_execution (stale session) #1208 path isn't tested on the resumed-interactive path (the existing loud-fail test only covers a non-interactive fresh run). A ~45-line test — resumed interactive loop + stale session → passes the stale id, fails loudly, one iteration only — would document the restored contract and guard against silent regression.
  • I3 — the bare // Session threading comment no longer records why resume threads the stored session (the fix(workflows): interactive loop resume uses fresh session on first iteration #1923 comment explaining the opposite was removed). A 1–2 line note would help, e.g. "on resume startIteration > 1 so i === 1 is false → the stored session threads forward intentionally; a stale session surfaces via the fail-loud guard below." Saves the next reader from guessing intent vs. oversight.
  • S1 — restore the deleted test's no-failure-events assertion into the kept test (~10 lines) — recovers its most valuable invariant.
  • S2 — docs: the ### interactive section doesn't mention that sessions thread across gates (the core restored UX), and there's no documented fresh_context: true stopgap for environments that still hit fix: interactive loop resume crashes with error_during_execution (stale session) #1208. Worth adding both.
  • S3 — heads up: the branch is ~10 commits behind dev and mergeability shows UNKNOWN; a rebase + re-check before merge.

One follow-up worth filing (pre-existing, not introduced here)

For non-Claude providers (Codex/Pi/Copilot/OpenCode), a cold-resume falls back to a fresh session and emits a ⚠️ system chunk — but executeLoopNode's stream loop has no msg.type === 'system' handler, so that warning is silently dropped (and #1842's cold-resume signal doesn't reach the loop path). So the observability net is real for Claude but absent for the others on the loop path. Not a blocker for this PR — just flagging it for a separate issue.


Net: the core revert is good to go. Happy for you to take the I1–I3 polish if you'd like (they're small) — or if you'd rather, hand it over and I'll finish it off. Whichever's easier. Either way, thanks for catching this.

@Wirasm

Wirasm commented Jun 27, 2026

Copy link
Copy Markdown
Collaborator

Taking this over to get it across the line — thanks @ianstantiate! I rebased your fix onto current dev (where dag-executor.ts has since changed via #1842 et al.), re-validated it (type-check + 286 dag-executor tests green), and opened it as #2046 with your commit authorship preserved. The fork-PR CI wasn't running here, so re-homing on origin lets the full suite run automatically. Closing in favor of #2046 (Closes #2004). Appreciate the clean, well-documented revert!

@Wirasm Wirasm closed this Jun 27, 2026
Wirasm added a commit that referenced this pull request Jun 27, 2026
…y (reverts #1923) (#2046)

Reverts #1923 so interactive loops re-thread the stored gate session on the first post-gate iteration, restoring cross-gate conversation continuity. Closes #2004. Takeover of #2005.

Co-authored-by: ianstantiate <251394470+ianstantiate@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Loops with 'interactive:true' always start a fresh AI session, losing previous-loop conversation continuity (regression introduced by #1923)

2 participants