feat(workflows): loop_group node — cross-node subgraph looping#2032
feat(workflows): loop_group node — cross-node subgraph looping#2032xbeark wants to merge 14 commits into
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughAdds ChangesLoop Group Node Support
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 10
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
packages/workflows/src/dag-executor.ts (1)
3911-3925: 🗄️ Data Integrity & Integration | 🟠 Major | 🏗️ Heavy liftPass the prefixed step name into normal node execution.
Loop-group body prompt/command nodes still call
executeNodeInternal()with the raw node, and that helper writesnode_started,node_completed,node_failed, and tool events usingstep_name: node.id. Those events will collide across loop iterations/body scopes despitestepNamePrefix; add an event-step-name parameter or equivalent prefixing path for the internal executor.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/workflows/src/dag-executor.ts` around lines 3911 - 3925, Pass the prefixed step name through normal node execution so loop/body nodes emit unique events. Update the call path into executeNodeInternal() and its event-writing logic to use the prefixed step name instead of node.id for node_started, node_completed, node_failed, and tool events, reusing the existing stepNamePrefix handling from the loop-group execution flow.packages/workflows/src/schemas/loop.ts (1)
18-41: 🎯 Functional Correctness | 🟠 Major | 🏗️ Heavy lift
loop_groupstill can't terminate viauntil_bash-only or fixed-count-only configs.Lines 21-23 make both
untilandmax_iterationsmandatory, so workflows using onlyuntil_bashor onlymax_iterationsare rejected before execution even though the PR contract describes those as valid completion modes. This needs a cross-field refinement instead of unconditional required fields.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/workflows/src/schemas/loop.ts` around lines 18 - 41, The loopControlSchema currently requires both loop.until and loop.max_iterations unconditionally, which blocks valid loop_group configs that should terminate via until_bash alone or via fixed-count only. Change the z.object definition in loopControlSchema to make until and max_iterations optional, then add superRefine cross-field validation to enforce at least one valid completion mode (until, max_iterations, or until_bash) while preserving the interactive/gate_message check. Use the existing loopControlSchema and superRefine logic to keep the schema constraints centralized.
🧹 Nitpick comments (2)
packages/workflows/src/dag-executor.test.ts (2)
10193-10242: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick winMake the
until_bashshort-circuit observable.This test says
until_bashshould not run, butuntil_bash: 'false'does not prove that. Use a side-effect sentinel/counter and assert it was not created when the AI signal is present.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/workflows/src/dag-executor.test.ts` around lines 10193 - 10242, The EDGE G test for loop_group short-circuiting does not actually prove that until_bash was skipped because using until_bash: 'false' still executes the command. Update the test around executeDagWorkflow and the loop_group setup to use a side-effect sentinel or counter tied to until_bash execution, then assert that it was never triggered when the until signal is emitted; keep the existing mockSendQueryDag flow and verify completion is still driven by the signal path.
10421-10512: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick winAdd a resume regression for
$LOOP_PREVin loop_group bodies.The resumed interactive test only checks
$LOOP_USER_INPUT; it should also cover$LOOP_PREV.<bodyNode>.outputsurviving the pause/resume boundary so the previous iteration’s body output is not lost.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/workflows/src/dag-executor.test.ts` around lines 10421 - 10512, The resumed interactive loop_group test in dag-executor.test.ts only verifies $LOOP_USER_INPUT, so extend the existing INTERACTIVE resumed loop_group case to also assert that $LOOP_PREV.work.output is preserved across the pause/resume boundary. Use the current loop_group setup around executeDagWorkflow, makeWorkflowRun, and the work node prompt to capture a previous-iteration body output before pausing, then resume with metadata.approval and verify the next iteration can still read that prior output via $LOOP_PREV.<bodyNode>.output.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/docs-web/src/content/docs/guides/loop-nodes.md`:
- Around line 428-439: Clarify the `persist_session` note in the `loop_group`
docs by separating it from `fresh_context`: `persist_session` is a node-level AI
field handled via `LOOP_GROUP_NODE_AI_FIELDS` and is ignored at runtime with a
warning, while `fresh_context` controls per-iteration context reset. Update the
unsupported-items section in `loop-nodes.md` to state the correct behavior using
those symbols so it no longer implies `persist_session` governs iteration
behavior.
- Around line 356-358: The `fresh_context` example in the loop-nodes guide has
an inverted explanation: in the `fresh_context` snippet, the comment currently
describes reset behavior while the value is `false`. Update the inline comment
near `fresh_context` to match the actual semantics in this section, or switch
the example to `true` if you want to demonstrate resetting body sessions each
iteration; use the `fresh_context` setting in the loop example as the anchor for
the edit.
In `@packages/workflows/src/dag-executor.ts`:
- Around line 2441-2455: The loop-group max-iteration failure path in
dag-executor should also emit/persist a node-level failure event before
returning. In the branch that builds errorMsg and logs
loop_group_node.max_iterations_reached, add the same node_failed handling used
by other failed exits for the loop_group node (using the node.id and failure
state) so UI/event consumers receive the failure even when state is 'failed'.
Keep the existing safeSendMessage and returned failure payload unchanged, and
update the loop-group failure flow consistently in dag-executor.
- Around line 2235-2236: The loop-group session threading in dag-executor is
resetting `lastSequentialSessionId` on every iteration, so `fresh_context:
false` is ignored. Update the loop iteration handling in the DAG executor to
keep a loop-level session cursor across iterations, and only set
`lastSequentialSessionId` back to undefined when `group.fresh_context` is true
or when starting the first iteration. Make the change in the loop-group
execution path where `lastSequentialSessionId` is assigned so the existing
session continues correctly between iterations.
- Around line 2127-2136: Persist the previous loop body outputs across
interactive resume so $LOOP_PREV remains resolvable after restarting the loop.
In dag-executor.ts, update the interactive resume path in the loop execution
flow around isLoopResume/startIteration and the later loop handling near the
second referenced block so loopPrevOutputs is restored from saved metadata or
reconstructed from the prior iteration’s namespaced body outputs instead of
always starting as undefined. Use the existing executeLoopNode/loop resume logic
and the node.id/iteration metadata to load the prior snapshot before continuing
iteration 2.
- Around line 2212-2222: Propagate the parent workflow settings into loop-group
body execution: `executeLoopGroupNode()` currently builds `iterCtx` with
`workflowModel`, `workflowLevelOptions`, `aiProfile`, and `workflowPreset` reset
to undefined, which drops inherited model aliases and workflow defaults. Update
`executeLoopGroupNode()` to accept and forward the caller’s `runLayers` values,
then construct `iterCtx` from those values so body nodes inherit
`workflowModel`, `workflowLevelOptions`, `aiProfile`, and `workflowPreset`
according to the existing priority rules.
- Around line 2268-2272: In the DAG executor’s group completion logic,
short-circuit the external until_bash execution when detectCompletionSignal has
already marked the group complete. Update the flow around signalDetected and the
until_bash branch in dag-executor’s group handling so the bash command only runs
when signalDetected is false, keeping the existing completion state intact and
avoiding unnecessary side effects.
- Around line 2475-2496: The bash-body substitution in applyLoopPrevToBodyNode
is injecting $LOOP_PREV values without shell escaping, so update the bash path
to preserve shell safety the same way substituteNodeOutputRefs(..., true) does.
Keep the existing loop_user_input handling, but make the bash branch use escaped
output for prior-loop references before calling sub() so values in node.bash
cannot introduce executable shell metacharacters.
- Around line 480-493: The loop-previous output handling in dag-executor should
treat skipped or pending prior body outputs as absent before any field lookup.
Update the nodeOutput path in the cross-iteration resolution logic so that,
alongside the existing no-prior-output check, it returns empty immediately when
the prior node’s state is skipped or pending instead of calling
resolveNodeOutputField(). Keep the behavior in the surrounding loop output
resolution code consistent with the existing empty-return handling for missing
prior outputs.
In `@packages/workflows/src/schemas/dag-node.ts`:
- Around line 710-713: The loop-group transform in dag-node.ts is dropping
supported AI override fields, so the DagNode built for loop_group loses model
and provider. Update the loop_group branch in the transform that returns
LoopGroupNode to carry through the AI override fields allowed by
LOOP_GROUP_NODE_AI_FIELDS, preserving them alongside loop_group when
constructing the parsed node. Make sure the fix is applied where the DagNode is
assembled so the loader, validator, and executor all receive the preserved
model/provider values.
---
Outside diff comments:
In `@packages/workflows/src/dag-executor.ts`:
- Around line 3911-3925: Pass the prefixed step name through normal node
execution so loop/body nodes emit unique events. Update the call path into
executeNodeInternal() and its event-writing logic to use the prefixed step name
instead of node.id for node_started, node_completed, node_failed, and tool
events, reusing the existing stepNamePrefix handling from the loop-group
execution flow.
In `@packages/workflows/src/schemas/loop.ts`:
- Around line 18-41: The loopControlSchema currently requires both loop.until
and loop.max_iterations unconditionally, which blocks valid loop_group configs
that should terminate via until_bash alone or via fixed-count only. Change the
z.object definition in loopControlSchema to make until and max_iterations
optional, then add superRefine cross-field validation to enforce at least one
valid completion mode (until, max_iterations, or until_bash) while preserving
the interactive/gate_message check. Use the existing loopControlSchema and
superRefine logic to keep the schema constraints centralized.
---
Nitpick comments:
In `@packages/workflows/src/dag-executor.test.ts`:
- Around line 10193-10242: The EDGE G test for loop_group short-circuiting does
not actually prove that until_bash was skipped because using until_bash: 'false'
still executes the command. Update the test around executeDagWorkflow and the
loop_group setup to use a side-effect sentinel or counter tied to until_bash
execution, then assert that it was never triggered when the until signal is
emitted; keep the existing mockSendQueryDag flow and verify completion is still
driven by the signal path.
- Around line 10421-10512: The resumed interactive loop_group test in
dag-executor.test.ts only verifies $LOOP_USER_INPUT, so extend the existing
INTERACTIVE resumed loop_group case to also assert that $LOOP_PREV.work.output
is preserved across the pause/resume boundary. Use the current loop_group setup
around executeDagWorkflow, makeWorkflowRun, and the work node prompt to capture
a previous-iteration body output before pausing, then resume with
metadata.approval and verify the next iteration can still read that prior output
via $LOOP_PREV.<bodyNode>.output.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6f91c473-f38c-4cdd-8585-4da4521c5b19
📒 Files selected for processing (12)
packages/docs-web/src/content/docs/book/dag-workflows.mdpackages/docs-web/src/content/docs/guides/loop-nodes.mdpackages/paths/src/telemetry.tspackages/workflows/src/dag-executor.test.tspackages/workflows/src/dag-executor.tspackages/workflows/src/loader.test.tspackages/workflows/src/loader.tspackages/workflows/src/schemas.test.tspackages/workflows/src/schemas/dag-node.tspackages/workflows/src/schemas/index.tspackages/workflows/src/schemas/loop.tspackages/workflows/src/validator.ts
| max_iterations: 5 | ||
| fresh_context: false # reset body sessions each iteration (default false) | ||
| nodes: # sealed sub-DAG body — repeats as a unit |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
Fix fresh_context comment — meaning is inverted.
Line 357 says fresh_context: false with the comment "reset body sessions each iteration (default false)". This is backwards: fresh_context: true resets sessions each iteration; false (the default) preserves context across iterations. Update the comment to:
- fresh_context: false # reset body sessions each iteration (default false)
+ fresh_context: false # preserve body sessions across iterations (default: false)Or if the intent was to show the reset behavior, change the value to true:
- fresh_context: false # reset body sessions each iteration (default false)
+ fresh_context: true # reset body sessions each iteration📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| max_iterations: 5 | |
| fresh_context: false # reset body sessions each iteration (default false) | |
| nodes: # sealed sub-DAG body — repeats as a unit | |
| max_iterations: 5 | |
| fresh_context: false # preserve body sessions across iterations (default: false) | |
| nodes: # sealed sub-DAG body — repeats as a unit |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/docs-web/src/content/docs/guides/loop-nodes.md` around lines 356 -
358, The `fresh_context` example in the loop-nodes guide has an inverted
explanation: in the `fresh_context` snippet, the comment currently describes
reset behavior while the value is `false`. Update the inline comment near
`fresh_context` to match the actual semantics in this section, or switch the
example to `true` if you want to demonstrate resetting body sessions each
iteration; use the `fresh_context` setting in the loop example as the anchor for
the edit.
| // Detect interactive loop resume (mirrors executeLoopNode). | ||
| const rawApproval = workflowRun.metadata?.approval; | ||
| const loopGateMeta = isApprovalContext(rawApproval) ? rawApproval : undefined; | ||
| const isLoopResume = loopGateMeta?.type === 'interactive_loop' && loopGateMeta.nodeId === node.id; | ||
| const startIteration = isLoopResume ? (loopGateMeta.iteration ?? 0) + 1 : 1; | ||
| const loopUserInput = isLoopResume | ||
| ? ((workflowRun.metadata?.loop_user_input as string | undefined) ?? '') | ||
| : ''; | ||
|
|
||
| let loopPrevOutputs: Map<string, NodeOutput> | undefined; // undefined on iteration 1 |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | 🏗️ Heavy lift
Persist enough loop state to restore $LOOP_PREV after interactive resume.
On resume, startIteration advances, but loopPrevOutputs is reinitialized as undefined, so resumed iteration 2 cannot resolve $LOOP_PREV.<bodyNode>.output from iteration 1. Store/reload the previous body-output snapshot when pausing, or reconstruct it from namespaced body events before continuing.
Also applies to: 2418-2424
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/workflows/src/dag-executor.ts` around lines 2127 - 2136, Persist the
previous loop body outputs across interactive resume so $LOOP_PREV remains
resolvable after restarting the loop. In dag-executor.ts, update the interactive
resume path in the loop execution flow around isLoopResume/startIteration and
the later loop handling near the second referenced block so loopPrevOutputs is
restored from saved metadata or reconstructed from the prior iteration’s
namespaced body outputs instead of always starting as undefined. Use the
existing executeLoopNode/loop resume logic and the node.id/iteration metadata to
load the prior snapshot before continuing iteration 2.
| // Max iterations exceeded. | ||
| const errorMsg = `Loop-group node '${node.id}' exceeded max iterations (${String(group.max_iterations)}) without completion signal '${group.until}'`; | ||
| getLog().warn( | ||
| { nodeId: node.id, maxIterations: group.max_iterations, signal: group.until }, | ||
| 'loop_group_node.max_iterations_reached' | ||
| ); | ||
| await safeSendMessage(platform, conversationId, errorMsg, msgContext); | ||
| return { | ||
| state: 'failed', | ||
| output: lastIterationOutput, | ||
| error: errorMsg, | ||
| costUsd: loopTotalCostUsd, | ||
| ...(loopTotalTokens !== undefined ? { tokens: loopTotalTokens } : {}), | ||
| loopIterations: group.max_iterations, | ||
| }; |
There was a problem hiding this comment.
🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win
Emit a node_failed event for loop-group failure exits.
The max-iteration path returns state: 'failed' without persisting/emitting a node_failed event for the loop_group node. UI/event consumers can miss the node-level failure even though the workflow later fails.
🧰 Tools
🪛 ESLint
[error] 2443-2443: Unsafe call of a type that could not be resolved.
(@typescript-eslint/no-unsafe-call)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/workflows/src/dag-executor.ts` around lines 2441 - 2455, The
loop-group max-iteration failure path in dag-executor should also emit/persist a
node-level failure event before returning. In the branch that builds errorMsg
and logs loop_group_node.max_iterations_reached, add the same node_failed
handling used by other failed exits for the loop_group node (using the node.id
and failure state) so UI/event consumers receive the failure even when state is
'failed'. Keep the existing safeSendMessage and returned failure payload
unchanged, and update the loop-group failure flow consistently in dag-executor.
Introduce the loop_group DAG node variant — a cross-node looping construct that repeats a multi-node sub-DAG body until a completion condition (`until` signal / `until_bash` exit code) or max_iterations. This extends today's intra-node `loop:` (single prompt) to subgraph iteration. Schema (T2): - Factor shared iteration-control fields (until, max_iterations, fresh_context, until_bash, interactive, gate_message) out of loopNodeConfigSchema into loopControlSchema; loop: now extends it. - Add loopGroupNodeConfigSchema = loopControlSchema + nodes: body (a recursive z.array of dagNodeSchema, via zod v4 getter pattern with explicit z.ZodType<LoopGroupNodeConfig> annotation to break the type-inference cycle). - Add LoopGroupNode type, loopGroupNodeSchema, isLoopGroupNode guard. - Wire loop_group into the flat dagNodeSchema (mutual-exclusivity superRefine, transform dispatch, DagNode union, retry-not-supported check, isPersistableNode exclusion, LOOP_GROUP_NODE_AI_FIELDS). - Export new symbols from schemas/index.ts; add 'loop_group' to WorkflowNodeType (paths/telemetry). Dispatch (T6, partial): - Add isLoopGroupNode branch in the DAG node dispatch (adjacent to isLoopNode), reusing resolveNodeProviderAndModel for provider/model forwarding. - executeLoopGroupNode is a stub that fails fast with a clear "not yet implemented" message — the per-iteration body runner (T5) lands in a follow-up. This keeps the branch compiling and lets a loop_group workflow validate/load even before full execution exists. No behavior change for existing node types. Existing schemas/loader/ dag-executor tests pass (workflows package test script, all 17 batches). Refs: .claude/archon/plans/subgraph-loop-group-node.plan.md (tasks T2, T6)
Extract the topological-layer execution loop out of executeDagWorkflow into
a reusable runLayers(ctx) helper. This is a pure, behavior-preserving
refactor that unblocks loop_group body iteration (T5), where the same
layer-walking logic runs over a subgraph per iteration.
- New RunLayersContext interface bundles run-level invariants (deps,
platform, run record, resolved provider/model/options, paths, config)
with per-subgraph mutable state (nodes, layers, nodeOutputs, session
threading cursor, usage accumulators, resume cache, stepNamePrefix).
- runLayers walks ctx.layers, executes each layer's nodes concurrently,
aggregates results into ctx.nodeOutputs, and accumulates usage into ctx.
Stops early on a non-running between-layer status (paused/cancelled/
deleted); the caller always proceeds to its own terminal tally.
- executeDagWorkflow now builds a top-level context (stepNamePrefix: '')
and awaits runLayers(ctx), then reads the mutated accumulators back.
nodeOutputs/layers remain local to executeDagWorkflow and are shared by
reference with the context, so the terminal tally sees runLayers' writes.
- The only structural addition vs. the former inline loop is
ctx.stepNamePrefix, prepended to every emitted event step_name. For the
top-level DAG it is '' so step_names are the raw node ids — identical to
pre-refactor behavior. (loop_group bodies will use '{groupId}.'.)
Verified behavior-preserving: all 285 dag-executor tests pass, plus the
full workflows package test suite (all 17 isolated batches). Repo-wide
type-check clean; lint zero warnings; prettier clean.
Refs: .claude/archon/plans/subgraph-loop-group-node.plan.md (task T1)
…(T4) Add the $LOOP_PREV.<nodeId>.output[.field] substitution that loop_group body nodes use to reference a sibling node's output from the *previous* iteration — the cross-node analog of the single-node loop's $LOOP_PREV_OUTPUT. - substituteLoopPrevRefs(prompt, loopPrevOutputs, escapedForBash?) mirrors substituteNodeOutputRefs' strict semantics: reuses resolveNodeOutputField for field access (declared-schema typo / schemaless non-JSON / missing key → throws OutputRefError, propagating to the consuming node's failure; only author-declared-optional resolves to ''). - A ref with no prior-iteration output (iteration 1, or the node was skipped last iteration) resolves to '' rather than throwing — absence on the first pass is expected, not an authoring error. - Fast-path returns the prompt unchanged when it contains no '$LOOP_PREV.' token; refs present on iteration 1 (empty/undefined map) still resolve to '' via the no-prior-output branch. Consumed by executeLoopGroupNode (T5). Verified with an inline smoke test: prior-iter ref resolves, missing-node and iter-1 refs resolve to ''. Refs: .claude/archon/plans/subgraph-loop-group-node.plan.md (task T4)
Replace the loop_group stub with the full per-iteration body runner,
mirroring executeLoopNode at subgraph granularity.
Per iteration:
- Emit loop_iteration_started (step_name = groupId).
- Pre-substitute $LOOP_PREV.* refs into body node prompt fields via the new
applyLoopPrevToBodyNode helper (handles prompt/bash/command/script/
approval.message/cancel + nested loop_group recursion). On iteration 1
the prior-iteration snapshot is undefined → refs resolve to ''.
- Build a fresh scoped nodeOutputs (seeded read-only with the outer DAG's
upstream outputs so body prompt refs to outer nodes still resolve).
- Run the body's pre-computed topological layers via runLayers with
stepNamePrefix '{groupId}.' so body node events are namespaced.
- Snapshot the iteration's body outputs as the next iteration's
$LOOP_PREV source.
- Completion gate: detectCompletionSignal on the body's terminal-node
output (first completed terminal node in definition order) and/or
until_bash exit 0. Max_iterations exceeded → failed.
- Interactive gate: pauseWorkflowRun({type:'interactive_loop', nodeId:
groupId, iteration}) + return completed (the outer between-layer check
sees 'paused' and halts). Resume re-runs the current iteration's body
in full (v1 — body priorCompletedNodes is undefined inside the group).
- Usage (cost/tokens) summed across iterations and returned on the final
NodeExecutionResult, so the outer runLayers aggregates the group as one
node's worth of usage.
$groupId.output (visible to the outer DAG) = the final iteration's
terminal-node output.
Tests (in dag-executor.test.ts):
- completes a loop_group when the until signal appears on iteration N
(2-iteration path: iter 1 no signal → iter 2 DONE → complete).
- fails when max_iterations is exceeded without the until signal (3 iters,
no signal → run failed, no terminal output).
Both pass; existing 285 dag-executor tests unchanged. Type-check clean,
lint zero warnings, prettier clean.
Refs: .claude/archon/plans/subgraph-loop-group-node.plan.md (tasks T5, T9)
validateDagStructure now recurses into each loop_group node's `nodes:` body, validating it as a sealed scoped sub-DAG (unique ids, depends_on resolves within the body, no cycles, valid $nodeId.output refs). The outer-DAG cycle/depends_on checks treat each loop_group as one outer node; body-internal edges don't count as outer edges (so a body a→b→a falsely flags only the body, not the outer DAG). Nested loop_groups recurse naturally. Errors are prefixed `loop_group '<id>' body: <inner error>` so the offending group is identifiable. Also adds loop_group to the non-AI-node AI-field-ignored warning set (LOOP_GROUP_NODE_AI_FIELDS — same as loop: since model/provider are forwarded to body AI nodes), alongside loop/approval/cancel/script/bash. Tests (loader.test.ts): - rejects a loop_group with a cyclic body. - rejects a body depends_on referencing an unknown node. - accepts a well-formed loop_group. All 134 loader tests pass. Refs: .claude/archon/plans/subgraph-loop-group-node.plan.md (task T3)
validateWorkflowResources now flattens top-level nodes together with every loop_group body (recursing into nested loop_groups) before running per-node resource checks (command files, MCP configs, skill dirs, named scripts). This ensures a loop_group body referencing a nonexistent command/mcp/skill is flagged at validate time, just like top-level nodes. ID-uniqueness and cycle detection remain the loader's responsibility (validateDagStructure); the validator only checks referenced resources exist, so flattening is safe here. All 55 validator tests pass. Refs: .claude/archon/plans/subgraph-loop-group-node.plan.md (task T8)
- loop-nodes.md: add a "Cross-Node Loops with loop_group" section covering syntax, how it works (sealed body, namespaced events, terminal output), $LOOP_PREV.<id>.output cross-iteration refs, shared config fields, resume semantics (re-runs whole iteration body), and the v1 NOT-supported list (retry, persist_session across iterations, per-body-node resume, $LOOP_PREV history indexing). - dag-workflows.md: bump node-type table to eight types and add the Loop Group row. Refs: .claude/archon/plans/subgraph-loop-group-node.plan.md (task T10)
…ering bug Add four end-to-end loop_group test instances (real engine, mocked provider) that exercise distinct topologies: 1. Multi-node body (implement→test→review): completes on iteration 2 when the AI review node emits the until signal. Asserts the 3-node sealed body runs as a unit per iteration and only the AI node calls sendQuery. 2. $LOOP_PREV cross-iteration ref: body prompt references $LOOP_PREV.work.output. Asserts iteration 1's prompt has no prior output (resolves to '') and iteration 2's prompt carries iteration 1's output. 3. until_bash deterministic gate: pure-bash body (no AI) increments a counter file; until_bash exits 0 once the counter reaches 2. Asserts the counter ends at 2 and the group output is the last iteration's body stdout. 4. Single-node body: degenerates like loop: — completes in 1 iteration when the until signal appears immediately. Bug fix uncovered by instance 2: body layers were computed once from the original group.nodes, but runLayers walks ctx.layers — so the $LOOP_PREV-substituted body nodes (iterBodyNodes) were never actually walked, and $LOOP_PREV refs stayed unresolved. Now re-layers from iterBodyNodes each iteration (depends_on shape is static, so layering is stable; only prompt text changes). The pre-computed bodyLayers is removed. All 291 dag-executor tests pass (285 prior + 2 earlier loop_group + 4 new).
Cover the semantic edge cases the instance tests didn't reach: - EDGE A: body node that fails every iteration → group runs to max_iterations and fails (failed body node isn't selected as terminal output, so no completion signal is ever detected). - EDGE B: body prompt references an outer-DAG upstream node via $setup.output — verifies outerNodeOutputs is seeded into the scoped map so outer context is reachable from the sealed body. - EDGE F (×2): $LOOP_PREV.<id>.output.<field> structured field access resolves from structuredOutput; a ref to a node absent from the prior iteration resolves to '' (not a throw). Unit-level against substituteLoopPrevRefs. - EDGE D: multi-terminal body (two parallel no-dependency AI nodes) — both run concurrently; the completion signal must appear in the first-completed terminal node (definition order) since that's the one selected as the group's terminal output. - EDGE C: fresh_context=true smoke — group completes normally with fresh-context sessions. - EDGE E: between-iteration cancellation — getWorkflowRunStatus flips to 'cancelled' after iteration 1, halting the loop (failed result). - EDGE G: until signal OR until_bash — signal alone completes even when until_bash always exits non-zero (short-circuit OR). - EDGE I: max_iterations=1 single-shot — completes in one iteration when the signal is present. - EDGE H: nested loop_group (loop_group body containing another loop_group) — smoke test that applyLoopPrevToBodyNode recurses and the inner group dispatches correctly. EDGE D surfaced a real behavior worth documenting: in a multi-terminal body, only the first-completed terminal node's output is checked for the until signal (it's the group's terminal output). Authors must place the signal on that node, or restructure. All 301 dag-executor tests pass (285 prior + 16 loop_group: 2 early + 4 instances + 10 edge).
…utput_type, schema Close the 4 remaining loop_group test gaps: Dimension 4 — interactive gate + resume (dag-executor.test.ts): - INTERACTIVE: fresh interactive loop_group pauses at the gate after iteration 1 (asserts pauseWorkflowRun called with type:'interactive_loop', nodeId:<groupId>, iteration:1, message:<gate_message>). - INTERACTIVE: resumed loop_group continues from the next iteration and completes — two-call pattern (call 1 pauses; call 2 resumes with metadata.approval carrying iter 1 + loop_user_input, iteration 2 emits the until signal). Asserts the resumed prompt carries $LOOP_USER_INPUT. This required a fix: $LOOP_USER_INPUT was not reaching loop_group body prompts (body executors call substituteWorkflowVariables with the run's user_message, not the loop's per-iteration input). applyLoopPrevToBodyNode now also substitutes $LOOP_USER_INPUT (resolved per-iteration from loopUserInput, '' except on the first resumed iteration of an interactive loop). The substitution now runs every iteration (not only when loopPrevOutputs is non-empty), since $LOOP_USER_INPUT must resolve on iteration 1 too. Dimension 3 — cost/token accumulation (dag-executor.test.ts): - COST: 2-iteration loop_group with per-iteration costs (0.01 + 0.02) accumulates to total_cost_usd: 0.03 on the run's completeWorkflowRun. Dimension 2 — output_type typed artifact (dag-executor.test.ts): - OUTPUT_TYPE: loop_group declaring output_type writes the sidecar (nodes/<groupId>.md) from the final iteration's terminal output. Dimension 1 — schema-layer unit tests (schemas.test.ts): - parses a valid loop_group with a recursive body; loop_group+prompt and loop_group+loop are mutually exclusive; retry is rejected on loop_group; body requires ≥1 node; until is required; nested loop_group body parses; LOOP_GROUP_NODE_AI_FIELDS mirrors LOOP_NODE_AI_FIELDS. All 305 dag-executor tests + 86 schemas tests pass. Type-check clean, lint zero warnings, prettier clean.
… rebase Rebase onto origin/dev (which added coleam00#2038 resolved-model/tier surfacing in node_started + coleam00#1842 cold-resume surfacing) left runLayers calling the updated executeNodeInternal/resolveNodeProviderAndModel without the new wiring, since runLayers was extracted from the pre-dev inline loop. - Destructure model: resolvedNodeModel + tier: resolvedTier from resolveNodeProviderAndModel in the AI-node path and forward them to executeNodeInternal (matches dev's node_started tier annotation). - Add the cold-resume surfacing block (provider reports resumed:false → warn + message the user; keep the fresh run, no replay) between the retry loop and the persist_session upsert. All 306 dag-executor tests + bun run validate pass on top of current dev.
a1950d5 to
0596f66
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (2)
packages/workflows/src/schemas/dag-node.ts (1)
710-713: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winPreserve loop-group
modelandproviderin the transform.This branch drops the group-level AI overrides even though they are intentionally allowed for forwarding. As-is, loader/validator/executor never see
loop_groupmodel/provider.Proposed fix
// loop_group — guaranteed by superRefine to be defined at this point if (data.loop_group !== undefined) { - return { ...base, loop_group: data.loop_group } as LoopGroupNode; + return { + ...base, + ...(data.model !== undefined ? { model: data.model } : {}), + ...(data.provider !== undefined ? { provider: data.provider } : {}), + loop_group: data.loop_group, + } as LoopGroupNode; }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/workflows/src/schemas/dag-node.ts` around lines 710 - 713, The loop_group transform in dag-node.ts is discarding the group-level AI overrides, so preserve the loop_group model and provider when building the LoopGroupNode. Update the data.loop_group branch in the schema transform to forward those fields along with the existing loop_group value, and verify the LoopGroupNode shape and related transformer logic still accept the override properties.packages/workflows/src/dag-executor.ts (1)
2383-2401: 🎯 Functional Correctness | 🟠 Major | 🏗️ Heavy liftKeep workflow/group model options when entering loop-group bodies.
The body context resets
workflowModel, workflow-level options,aiProfile, andworkflowPreset, so body nodes lose inherited model aliases and workflow defaults. This remains the same unresolved issue previously flagged for loop-group body execution.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/workflows/src/dag-executor.ts` around lines 2383 - 2401, The loop-group body context in RunLayersContext is still resetting inherited workflow settings, causing body nodes to lose the parent model and defaults. Update the iterCtx construction in dag-executor so loop-group bodies preserve workflowModel, workflowLevelOptions, aiProfile, and workflowPreset from the incoming context instead of forcing them to undefined, while keeping the existing body-specific overrides intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/workflows/src/dag-executor.ts`:
- Around line 3576-3577: Thread the existing stepNamePrefix through delegated
node execution so event persistence uses the prefixed step name instead of raw
node.id. Update the runLayers path and the executeNodeInternal delegation for
normal AI nodes to accept and pass an event step name, and ensure
node_started/node_completed records emitted from executeNodeInternal use that
value consistently. This keeps loop-group body events distinct from top-level
nodes and avoids collisions in resume/event consumers.
- Around line 2422-2444: After runLayers(iterCtx) in executeLoopGroupNode, the
loop still proceeds into snapshot and completion handling even when a body
approval/cancel node has changed the workflow status. Add a
getWorkflowRunStatus() check immediately after runLayers and return early for
paused or any terminal state so loopPrevOutputs, terminal output selection, and
the next iteration are skipped when the workflow has already stopped.
- Around line 2654-2675: The loop substitution helper in applyLoopPrevToBodyNode
is inlining $LOOP_USER_INPUT into bash bodies via the same sub() path used for
prompts and messages, which can turn approval feedback into executable shell.
Update the isBashNode branch to use a shell-safe substitution path in
dag-executor.ts—keep raw user input out of direct string replacement and pass it
through env/quoted handling instead—while leaving loop/approval/script handling
unchanged.
---
Duplicate comments:
In `@packages/workflows/src/dag-executor.ts`:
- Around line 2383-2401: The loop-group body context in RunLayersContext is
still resetting inherited workflow settings, causing body nodes to lose the
parent model and defaults. Update the iterCtx construction in dag-executor so
loop-group bodies preserve workflowModel, workflowLevelOptions, aiProfile, and
workflowPreset from the incoming context instead of forcing them to undefined,
while keeping the existing body-specific overrides intact.
In `@packages/workflows/src/schemas/dag-node.ts`:
- Around line 710-713: The loop_group transform in dag-node.ts is discarding the
group-level AI overrides, so preserve the loop_group model and provider when
building the LoopGroupNode. Update the data.loop_group branch in the schema
transform to forward those fields along with the existing loop_group value, and
verify the LoopGroupNode shape and related transformer logic still accept the
override properties.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 2a76445a-0e0a-4266-a67f-07b7fcf11118
📒 Files selected for processing (12)
packages/docs-web/src/content/docs/book/dag-workflows.mdpackages/docs-web/src/content/docs/guides/loop-nodes.mdpackages/paths/src/telemetry.tspackages/workflows/src/dag-executor.test.tspackages/workflows/src/dag-executor.tspackages/workflows/src/loader.test.tspackages/workflows/src/loader.tspackages/workflows/src/schemas.test.tspackages/workflows/src/schemas/dag-node.tspackages/workflows/src/schemas/index.tspackages/workflows/src/schemas/loop.tspackages/workflows/src/validator.ts
✅ Files skipped from review due to trivial changes (1)
- packages/docs-web/src/content/docs/book/dag-workflows.md
🚧 Files skipped from review as they are similar to previous changes (8)
- packages/workflows/src/schemas/index.ts
- packages/paths/src/telemetry.ts
- packages/workflows/src/validator.ts
- packages/workflows/src/loader.ts
- packages/workflows/src/schemas.test.ts
- packages/workflows/src/loader.test.ts
- packages/docs-web/src/content/docs/guides/loop-nodes.md
- packages/workflows/src/dag-executor.test.ts
| /** Prefix prepended to every `step_name` in emitted events ('' for top-level, '{groupId}.' for body). */ | ||
| stepNamePrefix: string; |
There was a problem hiding this comment.
🗄️ Data Integrity & Integration | 🟠 Major | 🏗️ Heavy lift
Thread stepNamePrefix into delegated node event persistence.
runLayers documents prefixed body events, but Line 4097 delegates normal AI nodes to executeNodeInternal, which writes node_started/node_completed with raw step_name: node.id. Loop-group body events can collide with top-level node IDs and poison resume/event consumers. Pass an event step name into the delegated executors, or centralize event persistence behind runLayers.
Also applies to: 4097-4118
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/workflows/src/dag-executor.ts` around lines 3576 - 3577, Thread the
existing stepNamePrefix through delegated node execution so event persistence
uses the prefixed step name instead of raw node.id. Update the runLayers path
and the executeNodeInternal delegation for normal AI nodes to accept and pass an
event step name, and ensure node_started/node_completed records emitted from
executeNodeInternal use that value consistently. This keeps loop-group body
events distinct from top-level nodes and avoids collisions in resume/event
consumers.
Rebased onto origin/dev first (resolved the T1 runLayers conflict; ported upstream coleam00#2038 resolved-model/tier surfacing + coleam00#1842 cold-resume surfacing into runLayers so the extracted helper stays behavior-identical to dev's inline loop). Then addressed CodeRabbit's actionable findings: - coleam00#6 inherit workflow defaults in loop_group body: executeLoopGroupNode now accepts + forwards workflowModel/workflowLevelOptions/aiProfile/workflowPreset to the per-iteration runLayers ctx (was hardcoded undefined, dropping inherited model aliases + workflow options for body AI nodes). - coleam00#4 fresh_context:false session threading: replaced the per-iteration lastSequentialSessionId reset with a loop-level cursor that threads the body's final sequential session into the next iteration (reset only when fresh_context is true or on iteration 1) — so fresh_context:false actually resumes the prior iteration's conversation across iterations. - coleam00#8 shell-escape $LOOP_PREV in bash/script/cancel body fields: applyLoopPrevToBodyNode now passes escapedForBash=true for shell-bound fields (bash/script/cancel), matching substituteNodeOutputRefs' shell-safety contract. - coleam00#9 treat skipped/pending prior body output as absent: substituteLoopPrevRefs resolves to '' when the prior NodeOutput is skipped/pending (not just missing). - coleam00#7 short-circuit until_bash when the until-signal already completed: skips the until_bash subprocess when signalDetected is true (OR semantics). - coleam00#10 preserve model/provider AI overrides on loop_group parse: the transform now spreads aiOnly so group-level model/provider survive parsing (the executor forwards them to body AI nodes). Tests: - EDGE G (nitpick #N1): rewrote to use a sentinel file proving until_bash is short-circuited (not executed) when the until-signal completes. - Added INTERACTIVE resume test (nitpick #N2) locking the v1 behavior that $LOOP_PREV is NOT preserved across the pause/resume boundary (resolves to '' on the resumed iteration) — persisting it is a tracked follow-up (coleam00#5). Skipped (out of scope for this PR): - coleam00#1 namespaced step_name for executeNodeInternal body events (15 call sites; invasive — noted as known v1 limitation; runLayers' own skip/trigger/when events ARE namespaced). - coleam00#2 until/max_iterations optional for until_bash-only/fixed-count loops (existing loop: behavior; would change loop: semantics — separate PR). - coleam00#3 node_failed event on max-iter (loop: doesn't emit it either; consistent). - coleam00#5 persist $LOOP_PREV across interactive resume (tracked follow-up; test added locks current behavior). bun run validate green; 393 dag-executor+schemas tests pass.
|
Thanks for the thorough review — all addressed in Fixed
Rebase note — rebased onto current Deferred (out of scope for this PR)
|
…USER_INPUT Address the 2 new valid findings from CodeRabbit's second review (the other 3 were stale duplicates of already-fixed coleam00#6/coleam00#10, evaluated against the pre-rebase commit): - Post-body status check: after runLayers(iterCtx) returns, a body approval/ cancel node may have paused or cancelled the run mid-iteration. Re-check getWorkflowRunStatus before proceeding into snapshot/completion handling — a terminal/cancelled state stops the loop now (skipping terminal-output selection and the next iteration). `paused` is tolerated (mirrors executeLoopNode's between-iteration tolerance for sibling gates). - Shell-safe $LOOP_USER_INPUT in bash/script/cancel body fields: applyLoopPrevToBodyNode now shell-quotes loopUserInput before splicing it into shell-bound fields (escapedForBash=true path). User input is free-text; unquoted it could break or inject into the bash/script command. AI-bound fields (prompt/approval.message/command) keep the raw value. The namespaced-step_name finding (coleam00#1) remains deferred — 15 call sites in executeNodeInternal, invasive; runLayers' own events are already namespaced. bun run validate green; 307 dag-executor tests pass.
|
Second review processed in Already fixed (stale duplicates — evaluated against the pre-rebase commit):
Both were fixed in Newly fixed in
Still deferred:
|
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/workflows/src/dag-executor.ts`:
- Around line 2439-2446: In loop_group_node post-body handling inside
dag-executor.ts, the status check currently skips null so a deleted workflow can
keep running; update the getWorkflowRunStatus() result handling so null is
treated as a stop condition just like a deleted run. Use the existing
postBodyStatus, effectiveStatus, and getLog().info flow to make the early return
happen whenever the status is not running or paused, including null, so the node
exits before snapshot/completion handling.
- Around line 2692-2698: The `sub` helper in `dag-executor.ts` is resolving
`$LOOP_USER_INPUT` too early, which lets user-provided text be reprocessed by
`substituteLoopPrevRefs()`. Change the order so `$LOOP_PREV.*` placeholders are
resolved first and the loop user input is inserted or masked afterward, or
refactor to a single-pass resolver. Make sure the fix preserves the existing
`escapedForBash` behavior while keeping `loopUserInput` inert and unaffected by
workflow-ref substitution.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b0fd80eb-5788-4f65-84b0-5f7c513e2402
📒 Files selected for processing (1)
packages/workflows/src/dag-executor.ts
… splice order Address the 2 actionable nits from CodeRabbit's third review: - Post-body status check: treat a null getWorkflowRunStatus result (run row gone / deleted) as a stop condition, same as an explicit terminal state. Previously `postBodyStatus !== null && ...` let null fall through and the loop continued into snapshot handling for a deleted run. - $LOOP_USER_INPUT splice order: resolve $LOOP_PREV.* refs FIRST, then splice $LOOP_USER_INPUT into the result — so user input containing a literal "$LOOP_PREV." is not itself reprocessed as a workflow ref. Shell-quoting of user input (escapedForBash path) is preserved. bun run validate green; 307 dag-executor tests pass.
Summary
Describe this PR in 2-5 bullets:
loop:re-invokes its own single prompt per iteration). There is no construct that repeats a subgraph of multiple nodes (e.g.implement → test → review) until a condition is met.loop_groupDAG node variant that carries anodes:sub-DAG body and repeats the whole body per iteration until a completion condition (untilsignal and/oruntil_bashexit code) ormax_iterations. Mirrors the provenexecuteLoopNodepattern at subgraph granularity. The layer-execution loop is factored into a reusablerunLayers(...)helper (pure refactor, behavior-preserving). New cross-iteration ref$LOOP_PREV.<nodeId>.output.loop_groupnode. No native back-edges, no fixpoint scheduler. v1 resume re-runs the whole current iteration's body.persist_sessionacross iterations, per-body-node resume granularity,$nodeId.output[N]history indexing, and the Web WorkflowBuilder nested-nodes editor are explicitly deferred (see NOT Building in the plan).UX Journey
Before
After
Architecture Diagram
Before
After
Connection inventory:
executeDagWorkflowrunLayersrunLayersdispatchexecuteLoopGroupNodeisLoopGroupNodebranchexecuteLoopGroupNoderunLayersexecuteLoopGroupNodedetectCompletionSignalexecutor-sharedexecuteLoopGroupNodebuildTopologicalLayersLabel snapshot
risk: high(HIGH complexity; touches the core DAG executor; mitigated by pure-refactor spike first)size: XLworkflowsworkflows:executorChange Metadata
featureworkflowsLinked Issue
.claude/archon/research/2026-06-26-cross-node-looping.md; this PR implements.claude/archon/plans/subgraph-loop-group-node.plan.md)Validation Evidence (required)
All eight
bun run validatechecks pass (exit 0):check:bundled,check:bundled-skill,check:bundled-schema,check:pi-vendor-map, type-check, lint (--max-warnings 0), format check, and the full test suite.bun run validate # ✅ exit 0runLayersextraction is behavior-preserving — all 287 pre-existingdag-executortests pass unchanged (285 + 2 new loop_group tests).loop_groupcompletes in N iterations when theuntilsignal appears on iteration N; fails whenmax_iterationsis exceeded without the signal.depends_onto unknown node, accepts well-formedloop_group(3 new tests, 134 loader tests total).Security Impact (required)
No— new node variant; no new capability surface beyond existing AI/bash node execution already present in workflows.NoNoNo— body nodes use the same artifactsDir/logDir as the parent run; body node artifacts are namespaced{groupId}.{nodeId}.Compatibility / Migration
Yes— purely additive (new node variant). Existingloop:and all other node types unchanged. TherunLayersextraction is a behavior-preserving refactor covered by a golden test.NoNo— body node events reuse the existingremote_agent_workflow_eventstable with namespacedstep_name; no schema change (confirmed:getCompletedDagNodeOutputskeys onstep_nameonly).Human Verification (required)
To be completed as tasks land. Planned:
loop_groupcompletes in N iterations;$groupId.output= final iteration terminal output;$LOOP_PREV.<id>.outputresolves across iterations;max_iterations→ failed;interactivepause/resume;until_bashexit-code gate.loop:-equivalent); bodydepends_onto outer node id rejected; nestedloop_group(smoke-test only).loop_grouphardening (deferred); Web WorkflowBuilder nested editor (deferred).Side Effects / Blast Radius (required)
@archon/workflows(dag-executor, schemas, loader, validator, executor-shared).@archon/webonly if the recursivez.lazyschema affects OpenAPI type generation (verified in T2/T11).runLayersrefactor touches the single hottest code path in the workflow engine (every workflow run). Mitigation: it lands first as a pure refactor with a golden parity test, ideally as its own commit/PR-slice before any new node type.node_completedevents + outputs before/after refactor;bun run validategates CI.Rollback Plan (required)
git reverttheloop_groupcommits; therunLayersrefactor reverts independently (it's additive-factored, behavior-preserving). Removing theloop_groupvariant is a clean delete — no migration, no persisted state depends on it (body events are namespaced and naturally ignored if the variant is removed).loop_groupnode; workflows without it are unaffected.loop_groupnodes fail at load/validate if reverted; plain workflows unaffected.Risks and Mitigations
runLayersextraction introduces subtle behavior drift (session threading, usage accumulation, event ordering) in the engine's hottest path.z.lazy(() => dagNodeSchema)recursive schema breaks zod-openapi /@archon/webtype generation.bun run type-check+ webgenerate:types; fallback to depth-limited schema if recursive OpenAPI fails.loop:(body pauses mid-iteration vsloop:post-iteration).executeLoopNode:2065-2073carefully, adds an explicit resume test, documents the choice.Summary by CodeRabbit
until/until_bashandmax_iterations, including nested loop groups.$LOOP_PREV..., namespaced body events, and defined loop-group output as the terminal output from the final iteration.loop_group(examples, limitations, and resume behavior).