Skip to content

feat(phase-19): track D auto research lessons 54-57#203

Merged
rohitg00 merged 10 commits into
mainfrom
feat/phase-19-track-d2
May 27, 2026
Merged

feat(phase-19): track D auto research lessons 54-57#203
rohitg00 merged 10 commits into
mainfrom
feat/phase-19-track-d2

Conversation

@rohitg00

Copy link
Copy Markdown
Owner

Summary

Track D auto-research loop deep capstones (Phase 19, lessons 54-57). Four self-contained Python lessons that compose into a self-running research loop.

  • 54-paper-writer: LaTeX skeleton generator with figure injection, deterministic prose, and a manifest contract. Four validation gates fire before any file is written.
  • 55-critic-loop: Multi-turn revision over five fixed dimensions (clarity, novelty, evidence, methodology, related-work). Three stop conditions in priority order: target met, plateau detected, budget exhausted.
  • 56-iteration-scheduler: Async UCB1 scheduler with parallel slots, branch pruning, and fan-out triggers. Stdlib + numpy. Two budgets (max experiments, wall clock) enforced as hard limits.
  • 57-end-to-end-research-demo: Composes 54, 55, 56 through importlib path injection. Three seed hypotheses, six experiments, one best branch, one paper draft. Deterministic under fixed seed.

Stats

  • 4 lessons, 16 files added
  • 48 unit tests, all passing
  • 4 demos, all exit 0
  • 4 docs, 932-1056 words each, mermaid diagrams only
  • 4 quizzes, 6 questions each

Test plan

  • All four python3 code/main.py demos exit 0
  • python3 -m unittest discover -s code/tests passes per lesson (15 + 11 + 12 + 10 = 48)
  • Docs use mermaid only, all fences tagged, no em-dashes, no AI vocab
  • Atomic per-lesson commits
  • Stdlib + numpy only
  • No changes to site/, root README, or catalog.json

@coderabbitai

coderabbitai Bot commented May 26, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

This PR adds four complete capstone lessons (54–57) to Phase 19, building an end-to-end research automation pipeline. It includes a LaTeX paper generator, an iterative refinement loop with heuristic scoring, a parallel hypothesis scheduler with UCB branch selection, and an orchestrator composing all three. The catalog is updated to reflect new lesson and file counts, and prior capstone projects are refactored with expanded module structures.

Changes

Catalog and Prior Capstone Refactoring

Layer / File(s) Summary
Metadata and module structure updates
catalog.json
Totals updated (lessons 435→439, code_files 487→515), phase 19 lesson_count increased (17→21), and capstone projects 9–11 refactored from single-file entrypoints to expanded ts/src modules with corresponding test files. Four new capstone lesson entries (54–57) appended with main.py and Python test files.

Lesson 54: Paper Writer

Layer / File(s) Summary
LaTeX paper data models and validation
phases/19-capstone-projects/54-paper-writer/code/main.py
Defines BibEntry, Figure, Section, Paper dataclasses; implements _validate() enforcing non-empty title/abstract, unique figure IDs, citation key existence, and figure reference resolution. Provides LaTeX character escaping and render_latex() and render_bibtex() functions for document and bibliography generation.
Manifest ingestion and paper writing
phases/19-capstone-projects/54-paper-writer/code/main.py
Implements read_experiment_manifest() converting artifact dictionaries to stable Figure records with normalized paths; adds MockProseGenerator for deterministic prose from titles/outlines; provides PaperWriter class with fill_prose() and write() methods outputting paper.tex, references.bib, and manifest.json.
Tests, documentation, and quiz
phases/19-capstone-projects/54-paper-writer/code/tests/test_paper_writer.py, docs/en.md, quiz.json
Unit tests validating LaTeX skeleton, escaping, validation errors, manifest deduplication, and file I/O. Comprehensive documentation of contract, validation gates, and outputs. Quiz covering structure-first validation and expected behavior.

Lesson 55: Critic Loop

Layer / File(s) Summary
Loop data structures and protocols
phases/19-capstone-projects/55-critic-loop/code/main.py
Defines DIMENSIONS constant (five scoring axes), MiniSection and MiniPaper models, Suggestion and Critique structures with serialization, Critic and Reviser call-signature protocols, and LoopTrace and LoopResult containers.
CriticLoop convergence engine
phases/19-capstone-projects/55-critic-loop/code/main.py
Implements CriticLoop constructor with configuration defaults and run() method driving critic/reviser iteration with target-met convergence, plateau detection over sliding windows, per-round trace recording, and budget exhaustion.
Deterministic scorer, critic, and reviser
phases/19-capstone-projects/55-critic-loop/code/main.py
Implements deterministic_score() computing dimension scores from section metadata, deterministic_critic() emitting suggestions for low-scoring dimensions, deterministic_reviser() applying in-place mutations (expand sections, bump originality, add references), and demo() runner.
Tests, documentation, and quiz
phases/19-capstone-projects/55-critic-loop/code/tests/test_critic_loop.py, docs/en.md, quiz.json
Tests validate scoring bounds, suggestion coverage, monotone improvement, convergence types (target/plateau/budget), and reviser edits. Documentation specifies dimensions, convergence rules, and trace semantics. Quiz covers stop-condition priority and terminal verdicts.

Lesson 56: Iteration Scheduler

Layer / File(s) Summary
Data models and UCB scoring
phases/19-capstone-projects/56-iteration-scheduler/code/main.py
Defines Hypothesis, Result, BranchStats (with aggregates and flags), TraceEvent, and SchedulerReport dataclasses with serialization. Implements ucb_score() with special handling for zero-run branches.
IterationScheduler async orchestration
phases/19-capstone-projects/56-iteration-scheduler/code/main.py
Implements IterationScheduler with asyncio slot dispatch, UCB-based hypothesis selection, per-branch stats aggregation, paper-trigger thresholding (once per branch), pruning of low-yield branches, optional fan-out expansion, and dual budget enforcement (max experiments and wall-time deadline).
Deterministic runner and demo
phases/19-capstone-projects/56-iteration-scheduler/code/main.py
Implements make_deterministic_runner() computing bounded Gaussian rewards, deterministic_expander() generating follow-up hypotheses, and demo() orchestrating scheduler with seeded configuration.
Tests, documentation, and quiz
phases/19-capstone-projects/56-iteration-scheduler/code/tests/test_scheduler.py, docs/en.md, quiz.json
Tests validate UCB edge cases, scheduling correctness, paper-trigger semantics, pruning, and budget termination. Documentation covers queue/slot/result architecture, UCB formula, asyncio concurrency, and trace semantics. Quiz addresses UCB handling, paper triggers, and budget enforcement.

Lesson 57: End-to-End Research Demo

Layer / File(s) Summary
Cross-module imports and exception types
phases/19-capstone-projects/57-end-to-end-research-demo/code/main.py
Sets up sys.path and dynamically loads paper-writer, critic-loop, and iteration-scheduler modules; defines NoTriggerError and BestResultError exception types.
Hypothesis seeding and best-branch selection
phases/19-capstone-projects/57-end-to-end-research-demo/code/main.py
Implements make_seed_hypotheses() generating three seed branches (alpha, beta, gamma); pick_best_branch() selecting highest-mean triggered branch with alphabetical tie-breaking and typed error handling.
Paper model conversion and synthesis
phases/19-capstone-projects/57-end-to-end-research-demo/code/main.py
Implements build_mini_paper() constructing a MiniPaper from branch/reward with derived originality tag and targets; mini_to_full_paper() promoting MiniPaper to Paper by aggregating citations, synthesizing bibliography, adding results figures, and mapping section structures.
Full pipeline orchestration
phases/19-capstone-projects/57-end-to-end-research-demo/code/main.py
Implements run_demo() async function orchestrating five stages (seed → scheduler → picker → critic → paper writer), returning DemoReport with all upstream outputs. Provides synchronous demo() wrapper and __main__ JSON output.
Tests, documentation, and quiz
phases/19-capstone-projects/57-end-to-end-research-demo/code/tests/test_e2e.py, docs/en.md, quiz.json
Tests validate demo completion, paper file emission, determinism across seeds, error modes, validation propagation, seed shape, stop-reason membership, and scenario invariants. Documentation specifies five-stage composition, failure modes, and wiring. Quiz covers composition strategy and determinism contracts.

Sequence Diagram(s)

sequenceDiagram
  participant User
  participant E2E as End-to-End<br/>Demo
  participant Sched as Iteration<br/>Scheduler
  participant Critic as Critic<br/>Loop
  participant Writer as Paper<br/>Writer
  User->>E2E: run_demo(out_dir, seed=11)
  E2E->>E2E: make_seed_hypotheses()
  E2E->>Sched: IterationScheduler.run(seeds)
  Sched->>Sched: UCB branch selection<br/>parallel experiments
  Sched-->>E2E: SchedulerReport(best_branch, triggers)
  E2E->>E2E: pick_best_branch(report)
  E2E->>E2E: build_mini_paper(branch, reward)
  E2E->>Critic: CriticLoop.run(mini_paper)
  Critic->>Critic: deterministic_score()<br/>critic/reviser rounds
  Critic-->>E2E: LoopResult(converged_paper)
  E2E->>E2E: mini_to_full_paper(converged, branch)
  E2E->>Writer: PaperWriter.fill_prose(paper)
  Writer->>Writer: MockProseGenerator.generate()
  E2E->>Writer: writer.write(paper, out_dir)
  Writer->>Writer: render_latex(), render_bibtex()
  Writer-->>E2E: manifest.json
  E2E-->>User: DemoReport(scheduler+critic+paper)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 9.16% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat(phase-19): track D auto research lessons 54-57' clearly and accurately describes the main change: adding four new capstone lessons (54-57) for Phase 19 in the Track D auto-research curriculum. It is specific, concise, and directly summarizes the primary content of the changeset.
Description check ✅ Passed The description is comprehensive and directly related to the changeset. It provides a clear summary of the four lessons being added (paper-writer, critic-loop, iteration-scheduler, end-to-end-research-demo), their individual purposes, file counts, test coverage, and documentation details. It also lists completed test plans and constraints, making it highly informative about the PR's scope and intent.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/phase-19-track-d2

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🧹 Nitpick comments (3)
phases/19-capstone-projects/54-paper-writer/code/tests/test_paper_writer.py (1)

70-99: ⚡ Quick win

Add a regression test for duplicate bibliography keys.

Given the validation gates, a dedicated test for repeated BibEntry.key values will prevent regressions in BibTeX integrity.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@phases/19-capstone-projects/54-paper-writer/code/tests/test_paper_writer.py`
around lines 70 - 99, Add a new unit test in TestValidation that constructs a
Paper with two BibEntry objects sharing the same key (e.g., create a Paper via
_paper_min(), append two BibEntry(key="dup", ...)), then call render_latex and
assert it raises PaperValidationError (use assertRaisesRegex to match a message
like "duplicate bibliography key"); reference Paper, BibEntry, render_latex, and
PaperValidationError so the test mirrors the existing duplicate-figure and
unknown-citation tests to prevent regressions.
phases/19-capstone-projects/57-end-to-end-research-demo/code/main.py (1)

183-186: ⚡ Quick win

Avoid mutating MiniPaper during conversion.

On Line 185, sec.cites.append(...) mutates the input mini object in place. This converter should stay side-effect free and only shape the returned Paper.

♻️ Proposed fix
 def mini_to_full_paper(mini: MiniPaper, branch: str) -> Paper:
@@
-    if not bib:
-        bib = [BibEntry(
-            key=f"{branch}-baseline", entry_type="article",
-            fields={"title": "Baseline", "author": "Synthesised", "year": "2026"},
-        )]
-        for sec in mini.sections:
-            if sec.id == "intro":
-                sec.cites.append(f"{branch}-baseline")
-                break
+    baseline_key: str | None = None
+    if not bib:
+        baseline_key = f"{branch}-baseline"
+        bib = [BibEntry(
+            key=baseline_key, entry_type="article",
+            fields={"title": "Baseline", "author": "Synthesised", "year": "2026"},
+        )]
@@
     sections = []
     for s in mini.sections:
         figure_refs = [fig.id] if s.id == "results" else []
+        cites = list(s.cites)
+        if baseline_key is not None and s.id == "intro" and baseline_key not in cites:
+            cites.append(baseline_key)
         sections.append(Section(
             id=s.id, title=s.title, body=s.body,
-            cites=list(s.cites), figure_refs=figure_refs,
+            cites=cites, figure_refs=figure_refs,
         ))
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@phases/19-capstone-projects/57-end-to-end-research-demo/code/main.py` around
lines 183 - 186, The loop mutates the input MiniPaper by calling
sec.cites.append(...) on mini.sections; instead leave mini untouched and build
new Section objects (or shallow copies) for the output Paper, creating each
section's cites list as a copy of sec.cites plus the extra f"{branch}-baseline"
where needed; specifically, when iterating over mini.sections, do not call
sec.cites.append — construct a new section value (copying fields and using
sec.cites + [f"{branch}-baseline"] for the intro case) and add that to the
Paper.sections result so MiniPaper and mini remain side-effect free.
phases/19-capstone-projects/56-iteration-scheduler/code/tests/test_scheduler.py (1)

169-202: ⚡ Quick win

Add a regression test for trigger evaluation during drain.

Please add a deadline/max-budget scenario where a branch crosses paper_threshold only in drained results, then assert the trigger is still emitted.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@phases/19-capstone-projects/56-iteration-scheduler/code/tests/test_scheduler.py`
around lines 169 - 202, Add a new test (e.g., test_trigger_emitted_during_drain)
that constructs a seed with multiple Hypothesis objects and a slow runner that
delays long enough so some results finish only during the scheduler's drain
phase; configure IterationScheduler with a very short max_seconds (or a
max_experiments that causes early stop) and a paper_threshold such that the
target branch crosses the threshold only by results that complete during drain;
run sched.run(seed) and then assert the scheduler report contains the trigger
for that branch (for example by checking report.triggers or by asserting
any(t.branch == "b0" for t in report.triggers) — adapt the exact assertion to
the report field used in the codebase).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@phases/19-capstone-projects/54-paper-writer/code/main.py`:
- Around line 88-94: The code currently only checks that cited keys exist but
does not detect duplicate BibEntry.key values in paper.bibliography; add a
validation before the cite-check loop that scans paper.bibliography (e.g., using
bib_keys and a seen set or a Counter) to detect any duplicate BibEntry.key
values and raise PaperValidationError listing the duplicate key(s) (include the
offending key(s) in the error message) so duplicate BibTeX entries cannot be
emitted; keep this check alongside the existing cite validation that uses
bib_keys, and reference the same symbols (paper.bibliography, BibEntry.key,
bib_keys, PaperValidationError, paper.sections, sec.cites).

In `@phases/19-capstone-projects/55-critic-loop/code/main.py`:
- Around line 339-345: The branch handling sug.edit ==
"add-related-work-section" can no-op when a "Related Work" section exists but
has an empty body; change it to check for an existing section (search
paper.sections for s.title.lower() == "related work") and if found and s.body is
empty or whitespace, set s.body = "We survey adjacent work. " + ("x" * 200);
otherwise (no existing section) append a new MiniSection(id="related-work",
title="Related Work", body=...). This ensures the suggestion actually fills an
empty section instead of repeating without improving score.

In `@phases/19-capstone-projects/55-critic-loop/docs/en.md`:
- Around line 76-84: The docs incorrectly state that clarity is driven by
section count; update the description to match deterministic_score() by stating
that clarity is computed from average section body length (clarity grows when
the average/mean section body length crosses the configured threshold), and keep
the other signal descriptions (novelty, evidence, methodology, related-work)
as-is; reference the deterministic_score() function and the clarity signal when
making this edit so the prose matches the implementation.
- Around line 57-59: The doc's convergence flow incorrectly states the stop rule
as "mean score >= target" while the implementation in CriticLoop._target_met
requires every one of the five dimensions to meet or exceed target_score; update
the diagram/text (the A decision node and the related paragraphs) to state that
convergence requires all five dimensions to be >= target_score (or explicitly
list the five dimensions and the "all" condition), and make the same correction
for the other references around the same section (the lines referenced after
66–69) so the documentation matches CriticLoop._target_met.

In `@phases/19-capstone-projects/55-critic-loop/quiz.json`:
- Around line 43-52: The quiz option and explanation are incorrect:
Reviser.__call__ accepts (paper, suggestions), so update the option currently
reading "Only the list of Suggestion records" to something like "The paper and
the list of Suggestion records" and change the explanation to state that the
reviser receives both the paper and the Suggestion records (each Suggestion has
dimension, target section, and an edit instruction) and does not receive scores;
ensure the correct answer index points to the updated option and that the
wording references Reviser.__call__.

In `@phases/19-capstone-projects/56-iteration-scheduler/code/main.py`:
- Around line 209-211: The code indexes stats with result.branch without
guarding against unknown branches, causing crashes if a runner returns an unseen
branch; update both locations where you do bs = stats[result.branch] (the block
updating bs.runs and bs.reward_sum and the similar block at the later
occurrence) to first check if result.branch exists in stats and if not insert a
new BranchStats instance (or use stats.setdefault(result.branch,
BranchStats(...))) and then update bs.runs and bs.reward_sum; reference the
symbols stats, result.branch, bs.runs, bs.reward_sum, and BranchStats when
applying the fix.
- Around line 261-275: Loop draining in_flight tasks updates stats but never
runs paper.trigger, so valid triggers are lost; after successfully awaiting a
task (in the for task in list(in_flight.keys()) block) and after updating
bs.runs and bs.reward_sum, call the trigger evaluation on the associated paper
(e.g., paper.trigger(...) or the existing trigger-evaluation helper used
elsewhere) with the branch and updated stats/reward so trigger conditions are
evaluated and any resulting TraceEvent(s) appended to trace before removing the
task from in_flight; ensure to preserve the current TraceEvent("result.drain")
logic and handle exceptions exactly as before.

In `@phases/19-capstone-projects/56-iteration-scheduler/docs/en.md`:
- Around line 70-71: Update the docs to match the implementation: change wording
that currently states the scheduler posts Results to an asyncio.Queue and uses
random runner delay; instead explain that the scheduler creates tasks with
asyncio.create_task to run the async experiment runner (which yields a Result),
waits on the set of in-flight tasks using asyncio.wait(...,
return_when=asyncio.FIRST_COMPLETED), and then applies a fixed sleep delay
between loop iterations; reference the experiment runner, Result,
asyncio.create_task, the in-flight tasks set, and asyncio.FIRST_COMPLETED so
readers understand the actual control flow and timing.

In `@phases/19-capstone-projects/57-end-to-end-research-demo/docs/en.md`:
- Around line 80-81: Update the stale documentation to match the current tests
and API: remove or rewrite the claim that a test in test_e2e.py asserts an
empty-seed short-circuit to a typed exception (fix the sentence around the
original lines 80-81 to reflect that no such test exists), change the “one
figure per high-yield branch” wording to state that mini_to_full_paper() creates
one figure for the selected branch (replace the text around line 94), and
replace any reference to build_paper_from_result with the actual exposed
functions build_mini_paper and mini_to_full_paper (update references in the
block around lines 98 and the 94-100 range) so the docs accurately reflect
current function names and behavior.

---

Nitpick comments:
In `@phases/19-capstone-projects/54-paper-writer/code/tests/test_paper_writer.py`:
- Around line 70-99: Add a new unit test in TestValidation that constructs a
Paper with two BibEntry objects sharing the same key (e.g., create a Paper via
_paper_min(), append two BibEntry(key="dup", ...)), then call render_latex and
assert it raises PaperValidationError (use assertRaisesRegex to match a message
like "duplicate bibliography key"); reference Paper, BibEntry, render_latex, and
PaperValidationError so the test mirrors the existing duplicate-figure and
unknown-citation tests to prevent regressions.

In
`@phases/19-capstone-projects/56-iteration-scheduler/code/tests/test_scheduler.py`:
- Around line 169-202: Add a new test (e.g., test_trigger_emitted_during_drain)
that constructs a seed with multiple Hypothesis objects and a slow runner that
delays long enough so some results finish only during the scheduler's drain
phase; configure IterationScheduler with a very short max_seconds (or a
max_experiments that causes early stop) and a paper_threshold such that the
target branch crosses the threshold only by results that complete during drain;
run sched.run(seed) and then assert the scheduler report contains the trigger
for that branch (for example by checking report.triggers or by asserting
any(t.branch == "b0" for t in report.triggers) — adapt the exact assertion to
the report field used in the codebase).

In `@phases/19-capstone-projects/57-end-to-end-research-demo/code/main.py`:
- Around line 183-186: The loop mutates the input MiniPaper by calling
sec.cites.append(...) on mini.sections; instead leave mini untouched and build
new Section objects (or shallow copies) for the output Paper, creating each
section's cites list as a copy of sec.cites plus the extra f"{branch}-baseline"
where needed; specifically, when iterating over mini.sections, do not call
sec.cites.append — construct a new section value (copying fields and using
sec.cites + [f"{branch}-baseline"] for the intro case) and add that to the
Paper.sections result so MiniPaper and mini remain side-effect free.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1df3182b-0e2c-486c-aab4-e69bbca980b1

📥 Commits

Reviewing files that changed from the base of the PR and between c1374e1 and 49c52d4.

📒 Files selected for processing (17)
  • catalog.json
  • phases/19-capstone-projects/54-paper-writer/code/main.py
  • phases/19-capstone-projects/54-paper-writer/code/tests/test_paper_writer.py
  • phases/19-capstone-projects/54-paper-writer/docs/en.md
  • phases/19-capstone-projects/54-paper-writer/quiz.json
  • phases/19-capstone-projects/55-critic-loop/code/main.py
  • phases/19-capstone-projects/55-critic-loop/code/tests/test_critic_loop.py
  • phases/19-capstone-projects/55-critic-loop/docs/en.md
  • phases/19-capstone-projects/55-critic-loop/quiz.json
  • phases/19-capstone-projects/56-iteration-scheduler/code/main.py
  • phases/19-capstone-projects/56-iteration-scheduler/code/tests/test_scheduler.py
  • phases/19-capstone-projects/56-iteration-scheduler/docs/en.md
  • phases/19-capstone-projects/56-iteration-scheduler/quiz.json
  • phases/19-capstone-projects/57-end-to-end-research-demo/code/main.py
  • phases/19-capstone-projects/57-end-to-end-research-demo/code/tests/test_e2e.py
  • phases/19-capstone-projects/57-end-to-end-research-demo/docs/en.md
  • phases/19-capstone-projects/57-end-to-end-research-demo/quiz.json

Comment thread phases/19-capstone-projects/54-paper-writer/code/main.py Outdated
Comment thread phases/19-capstone-projects/55-critic-loop/code/main.py
Comment thread phases/19-capstone-projects/55-critic-loop/docs/en.md Outdated
Comment thread phases/19-capstone-projects/55-critic-loop/docs/en.md Outdated
Comment thread phases/19-capstone-projects/55-critic-loop/quiz.json
Comment thread phases/19-capstone-projects/56-iteration-scheduler/code/main.py Outdated
Comment thread phases/19-capstone-projects/56-iteration-scheduler/code/main.py
Comment thread phases/19-capstone-projects/56-iteration-scheduler/docs/en.md Outdated
Comment thread phases/19-capstone-projects/57-end-to-end-research-demo/docs/en.md Outdated
rohitg00 added 4 commits May 26, 2026 20:57
Reject duplicate bibliography keys during validation so render_bibtex
cannot emit duplicate BibTeX entries that would break downstream
compilation.
- add-related-work-section now fills an empty existing section instead
  of no-op'ing, so convergence cannot stall when the title already exists
- Convergence doc + diagram say all five dimensions must hit target_score,
  matching CriticLoop._target_met
- Deterministic-scoring prose says clarity tracks average section body
  length (matching deterministic_score), not section count
- Quiz: reviser receives (paper, suggestions), matching Reviser.__call__
- Result and drain paths now use stats.setdefault, so a runner that
  returns an unseen branch records the result instead of crashing
- Drain phase evaluates paper.trigger after updating stats, so valid
  triggers fired post-budget/deadline are no longer dropped
- Concurrency doc reflects the FIRST_COMPLETED in-flight wait + fixed
  delay_ms, matching the runner implementation
Align stale docs with current code/tests:
- Stage-failure paragraph now references the real picker tests
  (NoTriggerError / BestResultError) instead of a non-existent empty-seed
  short-circuit test
- mini_to_full_paper attaches one figure for the selected branch (not
  one per high-yield branch)
- Replace stale build_paper_from_result reference with the actual
  exposed pair build_mini_paper + mini_to_full_paper

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
phases/19-capstone-projects/56-iteration-scheduler/code/main.py (1)

261-277: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't block past the deadline stop.

If stop_reason is "deadline", Lines 261-263 still await every in-flight task to completion. That can push wall_seconds well past max_seconds, so the scheduler no longer honors the advertised hard wall-clock budget. Split the drain path so deadline stops cancel or skip outstanding work, and keep full draining only for non-deadline exits.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@phases/19-capstone-projects/56-iteration-scheduler/code/main.py` around lines
261 - 277, The loop that awaits every task in in_flight regardless of
stop_reason can exceed the hard wall-clock budget; change the drain logic in
main.py so that when self.stop_reason == "deadline" you do not await all
outstanding tasks: iterate over in_flight keys and for each outstanding
asyncio.Task call task.cancel() (and optionally await them with
return_exceptions or simply pop them), removing them from the in_flight dict
immediately, whereas keep the existing await-and-process behavior only for
non-deadline exits so BranchStats updates, bs.runs/bs.reward_sum increments,
paper trigger checks (paper_threshold / bs.mean) and TraceEvent appends still
run in the normal draining path.
♻️ Duplicate comments (1)
phases/19-capstone-projects/57-end-to-end-research-demo/docs/en.md (1)

80-80: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix stale test names in the failure-mode paragraph.

The referenced test names do not match current code/tests/test_e2e.py (test_no_trigger_raises, test_orphan_trigger_raises_best_result_error). Please rename them here (and avoid claiming writer non-invocation is pinned unless a direct test exists).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@phases/19-capstone-projects/57-end-to-end-research-demo/docs/en.md` at line
80, Update the failure-mode paragraph to use the current test names: replace
`test_no_triggers_raises_typed_error` with `test_no_trigger_raises` and
`test_best_picker_raises_when_no_triggers` with
`test_orphan_trigger_raises_best_result_error`, and remove or soften the
absolute claim that "the writer is never invoked" unless there is a direct test
asserting writer non-invocation (instead state that tests assert the picker
raises `NoTriggerError` / `BestResultError` when no branch fired a trigger).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@phases/19-capstone-projects/55-critic-loop/docs/en.md`:
- Line 76: The opening sentence incorrectly states "three signals" but the
critic actually scores five dimensions; update that phrase in the paragraph to
"five signals" (or "five scored signals") so it matches the subsequent list
(average section body length (clarity), figure count and citation count
(evidence), `originality_tag` (novelty), and the methodology/related-work
signals) — locate the sentence mentioning "three signals" near the description
of the shipped critic and revise it accordingly.

---

Outside diff comments:
In `@phases/19-capstone-projects/56-iteration-scheduler/code/main.py`:
- Around line 261-277: The loop that awaits every task in in_flight regardless
of stop_reason can exceed the hard wall-clock budget; change the drain logic in
main.py so that when self.stop_reason == "deadline" you do not await all
outstanding tasks: iterate over in_flight keys and for each outstanding
asyncio.Task call task.cancel() (and optionally await them with
return_exceptions or simply pop them), removing them from the in_flight dict
immediately, whereas keep the existing await-and-process behavior only for
non-deadline exits so BranchStats updates, bs.runs/bs.reward_sum increments,
paper trigger checks (paper_threshold / bs.mean) and TraceEvent appends still
run in the normal draining path.

---

Duplicate comments:
In `@phases/19-capstone-projects/57-end-to-end-research-demo/docs/en.md`:
- Line 80: Update the failure-mode paragraph to use the current test names:
replace `test_no_triggers_raises_typed_error` with `test_no_trigger_raises` and
`test_best_picker_raises_when_no_triggers` with
`test_orphan_trigger_raises_best_result_error`, and remove or soften the
absolute claim that "the writer is never invoked" unless there is a direct test
asserting writer non-invocation (instead state that tests assert the picker
raises `NoTriggerError` / `BestResultError` when no branch fired a trigger).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 03fd4c2e-6909-4d54-b61d-77756a8c0a4f

📥 Commits

Reviewing files that changed from the base of the PR and between 49c52d4 and 2556e28.

📒 Files selected for processing (7)
  • phases/19-capstone-projects/54-paper-writer/code/main.py
  • phases/19-capstone-projects/55-critic-loop/code/main.py
  • phases/19-capstone-projects/55-critic-loop/docs/en.md
  • phases/19-capstone-projects/55-critic-loop/quiz.json
  • phases/19-capstone-projects/56-iteration-scheduler/code/main.py
  • phases/19-capstone-projects/56-iteration-scheduler/docs/en.md
  • phases/19-capstone-projects/57-end-to-end-research-demo/docs/en.md
✅ Files skipped from review due to trivial changes (2)
  • phases/19-capstone-projects/56-iteration-scheduler/docs/en.md
  • phases/19-capstone-projects/55-critic-loop/quiz.json


## The deterministic critic in this lesson

The lesson does not call a model. The shipped critic is a callable that scores a draft based on three signals: average section body length (clarity), figure count and citation count (evidence), and an `originality_tag` field on the paper metadata (novelty). The reviser knows how to push each score upward.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix scoring-signal count for consistency.

This sentence says “three signals,” but the section documents five scored dimensions/signals right below (including methodology and related-work). Please align the count in this line to avoid confusion.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@phases/19-capstone-projects/55-critic-loop/docs/en.md` at line 76, The
opening sentence incorrectly states "three signals" but the critic actually
scores five dimensions; update that phrase in the paragraph to "five signals"
(or "five scored signals") so it matches the subsequent list (average section
body length (clarity), figure count and citation count (evidence),
`originality_tag` (novelty), and the methodology/related-work signals) — locate
the sentence mentioning "three signals" near the description of the shipped
critic and revise it accordingly.

@rohitg00 rohitg00 merged commit 01f8413 into main May 27, 2026
5 checks passed
@rohitg00 rohitg00 deleted the feat/phase-19-track-d2 branch May 27, 2026 18:41
JerryC8080 pushed a commit to JerryC8080/ai-engineering-from-scratch-zh that referenced this pull request Jun 1, 2026
feat(phase-19): track D auto research lessons 54-57
JerryC8080 pushed a commit to JerryC8080/ai-engineering-from-scratch-zh that referenced this pull request Jun 1, 2026
feat(phase-19): track D auto research lessons 54-57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant