Generated by scripts/generate_experimental_scope_report.py on 2026-05-09.
This report reads saved configs, reports, and publication claim tables only. It does not call a model, rewrite prompts, or unlock any held-out split. Counts are VAE cost/selection proxies and should not be read as iid statistical sample sizes.
| Quantity | Count |
|---|---|
| 3D stability config files | 131 |
| 3D Qwen-3B scaffold-family config files | 109 |
| 3D output run-root directories | 134 |
| 3D report JSON files | 108 |
| 3D report Markdown files | 128 |
| Report files with candidate/family rows | 39 |
| Top-level family rows | 109 |
| Top-level candidate rows | 48 |
| Top-level manual-seed rows | 51 |
| Top-level reference-candidate rows | 25 |
| Top-level metric rows | 60 |
| Stage | Scope | Selection / interpretation |
|---|---|---|
| Full-seed directional runs | 1 | Directional background |
| Checkpoint examples | 64 selector-dev + 64 final-test | Held-out delta = 2 examples |
| 10-seed scaffold tournament | 10 | 6 frozen wins, 2 ties, 2 continued wins (60.0% win; 80.0% non-loss) |
| 10-seed route-cost proxies | 188 candidate prompts; 142 teacher calls; 446 student-eval calls | 20 final-test access events; 41727 seconds summed wall-clock |
| Post-selection fixed-artifact audit | 10 seeds x 256 examples | 8 wins, 1 tie, 1 loss (80.0% win; 90.0% non-loss) |
| Capacity audit | 3 student sizes | 1 positive unchanged-transfer size (33.3%) |
| Stage | Count | Rate / status | Interpretation |
|---|---|---|---|
| Official 3D status rows | 10 | Surfaced unlocked rows plus seed 2903 blocked-before-final | Publication registry boundary |
| Unlocked held-out rows | 9 | 90.0% | Reached held-out metric table |
| Blocked-before-final rows | 1 | final_test locked | Protocol discipline, not a failed metric row |
| Claim-bearing rows | 8 | 88.9% | Excludes post-selection audit |
| Metric no-regression rows | 5 | 55.6% | No baseline-winning metric in seven-metric scorecard |
| Paper-clean held-out wins | 2 | 22.2% of unlocked; 25.0% of claim-bearing | Seeds 2801 and 4523 |
The ETHICS track supplies the broadest prompt-search ledger: the 10-seed package records candidate prompts, teacher calls, student-evaluation calls, final-test access events, and wall-clock cost next to frozen-vs-continued residual accuracy. The 3D track supplies the perturbation-stability ledger: ten official paper-facing status rows include one disciplined blocked-before-final row, nine unlocked rows, eight claim-bearing rows after excluding post-selection audit, and two clean held-out wins. These counts strengthen rather than dilute the main claim: the selected prompt artifacts are surviving moral-attention scaffolds from a logged search route, not isolated wording anecdotes.
paper/tables/publication_claim_tables.jsonreports/statistical_reporting_3d_2026-05-09.jsonreports/3d_ethics_scaffold_family_confirmatory_seed2903_2026-05-06.json- ETHICS-only imported tables in
paper/refined_prompt_shape_epiplexity_paper.tex