Skip to content

feat(rlasso): faithful hdm::rlassologit port (logistic rigorous Lasso) #277

feat(rlasso): faithful hdm::rlassologit port (logistic rigorous Lasso)

feat(rlasso): faithful hdm::rlassologit port (logistic rigorous Lasso) #277

Workflow file for this run

name: Parity guards
# Two CLAUDE.md §3 contracts enforced as required CI gates, independent of
# the main ci-cd.yml pipeline so they show up as their own required checks on
# PRs and never get diluted by the unrelated test matrix:
#
# 1. reference_parity — numerical alignment with R / Stata / paper anchors
# 2. registry_drift — registry public-function count + submodule count
# stays within the README floor (1000+ / 80)
#
# The §10 zero-hallucination citation audit lives in its own workflow
# (.github/workflows/citation-audit.yml) — it used to be duplicated here as a
# third job, but running the network-dependent auditor twice per push only
# doubled the transient-flake surface. The dedicated workflow is the single
# source of truth: it carries the full 4-gate suite (auditor pytest + bib
# duplicate/coverage audits + the live citation audit) and an exit-2 soft-pass
# so an arXiv/Crossref outage warns instead of blocking.
on:
push:
branches: [main]
pull_request:
branches: [main]
workflow_dispatch:
concurrency:
group: parity-${{ github.ref }}
cancel-in-progress: true
jobs:
reference-parity:
name: Numerical reference parity
runs-on: ubuntu-latest
timeout-minutes: 30
steps:
- uses: actions/checkout@v5
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: "3.10"
- name: Cache pip
uses: actions/cache@v5
with:
path: ~/.cache/pip
key: ${{ runner.os }}-parity-pip-${{ hashFiles('**/pyproject.toml') }}
restore-keys: |
${{ runner.os }}-parity-pip-
- name: Install
run: |
python -m pip install --upgrade pip
pip install -e ".[dev]"
- name: Run reference_parity suite
run: |
pytest tests/reference_parity/ -q --no-header
registry-drift:
name: Registry / API surface drift
runs-on: ubuntu-latest
# The runnable-examples ratchet executes 1000+ docstring snippets. Keep
# enough headroom for GitHub-hosted runner variance after install/setup.
timeout-minutes: 25
steps:
- uses: actions/checkout@v5
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: "3.10"
# ``[dev]`` is required for pytest plus the parity / drift-check tooling
# invoked below (registry_stats, dump_schemas, examples_coverage).
- name: Install
run: |
python -m pip install --upgrade pip
pip install -e ".[dev]"
- name: registry_stats --check
run: |
python scripts/registry_stats.py --check
- name: dump_schemas --check (MCP cold-start bundle drift)
# The MCP server serves tools/list + resources from the committed
# schemas/ bundle on its cold-start fast path, gated only on a
# matching statspai_version. Without this guard, a registry change
# within the same version (e.g. a refactor between releases) drifts
# the bundle silently and the server serves a stale tool list. This
# check fails the build until the bundle is regenerated + committed.
run: |
python scripts/dump_schemas.py --check
- name: API surface consistency
run: |
pytest tests/test_api_surface_consistency.py -q --no-header
- name: examples_coverage --check (docstring Examples ratchet)
# Ratchet, not target: the budget is the number of registered
# symbols whose docstring lacks an ``Examples`` section. The
# 2026-06 examples campaign drove this from 661 (2026-06-12
# baseline, 370/1031 covered) to 0 — every registered function
# carries a docstring Examples section (1031/1031). The scanner
# resolves each registered name through its real source
# (scripts/_resolve.py), so the intentionally submodule-scoped
# functions (sp.causal_llm.echo_client, sp.assimilation.
# particle_filter, ...) are measured against their actual
# docstring rather than a top-level getattr that returns None.
# Budget is now 0: a new public function that ships without a
# docstring Examples section fails this gate (CLAUDE.md §4 requires
# Parameters/Returns/Examples/References). It must NEVER be raised.
run: |
python scripts/examples_coverage.py --check --max-missing 0
- name: check_example_execution (docstring Examples runnability)
# Presence is necessary but not sufficient: an ``Examples`` block
# that references an undefined ``df`` or a bare ``did(...)`` call
# is documentation theatre. This gate extracts every example's
# ``>>>`` source (dropping ``# doctest: +SKIP`` lines for
# heavy-dep / external-data blocks) and executes it. The whole
# campaign was verified to 1018 runnable / 0 failing; the ratchet
# holds that line. A new example that does not run fails here.
run: |
python scripts/check_example_execution.py --quiet --max-failures 0
# NOTE: the §10 citation audit job that used to live here was removed —
# it duplicated .github/workflows/citation-audit.yml (which is more
# complete and has an exit-2 soft-pass for transient arXiv/Crossref
# outages). Running the network-dependent auditor twice per push only
# doubled the flake surface. See that workflow for the single source of
# truth.