Skip to content

Commit a656701

Browse files
Release StatsPAI 1.15.5
Add agent-card coverage ratchet, baseline-card generation, and inherited registry metadata, then bump package/docs metadata for 1.15.5. Co-authored-by: Codex <noreply@openai.com>
1 parent 73c30f0 commit a656701

29 files changed

Lines changed: 5580 additions & 116 deletions

CHANGELOG.md

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,37 @@ All notable changes to StatsPAI will be documented in this file.
44

55
## [Unreleased]
66

7+
## [1.15.5] — 2026-05-21
8+
9+
### Added — Agent-card coverage ratchet and baseline enrichment
10+
11+
- Added `scripts/agent_card_coverage.py`, `docs/agent_cards_spec.md`, and
12+
`tests/test_agent_card_coverage.py` to make raw curated agent-card metadata
13+
measurable and CI-ratcheted. The committed floor tracks 15 counters across
14+
Tier-B, Tier-A, Tier-S, per-field coverage, and certified / validated
15+
evidence counts.
16+
- Added generated `src/statspai/_baseline_cards.py` plus
17+
`scripts/gen_baseline_cards.py` to fill empty Tier-B fields from docstrings
18+
without overwriting curated registry entries. The baseline pass lifts tags to
19+
100% of the 1,018-function registry and keeps examples / references limited
20+
to mechanically extracted, auditable content.
21+
- Added `FunctionSpec.inherits_from` and inherited agent-card rendering for
22+
canonical estimator variants. Child specs keep their own descriptions,
23+
examples, parameters, references, validation status, and limitations, while
24+
sharing parent assumptions, preconditions, failure modes, alternatives, and
25+
`typical_n_min` where appropriate.
26+
27+
### Changed — Registry and documentation refresh
28+
29+
- Expanded validation evidence seeds for tested long-tail estimators so the
30+
agent registry distinguishes stable APIs from functions with explicit unit,
31+
regression, parity, or reference-test coverage.
32+
- Refreshed registry count, module statistics, and agent-platform positioning:
33+
1,018 registered public functions across 80 submodules.
34+
- Updated DiD and agent-facing docs to mark `continuous_did(method='cgs')` and
35+
`did_multiplegt_dyn` as experimental MVP paths rather than fully
36+
paper-parity estimators.
37+
738
## [1.15.4] — 2026-05-18
839

940
### Added — Auto-CJK font fallback on import

CITATION.cff

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,8 +16,8 @@ authors:
1616
given-names: "Biaoyue"
1717
affiliation: "Stanford REAP, Stanford University"
1818
email: "brycew6m@stanford.edu"
19-
version: "1.15.3"
20-
date-released: "2026-05-17"
19+
version: "1.15.5"
20+
date-released: "2026-05-21"
2121
license: MIT
2222
# Zenodo *concept* DOI — always resolves to the latest archived version.
2323
# For a specific-version DOI (e.g. for replication packages), follow the

README.md

Lines changed: 23 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
[![status](https://joss.theoj.org/papers/9f1c837b1b1df7adfcdd538c3698e332/status.svg)](https://joss.theoj.org/papers/9f1c837b1b1df7adfcdd538c3698e332)
1515
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.19933900.svg)](https://doi.org/10.5281/zenodo.19933900)
1616

17-
StatsPAI is the **first agent-native** Python platform for causal inference and applied econometrics. One `import`, **950+ registered functions** across **80+ submodules** (live count: `python scripts/registry_stats.py`), covering the complete empirical research workflow — from classical econometrics to cutting-edge ML/AI causal methods to publication-ready tables in Word, Excel, and LaTeX.
17+
StatsPAI is the **first agent-native** Python platform for causal inference and applied econometrics. One `import`, **1,000+ registered functions** across **80 submodules** (live count: `python scripts/registry_stats.py`), covering the complete empirical research workflow — from classical econometrics to cutting-edge ML/AI causal methods to publication-ready tables in Word, Excel, and LaTeX.
1818

1919
**Built for AI agents**: every function returns structured result objects with machine-readable schemas (`list_functions()`, `describe_function()`, `function_schema()`) and is numerically validated against R and Stata reference implementations — purpose-built for LLM-driven research workflows while remaining fully ergonomic for human researchers.
2020

@@ -124,7 +124,26 @@ StatsPAI's focus is **causal inference** — and on this axis we aim to be the m
124124

125125
**Legend**: 🏆 most complete across ecosystems · ✅ full coverage · ⚠️ partial / scattered / single algorithm · ❌ not available.
126126

127-
**StatsPAI at a glance**: 950+ registered functions in the live agent registry · 80+ submodules · ~230k LOC (core) + ~70k LOC (tests). All four numbers are reproducible from the canonical generator (`python scripts/registry_stats.py`); the per-module table in [`docs/stats.md`](docs/stats.md) is regenerated from the same script. For the full coverage matrix (23 method families) and cross-ecosystem line-count comparison, see [`docs/stats.md`](docs/stats.md).
127+
**StatsPAI at a glance**: 1,018 registered functions in the live agent registry · 80 submodules · ~249k LOC (core) + ~86k LOC (tests). All four numbers are reproducible from the canonical generator (`python scripts/registry_stats.py`); the per-module table in [`docs/stats.md`](docs/stats.md) is regenerated from the same script. For the full coverage matrix (23 method families) and cross-ecosystem line-count comparison, see [`docs/stats.md`](docs/stats.md).
128+
129+
**📦 v1.15.5 (2026-05-21) — Agent-card coverage ratchet**
130+
131+
StatsPAI now ships a CI-ratcheted agent-card coverage audit, generated
132+
baseline cards for the 1,018-function registry, and inherited agent-card
133+
metadata for canonical estimator variants. This release is registry /
134+
metadata infrastructure: estimator numerical paths are unchanged. Full
135+
notes in [`CHANGELOG.md`](CHANGELOG.md) under `[1.15.5]`.
136+
137+
---
138+
139+
**📦 v1.15.4 (2026-05-18) — Auto-CJK plot font fallback**
140+
141+
`import statspai as sp` now auto-registers a detected CJK font as a
142+
matplotlib fallback, so Chinese labels in plots render correctly on
143+
systems with standard CJK fonts installed. English-only plots keep the
144+
user's primary font unchanged, and users can opt out with
145+
`STATSPAI_NO_AUTO_CJK=1`. Full notes in [`CHANGELOG.md`](CHANGELOG.md)
146+
under `[1.15.4]`.
128147

129148
---
130149

@@ -392,7 +411,7 @@ StatsPAI 1.4.0 is Sprint 2 of the 知识地图 v3 roadmap. Closes the four secon
392411
| **Particle-filter assimilation** | **`sp.assimilation.particle_filter`** — bootstrap-SIR particle filter with systematic resampling (Gordon-Salmond-Smith 1993; Douc-Cappé 2005). Non-Gaussian priors, heavy-tailed observation noise, nonlinear dynamics via pluggable callbacks. Agrees with exact Kalman to ~0.003 under Gaussian DGPs. **`sp.assimilative_causal(..., backend='particle')`** routes the end-to-end wrapper. |
393412
| **Documentation (v3 frontier guides)** | `docs/guides/synth_experimental.md` (Abadie-Zhao inverse-SC workflow), `docs/guides/harvest_did.md` (Borusyak-Hull-Jaravel harvesting DID), `docs/guides/assimilative_ci.md` (Nature Comms 2026 streaming CI, Kalman + particle backends). Wired into `mkdocs.yml` nav. |
394413
| **v1.3 stable foundation (carried forward)** | 11 2025-2026 frontier methods from Sprint 1: `synth_experimental_design`, `rdrobust(..., bootstrap='rbc')`, `evidence_without_injustice`, `target_trial.to_paper(fmt='jama'/'bmj')`, `harvest_did`, `bcf_ordinal`, `bcf_factor_exposure`, `causal_mas`, `shift_share_political`, `causal_kalman`. All v1.0 capstone surfaces (`sp.bridge`, `sp.fairness`, `sp.surrogate`, `sp.epi`, `sp.longitudinal`, `sp.question`, full MR suite, TARGET checklist) remain intact. |
395-
| **Agent-native platform** | `sp.list_functions()` / `sp.describe_function()` / `sp.function_schema()` expose OpenAI/Anthropic tool-calling schemas for 874+ registered estimators. 5 new hand-written `FunctionSpec` entries this release. `sp.agent.mcp_server` MCP scaffold lets external LLMs call every StatsPAI function via natural-language tool invocation. |
414+
| **Agent-native platform** | `sp.list_functions()` / `sp.describe_function()` / `sp.function_schema()` expose OpenAI/Anthropic tool-calling schemas for 1,018 registered public functions. 145 curated or explicitly inherited `FunctionSpec` entries carry at least one of assumptions, preconditions, failure modes, limitations, `typical_n_min`, and validation tiers for the flagship surface. `sp.agent.mcp_server` MCP scaffold lets external LLMs call every StatsPAI function via natural-language tool invocation. |
396415
| **CI/CD hygiene** | `tabulate` hard-dep from v1.3.0 carried forward. Deflaked `test_forest_ate_recovers_average_tau` by seeding the forest explicitly (`random_state=0`, `n_estimators=300`, larger `n`). 2 699+ tests passing across all OS × Python matrix entries. |
397416

398417
**Previously in v0.9.2 — Decomposition Analysis**: **18 first-class decomposition methods across 13 modules (~6,200 LOC, 54 tests)**, unified under `sp.decompose(method=...)`. Mean (Blinder-Oaxaca/Gelbach/Fairlie/Bauer-Sinning/Yun), distributional (RIF/FFL/DFL/Machado-Mata/Melly/CFM), inequality (Theil/Atkinson/Dagum/Shapley/Lerman-Yitzhaki), demographic (Kitagawa/Das-Gupta), and causal (gap_closing/mediation_decompose/disparity_decompose). Closed-form influence functions for Theil/Atkinson, weighted O(n log n) Dagum Gini, cross-method consistency checks.
@@ -1375,7 +1394,7 @@ resolves to the latest version):
13751394
author = {Wang, Biaoyue},
13761395
title = {StatsPAI: The Agent-Native Causal Inference \& Econometrics Toolkit for Python},
13771396
year = {2026},
1378-
version = {1.15.3},
1397+
version = {1.15.5},
13791398
doi = {10.5281/zenodo.19933900},
13801399
url = {https://doi.org/10.5281/zenodo.19933900},
13811400
license = {MIT},

README_CN.md

Lines changed: 23 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
[![status](https://joss.theoj.org/papers/9f1c837b1b1df7adfcdd538c3698e332/status.svg)](https://joss.theoj.org/papers/9f1c837b1b1df7adfcdd538c3698e332)
1515
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.19933900.svg)](https://doi.org/10.5281/zenodo.19933900)
1616

17-
StatsPAI 是**首个面向 Agent 原生设计**的 Python 因果推断与应用计量经济学平台。一个 `import`**950+ 个注册函数**,分布在 **80+ 个子模块**(实时数量请运行 `python scripts/registry_stats.py`),覆盖从经典计量经济学到前沿 ML/AI 因果推断方法,再到论文级 Word、Excel、LaTeX 输出表格的完整实证研究流程。
17+
StatsPAI 是**首个面向 Agent 原生设计**的 Python 因果推断与应用计量经济学平台。一个 `import`**1,000+ 个注册函数**,分布在 **80 个子模块**(实时数量请运行 `python scripts/registry_stats.py`),覆盖从经典计量经济学到前沿 ML/AI 因果推断方法,再到论文级 Word、Excel、LaTeX 输出表格的完整实证研究流程。
1818

1919
**为 AI Agent 而生**:每个函数都返回结构化结果对象,附带机器可读的 schema(`list_functions()``describe_function()``function_schema()`),并通过 R 与 Stata 参考实现进行数值对齐验证——专为 LLM 驱动的研究流程设计,同时对人类研究者也完全友好。
2020

@@ -46,7 +46,26 @@ StatsPAI 聚焦**因果推断**——在这条主线上,我们的目标是成
4646

4747
**图例**:🏆 跨生态最完整 · ✅ 完整覆盖 · ⚠️ 部分 / 分散 / 单算法 · ❌ 无。
4848

49-
**StatsPAI 一句话概览**:live agent registry 中有 950+ 个注册函数 · 80+ 个子模块 · ~230k 行核心代码 + ~70k 行测试。这四个数字都可以由唯一的生成器 (`python scripts/registry_stats.py`) 现场复算;[`docs/stats.md`](docs/stats.md) 中的按模块拆分表也由同一个脚本回写。完整覆盖矩阵(23 个方法家族)以及跨生态行数对比,详见 [`docs/stats.md`](docs/stats.md)
49+
**StatsPAI 一句话概览**:live agent registry 中有 1,018 个注册函数 · 80 个子模块 · ~249k 行核心代码 + ~86k 行测试。这四个数字都可以由唯一的生成器 (`python scripts/registry_stats.py`) 现场复算;[`docs/stats.md`](docs/stats.md) 中的按模块拆分表也由同一个脚本回写。完整覆盖矩阵(23 个方法家族)以及跨生态行数对比,详见 [`docs/stats.md`](docs/stats.md)
50+
51+
**📦 v1.15.5(2026-05-21)— Agent-card 覆盖率 ratchet**
52+
53+
StatsPAI 现在带有 CI ratchet 的 agent-card 覆盖率审计、面向
54+
1,018 个注册函数的自动 baseline cards,以及 canonical 估计器变体的
55+
继承式 agent-card 元数据。本版本是 registry / metadata 基础设施更新:
56+
估计器数值路径不变。完整发布说明见 [`CHANGELOG.md`](CHANGELOG.md)
57+
`[1.15.5]`
58+
59+
---
60+
61+
**📦 v1.15.4(2026-05-18)— 绘图自动 CJK 字体 fallback**
62+
63+
`import statspai as sp` 现在会自动把检测到的 CJK 字体注册为
64+
matplotlib fallback,因此常见 macOS / Windows / Linux 桌面环境里的
65+
中文图标题、坐标轴和注释无需额外调用 `sp.use_chinese()` 即可正常渲染。
66+
英文图仍保留用户原本的主字体;如需关闭,可在导入前设置
67+
`STATSPAI_NO_AUTO_CJK=1`。完整发布说明见 [`CHANGELOG.md`](CHANGELOG.md)
68+
`[1.15.4]`
5069

5170
---
5271

@@ -220,7 +239,7 @@ StatsPAI 1.4.0 是知识地图 v3 路线图的 Sprint 2,关闭了 Sprint 1 末
220239
| **粒子滤波同化** | **`sp.assimilation.particle_filter`** — 带系统重采样的 bootstrap-SIR 粒子滤波(Gordon-Salmond-Smith 1993;Douc-Cappé 2005)。非高斯先验、重尾观测噪声、非线性动力学通过可插拔回调实现。高斯 DGP 下与精确 Kalman 一致到 ~0.003。**`sp.assimilative_causal(..., backend='particle')`** 端到端 wrapper 路由到粒子滤波。 |
221240
| **文档(v3 前沿指南)** | `docs/guides/synth_experimental.md`(Abadie-Zhao 反向 SC 流程)、`docs/guides/harvest_did.md`(Borusyak-Hull-Jaravel harvest DID)、`docs/guides/assimilative_ci.md`(Nature Comms 2026 流式 CI,Kalman + 粒子后端)。已挂到 `mkdocs.yml` 导航。 |
222241
| **v1.3 稳定基础(延续)** | Sprint 1 的 11 个 2025-2026 前沿方法:`synth_experimental_design``rdrobust(..., bootstrap='rbc')``evidence_without_injustice``target_trial.to_paper(fmt='jama'/'bmj')``harvest_did``bcf_ordinal``bcf_factor_exposure``causal_mas``shift_share_political``causal_kalman`。所有 v1.0 capstone 面(`sp.bridge``sp.fairness``sp.surrogate``sp.epi``sp.longitudinal``sp.question`、MR 全家桶、TARGET 清单)保持不变。 |
223-
| **Agent 平台** | `sp.list_functions()` / `sp.describe_function()` / `sp.function_schema()`874+ 估计量提供 OpenAI/Anthropic tool-calling schema。本版本新增 5 个手工 `FunctionSpec``sp.agent.mcp_server` MCP 脚手架让外部 LLM 可自然语言调用每个函数。 |
242+
| **Agent 平台** | `sp.list_functions()` / `sp.describe_function()` / `sp.function_schema()`1,018 个注册公共函数提供 OpenAI/Anthropic tool-calling schema。145 个已策划或显式继承的 `FunctionSpec` 条目携带假设、前置条件、失败模式、局限、`typical_n_min` 或 validation tier`sp.agent.mcp_server` MCP 脚手架让外部 LLM 可自然语言调用每个函数。 |
224243
| **CI/CD 卫生** | v1.3.0 的 `tabulate` 硬依赖延续。通过显式播种(`random_state=0``n_estimators=300`、扩 `n`)修复 `test_forest_ate_recovers_average_tau` flake。2 699+ 测试在所有 OS × Python matrix 上通过。 |
225244

226245
**v0.6 新功能**`sp.interactive(fig)` —— 类似 Stata Graph Editor 的 WYSIWYG 图表编辑器,支持 29 种学术主题、实时预览、自动生成可复现代码。
@@ -592,7 +611,7 @@ sp.__citation__ # 与 sp.citation("bibtex") 等价
592611
author = {Wang, Biaoyue},
593612
title = {StatsPAI: The Agent-Native Causal Inference \& Econometrics Toolkit for Python},
594613
year = {2026},
595-
version = {1.15.3},
614+
version = {1.15.5},
596615
doi = {10.5281/zenodo.19933900},
597616
url = {https://doi.org/10.5281/zenodo.19933900},
598617
license = {MIT},

0 commit comments

Comments
 (0)