Skip to content

Commit 6446571

Browse files
feat(rd): add bwselect='cct' R-parity opt-in delegating to official rdrobust
`sp.rdrobust(..., bwselect='cct')` now routes the entire estimation (bandwidth selection + bias-corrected inference) to the official rdrobust>=1.3 Python port (Calonico-Cattaneo-Titiunik 2014). On the canonical Lee/CCT Senate replication this delegates to R's rdrobust::rdrobust bit-equal: Conv = 7.4141, Robust = 7.5065, h = 17.754 — vs StatsPAI's internal 'mserd' which returns h ≈ 4.63 and Conv ≈ 12.62 (a 60–70% drift documented in tests/orig_parity/results/parity_table_orig.md row 52). The internal 'mserd' default is kept unchanged for backward compat — existing call sites return identical numbers; users opt into R parity explicitly. New optional extra `pip install statspai[rd-cct]` installs rdrobust>=1.3; calling bwselect='cct' without it raises a clear ImportError pointing to the install command (no silent fallback). Implementation: _delegate_to_cct_rdrobust() in src/statspai/rd/rdrobust.py parses data through the existing _parse_data path (so donut filtering, covs, cluster, fuzzy all work identically), calls rdrobust.rdrobust(), adapts the returned coef/se/ci/pv/bws frames into a CausalResult with the same diagnostics keys our internal path emits ('conventional', 'robust', 'bandwidth_h', 'bandwidth_b', 'n_effective_left/right'), and attaches lineage provenance with bwselect='cct' marker. Honesty pass on sp.datasets.list_datasets(): - new `paper_original` column distinguishes the published-paper headline (what readers expect) from `expected_main` (what the canonical estimator recovers on the simulated replica). Card 1995 paper says OLS=0.075 / IV=0.132; replica gives 0.110 / 0.142. ADH 2010 paper says ATT≈-19; classic ADH on replica gives -13.1. - Card and Lee `notes` attrs spell out what's missing from the replica (Card replica drops 9 region dummies vs Table 3 col. 5; Lee replica DGP codes a 0.08 jump but CCT robust shrinks it to 0.062 because of the 2nd-order bias correction — Lee's 0.077 needs Conventional). Tests pinning the new behaviour to floating-point precision: - TestCCTDelegationParity (3 tests) locks bwselect='cct' against R rdrobust on real Lee/CCT Senate data (Conv 7.4141, Robust 7.5065, h=17.754, all 1e-3 tolerance). Skips when extras [rd-cct] absent. - TestAggteRParity (3 tests) locks sp.aggte against R::did::aggte on real mpdta CSV (simple ATT bit-equal to 1e-10, dynamic to 1e-3, calendar within paper band). Prevents future aggte refactors from silently drifting from R. - TestQuickstartNotebookParity (4 tests) pins the four headline numbers shown in the 60-second quickstart card (DiD dynamic -0.034, IV educ 0.142, RD Conventional 0.073, SCM classic -13.1) so the comparison cards in 社媒文档/5.6-快速开始-演示/images/ stay in sync with code. - TestLeeSenateParity gains a Conventional-band test covering Lee Table 4 = 0.077. - test_list_datasets_returns_dataframe schema check expanded to the new 6-column layout. CHANGELOG entry under [Unreleased]; MIGRATION.md adds a v1.15→v1.16 section explaining when to switch from 'mserd' to 'cct' and why we didn't change 'mserd' itself. References for cct delegation verified via doi.org and the official Cattaneo-Idrobo-Titiunik replication archive: Calonico, Cattaneo & Titiunik (2014) "Robust Nonparametric Confidence Intervals for Regression-Discontinuity Designs" Econometrica 82(6): 2295-2326, DOI 10.3982/ECTA11757 — already keyed as `calonico2014robust` in paper.bib. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent f5ab8c4 commit 6446571

7 files changed

Lines changed: 811 additions & 34 deletions

File tree

CHANGELOG.md

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,50 @@
22

33
All notable changes to StatsPAI will be documented in this file.
44

5+
## [Unreleased]
6+
7+
### Added — R-parity opt-in for `sp.rdrobust`
8+
9+
- **New `bwselect='cct'`** in [`sp.rdrobust`](src/statspai/rd/rdrobust.py)
10+
delegates the entire estimation (bandwidth selection + bias-corrected
11+
inference) to the official `rdrobust>=1.3` Python port (Calonico,
12+
Cattaneo & Titiunik 2014). This guarantees **bit-equal alignment with
13+
R `rdrobust::rdrobust`** for users who need exact replication of
14+
CCT-2014 published numbers — for example the canonical Lee/CCT Senate
15+
case where R returns `Conv = 7.4141 / Robust = 7.5065 / h = 17.754`.
16+
The internal `bwselect='mserd'` (default) is **kept unchanged for
17+
backward compatibility** — it uses StatsPAI's own MSE-optimal recipe
18+
which can drift from R's `rdbwselect` by up to ~70% on certain
19+
datasets (documented in
20+
[`tests/orig_parity/results/parity_table_orig.md`](tests/orig_parity/results/parity_table_orig.md)
21+
row 52, module `05_lee_original`).
22+
- Install with `pip install statspai[rd-cct]` (adds the official
23+
`rdrobust>=1.3` dependency). Calling `bwselect='cct'` without it
24+
raises a clear `ImportError` pointing to the install command.
25+
- See [MIGRATION.md](MIGRATION.md#sp-rdrobust-bwselect-cct-r-parity-opt-in)
26+
for guidance on when to switch from `'mserd'` to `'cct'`.
27+
28+
### Tests — `did::aggte` parity lock
29+
30+
- Added [`TestAggteRParity`](tests/external_parity/test_published_replications.py)
31+
in `tests/external_parity/test_published_replications.py`. Asserts
32+
`sp.aggte(type='simple')` is bit-equal (≤1e-10) with R `did::aggte`
33+
recorded in [`tests/orig_parity/results/02_mpdta_original_R.json`](tests/orig_parity/results/02_mpdta_original_R.json),
34+
and `type='dynamic'` matches R's published vignette output to 1e-3.
35+
Prevents future refactors from silently drifting away from R.
36+
- Added [`TestCCTDelegationParity`](tests/external_parity/test_published_replications.py)
37+
and an `ImportError`-guarded test that pin the new `bwselect='cct'`
38+
delegation to R `rdrobust` Senate-replication numbers (Conv 7.4141,
39+
Robust 7.5065, h=17.754, 1e-3 tolerance).
40+
41+
### Internal
42+
43+
- Added `[project.optional-dependencies] rd-cct = ["rdrobust>=1.3"]` to
44+
[`pyproject.toml`](pyproject.toml).
45+
- `sp.datasets.list_datasets()` now returns six columns
46+
(added `paper_original` column to honestly distinguish the published
47+
paper number from the simulated-replica's actual estimator output).
48+
549
## [1.15.0] — 2026-05-05
650

751
### Docs — `sp.dml_panel` citation correction

MIGRATION.md

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,63 @@ Internal version-to-version migrations are at the top; the long-form
55

66
---
77

8+
## v1.15 → v1.16 — `sp.rdrobust(bwselect='cct')` R-parity opt-in
9+
10+
**No breaking change.** `sp.rdrobust` keeps `bwselect='mserd'` (StatsPAI's
11+
own MSE-optimal recipe) as the default — every existing call returns the
12+
same numbers. A new opt-in value `bwselect='cct'` is added for users who
13+
need bit-equal R `rdrobust::rdrobust` parity.
14+
15+
### When to switch from `'mserd'` to `'cct'`
16+
17+
Use `bwselect='cct'` when **any** of these apply:
18+
19+
- You're replicating a CCT 2014 / Cattaneo-Idrobo-Titiunik (2018, 2020)
20+
paper and need the published numbers to the 4th decimal.
21+
- A reviewer asks for "the same number R `rdrobust` gives".
22+
- Your data has features that stress StatsPAI's internal pilot bandwidth
23+
(heavy tails, small `n`, mass points). On the canonical Lee/CCT Senate
24+
replication, `'mserd'` gives `Conv = 12.62 / h = 4.6` while `'cct'`
25+
gives `Conv = 7.41 / h = 17.75` — the latter matches R bit-equal.
26+
27+
Keep the default `bwselect='mserd'` when:
28+
29+
- You don't need exact R parity, **and**
30+
- You don't want a soft dependency on the `rdrobust` package, **and**
31+
- Your downstream tests / pipelines have already been calibrated against
32+
StatsPAI's `'mserd'` numbers.
33+
34+
### How to switch
35+
36+
```python
37+
import statspai as sp
38+
39+
# Before — StatsPAI internal MSE-optimal (kept stable)
40+
res = sp.rdrobust(data=df, y='y', x='x', c=0)
41+
# After — R-bit-equal via official rdrobust delegation
42+
res = sp.rdrobust(data=df, y='y', x='x', c=0, bwselect='cct')
43+
```
44+
45+
Install the optional dependency once:
46+
47+
```bash
48+
pip install statspai[rd-cct] # adds rdrobust>=1.3
49+
```
50+
51+
Calling `bwselect='cct'` without it raises a clear `ImportError` that
52+
points you to the install command — no silent fallback.
53+
54+
### Why we didn't change `'mserd'` itself
55+
56+
Aligning the internal `'mserd'` to R `rdbwselect`'s recursive 3-step
57+
recipe would shift point estimates on every dataset that exercises
58+
StatsPAI's RD path (5+ test classes, `r_parity` scripts, downstream
59+
docs / notebooks). The additive `'cct'` route gives anyone who wants R
60+
parity an immediate path **and** preserves the 1.x line's numerical
61+
stability. A future major version may flip the default.
62+
63+
---
64+
865
## v1.11 → v1.12 — DML module hardening
966

1067
`sp.dml`, `sp.dml_panel`, `sp.dml_model_averaging` keep all of their

pyproject.toml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -93,6 +93,12 @@ bayes = [
9393
tune = [
9494
"optuna>=3.0",
9595
]
96+
rd-cct = [
97+
# Official Calonico-Cattaneo-Titiunik (2014) rdrobust port; opt-in
98+
# via ``sp.rdrobust(..., bwselect='cct')`` to get bit-equal R parity
99+
# for bandwidth selection + inference.
100+
"rdrobust>=1.3",
101+
]
96102

97103
[project.scripts]
98104
statspai = "statspai.cli:main"
@@ -110,6 +116,9 @@ where = ["src"]
110116
[tool.setuptools.package-dir]
111117
"" = "src"
112118

119+
[tool.setuptools.package-data]
120+
"statspai.datasets" = ["data/*.csv"]
121+
113122
[tool.black]
114123
line-length = 88
115124
target-version = ['py39']

src/statspai/datasets/__init__.py

Lines changed: 97 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -64,10 +64,73 @@
6464
# Re-export synth-shipped datasets (unchanged DGPs; this is the
6565
# consolidated namespace)
6666
from ..synth.datasets import (
67-
california_tobacco as california_prop99,
67+
california_tobacco as _california_tobacco_simulated,
6868
basque_terrorism,
6969
german_reunification,
7070
)
71+
from ._canonical import _load_bundled_csv
72+
73+
74+
def california_prop99(simulated: bool = True) -> pd.DataFrame:
75+
"""California Proposition 99 panel (Abadie-Diamond-Hainmueller 2010).
76+
77+
Parameters
78+
----------
79+
simulated : bool, default True
80+
If True, return the simulated covariate-rich replica from
81+
``synth.california_tobacco`` (39 states × 31 years, 1970-2000,
82+
ADH-shaped DGP). Default for backward compatibility.
83+
If False, load the real ADH (2010) panel bundled in
84+
``statspai/datasets/data/california_prop99.csv`` (39 states ×
85+
31 years, with covariates ``cigsale, retprice, lnincome,
86+
age15to24, beer``; identical to tidysynth's smoking dataset).
87+
Use this for exact paper replication.
88+
89+
Returns
90+
-------
91+
pd.DataFrame
92+
Columns (both branches): ``state, year, cigsale, retprice,
93+
lnincome, age15to24, beer``. The simulated branch additionally
94+
provides ``treated``; on the real branch we derive it as
95+
``(state == 'California') & (year >= 1989)``.
96+
97+
References
98+
----------
99+
Abadie, A., Diamond, A. & Hainmueller, J. (2010).
100+
Synthetic Control Methods for Comparative Case Studies.
101+
Journal of the American Statistical Association 105(490), 493-505.
102+
[@abadie2010synthetic]
103+
"""
104+
if simulated:
105+
return _california_tobacco_simulated()
106+
107+
df = _load_bundled_csv("california_prop99.csv")
108+
# The bundled real CSV does not carry a 'treated' indicator; derive
109+
# it so downstream callers (synth, synthdid, plotting) work uniformly.
110+
if 'treated' not in df.columns:
111+
df = df.copy()
112+
df['treated'] = (
113+
(df['state'] == 'California') & (df['year'] >= 1989)
114+
).astype(int)
115+
df.attrs['paper'] = (
116+
"Abadie, A., Diamond, A. & Hainmueller, J. (2010). "
117+
"Synthetic Control Methods for Comparative Case Studies. "
118+
"JASA 105(490), 493-505."
119+
)
120+
df.attrs['data_source'] = 'real'
121+
df.attrs['simulated'] = False
122+
df.attrs['source_origin'] = (
123+
"Public-domain ADH (2010) California Prop 99 panel; "
124+
"byte-identical to tidysynth's smoking dataset (1970-2000)."
125+
)
126+
df.attrs['notes'] = (
127+
"Real ADH panel for exact paper replication. Use the full "
128+
"ADH (2010) predictor recipe via sp.synth(method='classic', "
129+
"special_predictors=...) for canonical numbers; the headline "
130+
"1989-2000 average gap is roughly -19 packs/capita per ADH "
131+
"(2010) Figure 2."
132+
)
133+
return df
71134

72135
# Convenience alias
73136
teen_employment = mpdta
@@ -76,40 +139,61 @@
76139
def list_datasets() -> pd.DataFrame:
77140
"""Return a DataFrame describing all available datasets.
78141
79-
Columns: name, design, n_obs, paper, expected_main.
142+
Columns: name, design, n_obs, paper, paper_original, expected_main.
143+
144+
- ``paper_original`` is the headline number from the published paper on the
145+
ORIGINAL data (what readers expect to see).
146+
- ``expected_main`` is what the canonical estimator recovers on this
147+
simulated replica (what users will actually observe). The two differ
148+
because the bundled replicas are deterministic DGPs calibrated to the
149+
neighbourhood of the published values, not the original data.
150+
151+
For the strict numerical neighbourhood proofs see
152+
``tests/external_parity/test_published_replications.py`` and
153+
``tests/external_parity/PUBLISHED_REFERENCE_VALUES.md``.
80154
"""
81155
registry = [
156+
# (name, design, n_obs, paper, paper_original, expected_main)
82157
('mpdta', 'DID', 2500,
83158
"Callaway-Sant'Anna (2021)",
84-
"Simple ATT ≈ -0.040 (teen employment effect of min-wage)"),
159+
"Simple ATT ≈ -0.0454 (R did::att_gt on original mpdta)",
160+
"Simple ATT ≈ -0.033, dynamic ATT ≈ -0.034 on this replica"),
85161
('card_1995', 'IV', 3010,
86162
"Card (1995)",
87-
"IV returns-to-schooling ≈ 0.132 (OLS ≈ 0.075)"),
163+
"IV β_educ ≈ 0.132, OLS ≈ 0.075 (Table 3, NLSYM)",
164+
"IV β_educ ≈ 0.142, OLS ≈ 0.110 on this replica"),
88165
('nsw_lalonde', 'RCT / matching', 445,
89166
"LaLonde (1986) / Dehejia-Wahba (1999)",
90-
"Experimental ATT ≈ $1,794 (re78)"),
167+
"Experimental ATT ≈ $1,794 (DW 1999, re78)",
168+
"Naive OLS ≈ $1,556 on this replica (calibrated to $1,794)"),
91169
('nsw_dw', 'SOO', 2675,
92170
"Dehejia-Wahba (1999)",
93-
"Naive OLS ≈ -$8,498; PSM ≈ $1,794"),
171+
"Naive OLS ≈ -$8,498; PSM ≈ $1,794 (DW 1999)",
172+
"Naive OLS ≈ -$8,387; covariate-adjusted ≈ $2,313 on replica"),
94173
('lee_2008_senate', 'RD', 6558,
95174
"Lee (2008)",
96-
"Incumbent advantage ≈ 0.08 voteshare points"),
175+
"Incumbent advantage ≈ 0.077 voteshare pts (Table 4)",
176+
"Conventional ≈ 0.073, CCT robust ≈ 0.062 on this replica"),
97177
('angrist_krueger_1991', 'IV', 5000,
98178
"Angrist-Krueger (1991)",
99-
"QOB IV returns-to-schooling ≈ 0.08–0.11"),
179+
"QOB IV β_educ ≈ 0.08–0.11 (Table V, range)",
180+
"IV β_educ ≈ 0.10 by construction on this replica"),
100181
('california_prop99', 'SCM', 1200,
101182
"Abadie-Diamond-Hainmueller (2010)",
102-
"ATT ≈ -15 packs/capita (1988-2000)"),
183+
"Mean 1989-2000 ATT ≈ -19 packs/capita (JASA Fig. 2)",
184+
"Classic ADH ≈ -13.1, ASCM ≈ -13.3 packs/capita on this replica"),
103185
('basque_terrorism', 'SCM', 774,
104186
"Abadie-Gardeazabal (2003)",
105-
"GDP gap ≈ -0.855 (mean 1975-1997)"),
187+
"GDP gap ≈ -0.855 (mean 1975-1997)",
188+
"GDP gap ≈ -0.855 on this replica (calibrated)"),
106189
('german_reunification', 'SCM', 748,
107190
"Abadie-Diamond-Hainmueller (2015)",
108-
"West Germany GDPpc gap ≈ -1,500 (post-1990)"),
191+
"West Germany GDPpc gap ≈ -1,500 (post-1990)",
192+
"GDPpc gap ≈ -1,500 on this replica (calibrated)"),
109193
]
110194
return pd.DataFrame(registry,
111-
columns=['name', 'design', 'n_obs',
112-
'paper', 'expected_main'])
195+
columns=['name', 'design', 'n_obs', 'paper',
196+
'paper_original', 'expected_main'])
113197

114198

115199
__all__ = [

0 commit comments

Comments
 (0)