Adapted from the verified Economic-Research-Journal-Skills/resources/code library (Stata 18 MP, 2026-06); translated to English and made venue-neutral.
A reproducible code skeleton for empirical causal-inference research, venue-neutral and ready to copy-and-adapt. It serves both as "ready-to-run" templates for each identification method and as a concrete layout for a one-click reproduction workflow.
Verified-on-Stata-18 (per the source library): the core command chains were run end-to-end on synthetic data on Stata 18 MP and ran successfully (2026-06):
reghdfe(TWFE),bacondecomp,csdid,did_imputation, the event-study + parallel-trends joint test,esttabthree-line-table export,ivreg2+ KP rk F,weakivtesteffective F,rdrobust, andrddensity. All command syntax has been checked. For version-sensitive points (see below), defer to the latest official documentation.
Stata do-file rule (verified): in do-file mode a loop body must NOT share a line with
{— a newline is required after the opening brace. This library follows that rule throughout (see theforvalues/foreachblocks).
- Organize the project as below, with the scripts in this directory under
code/:
project/
├── data/raw/ raw data (read-only)
├── data/clean/ cleaned data
├── code/ <- scripts in this directory
├── output/tables/
└── output/figures/
- Open
stata/00_master.do, changeglobal rootto your project root, and uncomment the dependency-install block on the first run. - Run
do code/00_master.dofor a one-click pass: clean -> describe -> identify -> mechanism -> robustness -> export tables/figures.
Throughout the scripts, illustrative names are kept generic: Y (outcome), D / treat (treatment), id (unit), year, gvar (first treated period), and a controls list (size lev roa age group).
| File | Purpose |
|---|---|
00_master.do |
One-click master: paths, dependencies, fixed random seed |
01_clean.do |
Raw -> analysis sample: merge, recorded filtering, variable construction, winsorize |
02_descriptive.do |
Descriptive-statistics table; treatment/control balance |
03_did_modern.do |
TWFE baseline -> Bacon decomposition -> CS/BJS/SA/dCDH -> event study |
04_iv.do |
IV: KP rk F + effective F (weakivtest) + AR-robust inference (weakiv) |
05_rdd.do |
RDD: rdrobust bias-corrected robust CI + rddensity manipulation test |
06_dml.do |
Double machine learning: ddml + pystacked multi-learner |
07_mechanism.do |
Mechanism: estimate D->M only; argue M->Y by theory |
08_robustness.do |
Robustness system organized by identification threat + placebo + wild bootstrap |
09_tables.do |
Three-line tables (esttab) and event-study figure export |
| File | Purpose |
|---|---|
python/clean_panel.py |
Large-scale micro-panel cleaning (pandas), outputs analysis.dta |
python/dml_doubleml.py |
DoubleML PLR, multi-learner robustness comparison |
python/event_study_plot.py |
Publication-grade event-study / parallel-trends plot (matplotlib, >= 300 dpi) |
| Method | Paper | Stata | Python / R |
|---|---|---|---|
| Staggered DID | Callaway & Sant'Anna (2021) | csdid |
R did::att_gt |
| Staggered DID | Sun & Abraham (2021) | eventstudyinteract |
R fixest::sunab |
| Staggered DID | Borusyak, Jaravel & Spiess (2024) | did_imputation |
R didimputation |
| Staggered DID | Gardner (2022) two-stage | did2s |
R did2s |
| Staggered DID | de Chaisemartin & D'Haultfoeuille | did_multiplegt_dyn |
R DIDmultiplegtDYN |
| Bias diagnosis | Goodman-Bacon (2021) | bacondecomp |
— |
| IV effective F | Montiel Olea & Pflueger (2013) | weakivtest |
— |
| Weak-IV-robust | Anderson-Rubin / CLR | weakiv |
— |
| RDD | Calonico-Cattaneo-Titiunik (2014) | rdrobust rddensity |
R/Python same-name packages |
| DML | Chernozhukov et al. (2018) | ddml pdslasso |
Python DoubleML |
| Few-cluster inference | Roodman et al. (2019) | boottest |
— |
| Selection-bias bound | Oster (2019) | psacalc |
— |
Install uniformly with
ssc install <pkg>, replace.
- de Chaisemartin & D'Haultfoeuille:
did_multiplegt_dynsupersedes the olderdid_multiplegt. Use the_dyncommand and confirm option names against current docs. - Weak-IV inference: Stata 19 ships built-in weak-IV-robust inference; on Stata 18 and earlier this library uses the community-contributed
weakivtest/weakiv. Prefer the built-in path where available. - Manipulation test:
rddensity(Cattaneo-Jansson-Ma) supersedes the older McCrary (2008) density test. - DID estimator choice: report at least one heterogeneity-robust estimator (CS / SA / BJS / dCDH) as the main result rather than TWFE alone; package option names evolve, so verify against the latest documentation.