MC Comparison: Run Stata phase and finalize results

## Status

The Monte Carlo comparison framework from #1 is committed in `montecarlo/`. Phase 1 (R) is complete. Phase 2 (Stata) needs to run on a faster machine.

## What's done

**Phase 1 (R MC) — COMPLETE** (`01_mc_R.R` ran locally, ~10 min on 7 cores)

- 500 sims × 2 DGPs × 2 sample sizes (200, 500) × 3 deltas (0, 1, 2) × 2 methods (simple, frechet) = 24 sharp cells + 1 fuzzy cell
- All 500/500 valid for every cell
- R results summary:
  - Coverage: 0.98-0.99 (conservative relative to 0.95 nominal)
  - Size (delta=0): 0.07-0.12
  - Power (delta=2): 0.99-1.00
- Output saved to `montecarlo/output/` (gitignored):
  - `output/data/` — 1,300 CSV files (data for Stata)
  - `output/xi/` — 200 xi matrices
  - `output/results_R/` — 5,127 files (per-rep results + summary + .rds)

## What to do next

### Step 1: Generate R output on the server

The `output/` directory is gitignored, so you need to regenerate it:

```bash
cd montecarlo/
Rscript 01_mc_R.R
```

This takes ~10 min with 7+ cores. It writes all data/xi/results to `output/`.

### Step 2: Run Stata MC

```bash
cd montecarlo/
stata-se -b do 02_mc_stata.do
# or: /path/to/StataSE -b do 02_mc_stata.do
```

This runs 100 sims × 16 sharp cells × 2 runs each (shared-bw + own-bw) + 100 fuzzy sims. Runtime estimate: 2-6 hours depending on the machine.

**Known issue fixed:** The original `capture noisily { r3d ... /// }` pattern broke in batch mode. Rewrote to use a helper `program` (`mc_run_and_save`) that avoids braces around `capture noisily`.

**If it still fails:** Check `02_mc_stata.log` in the working directory and `output/results_Stata/mc_stata.log`. Common issues:
- r3d package not found → `net install r3d, from("../stata_r3d/") replace`
- Mata not compiled → the script tries auto-compilation but may need `do ../stata_r3d/mata/r3d_mata.mata` manually first
- Path issues → script assumes `cwd = montecarlo/`

### Step 3: Run comparison

```bash
cd montecarlo/
Rscript 03_mc_compare.R
```

Produces:
- `output/comparison/mc_summary.csv` — main results table
- `output/comparison/mc_agreement.csv` — per-replication R vs Stata diffs
- `output/comparison/mc_tables.tex` — LaTeX table
- Console: PASS/FAIL assessment

### Step 4: Or just run everything

```bash
cd montecarlo/
bash 00_run_all.sh
```

Runs phases 0-3 sequentially (phase 0 = equivalence tests).

## Key design decisions

1. **Shared xi matrices**: R and Stata use identical N(0,1) multiplier matrices (saved as CSV) for deterministic bootstrap comparison.
2. **Two Stata runs per cell**: "shared-bw" (uses R's bandwidths) for apples-to-apples comparison, "own-bw" (Stata's bandwidth selection) for end-to-end comparison.
3. **Stata subset**: Only 100 of the 500 sims are run in Stata (runtime). R results use all 500.
4. **Delta=1 skipped in Stata**: Only delta=0 and delta=2 (null + large effect) to halve runtime.

## Success criteria

- Direct agreement (shared bw): tau diff p95 < 0.01 (frechet), < 0.05 (simple)
- Coverage (delta>0): both > 0.80
- Size (delta=0): both < 0.15
- R vs Stata power: within 10pp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MC Comparison: Run Stata phase and finalize results #3

Status

What's done

What to do next

Step 1: Generate R output on the server

Step 2: Run Stata MC

Step 3: Run comparison

Step 4: Or just run everything

Key design decisions

Success criteria

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

MC Comparison: Run Stata phase and finalize results #3

Description

Status

What's done

What to do next

Step 1: Generate R output on the server

Step 2: Run Stata MC

Step 3: Run comparison

Step 4: Or just run everything

Key design decisions

Success criteria

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions