This bundle uses both R (core MH-within-Gibbs sampler + waterfowl analysis) and Python (NUTS-HMC and PMMH baselines + plotting). Both stacks need to be installed before any of the make targets in README.md will work.
Time required: ~10 min for a clean install; less if R / Python are already set up.
The code has been run on R ≥ 4.2. Any reasonably recent R should work; check with:
R.version.stringFrom an R prompt at the repo root:
install.packages(c(
"Rcpp",
"RcppArmadillo",
"truncnorm",
"lpSolve",
"lintools",
"bridgesampling",
"mvtnorm"
# 'splines' is shipped with base R; no need to install separately
))The first run of any sampling script will trigger Rcpp compilation of the C++ files in R/src/cpp/. Compilation requires a working C++ toolchain:
- macOS:
xcode-select --install(Apple Clang) or install GCC via Homebrew. - Linux:
sudo apt install r-base-dev build-essential(or your distro's equivalent). - Windows: install Rtools matching your R version.
OpenMP is used to parallelize the hit-and-run inner loop. If your toolchain doesn't support OpenMP, the code will still compile (the #pragma omp lines become no-ops) but will run single-threaded.
make smokeshould complete in well under a minute and write outputs under results/runs/mh_within_gibbs/smoke/ and results/runs/data_analysis/duck_smoke/.
For exact replication of the paper's tables, capture the package versions you used:
# from R, at the repo root
renv::init() # creates renv.lock
renv::snapshot() # records current package versionsCommit the resulting renv.lock alongside this file. Future reviewers can then run renv::restore() to install the exact set.
Tested on Python ≥ 3.10. Earlier versions may work but JAX increasingly requires 3.10+.
From the repo root:
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txtThis installs loose-pinned versions of:
- numpy, scipy, pandas, matplotlib
- jax, jaxlib, numpyro (for
python/hmc/) - arviz (for
python/pmmh/pmmh_diagnostics.py) - jupyter (to run the
.ipynbentry points)
GPU note:
pip install jax jaxlibinstalls the CPU build. The paper's tables are reproducible on CPU; if you want GPU acceleration for large NUTS runs, see the JAX install instructions.
The committed requirements.txt uses >= floors, not exact versions. For an exact replication, freeze the working environment:
pip freeze > requirements-lock.txt
git add requirements-lock.txt
git commit -m "build: add Python lockfile"Reviewers can then install the exact set with:
pip install -r requirements-lock.txtThe simplest end-to-end check is the PMMH driver:
python python/pmmh/run_pmmh.pyThis runs a small (n=1000, p=5, d=2, m=1) PMMH chain in a couple of minutes and writes outputs under results/runs/pmmh/.
For NUTS-HMC, open python/hmc/batch_run_nuts.ipynb in Jupyter and run all cells; expect ≈ 5–15 min depending on the loop's (d, m) grid.
| Symptom | Likely cause | Fix |
|---|---|---|
Rcpp::sourceCpp fails with "no C++ compiler" |
Missing toolchain | See §1.2. |
Error in is_totally_unimodular(A) |
lintools not installed |
install.packages("lintools") |
ImportError: No module named jax |
Python deps not installed | See §2.2. |
make smoke writes to a different folder |
Stale JASA_* env vars in your shell |
`unset $(env |
| Different RMSE numbers than the paper | RNG depends on OMP_NUM_THREADS (the C++ hit-and-run uses R's RNG inside an OpenMP region) |
Set OMP_NUM_THREADS=1 for byte-exact replication. |
- A bit-exact replication of the paper's tables. The hit-and-run sampler's parallel section consumes R's RNG state in an order that depends on thread scheduling, so two runs with different
OMP_NUM_THREADS(or different OpenMP runtimes) may produce slightly differentbeta_samples. For the strictest reproducibility, setOMP_NUM_THREADS=1. - CI integration. The repo has no
.github/workflows/; reviewers should runmake smokemanually as a sanity check. - An R
DESCRIPTIONfile. The R package list in §1.2 is the canonical source.