Skip to content

Commit 8021e4f

Browse files
docs(paper): revise JOSS manuscript wording and drop stale docx draft
Refine phrasing on scope, schema description, and AI-use disclosure; update submission date to 12 May 2026; ignore Paper-JOSS/ scratch dir. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 735cee7 commit 8021e4f

3 files changed

Lines changed: 24 additions & 19 deletions

File tree

.gitignore

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -188,8 +188,9 @@ audit_report.json
188188
# Ad-hoc citation-audit scratch (not from the tool)
189189
.citation_audit/
190190

191-
# Paper JSS
191+
# Paper JSS and JOSS submissions
192192
Paper-JSS/
193+
Paper-JOSS/
193194

194195
# Temp files
195196
MORNING_REPORT_2026-04-29.md

StatsPAI_JOSS_draft_for_Scott.docx

-17 KB
Binary file not shown.

paper.md

Lines changed: 22 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ affiliations:
2222
ror: 00f54p054
2323
- name: StatsPAI Inc., United States
2424
index: 2
25-
date: 9 May 2026
25+
date: 12 May 2026
2626
bibliography: paper.bib
2727
---
2828

@@ -34,21 +34,23 @@ for estimating, diagnosing, comparing, and reporting models that are
3434
usually spread across many specialized packages or proprietary
3535
statistical environments. The package currently exposes more than 950
3636
registered functions across more than 80 submodules, covering classical
37-
regression, instrumental variables, panel data, difference-in-differences,
37+
regression, instrumental variable analysis, panel data, difference-in-differences,
3838
regression discontinuity, synthetic control, matching,
3939
stochastic frontier analysis, mixed-effects models, decomposition
4040
methods, sensitivity analysis, and modern machine-learning estimators
4141
for heterogeneous treatment effects.
4242

43-
The package is designed for policy evaluation, social science research,
44-
and other empirical workflows where researchers must move between
45-
research design, estimation, diagnostics, robustness checks, and
46-
publication tables. A common result contract gives users `.summary()`,
43+
The package is designed for policy evaluation, social science and
44+
public health research, and other empirical workflows where researchers
45+
must move between research design, estimation, diagnostics, robustness
46+
checks, and publication tables. A common result contract gives users `.summary()`,
4747
`.plot()`, `.to_latex()`, `.to_docx()`, and `.cite()` methods where
4848
appropriate. `StatsPAI` is also agent-native: registered functions
49-
expose machine-readable schemas and structured failure metadata so that
50-
LLM-driven research assistants can discover estimators, choose among
51-
alternatives, and surface assumptions without parsing free-form prose.
49+
expose machine-readable schemas (structured descriptions of each
50+
function's arguments and outputs that programs can parse directly) and
51+
structured failure metadata so that LLM-driven research assistants can
52+
discover estimators, choose among alternatives, and surface assumptions
53+
without parsing free-form prose.
5254
The source code is available at
5355
[https://github.com/brycewang-stanford/statspai](https://github.com/brycewang-stanford/statspai).
5456

@@ -73,7 +75,7 @@ through estimation, robustness, and publication output.
7375
`StatsPAI` addresses this gap for graduate students, applied
7476
economists, policy researchers, and data scientists who want a
7577
Python-native workflow without giving up the breadth of Stata or the
76-
methodological depth of R. Its goal is not to replace every specialized
78+
methodological depth of R. The goal of StatsPAI is not to replace every specialized
7779
implementation. Instead, it provides a coherent empirical workspace:
7880
shared formula conventions, common result objects, consistent export
7981
methods, citations attached to estimators, and validation metadata that
@@ -124,8 +126,9 @@ across implementations.
124126
The package is implemented mainly in Python on top of NumPy, SciPy,
125127
Pandas, statsmodels, scikit-learn, and linearmodels. This keeps the
126128
installation path familiar for Python users and supports Python 3.9 and
127-
newer. Optional accelerator backends are used only where they materially
128-
change the computation: PyTorch for neural causal estimators, JAX for
129+
newer versions of Python. Optional accelerator backends are used only
130+
where they materially change the computation: PyTorch for neural causal
131+
estimators, JAX for
129132
selected bootstrap and linear algebra workloads, and a Rust/PyO3 kernel
130133
for high-dimensional fixed-effect and cluster-variance routines. This
131134
keeps the default package inspectable while allowing heavy workloads to
@@ -149,9 +152,10 @@ The near-term research impact is a more reproducible empirical workflow
149152
for applied policy evaluation. Because methods share one interface,
150153
researchers can compare estimators on the same data, export tables with
151154
the same metadata, and record the citations and assumptions attached to
152-
each analysis. Early use in Stanford REAP research workflows has shown
153-
the value of the package for rapid policy-evaluation prototyping, while
154-
the agent-native registry supports a second use case: AI-assisted
155+
each analysis. Early use in research workflows of the Rural Education
156+
Action Program at Stanford University has shown the value of the
157+
package for rapid policy-evaluation prototyping, while the agent-native
158+
registry supports a second use case: AI-assisted
155159
replication and robustness analysis in which statistical tools are
156160
discovered and invoked through explicit schemas rather than informal
157161
prompts.
@@ -161,9 +165,9 @@ prompts.
161165
Generative AI tools, including Claude and OpenAI/Codex, were used to
162166
draft portions of the documentation, assist with code generation, and
163167
revise this manuscript. The corresponding author reviewed AI-generated
164-
text, checked citations and software claims against repository evidence,
165-
and retained responsibility for the correctness of the package and this
166-
paper.
168+
text and checked citations and software claims against repository
169+
evidence. All of the authors take responsibility for the correctness of
170+
the package and this paper.
167171

168172
# Acknowledgements
169173

0 commit comments

Comments
 (0)