This repository is the public code-and-metadata companion for the paper:
Making Hospital Context Visible to Support Learning Health Systems for Acute Care: A Transparent Observed-Data Workflow Using Linked National Survey Data
This repository is prepared for Zenodo archiving. Public releases contain code, metadata, mock templates, and non-disclosive manuscript outputs only. Proprietary AHA source data and restricted row-level derived datasets are not redistributed.
This release is designed for GitHub + Zenodo DOI archiving in a setting where the underlying source data are licensed and cannot be redistributed.
Included here:
scripts/— Scripts 13–17 used for the papermetadata/— variable manifest, file-layout crosswalk, and selected-variable templatesmock/— empty templates showing expected input structureoutputs/tables/— non-disclosive aggregate outputs used in the manuscriptoutputs/figures/— manuscript figuresdocs/— data availability and reproducibility notesenvironment.yml— a portable environment specificationCITATION.cffand.zenodo.json— repository citation and Zenodo metadataLICENSE— license for the code/documentation in this repository
The underlying AHA Annual Survey Database (ASDB) and AHA Information Technology data products are proprietary/licensed and are not included here.
Also excluded:
- raw AHA data files
- row-level merged analysis files
- hospital-level profile assignment outputs
- system-identifiable outputs
- local path configuration files used in the author’s computing environment
The paper used the following script order:
13_rebuild_weight_and_mask_missingness_v4.R14_rebuild_masked_smc_network_v4.R15_rebuild_weighted_masked_clustering_v5.R16_rebuild_system_nesting_and_validation_masked_v7.R17_update_manuscript_tables_and_figures_v3.R
- Observed-data workflow: no imputation of unreported capabilities
- Inverse-probability weighting: response weights estimated from consistently available structural variables
- Masking: pairwise similarity and score construction only from jointly observed values
- Public release constraint: only non-disclosive aggregate outputs are redistributed
The weighting script defines an IT response indicator based on whether at least one selected IT variable is observed, then fits a parsimonious inverse-probability weighting model using consistently available structural variables such as state/region, bed-size-like variables, teaching-related variables, ownership/control variables, rural/urban indicators, and system/network-related variables. The resulting weights are trimmed at lower and upper quantiles before downstream use.
Masking means the workflow does not fill in missing capability values. Similarities and module scores are computed only from observed overlap. This preserves reported capabilities while avoiding synthetic values, at the cost of leaving some hospitals unassigned under the current completeness rule.
This package is suitable for:
- auditing the published workflow
- reviewing the variable curation logic
- reproducing manuscript figures and aggregate outputs
- adapting the code locally if you independently obtain authorized AHA access
It is not a turnkey rerun package unless you already have the licensed AHA data and can map them to the expected local layout.
Please cite both:
- the associated manuscript, and
- the Zenodo DOI for this repository release.
See CITATION.cff for machine-readable citation metadata.