Skip to content

zixiaotan21/STA522-project1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

STA 522 Project 1: Survey Sampling Analysis

Overview

This project analyzes U.S. college characteristics using stratified random sampling methodology. Two main research questions are addressed:

  1. Problem 1: Estimating total undergraduate enrollment across U.S. institutions
  2. Problem 2: Estimating the proportion of colleges offering undergraduate Statistics majors

Data Sources

  • Most-Recent-Cohorts-Institution_05192025.csv - Institution-level data
  • Most-Recent-Cohorts-Field-of-Study.csv - Program-level data
  • College Scorecard: https://collegescorecard.ed.gov/

Methodology

  • Sampling Design: Stratified random sampling (n=100)
  • Stratification: Public/Private × 2-year/4-year (4 strata)
  • Allocation: Proportional allocation
  • Estimation: Design-based estimators with 95% confidence intervals

Files

  • project1.Rmd - R Markdown source file with analysis and documentation
  • project1.pdf - Compiled report

Rendering

To compile the R Markdown file to PDF:

Rscript -e "rmarkdown::render('project1.Rmd', output_format = 'pdf_document')"

Or simply:

Rscript -e "rmarkdown::render('project1.Rmd')"

Requirements

  • R packages: dplyr, ggplot2, tidyr, gridExtra
  • LaTeX distribution (for PDF output)

Author

Zhihao Chen, Zixiao Tan Date: 2025-10-14

About

STA522-project1

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors