The Healthcare Claims Integrity & Provider Performance Analysis initiative was launched to mitigate financial leakage within the CMS Medicare provider network. By leveraging a high-volume dataset (3.15 GB), this project engineered a scalable ETL pipeline to automate the identification of billing variances between "Submitted Charges" and "Actual Medicare Reimbursements." This solution establishes a "Single Source of Truth," enabling data-driven decision-making for network leadership by benchmarking provider performance against national specialty and geographic standards.
The pipeline follows a professional Star Schema architecture:
- Ingestion: Raw CMS datasets are hosted in AWS S3 and ingested via Alteryx.
- Cleaning: Technical CMS variables are normalized and mapped to professional Business Semantic names.
- Modeling: Cleaned data is loaded into a SQL Server Star Schema (Fact & Dimension tables).
- Visualization: Insights are surfaced via Power BI.
- Financial Leakage: Quantifying the dollar-amount variance between
Avg_Billed_AmountandAvg_Paid_Amount. - Provider Specialty Benchmarking: Ranking specialties by service volume and payment variance.
- Geographic Analysis: Identifying regional trends in Medicare payment rates.
- Operational Efficiency: Analyzing impact of 'Facility' vs. 'Non-Facility' settings.
- ETL: Alteryx (Advanced Workflow Design)
- Database: SQL Server (Star Schema, Physical Modeling)
- Visualization: Power BI (DAX, Interactive Dashboards)
- Cloud Storage: AWS S3 ("Landing Zone" for raw data)
-
Alteryx Designer: (Required for ETL execution).
-
SQL Server 2022: (For database hosting).
-
Power BI Desktop: (For rendering final dashboard).
-
AWS S3 Access: (Configured as the primary data lake landing zone).
/docs: Project Charter and Data Dictionary./sql: DDL scripts for Star Schema creation./alteryx: .yxmd workflow files./power-bi: .pbix dashboard templates.