HR-Optimization

A data-driven solution for predicting employee churn at Seagate using machine learning and analytics. This project leverages Python, R, and Excel to clean, analyze, and model employee data, helping HR teams make informed workforce planning decisions.

Problem Statement

Employee turnover costs can range from 50% to 200% of an employee's annual salary. Our goal is to build a predictive model that identifies employees at risk of voluntary churn, enabling Seagate's HR team to:

Forecast a two-year hiring pipeline
Optimize workforce strategies
Minimize financial impact (~$43.6M projected churn costs)

Project Highlights

✅ Cleaned and prepared a 25,995-row dataset across 27 countries and 82 locations. ✅ Handled missing data, outliers, and normalized compensation data. ✅ Engineered features from HR data (e.g., tenure, compa ratio, generation). ✅ Built multiple ML models:

Random Forest (Best model: 82% accuracy)
Logistic Regression
KNN-5, KNN-10
AdaBoost
Bagging Classifier

✅ Delivered actionable insights:

Engineering teams → 114% higher churn risk
Asia-Pacific → 1.8x more likely to churn
Thailand → 31% higher churn than USA

Tools & Technologies

📊 Data	🧠 Modeling	📈 Visualization
Excel	Python (scikit-learn, pandas)	Tableau, Python (matplotlib, seaborn)
R	Random Forest	Matplotlib, Seaborn
Jupyter

Files in This Repository

File	Description
SeagateCorr_4.ipynb	Jupyter notebook for churn prediction models
DataCleanupReport.docx	Detailed data cleaning steps
PreliminaryDataReport.pdf	Initial exploration & insights
PresentationNotes.pdf	Final presentation with key takeaways
ProductBacklog.pdf	Agile product backlog

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
S24_004_Group_6_Project2_DataCleanupReport.docx		S24_004_Group_6_Project2_DataCleanupReport.docx
S24_004_Group_6_Project2_PrelimDataReport.pdf		S24_004_Group_6_Project2_PrelimDataReport.pdf
S24_004_Group_6_Project2_PresentationNotes.pdf		S24_004_Group_6_Project2_PresentationNotes.pdf
S24_004_Group_6_Project2_ProductBacklog.pdf		S24_004_Group_6_Project2_ProductBacklog.pdf
S24_004_Group_6_Project2_Slides.pptx		S24_004_Group_6_Project2_Slides.pptx
S24_004_Group_6_Project2_SprintRetrospective.pdf		S24_004_Group_6_Project2_SprintRetrospective.pdf
SeagateCorr_4.ipynb		SeagateCorr_4.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HR-Optimization

Problem Statement

Project Highlights

Tools & Technologies

Files in This Repository

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HR-Optimization

Problem Statement

Project Highlights

Tools & Technologies

Files in This Repository

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages