This project explores historical wildfire and weather data across California to build predictive models for wildfire frequency and severity. Developed as part of an MSBA course, it combines EDA and machine learning to identify trends and forecast wildfire risk.
Analyze spatio-temporal wildfire patterns and environmental triggers in California, then use statistical and ML-based methods to answer:
- What are the distribution and trends of wildfires in California?
- Which weather and climate factors drive wildfire frequency and severity?
- Can we effectively forecast wildfire occurrence?
| File/Folder | Description |
|---|---|
WildfirePrediction.ipynb |
EDA and ML modeling of historical wildfire data |
WildfirePredictionReport.pdf |
Summary of methods, performance, and impact |
WildfirePredectionDeck.pdf |
Slide deck of methodology and findings |
- Rising trends: Both wildfire frequency and burned acreage show upward trends over recent years.
- Climate correlations: Temperature, wind, and humidity strongly correlate with wildfire severity.
- Model performance:
- XG-Boost outperformed other classification and tree-based models.
- Pandas, NumPy, Matplotlib, Seaborn, SHAP β for EDA & visualization
- scikit-learn, TensorFlow / Keras β for machine learning models