An end-to-end decision support project for merchant operations:
- Forecast daily revenue for the next 30 days
- Recommend deal dates and spend intensity under multiple scenarios
- Translate model output into execution-ready, business-facing guidance
Merchants often schedule promotions without demand-aware timing. This project addresses that gap by combining short-term forecasting with scenario-based deal recommendation.
- Revenue rolling backtest MAPE: 6.50%
- Revenue within +/-10% accuracy: 83.33%
- Recommendation candidates after model filters: 60 of 120
- Execution-ready recommendation dates (positive profit + ROI proxy): 14
- Dominant strategy in top candidates: light_push (1.25x)
- Designed and implemented the XGBoost revenue forecasting pipeline
- Built feature engineering framework (calendar, lag, rolling, ad metrics)
- Implemented expanding-window rolling backtest with recursive forecast logic
- Built scenario-based deal recommender with growth, efficiency, and balanced scoring
- Packaged project for reproducible GitHub delivery
- Marcie (Kaixuan) Ma
- Jisu Um
- Goyeun Yun
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python rebuild_xgboost_pipeline/build_features.py
python rebuild_xgboost_pipeline/backtest.py --target Revenue
python rebuild_xgboost_pipeline/forecast_30d.py --target Revenue
python rebuild_xgboost_pipeline/deal_recommendations.py --target Revenue- Included raw input: data/raw/synthetic_ecommerce_data.csv
- Generated feature tables: data/processed/
- Generated outputs: outputs/backtests/, outputs/forecasts/, outputs/recommendations/
Generated outputs are excluded from version control by default to keep the repository lightweight and reproducible.
data/
raw/
synthetic_ecommerce_data.csv
processed/
report/
executive-summary.html
outputs/
backtests/
forecasts/
recommendations/
rebuild_xgboost_pipeline/
build_features.py
backtest.py
forecast_30d.py
deal_recommendations.py
deal_scheduling_recommender.py
utils.py
- This repository focuses on the final delivery scope: Revenue forecasting + deal scheduling.
- Orders and sessions forecasting were intentionally deferred due to timeline constraints.
- Current validation uses a synthetic proxy dataset; next step is re-validation on production data.