SKU-Level Demand Forecasting with Neural Hierarchical Interpolation

This project focuses on forecasting monthly quantity demand at the SKU (Stock Keeping Unit) level for the next year (12 months of 2025). The system leverages a state-of-the-art deep learning model, incorporates local Vietnamese seasonal events (Lunar Calendar), and uses Bayesian hyperparameter tuning to deliver high-accuracy predictions across more than 48,000 distinct SKUs.

📌 Project Overview

Objective: Forecast the monthly demand quantity for each individual SKU for the 12 months of 2025.
Training Data: Historical monthly demand from 2015 to 2024 (120 months).
Test Data: 12 months of 2025 (from 2025-01-01 to 2025-12-01).
Scale: Concurrent forecasting for 48,093 SKUs (modeled as individual TimeSeries objects).

🛠️ Techniques Used

1. Forecasting Model: NHiTS

The project employs the NHiTS (Neural Hierarchical Interpolation for Time Series) model from the Darts library.

Why NHiTS: NHiTS is a state-of-the-art neural architecture that inherits from N-BEATS but introduces multi-rate pooling and hierarchical interpolation. This allows the model to process time series at multiple time scales, drastically reducing training/inference costs while significantly boosting accuracy for long-horizon forecasts (such as the 12-month window here).

2. Feature Engineering & Preprocessing

Vietnamese Lunar Calendar (vnlunar): To capture seasonal demand shifts unique to Vietnam, solar dates are converted to lunar calendar months and aggregated monthly to construct specific event indicators:
- is_tet: Lunar New Year month (Lunar Month 1).
- is_pre_tet: Lunar New Year preparation month (Lunar Month 12).
- is_ghost: Ghost Month (Lunar Month 7). These features are shifted by 12 months (shift(-12)) and utilized as past_covariates to help the model learn annual historical seasonal behaviors.
One-hot Column Compression: One-hot encoded vehicle features (feature_1h_vehicle_...) were grouped back into a single categorical vehicle model name feature (feature_model_name) to optimize memory footprint and dataset structure.
Data Scaling: Min-Max scaling (via Darts' Scaler) was applied to normalize the target demand values to the range $[0, 1]$ before training, and reversed (inverse_transform) to yield final forecast quantities.

3. Hyperparameter Optimization with Optuna

Bayesian optimization was conducted using Optuna on a representative sample of 5,000 SKUs. Tuning was performed on a Validation set (trained up to 2023-12-01, validated on 2024) with the objective of maximizing the business Pass Rate.

Search Space:
- input_chunk_length: $[24, 36, 48, 60]$ months
- batch_size: $[128, 256, 512, 1024, 2048]$
- learning_rate: log-uniform from $10^{-4}$ to $10^{-2}$
Best Parameters Found:
- input_chunk_length: 60
- batch_size: 128
- learning_rate: 0.0005359
- Optimal training epochs: 1 (due to Early Stopping with PyTorch Lightning)
- Record Pass Rate on Validation Set: 68.86%

📐 Business Evaluation Rules (PASS/FAIL)

Model performance is evaluated based on strict, predefined business requirements from the operations department. Both actual (Actual) and predicted (Forecast) values are rounded to the nearest integer before verification. SKUs are grouped and evaluated by part type (Part_Type):

📋 Dataset Distribution by Part_Type

PMC: Represents the vast majority of SKUs (68.75% of the dataset).
PAM: Represents 30.58% of SKUs.
P: Represents a tiny minority of SKUs (0.67% of the dataset).

⚖️ PASS/FAIL Criteria

Case 1: For PMC parts when actual demand is low ($\text{Actual} \le 20$):
- PASS if and only if the absolute difference between actual and forecast is 5 or less: $$\left| \text{Actual} - \text{Forecast} \right| \le 5$$
- FAIL otherwise.
Case 2: For all other cases (i.e., Part_Type is not PMC, or when $\text{Actual} > 20$):
- PASS if any of the following conditions are met:
  - Forecast = 0 and Actual = 0.
  - Forecast = 1 and Actual $\in {1, 2}$.
  - For small demands ($1 < \text{Forecast} < 10$), the Actual value is within $\pm 1$ of the Forecast: $$\text{Forecast} - 1 \le \text{Actual} \le \text{Forecast} + 1$$
  - For larger demands ($\text{Forecast} \ge 10$), the Actual value falls within a $\pm 10%$ error margin of the Forecast: $$0.9 \le \frac{\text{Actual}}{\text{Forecast}} \le 1.1$$
- FAIL if none of the above conditions are met.

📊 Experimental Results (Test Set 2025)

After training the NHiTS model on the full 10-year training set (2015-2024), here are the results achieved on the 2025 test set:

📈 Average Monthly Pass Rate by Part_Type

Part_Type	Average Monthly Pass Rate in 2025 (%)
PAM	72.39%
PMC	66.87%
P	32.82%

📅 Detailed Monthly Pass Rates in 2025 (%)

Month	PAM (%)	PMC (%)	P (%)
01/2025	75.58%	68.62%	35.60%
02/2025	73.74%	67.64%	39.32%
03/2025	73.71%	69.24%	34.37%
04/2025	74.48%	67.83%	32.51%
05/2025	74.55%	68.67%	36.84%
06/2025	73.74%	68.13%	31.58%
07/2025	71.78%	68.02%	28.17%
08/2025	72.89%	67.07%	33.44%
09/2025	72.93%	66.54%	30.96%
10/2025	70.07%	64.78%	31.89%
11/2025	69.09%	64.14%	30.96%
12/2025	66.17%	61.74%	28.17%
Average	72.39%	66.87%	32.82%

Key Observations:

The PAM group shows the highest predictability, averaging a 72.39% monthly pass rate.
The PMC group (comprising over 68% of the SKUs) remains highly stable throughout the year, hovering around an average of 66.87%.
The P group (extremely low SKU count, typically with very volatile or sparse demand patterns) yields a lower pass rate of 32.82%. This group would benefit from specialized intermittent demand forecasting algorithms in future iterations.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Result_Final.ipynb		Result_Final.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SKU-Level Demand Forecasting with Neural Hierarchical Interpolation

📌 Project Overview

🛠️ Techniques Used

1. Forecasting Model: NHiTS

2. Feature Engineering & Preprocessing

3. Hyperparameter Optimization with Optuna

📐 Business Evaluation Rules (PASS/FAIL)

📋 Dataset Distribution by Part_Type

⚖️ PASS/FAIL Criteria

📊 Experimental Results (Test Set 2025)

📈 Average Monthly Pass Rate by Part_Type

📅 Detailed Monthly Pass Rates in 2025 (%)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SKU-Level Demand Forecasting with Neural Hierarchical Interpolation

📌 Project Overview

🛠️ Techniques Used

1. Forecasting Model: NHiTS

2. Feature Engineering & Preprocessing

3. Hyperparameter Optimization with Optuna

📐 Business Evaluation Rules (PASS/FAIL)

📋 Dataset Distribution by Part_Type

⚖️ PASS/FAIL Criteria

📊 Experimental Results (Test Set 2025)

📈 Average Monthly Pass Rate by Part_Type

📅 Detailed Monthly Pass Rates in 2025 (%)

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages