Skip to content

tuansondoan-mark/SKU-Level-Demand-Forecasting-with-Neural-Hierarchical-Interpolation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SKU-Level Demand Forecasting with Neural Hierarchical Interpolation

This project focuses on forecasting monthly quantity demand at the SKU (Stock Keeping Unit) level for the next year (12 months of 2025). The system leverages a state-of-the-art deep learning model, incorporates local Vietnamese seasonal events (Lunar Calendar), and uses Bayesian hyperparameter tuning to deliver high-accuracy predictions across more than 48,000 distinct SKUs.


📌 Project Overview

  • Objective: Forecast the monthly demand quantity for each individual SKU for the 12 months of 2025.
  • Training Data: Historical monthly demand from 2015 to 2024 (120 months).
  • Test Data: 12 months of 2025 (from 2025-01-01 to 2025-12-01).
  • Scale: Concurrent forecasting for 48,093 SKUs (modeled as individual TimeSeries objects).

🛠️ Techniques Used

1. Forecasting Model: NHiTS

The project employs the NHiTS (Neural Hierarchical Interpolation for Time Series) model from the Darts library.

  • Why NHiTS: NHiTS is a state-of-the-art neural architecture that inherits from N-BEATS but introduces multi-rate pooling and hierarchical interpolation. This allows the model to process time series at multiple time scales, drastically reducing training/inference costs while significantly boosting accuracy for long-horizon forecasts (such as the 12-month window here).

2. Feature Engineering & Preprocessing

  • Vietnamese Lunar Calendar (vnlunar): To capture seasonal demand shifts unique to Vietnam, solar dates are converted to lunar calendar months and aggregated monthly to construct specific event indicators:
    • is_tet: Lunar New Year month (Lunar Month 1).
    • is_pre_tet: Lunar New Year preparation month (Lunar Month 12).
    • is_ghost: Ghost Month (Lunar Month 7). These features are shifted by 12 months (shift(-12)) and utilized as past_covariates to help the model learn annual historical seasonal behaviors.
  • One-hot Column Compression: One-hot encoded vehicle features (feature_1h_vehicle_...) were grouped back into a single categorical vehicle model name feature (feature_model_name) to optimize memory footprint and dataset structure.
  • Data Scaling: Min-Max scaling (via Darts' Scaler) was applied to normalize the target demand values to the range $[0, 1]$ before training, and reversed (inverse_transform) to yield final forecast quantities.

3. Hyperparameter Optimization with Optuna

Bayesian optimization was conducted using Optuna on a representative sample of 5,000 SKUs. Tuning was performed on a Validation set (trained up to 2023-12-01, validated on 2024) with the objective of maximizing the business Pass Rate.

  • Search Space:
    • input_chunk_length: $[24, 36, 48, 60]$ months
    • batch_size: $[128, 256, 512, 1024, 2048]$
    • learning_rate: log-uniform from $10^{-4}$ to $10^{-2}$
  • Best Parameters Found:
    • input_chunk_length: 60
    • batch_size: 128
    • learning_rate: 0.0005359
    • Optimal training epochs: 1 (due to Early Stopping with PyTorch Lightning)
    • Record Pass Rate on Validation Set: 68.86%

📐 Business Evaluation Rules (PASS/FAIL)

Model performance is evaluated based on strict, predefined business requirements from the operations department. Both actual (Actual) and predicted (Forecast) values are rounded to the nearest integer before verification. SKUs are grouped and evaluated by part type (Part_Type):

📋 Dataset Distribution by Part_Type

  • PMC: Represents the vast majority of SKUs (68.75% of the dataset).
  • PAM: Represents 30.58% of SKUs.
  • P: Represents a tiny minority of SKUs (0.67% of the dataset).

⚖️ PASS/FAIL Criteria

  1. Case 1: For PMC parts when actual demand is low ($\text{Actual} \le 20$):

    • PASS if and only if the absolute difference between actual and forecast is 5 or less: $$\left| \text{Actual} - \text{Forecast} \right| \le 5$$
    • FAIL otherwise.
  2. Case 2: For all other cases (i.e., Part_Type is not PMC, or when $\text{Actual} > 20$):

    • PASS if any of the following conditions are met:
      • Forecast = 0 and Actual = 0.
      • Forecast = 1 and Actual $\in {1, 2}$.
      • For small demands ($1 < \text{Forecast} < 10$), the Actual value is within $\pm 1$ of the Forecast: $$\text{Forecast} - 1 \le \text{Actual} \le \text{Forecast} + 1$$
      • For larger demands ($\text{Forecast} \ge 10$), the Actual value falls within a $\pm 10%$ error margin of the Forecast: $$0.9 \le \frac{\text{Actual}}{\text{Forecast}} \le 1.1$$
    • FAIL if none of the above conditions are met.

📊 Experimental Results (Test Set 2025)

After training the NHiTS model on the full 10-year training set (2015-2024), here are the results achieved on the 2025 test set:

📈 Average Monthly Pass Rate by Part_Type

Part_Type Average Monthly Pass Rate in 2025 (%)
PAM 72.39%
PMC 66.87%
P 32.82%

📅 Detailed Monthly Pass Rates in 2025 (%)

Month PAM (%) PMC (%) P (%)
01/2025 75.58% 68.62% 35.60%
02/2025 73.74% 67.64% 39.32%
03/2025 73.71% 69.24% 34.37%
04/2025 74.48% 67.83% 32.51%
05/2025 74.55% 68.67% 36.84%
06/2025 73.74% 68.13% 31.58%
07/2025 71.78% 68.02% 28.17%
08/2025 72.89% 67.07% 33.44%
09/2025 72.93% 66.54% 30.96%
10/2025 70.07% 64.78% 31.89%
11/2025 69.09% 64.14% 30.96%
12/2025 66.17% 61.74% 28.17%
Average 72.39% 66.87% 32.82%

Key Observations:

  • The PAM group shows the highest predictability, averaging a 72.39% monthly pass rate.
  • The PMC group (comprising over 68% of the SKUs) remains highly stable throughout the year, hovering around an average of 66.87%.
  • The P group (extremely low SKU count, typically with very volatile or sparse demand patterns) yields a lower pass rate of 32.82%. This group would benefit from specialized intermittent demand forecasting algorithms in future iterations.

About

SKU-Level Demand Forecasting for 2025 using NHiTS deep learning model and Optuna tuning

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors