A Fortune 100 CPG company used AI to improve the accuracy of forecasting for UK tea shipments
The Big Picture:
UK tea is a data-rich market and has reached almost its saturation point in terms of growth. Despite this stability, the accuracy of the CPG company’s tea shipment forecasts, performed by its expert demand planner, varied hugely month on month: from 23% to 46% at the retailer level and 45% to 71% at the SKU level. These large variations in forecast accuracy demonstrates the inherent difficulty in accurately predicting business metrics even for a data-rich and stable market.
A Fortune 100 CPG company wanted to understand how machine-learning algorithms could help more accurately predict business metrics, such as shipments or sales, by accurately capturing market dynamics that experts could not. There was no consistent pattern for shipment in the data at the retailer level for specific SKUs, making forecasting more difficult. For operational reasons, certain input data was missing, including some point-of-sales data and promotional data.
The goal was to build a model to consistently produce a few KPIs, such as improved forecast accuracy and reduced forecast bias, volume error, phasing error, and extreme bias. Month on month, the model needed to consistently predict retailer level shipments to improve promotion planning and reduce loss. The same was needed for ‘SKU per retailer’ level shipments to help optimize supply-chain operations and reduce safety stocks.
To solve the company’s challenges, the first step was to build a robust feature engineering pipeline. 199 features were automatically derived out of five primary features, such as shipment, promotion, EPOS, calendar date, etc. Within the five primary features, 21 (out of 199) were identified as influential features. For example, an important feature that was identified was the cannibalization effect of promotion features, which could be derived from analyzing competing SKUs running promos simultaneously.
A ‘greedy’ method was further used to identify features that provided a high lift in accuracy. Models were tuned to find a set of features that were important for each ‘SKU per retailer’ level model. A few features that were found to have consistently helped to improve the accuracy of the forecast were EPOS sales, ordinal date, week number for month, etc.
The final model used three sub-models, among which, one was designed for doing retailer level shipment prediction accurately, while the other two were designed for ‘SKU per retailer’ level shipment prediction. These two different models were then ensembled using another two-layered neural network model with a dual objective function to produce minimum prediction bias and maximum prediction accuracy.
As a result of the engagement, the company’s final ensembled model has consistently improved forecast accuracy, over the demand planner’s forecast, month on month. The best model produces high accuracy (>60%) and low bias consistently month on month. Overall, the engagement has improved forecast accuracy by 6% at the retailer level and 5% at the SKU level. It has also shown improvements by reducing bias metrics such as extreme bias, volume error, and phase error.