Scalable probabilistic forecasting in retail with gradient boosted trees: A practitioner’s approach
Metadatos
Mostrar el registro completo del ítemAutor
Long, Xueying; Bui, Quang; Oktavian, Grady; F. Schmidt, Daniel; Bergmeir, Christoph Norbert; Godahewa, Rakshitha; Per Lee, Seong; Zhao, Kaifeng; Condylis, PaulEditorial
Elsevier
Materia
Probabilistic forecasting Gradient boosted trees Global models
Fecha
2024-11-12Referencia bibliográfica
Long, X. et. al. Int. J. Production Economics 279 (2025) 109449. [https://doi.org/10.1016/j.ijpe.2024.109449]
Patrocinador
María Zambrano (Senior)Fellowship by the Spanish Ministry of Universities; Next Generation funds from the European UnionResumen
The recent M5 competition has advanced the state-of-the-art in retail forecasting. However, there are important
differences between the competition challenge and the challenges we face in a large e-commerce company. The
datasets in our scenario are larger (hundreds of thousands of time series), and e-commerce can afford to have
a larger stock assortment than brick-and-mortar retailers, leading to more intermittent data. To scale to larger
dataset sizes with feasible computational effort, we investigate a two-layer hierarchy, namely the decision
level with product unit sales and an aggregated level, e.g., through warehouse-product aggregation, reducing
the number of series and degree of intermittency. We propose a top-down approach to forecasting at the
aggregated level, and then disaggregate to obtain decision-level forecasts. Probabilistic forecasts are generated
under distributional assumptions. The proposed scalable method is evaluated on both a large proprietary
dataset, as well as the publicly available Corporación Favorita and M5 datasets. We are able to show the
differences in characteristics of the e-commerce and brick-and-mortar retail datasets. Notably, our top-down
forecasting framework enters the top 50 of the original M5 competition, even with models trained at a higher
level under a much simpler setting.