This seminar presents a comparative analysis of traditional time series forecasting techniques, such as Holt, Holt-Winters, and ARIMA, with modern machine learning and deep learning approaches, including XGBoost, Random Forest, CNNs, RNNs, LSTMs, and Transformers. As a practical application, we explore a case study focused on predicting monthly sales of a strategic health insurance product. The study evaluates the impact of incorporating exogenous macroeconomic indicators, including consumer price index variation and unemployment rates, on model accuracy. Particular emphasis is placed on identifying and correcting outliers, assessing their influence on both classical and machine learning models. Results show that the ARIMAX model applied to an outlier-adjusted dataset achieved the best forecasting performance, demonstrating the value of integrating external variables. While machine learning models performed competitively, their accuracy was notably lower when outliers remained untreated. Overall, the findings highlight the complementary strengths of classical statistical models and modern learning algorithms, encouraging a hybrid approach to enhance the robustness and reliability of sales forecasting in the insurance sector.
In parallel, we evaluate several robust strategies proposed in the literature, incorporating them into the RF framework to assess whether, and to what extent, they enhance its stability and predictive accuracy under contamination. The aim of this study is to clarify the practical potential of such robust adaptations as complementary tools to the classical RF algorithm in routine genomic prediction workflows, as well as to identify a simple and practical robust strategy that delivers improved prediction in the presence of outliers.