Título: Ensembles for Regression: A Statistical Analysis and Recommendations
Autores: Halcyon Davys Pereira de Carvalho, João Fausto L. de Oliveira & Roberta A de A Fagundes
Resumo: Ensemble techniques have proven effective in improving the performance of regression models. However, the decision between homogeneous and heterogeneous strategies is often made empirically, without considering the statistical properties of the data. This paper presents a framework that conducts a statistical analysis of datasets to recommend the most appropriate ensemble strategy. The framework leverages metrics such as coefficient of variation, skewness, kurtosis, correlation, and outlier presence to classify datasets into homogeneous or heterogeneous profiles. Ten real-world datasets with varying levels of complexity were evaluated. The results show that the proposed framework achieved the best performance in 80% of the cases, confirming that the statistical structure of the data should guide the selection of ensemble strategies. Wilcoxon statistical tests further validated the significance of the results. The proposed framework offers a systematic alternative to guide ensemble decisions in regression tasks, enhancing accuracy and reducing reliance on empirical trial-and-error procedures.
Palavras-chave: Ensemble learning; Regression; Statistical analysis; Model recommendation; Machine learning.
Páginas: 8
Código DOI: 10.21528/CBIC2025-1176303
Artigo em PDF: CBIC_2025_paper1176303.pdf
Arquivo BibTeX:
CBIC_2025_1176303.bib
