Enhancing Predictive Accuracy through Ensemble Learning Techniques in High-Dimensional Datasets

Authors

  • Maria Katja Jotuni Senior Data Scientist, Nigeria Author

Keywords:

Ensemble Learning, High-dimensional Data, Bagging, Boosting, Feature Selection, Model Fusion, Dimensionality Reduction, Predictive Accuracy

Abstract

In high-dimensional datasets, traditional machine learning models often suffer from overfitting, reduced generalization, and computational inefficiency. Ensemble learning techniques—such as bagging, boosting, and stacking—offer a compelling alternative by combining multiple models to enhance prediction robustness and accuracy. This paper explores the efficacy of ensemble methods in high-dimensional feature spaces, reviewing their performance across diverse domains including bioinformatics, finance, and image recognition. We present comparative experimental results, visual analysis of model behavior, and future research implications

References

Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32. https://doi.org/10.1023/A:1010933404324

Devalla, S. (2020). Performance benchmarking of Java garbage collectors in containerized microservices. Journal of Scientific and Engineering Research, 7(6), 326–334.

Dietterich, T. G. (2000). Ensemble methods in machine learning. International workshop on multiple classifier systems. Springer.

Dudoit, S., Fridlyand, J., & Speed, T. P. (2002). Comparison of discrimination methods for microarray data. Journal of the American statistical association, 97(457), 77–87.

Devalla, S. (2020). Beyond Redux: State management and developer productivity in enterprise SPAs. European Journal of Advances in Engineering and Technology, 7(4), 70–78.

Zhou, Z. H., Wu, J., & Tang, W. (2002). Ensembling neural networks. Artificial Intelligence Review, 17(1), 3–19.

Fernández-Delgado, M. et al. (2014). Do we need hundreds of classifiers? Journal of Machine Learning Research, 15, 3133–3181.

Devalla, S. (2019). Unveiling the enterprise value of PaaS: A comparative study of productivity, scalability, and cost efficiency against SaaS and IaaS. European Journal of Advances in Engineering and Technology, 6(2), 120–126.

Sun, Y., & Kamel, M. (2009). Classification of imbalanced data. Pattern Recognition, 42(4), 691–701.

Tsai, C. F., & Hsiao, Y. C. (2010). Combining multiple feature selection methods. Expert Systems with Applications, 37(12), 7187–7196.

Devalla, S. (2019). Adaptive security frameworks for Java EE 8 and JSF: Automating threat detection and mitigation in enterprise web applications. Journal of Scientific and Engineering Research, 6(10), 326–334.

Saeys, Y., Inza, I., & Larrañaga, P. (2007). A review of feature selection techniques. Bioinformatics, 23(19), 2507–2517.

Wang, X., & Yao, X. (2009). Diversity analysis of ensemble classifiers. IEEE Trans. on Systems, Man, and Cybernetics, 39(4), 1287–1301.

Opitz, D., & Maclin, R. (1999). Popular ensemble methods. Journal of Artificial Intelligence Research, 11, 169–198.

Devalla, S. (2018). Performance benchmarking of RESTful and SOAP APIs in enterprise IoT control systems. Journal of Scientific and Engineering Research, 5(11), 376–390.

Downloads

Published

2021-06-22

How to Cite

Maria Katja Jotuni. (2021). Enhancing Predictive Accuracy through Ensemble Learning Techniques in High-Dimensional Datasets. INTERNATIONAL JOURNAL OF ENGINEERING AND TECHNOLOGY RESEARCH & DEVELOPMENT, 2(1), 7-11. https://ijetrd.com/index.php/ijetrd/article/view/IJETRD_02_01_002