首页 | 本学科首页   官方微博 | 高级检索  
     检索      


A Pragmatic Ensemble Strategy for Missing Values Imputation in Health Records
Authors:Shivani Batra  Rohan Khurana  Mohammad Zubair Khan  Wadii Boulila  Anis Koubaa  Prakash Srivastava
Institution:1.Department of Computer Science and Engineering, KIET Group of Institutions, Delhi-NCR, Ghaziabad 201206, India; (S.B.); (R.K.);2.Department of Computer Science and Information, Taibah University, Medina 42353, Saudi Arabia;3.Robotics and Internet-of-Things Laboratory, Prince Sultan University, Riyadh 12435, Saudi Arabia; (W.B.); (A.K.);4.Department of Computer Science and Engineering, Graphic Era (Deemed to be University), Dehradun 248002, India;
Abstract:Pristine and trustworthy data are required for efficient computer modelling for medical decision-making, yet data in medical care is frequently missing. As a result, missing values may occur not just in training data but also in testing data that might contain a single undiagnosed episode or a participant. This study evaluates different imputation and regression procedures identified based on regressor performance and computational expense to fix the issues of missing values in both training and testing datasets. In the context of healthcare, several procedures are introduced for dealing with missing values. However, there is still a discussion concerning which imputation strategies are better in specific cases. This research proposes an ensemble imputation model that is educated to use a combination of simple mean imputation, k-nearest neighbour imputation, and iterative imputation methods, and then leverages them in a manner where the ideal imputation strategy is opted among them based on attribute correlations on missing value features. We introduce a unique Ensemble Strategy for Missing Value to analyse healthcare data with considerable missing values to identify unbiased and accurate prediction statistical modelling. The performance metrics have been generated using the eXtreme gradient boosting regressor, random forest regressor, and support vector regressor. The current study uses real-world healthcare data to conduct experiments and simulations of data with varying feature-wise missing frequencies indicating that the proposed technique surpasses standard missing value imputation approaches as well as the approach of dropping records holding missing values in terms of accuracy.
Keywords:ensemble learning  health data  imputation methods  missing values  regression algorithms
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号