Publication Date: 2023/10/30
Abstract: It is an essential part of research to find ways to impute the missing values in a data set. The missingness is unavoidable as it could be due to natural or non-natural reasons. Missing information is inevitable in longitudinal or multilevel studies, and can result in biased estimates, loss of power, variability and inaccuracy in results. For this study a complete data which showed the resistance scores of intellectually disabled children on giving behavioral skilltraining was considered in order to compare the variousimputation techniques. The secondary data collected was longitudinal in nature. The resistance score was noted beforethe training and at four different time points after the training. A random missingness was created under varying percentages in the complete data (5%, 10%, 15%, 20%, 30%) using the MAR mechanism. The obtained values after imputation were compared with full data using a linear mixed model. Various models built under the multiple imputation and machine learning techniques for imputing different features which are used to predict the resistance score, using the coefficients taken from the real data and the same mechanism was implemented for simulated data as well. The methods based on machine learning techniques were the most suited for the imputation of missing values and led to a significant enhancement of prognosis accuracy when compared to multiple imputation techniques using linear mixed models.
Keywords: Multiple Imputation, MAR Mechanisms, Machine Learning Techniques, Linear Mixed Effect Model.
DOI: https://doi.org/10.5281/zenodo.10053050
PDF: https://ijirst.demo4.arinfotech.co/assets/upload/files/IJISRT23OCT1169.pdf
REFERENCES