Abstract:
Missing data in time series data is a common problem in statistical analysis that occurs due to many reasons. In order to estimate missing values accurate, it is necessary to select an appropriate method depending on the type and mechanisms generating missing values so as to obtain the best possible estimates of missing values. The purpose of this study is to compare the imputation methods for time series analysis with missing data. The imputation methods were Mean imputation, LOCF, and EM Algorithm. The data were simulated under three levels of missing percentages of data 10%, 20% and 30%, three levels of nonignorable-missingness of none, medium, high. The comparison of each imputation methods using the size of average mean absolute percentage error (AMAPE), the findings are the followings: i) for first order autoregressive model, Mean Imputation perform test when the sample size is small (n=50,100) and parameter first order autoregressive process equal 0.2, ii) EM Algorithm perform best when parameter first order autoregressive process equal 0.5, iii) LOCF perform best when the sample size is small (n=50,100) and parameter first order autoregressive process equal 0.8, iv) for second order autoregressive model, Mean Imputation perform best when parameter first order autoregressive process and second order autoregressive process equal 0.1, v) Mean Imputation perform best when the sample size is small (n=50) and parameter first order autoregressive process and second order autoregressive process equal 0.25, vi) EM Algorithm perform best when parameter first order autoregressive process and second order autoregressive process equal 0.4.