Abstract:
The objectives of this research were 1) to analyze the item quality of three subjects of National Test (NT), 2) to examine the Differential Item Functioning (DIF) of NT tests using HGLM, MIMIC and BAYESIAN methods, and 3) to compare performance of differential Item functioning NT tests using HGLM, MIMIC, and BAYESIAN methods by using the secondary data of the results from NT items for academic year 2011 across according to three subjects from 2000 Grade three students. The item quality was analyzed according to three parameters using Xcalibre Version 4.2.2.
Results were as follows.
1. The NT for Grade 3 students on three subjects had a fairly good level of the discrimination parameter value (a), had a difficulty level of difficulty parameter value (b), and the guessing parameter value (c) did not exceed .30.
2. The results of the examination of DIF of the NT items from the grade 3 students across three subjects were shown as follows: For literacy, the HGLM was found to be the most DIF item and it could account for 30%, followed by the BAYESIAN method at 23.33% and MIMIC method at 3.33%, respectively. For numeracy, the MIMIC method was found to be the most DIF test and it could account for 26.67%, followed by the BAYESIAN method at 20%, and the HGLM method at 16.67%. For reasoning, the HGLM was the most DIF test and it could account for 56.67%, followed by the MIMIC and the BAYESIAN methods at 36.67%.
3. The results of the comparison of DIF of the NT items the grade 3 students across three subjects indicated that the HGLM method was better than the MIMIC method in terms of DIF on literacy and reasoning ability and the HGLM method could account for 26.27% and 20%, respectively. The HGLM method was better than the BAYESIAN method in terms of DIF on reasoning and literacy ability and the HGLM method could account for 20 % and 6.67%, respectively. The MIMIC method was better than the BAYESIAN method in terms of DIF on numeracy and it could account for 6.67% but The MIMIC method was worse than the BAYESIAN method in terms of DIF on literacy and it could account for 20%. For reasoning ability, both methods did not differ but the HGLM method was worse than the MIMIC and BAYESIAN methods in terms of DIF on numeracy ability that could account for 10% and 3.33%, respectively.