Abstract:
The purpose of this research was to compare the efficiency of two item fit indices-PARSCALE G² and GENERALIZED S-X². Data were simulated under two Polytomous item response theory models- Grade response model (GRM) and Generalized partial credit model (GPCM). Three conditions were manipulated: 1) three levels of test length (10, 20 and 40), 2) three levels of sample size (500, 1000 and 2000), 3) four levels of category (3, 5, 7 and 9). Seventy-two situations were analyzed. Type I error and power of the test were used as criteria to evaluate the efficiency of item fit index in this research in 1) the comparisons of two item fit indices efficiency by Kang and Chen (2008) s condition, and 2) the comparisons of two item fit indices efficiency in two-way ANOVA. The results of this research were : 1. The comparisons of two item fit indices efficiency in Kang and Chen (2008) s condition, GENERALIZED S-X² had more efficiency than PARSCALE G² in most situations due to the fact that the type I error of GENERALIZED S-X² was less than the type I error of PARSCALE G² in 70 situations out of 72 situations (mean of PARSCALE G² s type I error = 0.1535, mean of GENERALIZED S-X² s type I error = 0.0216). 2. The comparisons of two item fit indices efficiency in two-way ANOVA, type I error of GENERALIZED S-X² was less than the type I error of PARSCALE G² in 5 cases of 6 cases, and the power of the test of PARSCALE G² was greater than the power of the test of GENERALIZED S-X² in all 6 cases. 3. GENERALIZED S-X² had probability to indicate the fitted items were misfitted items less than PARSCALE G² (GENERALIZED S-X² had type I error less than PARSCALE G²). While PARSCALE G² had probability to indicate the misfitted items were misfitted items more than GENERALIZED S-X² (PARSCALE G² had power of the test more than GENERALIZED S-X²).