Abstract:
The purpose of this study was to determine the criterion-referenced tests lengths for use in estimation the true domain scores of examinees both at group level and that of each examinee. At group level, the estimation of the true domain scores to find the shorter lengths whose true domain scores were estimated from those scores of the tests which were of those same length and whose bases were already adjusted. When it revealed that there was no statistical difference from the true domain scores estimated by the full-length test, and the scores from the tests of such length also has statistical significant correlation with those from the full-length test, the length was then justified. To estimate the true domain score for each examinee, further study was attempted to find the length of the test that yield the score which, when the full score was base-adjusted, was not statitically different from that of the full-length showing not more that the confidence limit specifying by the standars error of measurement.
The full-length test used in this study was An Item Analysis crite ion-referenced test constructed according to the 1976 Teacher Council Curriculum. The items in the test were constructed from 14 item forms. Based on these item forms, the test was made on the basis of 5 items per each item form, Selected randomly into rounds, each consisted of 5 items per each item form and besides, ne more item was added to both the sixth and the seventh item form, giving a final full combined-test with 72 items. Sufjects in the study consisted of 65 students who were studying in the first year of the Bachelor of Education Program and 55 students who were the social studies students in the Higher Certificate of Education Program at Surin Teachers College, both of the 1985 academic year. Also 22 subjects of the 1985 academic year were students working toward a Bachelor of Science degree then in their third year at Surin Vocational and Technology College.
The full-length turned out to be adequately qualified. It had discrimination validity and since the variable measured by the test was a continum variable, and since the study aimed at measuring the true scores of the examinee, the Kuder- Richardson formula 20 method of reliability was applied and the findings varied from .85 - .96.
The major findings were as follows:
When the Item Analysis Criterion-Referenced Test was used to estimate the true domain score of students who were studying under the 1976 Teacher Council Curriculum of other curriculum whose content domain was equivalent. By planning the lesson which had a consistency with objectives, instructional activities and evaluation, it yield that:
1. The test consisted of 20 items was enough to estimate the group domain score or it could be concluded that 20-item test yielded the score, use in estimating the true domain score, when the full score was base adjusted, which was not stistically different fromt hat estimated by the full-length test at 0.05 level. The score from the 20-item test also had statistically significant correlation with that from full-length at 0.001 level.
2. The test consisted of 30 items was effective enough to use in estimating the true score of each examinee. Thus it could be concluded that the numbers of examinees, whose difference score between, those of the 30-item when base-adjusted, and those of the full-length, were not more than 5 % expected error rate. This yield a statistical significance at the 0.95 confidence limit when the Binomial Test was employed as a measure in statistical tests.