A Comparison of the Three Adaptive Testing Strategies Using MicroCAT

No Thumbnail Available

Date

1989-06-??

Authors

何榮桂

Journal Title

Journal ISSN

Volume Title

Publisher

國立臺灣師範大學研究發展處
Office of Research and Development

Abstract

本研究旨在探討三種電腦化測驗策略(即Bayesian, model-bayesian及maximum likelihood)在不同題數及不同項目參數(Item Parameter)之題庫下執行之效率及能力估計之準確性。三種測驗策略均設定相同的起始點(即受試者能力參數之平均數及變異數);同時以三種效標變項做為評鑑的依據。其一為達到測驗預設終點所需之題數;其二即能力估計誤差變異數之平均數。此兩效標變項之值愈小表示愈有效率。第三種效標變項為期望能力(即能力參數)與觀察(估計)能力之差的絕對值。某種策略所獲得之值愈小表示該策略之能力估計愈準確。  經電腦模擬所獲得之三種效標量數,以重複量數三因子變異數分析處理後,可得以下結論:  1.就執行之效率而言,model-Bayesian策略最佳;但Bayesian策略之能力估計最正確。Maximum likelihood 策略則表現不甚一致;  2.當測驗的起始點高於受試者之真實能力時,題庫之大小對測驗結果略有影響。在此情況下,以86題為題庫並以22題為測驗終止點似乎難以區分三種策略之優劣;  3.當設定的起始點等於或略高於受試者之真實能力時,所得到的能力估計愈準確;  4.不同項目參數之題庫與測驗策略,及不同項目參數之題庫與受試者之能力水準有交互作用存在;  5.在不同大小及不同項目參數之題庫下,從未被選及被選最多次數之項目其統計特質(即項目參數)之差異均甚小。
The purpose of this study was to investigate the empirical performance of the three adaptive strategies. The following conclusions emerged from the data analyses: (1) The modal-Bayesian strategy was the most efficient one among the strategies used. The Bayesian strategy, however, yielded much reliable ability estimates. The maximum likelihood strategy was found to be inconsistent under most testing situations. (2) The effect, of selected bank sizes seemed minimum except in the low ability level where the starting point was higher than the true ability of the examinee. Under this condition, a bank size of 86 items using a maximum number of 22 items as the termination criterion was insufficient to differentiate the benefits of the selected strategies. (3) More accurate estimate of the ability could be obtained if the starting point was equal to or lower than the true ability of the examinee. (4) There were interactions between bank type and the adaptive testing strategy, and between bank type and ability level. Using the bank with the easiest items, the performance of the maximum likelihood strategy improved substantially, especially in the low ability level. However, the performance of the other strategies was still better. (5) The differences between the statistical characteristics of never-selected items and frequently-selected items were relatively small.

Description

Keywords

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By