在台灣的英語教學現況中,「翻譯」不僅列入普通高級中學英文課程綱要中的核心能力,也是大學入學考試中的考試題型之一。因此,教與學當中,翻譯為不容忽視的語言能力。在翻譯試題中,評分者扮演極其重要的關鍵角色,因為評分者可以用分數去評斷考生能力,然而評分包含眾多複雜的因素,像是評分者本身內在學術知識、過往評閱試卷經驗,或是評分者個人特質,皆有可能影響評分的嚴厲或寬鬆程度。因此,本研究應用多面向羅許模式(Many-Facet Rasch Measurement Model)去檢驗評分者特質(評分者經驗)對於翻譯試題評分的影響程度以及多面向羅許模式如何看出四個因素間(考生能力、評分者嚴厲程度、評分者經驗、考題難易度)的交互作用。參與此研究的受試者為二百二十五名來自北台灣的四所高三學生。研究結果顯示評分者經驗在某些程度的確會影響評分,包含過往批改經驗促進了評分效率、對考生答案更為敏銳與有較大的彈性空間……等。然而,評分經驗並不是評分品質優劣的指標,即使是新手評分者,若能仔細詳閱批改說明與標準答案,並且謹慎批改試題,也能提升自身的評分品質。本研究對於在中學擔任英語科老師能有極大的啟發,英語科老師不僅只有教學,也要懂得如何批改翻譯試題。在翻譯考試中,只依循原始分數去評斷學生翻譯能力與翻譯考題難易度並非客觀,若能在施測中,同時使用多面向羅許模式去檢驗四大因素(考生能力、評分者、評分者經驗、考題難易度),相信不僅能提升教師教學品質、學生學習狀況,更能隨時調整自身的評分狀況。
In Taiwan, an EFL context, English is a core component of the national senior high school curriculum, of which translation skill is a key objective. As for senior high school, translation is also a testing method in advanced subject tests and general scholastic ability tests. In translation items, raters play a critical role, because they have to judge rater’s ability by giving them scores. The essence of rating is still a complicated process, including rater’s inner knowledge, prior rating experience, or rater characteristics, and all of these reasons might cause variance in performance ratings, that is, harshness or leniency. Therefore, the current study applies Many-Facet Rasch Measurement Model to examine to what extent do the characteristics of raters, in terms of their experience, affect scores on the translation items and to find out the interactions of the four facets of rater severity, rater experience, test taker proficiency level, and item difficulty. Participants in this study were 225 third-year senior high school students from northern Taiwan. From the result, it may only be surmised that rater experience indeed causes differences in rater severity. But, it is hard to make a strong conclusion as to which group is more severe. Even within groups, there are rating differences. Though two groups of raters have different prior knowledge, given careful adherence to the scoring criteria, experts and novices can reach agreement on item scores. From this study, it is hoped that English teachers can gain some insight. In translation tests, using raw scores is not objective to judge learners’ ability or item difficulty. If teachers can make use of MFRM to examine the relationships between and among the facets of the estimates of trait ability, it can help teachers understand more about students, items, and even himself/ herself.



多面向羅許測量模式, 翻譯, 評分者嚴厲度, Many-Facet Rasch Measurement Model, translation, rater severity

