臺灣與大陸英語學習者語料庫之介詞錯誤研究

黃鈺真; Yu-Chen Huang

臺灣與大陸英語學習者語料庫之介詞錯誤研究

Date

2011

Authors

黃鈺真

Yu-Chen Huang

Abstract

本研究的目的為找出臺灣與大陸英語學習者語料庫中常見的介詞錯誤，比對兩方錯誤相似處，並試圖將錯誤加以分類。研究者將焦點放在發生於動詞+介詞、介詞+名詞與形容詞+介詞三種組合中的介詞錯誤，並採用半自動方法來抽取錯誤。研究的語料庫有二，一者是180多萬字的臺灣英語學習者語料庫，另一者是340多萬字的大陸英語學習者語料庫。做為比較基準的英語母語人士語料庫則包含了英國國家語料庫(BNC) 與紐約時報語料庫(NYT)。研究者首先找出三種目標介詞組合所有可能的排序，使用Monoconc Pro分別抓取學習者語料庫與基準語料庫中的三種目標介詞組合，再來利用Perl程式比對兩方抽取資料，過濾出學習者可疑的介詞組合，最後再由研究者做人工檢視。研究結果顯示，台灣學習者常見的介詞誤用有282筆，多用有162筆，大陸學習者常見的介詞誤用有1070筆，多用則有139筆。兩邊學習者所犯的錯誤整體而言十分相似，臺灣學習者所犯的誤用與多用錯誤有半數以上出現在大陸學習者語料庫，兩邊前五大介詞誤用更是有四個相同，包含了*in campus, *in (the) Internet, *by…way 和 *in the other hand。將所有的誤用進行分類，研究者發現雙方學習者前五大最常犯錯的類別有三個相同，亦即是空間(Space)、抽象空間(Abstract Space)與方法(Manner)。研究者推測學習者所犯的介詞錯誤大致可歸因於母語與目標語的差異以及對目標語規則的不知悉。本研究所發現的錯誤能夠用於改善寫作錯誤自動偵測系統，使介詞錯誤偵測功能更趨完善，本研究也希望能幫助英語教學者更深入了解以中文為母語的英語學習者常犯的介詞錯誤以及其背後原因，提供英語教學者與英語學習者介詞教學的具體參考。
The aim of the study is to investigate the common English prepositional errors in a Taiwanese learners’ corpus and a Chinese learners’ corpus. In previous error-analysis studies, few researchers made comprehensive and detailed discussion of learners’ common prepositional errors. Moreover, most researchers identified errors manually, which is actually quite laborious and time-consuming. Noticing these limitations, this study adopted a semi-automatic way to extract prepositional errors. The researcher focused on prepositional errors that occur in three combinations: V + Prep., Prep. + N and Adj + Pp. The errors discovered in two learner corpora were compared to check if there were any similarities. The researcher also classified all the errors to see in what aspects learners have difficulty in using prepositions. The research material included a 1.8-million-word Taiwanese learner corpus and a 3.4-million-word Chinese learner corpus. Two native speaker corpora, BNC and NYT, were adopted as reference corpora. The researcher first listed all the possible patterns of the three target combinations, keying them onto the concordancing program Monoconc Pro to retrieve prepositional instances in the learner corpora and reference corpora. Then, the program Perl was used to compare the extracted data, capturing instances that only appeared in the learner corpora. Finally, the researcher manually inspected these suspicious instances and judged whether they contained real prepositional errors. In total, 282 tokens of common misuses and 162 common extraneous uses were found in the TW corpus. 1070 tokens of common misuses and 139 tokens of common extraneous uses were found in the CN corpus. Two groups of learners shared lots of similarities in errors. Over half of the misuses and extraneous uses found in the TW corpus also appeared in the CN corpus. Among the top five misuses in respective learner corpora, four of them were even shared, which is *in campus, *in (the) Internet, *by…way and *in the other hand. Classifying all the misuses, it was discovered that among the top five misuse categories in each learner corpus, three of them were the same, including space, abstract space and manner. It is assumed by the researcher that all the discovered errors could roughly be attributed to interlingual factors and an ignorance of rule restriction. The prepositional errors discovered in this study can be used in improving automatic writing error detection systems, making the checking of prepositional errors perform better. It is also hoped that these errors can give English teachers some insights into the common prepositional errors made by learners with Chinese as L1, serving as useful reference for English teachers and learners.

Keywords

介詞, 錯誤, 學習者語料庫, 半自動資料抽取, preposition, error, learner corpus, semi-automatic data extraction

URI

http://etds.lib.ntnu.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dstdcdr&s=id=%22GN0697210624%22.&%22.id.&
http://rportal.lib.ntnu.edu.tw:80/handle/20.500.12235/97797

Collections

學位論文

Full item page

臺灣與大陸英語學習者語料庫之介詞錯誤研究

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By