中文母語者之漢語識詞量研究——以臺灣受測者為討論對象
No Thumbnail Available
Date
2022
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
本研究旨在探測中文母語人士之漢語詞彙識詞量,作為對外華語教學的參考基準。由於自本世紀開始,學習華語的人士急遽增加,而在教材編寫及華語測驗方面,詞彙量又是檢測語言能力的重要標準之一,但華語教學界對詞彙分級缺乏基礎的識詞量研究,因此極有必要先對中文母語者的詞彙量進行量化探測,以作為構建華語二語學習者能力指標的參照依據。本研究以中研院平衡語料庫中的一千萬字語料匯成的詞表,共包含262,780個詞,經過內容分析法過濾掉中外夾雜等非漢語詞彙,製成測試用的正式測試稿,共包括151,096個不同詞語,排除詞綴後則有143,646個詞。讓募集而來的臺灣中文母語者參加測試,採用清單式詞表法,由參加者逐一勾選不認識的詞語。為了確認參加者的準確度並進行數據調校,研究者也特別為每階段設計了後測環節,從母語者的測試資料中隨機及手動抽取一定比例的詞彙,再請參加者展示其對該詞彙的了解程度,並根據事先訂立的「識詞標準」判定參加者對詞彙的了解程度是否達標。共有八位參加者完成全部的測試,參加者平均識詞量是99,219;數據調校後平均識詞量是94,196。詞頻方面,出現在語料庫中超過一千次的詞語識詞率接近100%,即使只出現過一次的極低頻詞識詞率也平均接近50%。此外,參加者識詞量的影響因素是專業背景及嚴謹度。從理科轉文科的母語者識詞量最高,可能是兼具文理背景的人士具有較寬闊的詞語認知;而純文科母語者次之,純理科母語者識詞量則較前二者低。通過此次研究,保守估計臺灣具有大學學歷的識詞量應在九萬以上。由於歐洲共同語言框架的C級能力描述已是母語者的程度,此數據可供C級對應的參考,而現有的分級詞彙數量均有連帶調整的必要。
This study aims to investigate the vocabulary size of native Chinese speakers as a reference for teaching Chinese as a foreign language. Since the beginning of this century, the number of people learning Chinese has increased rapidly. Vocabulary is one of the fundamental criteria for assessing language proficiency. Therefore, it is essential to measure the vocabulary size of Chinese native speakers to construct the ability index of Chinese second language learners.This research uses the vocabulary list of 10 million Balanced Corpus of Academia Sinica, which contains 262,780 words. After filtering out non-Chinese words through the content analysis method, a formal experimental draft for testing is created, including 151,096 words and 143,646 words excluding affixes.Using the checklist method, the participants will examine the list and mark the unfamiliar words. For each stage, the researchers also designed a post-test session. A certain proportion of vocabulary is selected randomly and manually from the test data retrieved. Then, the participants will demonstrate their understanding of the words to determine whether their vocabulary knowledge is up to the pre-established"word recognition standard." By doing so, researchers may confirm their conscientiousness and perform data adjustments afterward. A total of eight participants completed the test, and they knew an average of 99,219 words. After removing the affixes, the average number of known vocabulary dropped to 94,196. The word recognition rate of words that appeared over a thousand times in the corpus was close to 100%. Even the low-frequency words that appeared only once had an average recognition rate ofclose to 50%. In addition, the related factors to participants' word literacy are educational backgrounds and conscientiousness. Native speakers who are science majors in college and working on a master's in arts have the highest word literacy. This result may be because people with both arts and science backgrounds have a broader word recognition. While the arts majors score second in vocabulary size, the science majors have the lowest word literacy. This research estimates that the number of people with university degrees in Taiwan should possess a lexicon of more than 90,000 words. Furthermore, since the C-level proficiency description of the Common European Language Framework is equivalent to that of a native speaker, this data can serve as a reference for the C-level. Therefore, the Chinese teaching field should adjust the existing number of vocabulary for different levels.
This study aims to investigate the vocabulary size of native Chinese speakers as a reference for teaching Chinese as a foreign language. Since the beginning of this century, the number of people learning Chinese has increased rapidly. Vocabulary is one of the fundamental criteria for assessing language proficiency. Therefore, it is essential to measure the vocabulary size of Chinese native speakers to construct the ability index of Chinese second language learners.This research uses the vocabulary list of 10 million Balanced Corpus of Academia Sinica, which contains 262,780 words. After filtering out non-Chinese words through the content analysis method, a formal experimental draft for testing is created, including 151,096 words and 143,646 words excluding affixes.Using the checklist method, the participants will examine the list and mark the unfamiliar words. For each stage, the researchers also designed a post-test session. A certain proportion of vocabulary is selected randomly and manually from the test data retrieved. Then, the participants will demonstrate their understanding of the words to determine whether their vocabulary knowledge is up to the pre-established"word recognition standard." By doing so, researchers may confirm their conscientiousness and perform data adjustments afterward. A total of eight participants completed the test, and they knew an average of 99,219 words. After removing the affixes, the average number of known vocabulary dropped to 94,196. The word recognition rate of words that appeared over a thousand times in the corpus was close to 100%. Even the low-frequency words that appeared only once had an average recognition rate ofclose to 50%. In addition, the related factors to participants' word literacy are educational backgrounds and conscientiousness. Native speakers who are science majors in college and working on a master's in arts have the highest word literacy. This result may be because people with both arts and science backgrounds have a broader word recognition. While the arts majors score second in vocabulary size, the science majors have the lowest word literacy. This research estimates that the number of people with university degrees in Taiwan should possess a lexicon of more than 90,000 words. Furthermore, since the C-level proficiency description of the Common European Language Framework is equivalent to that of a native speaker, this data can serve as a reference for the C-level. Therefore, the Chinese teaching field should adjust the existing number of vocabulary for different levels.
Description
Keywords
中文母語者, 識詞量, 臺灣, 影響因素, 清單式詞表法, Chinese native speaker, quantity of word recognition, Taiwan, influencing factors, checklist test