美國影集的字彙涵蓋量-語料庫分析
No Thumbnail Available
Date
2014
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
身在英語被視為外國語文的環境中,英語學習者很難擁有豐富的目標語言環
境。電視影集因結合語言閱讀與聽力,對英語學習者來說是一種充滿動機的學習資源,然而少有研究將電視影集視為道地的語言學習教材。許多研究指出媒體素材有很大的潛力能激發字彙學習,研究者很好奇學習者要學習多少字彙量才能理解電視影集的內容。
本研究探討理解道地的美國電視影集需要多少字彙涵蓋量(vocabulary coverage)。研究主要目的為:(1)探討為理解95%和98%的美國影集,分別需要英國國家語料庫彙編而成的字族表(the BNC word lists)和匯編英國國家語料庫(BNC)與美國當代英語語料庫(COCA)的字族表多少的字彙量;(2)探討為理解95%和98%的美國影集,不同的電視影集類型需要的字彙量;(3)分析出現在美國影集卻未列在字族表的字彙,並比較兩個字族表(the BNC word lists and the BNC/COCA word lists)的異同。
研究者蒐集六十部美國影集,包含7,279集,31,323,019字,並運用Range分析理解美國影集需要分別兩個字族表的字彙量。透過語料庫的分析,本研究進一步比較兩個字族表在美國影集字彙涵蓋量的異同。
研究結果顯示,加上專有名詞(proper nouns)和邊際詞彙(marginal words),英國國家語料庫字族表需2,000至7,000字族(word family),以達到95%的字彙涵蓋量;至於英國國家語料庫加上美國當代英語語料庫則需2,000至6,000字族。若須達到98%的字彙涵蓋量,兩個字族表都需要5,000以上的字族。
第二,有研究表示,適當的文本理解需要95%的字彙涵蓋量 (Laufer, 1989; Rodgers& Webb, 2011; Webb, 2010a, 2010b, 2010c; Webb & Rodgers, 2010a, 2010b),為達95%的字彙涵蓋量,本研究指出連續劇情類(serial drama)和連續超自然劇情類(serial supernatural drama)需要的字彙量最少;程序類(procedurals)和連續醫學劇情類(serial medical drama)最具有挑戰性,因為所需的字彙量最多;而情境喜劇(sitcoms)所需的字彙量差異最大。
第三,美國影集內出現卻未列在字族表的字會大致上可分為四種:(1)專有名詞;(2)邊際詞彙;(3)顯而易見的混合字(compounds);(4)縮寫。這兩個字族表基本上包含完整的字彙,但是本研究顯示語言字彙不斷的更新,新的造字像是臉書(Facebook)並沒有被列在字族表。
本研究也整理出兩個字族表在美國影集字彙涵蓋量的異同。為達95%字彙涵蓋量,英國國家語料庫的4,000字族加上專有名詞和邊際詞彙的知識才足夠;而英國國家語料庫合併美國當代英語語料庫加上專有名詞和邊際詞彙的知識只需3,000字族即可達到95%字彙涵蓋量。另外,為達98%字彙涵蓋量,兩個語料庫合併的字族表加上專有名詞和邊際詞彙的知識需要10,000字族;英國國家語料庫字族表則無法提供足以理解98%美國影集的字彙量。
本研究結果顯示,為了能夠適當的理解美國影集內容,3,000字族加上專有名詞和邊際詞彙的知識是必要的。字彙涵蓋量為理解美國影集的重要指標之一,而且字彙涵蓋量能協助挑選適合學習者的教材,以達到更有效的電視影集語言教學。
In EFL context, learners of English are hardly exposed to ample language input. Television program, combining properties as those in reading and listening programs, is a source of motivating language input for EFL learners. However, television programs have not been widely investigated as a source of authentic materials for language learning. While much of the research suggested there was great potential for learning vocabulary through media exposure, it was intriguing for the researcher that how much learners can comprehend with learned vocabulary coverage. The study set out to investigate what vocabulary size is needed to comprehend authentic American television programs. The purposes of the study are: (1) to examine to what extent do the BNC lists and the latest combination of the BNC lists with the COCA lists reach the vocabulary coverage of 95% and 98% respectively through watching authentic American television programs, (2) to investigate what vocabulary size is needed for different genres to reach 95% and 98% coverage for American television programs in the BNC lists and the BNC/COCA lists respectively, and (3) to investigate the vocabulary not found in the BNC lists and the BNC/COCA lists. In addition, the comparisons between the results of the two sets of lists were also discussed. The scripts of 7,279 episodes of sixty television programs consisting of 31,323,019 running words were analyzed using Range program (Heatley et al. 2002) with the BNC lists and the BNC/COCA lists respectively. Qualitative analysis was also carried out to examine the differences between coverage in television programs by the two sets of lists. The analysis yielded several interesting findings. First, the most frequent word families varying from 2,000 to 7,000 plus proper nouns and marginal words in the BNC lists could provide 95% coverage. In the BNC/COCA lists, a vocabulary size varying from 2,000 to 6,000 word families would provide 95% coverage for the television programs. To reach 98% coverage, a vocabulary size of 5,000 to over 14,000 word families plus proper nouns and marginal words were needed. Second, as 95% coverage was suggested to be sufficient for adequate comprehension (Laufer, 1989; Rodgers& Webb, 2011; Webb, 2010a, 2010b, 2010c; Webb & Rodgers, 2010a, 2010b), television programs of serial dramas and serial supernatural dramas would be the least demanding. Procedurals and serial medical dramas were the most challenging programs in the present study since they needed a larger vocabulary size to comprehend the television programs. Sitcoms, however, were dependent on the topics, which varied the most among all the genres in the present study. Third, the words Not Found in Any Lists were basically proper nouns, marginal words, transparent compounds, and abbreviations, which were the four lists, added on to the BNC/COCA lists. The two sets of lists included almost complete vocabulary; however, the study also found that the new-forming word, such as Facebook, was not found in the lists, suggesting that the vocabulary be ever-growing in a language. The comparison of the word families in the BNC and BNC/COCA lists providing 95% and 98% coverage showed that both the lists could provide 95% coverage for the television programs at the 4,000- and 3,000-word level respectively. However, with the proper nouns added in the BNC/COCA lists, which could provide 98% coverage at the 10,000-word level, the BNC lists could not provide 98% coverage. The findings also suggested that with the most frequent 3,000 word families plus proper nouns and marginal words, adequate comprehension could occur. Based on the findings in the present study, further pedagogical implications and possible directions for future studies were discussed in detail.
In EFL context, learners of English are hardly exposed to ample language input. Television program, combining properties as those in reading and listening programs, is a source of motivating language input for EFL learners. However, television programs have not been widely investigated as a source of authentic materials for language learning. While much of the research suggested there was great potential for learning vocabulary through media exposure, it was intriguing for the researcher that how much learners can comprehend with learned vocabulary coverage. The study set out to investigate what vocabulary size is needed to comprehend authentic American television programs. The purposes of the study are: (1) to examine to what extent do the BNC lists and the latest combination of the BNC lists with the COCA lists reach the vocabulary coverage of 95% and 98% respectively through watching authentic American television programs, (2) to investigate what vocabulary size is needed for different genres to reach 95% and 98% coverage for American television programs in the BNC lists and the BNC/COCA lists respectively, and (3) to investigate the vocabulary not found in the BNC lists and the BNC/COCA lists. In addition, the comparisons between the results of the two sets of lists were also discussed. The scripts of 7,279 episodes of sixty television programs consisting of 31,323,019 running words were analyzed using Range program (Heatley et al. 2002) with the BNC lists and the BNC/COCA lists respectively. Qualitative analysis was also carried out to examine the differences between coverage in television programs by the two sets of lists. The analysis yielded several interesting findings. First, the most frequent word families varying from 2,000 to 7,000 plus proper nouns and marginal words in the BNC lists could provide 95% coverage. In the BNC/COCA lists, a vocabulary size varying from 2,000 to 6,000 word families would provide 95% coverage for the television programs. To reach 98% coverage, a vocabulary size of 5,000 to over 14,000 word families plus proper nouns and marginal words were needed. Second, as 95% coverage was suggested to be sufficient for adequate comprehension (Laufer, 1989; Rodgers& Webb, 2011; Webb, 2010a, 2010b, 2010c; Webb & Rodgers, 2010a, 2010b), television programs of serial dramas and serial supernatural dramas would be the least demanding. Procedurals and serial medical dramas were the most challenging programs in the present study since they needed a larger vocabulary size to comprehend the television programs. Sitcoms, however, were dependent on the topics, which varied the most among all the genres in the present study. Third, the words Not Found in Any Lists were basically proper nouns, marginal words, transparent compounds, and abbreviations, which were the four lists, added on to the BNC/COCA lists. The two sets of lists included almost complete vocabulary; however, the study also found that the new-forming word, such as Facebook, was not found in the lists, suggesting that the vocabulary be ever-growing in a language. The comparison of the word families in the BNC and BNC/COCA lists providing 95% and 98% coverage showed that both the lists could provide 95% coverage for the television programs at the 4,000- and 3,000-word level respectively. However, with the proper nouns added in the BNC/COCA lists, which could provide 98% coverage at the 10,000-word level, the BNC lists could not provide 98% coverage. The findings also suggested that with the most frequent 3,000 word families plus proper nouns and marginal words, adequate comprehension could occur. Based on the findings in the present study, further pedagogical implications and possible directions for future studies were discussed in detail.
Description
Keywords
字彙涵蓋量, 語料庫分析, 第二語言字彙學習, 美國電視影集, vocabulary coverage, corpus-driven study, L2 vocabulary learning, television programs