以計算語言學方法研究英文的認知基本名詞

No Thumbnail Available

Date

2010

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

本論文探討認知科學中相當著名的原型理論(Prototype Theory)長久以來一直存在的一個議題,研究認知分類的文獻多是倚賴一些少數經典的例證,像是「扶手椅」、「椅子」、「傢俱」等的例子(Rosch et al. 1976; Taylor 2003; Ungerer& Schmid 1996, 2006)。就本作者所知,至今尚無任何研究試圖分析任一語言中所有詞彙的認知層(superordinate level, basic level, subordinate level),本論文以大型電子資料庫(WordNet, CELEX, BNC, CHILDES, ELP)為底,對英語的所有名詞進行全面性的研究,為羅須等人(Rosch et al. 1976, 1978)所提出的認知分類理論提供了有力的實證。本作者設計了一個找出WordNet裡的英文名詞認知層的計算法,比較每一個名詞在其所處的層級鍊中與其他名詞在形成複合詞能力上的相互關係,自動偵測出每個名詞的認知層級。 以上述方法所擷取的英文名詞在詞彙、語意、構詞等各方面都有明顯的數據可呼應我們以三個認知層的認知顯著性差別所做的各種預測,尤其是以多元回歸(multiple regression)分析詞彙判別時間差(lexical decision latency)的實驗結果顯示,利用本論文所提出的計算法找出的認知層與詞彙判斷之間有很高的關聯性,這些數據上的實證對於本論文所提出的計算法的效度以及原型理論的可信度都是強力的佐證。 分析母語習得的語料也達到與上述相同的結論,幼兒學習基本層詞彙的速度與詞彙量遠大於其他兩個認知層的詞彙,上層詞對幼兒而言特別具挑戰性,但一旦習得了的上層詞就成為幼兒常用的詞彙。 由本論文的研究結果可看出認知科學與計算科學是可緊密聯繫且齊頭並進的。
As a celebrated theory in cognitive linguistics, Prototype Theory faces the long-standing issue that studies of cognitive categorization have often resorted to just a few typical cases exemplified by ‘armchair’ - ‘chair’ - ‘furniture’ and the alike (Rosch et al. 1976; Taylor 2003; Ungerer& Schmid 1996, 2006). To my knowledge, so far there have been no attempts to pin down the cognitive levels of all the lexical words in any language. This study provides support to the cognitive categorization proposed by Rosch et al. (1976, 1978) with a general study on all lexical nouns in English based on large electronic databases (WordNet, CELEX, BNC, CHILDES, ELP). A computational algorithm is suggested for automatically identifying the cognitive levels of the nouns in WordNet by deducing its ability to form critical compounds in virtue of a contrast to the other words in hierarchical chains. The nouns we extract demonstrate distinctive numerical features in lexical, semantic, and morphological aspects in accordance with the predictions deduced from the demarcation between the cognitive saliencies of the three different levels. In particular, it is shown by multiple regression analysis that lexical decision latencies are highly correlated with the cognitive levels assigned by our algorithm. The empirical evidence provides strong support for both the validity of the level-assignment and the substantiality of Prototype Theory. First language acquisition data also support the conclusion reached above. Young children acquire basic level words at a significantly faster speed in strikingly larger volume. Superordinate level words are particularly challenging for young learners, but once they are acquired, they are very frequent linguistic items. The thesis has been a manifestation that cognitive science and computer science can well go hand-in-hand.

Description

Keywords

原型理論, 認知語言學, 計算語言學, 認知基本層名詞, 英語詞網, 英語詞彙計畫, prototype theory, cognitive linguistics, computational linguistics, basic level nouns, WordNet, English Lexicon Project

Citation

Collections