
No Thumbnail Available



Journal Title

Journal ISSN

Volume Title



台灣歷經多國殖民統治佮濟濟年來不當的語言政策,本土語言嚴重斷層。當務之急,愛找回日常生活化的語言,創造新時代的語彙。 觀察國外英語教材的編寫,攏有豐富的語料庫成做基礎,教材編寫者可以根據辭彙出現的頻率、句型的難易度來設計編寫教材。 檢視現此時台灣閩南語教材,缺乏詞類、詞頻這方面的基礎研究,只能依據編輯者的主觀判斷來編訂,較無科學系統可言。 現存的幾個閩南語語料庫攏尚未加註詞類等語法訊息標記,語料的規模嘛抑無大。標記語料庫的建置非常耗時費工,本論文先利用台華語線上對譯辭典以及中研院現代漢語平衡語料庫、楊允言老師「台語文斷詞、詞性標示系統」等豐富的數位資源,進行台語文詞性標記的轉註作業,加速台語文加工語料庫的建構,同齊加入豐富語料庫的典藏行列。 本論文以胡萬川教授總編輯、桃園縣政府文化局出版的「桃園縣民間文學集」中的六冊閩南語故事的內容為基礎,建置語料庫,做成本篇論文的分析語料,配合古文獻〈東番記〉來進行有關詞類、詞頻的語言調查分析,以古證今、以今辨古。本研究建置的語料庫將併入中研院語言所閩客語典藏計畫的台灣民間文學集語料庫,可協助擴充台語文之語料庫,並且提供台語文教材編訂的參考。
Taiwan languages other than Mandarin are declining due to a monolingual language policy adopted for several decades by the rulers. Therefore, it is urgent to safeguard local languages and maintain linguistic diversity. Linguistic corpora are important to compilation of language teaching materials. For example, the order of vocabularies and sentence patterns occurring in reading materials in a western classroom are usually arranged based on corpus-based survey of word frequency, parts of speech and collocations. However, current Taiwanese Southern Min textbooks are compiled upon the editor’s subjective judgment because lacks of fundamental researches on parts of speech and word frequency. Currently existing Taiwanese Southern Min corpora are relatively small and none of them are tagged. This study is a corpus-based investigation on parts of speech and word frequency in Taiwanese Southern Min story texts. We built a tagged corpus consists of 134 Southern Min Stories in Taoyuan County Folk Literature Series. Then we tackled the relevant issues and compare the result against T. Chen's Dongfanji ‘The Aborigines of Taiwan' published in 1603. Our corpus will be integrated into a large-scale Min corpus in Academia Sinica. Hopefully, results of our study are useful for editing and revising Taiwanese teaching materials.



台灣閩南語故事, 語料庫, 詞類, 詞頻, 東番記, Taiwanese Southern Min stories, language corpus, parts of speech, word frequency, Dongfanji





Supplemented By

Referenced By