台灣閩南語語音字典改良

dc.contributor甯俐馨zh_TW
dc.contributorNing, Li-Hsinen_US
dc.contributor.author陳晏淇zh_TW
dc.contributor.authorChen, Yan-Chien_US
dc.date.accessioned2020-10-19T06:48:58Z
dc.date.available2025-08-18
dc.date.available2020-10-19T06:48:58Z
dc.date.issued2020
dc.description.abstract本研究旨在改良囊括日常用語的台灣閩南語語音字典。有鑒於台灣的老化人口日益增加,建置台灣閩南語語音資料庫於未來多元應用更趨重要,如:語音科技改良與語言保存。然而,由於台灣閩南語為低資源語言(low-resource language)之一,目前可取得的台灣閩南語語料相當稀少;本研究蒐集線上台灣閩南語對話語料,並予以人工分詞與標記。本研究以蒐集的語料探究現有的台灣閩南語字典之於蒐集語料的涵蓋率,並發現尚有未被收錄於台灣閩南語字典中的台灣閩南語發音詞條與台灣閩南語新詞條。本研究將整理未被收錄於字典中的詞條,蒐集的詞條將被分類為三個類別;其分別為:發音變異(pronunciation variation, PV)、多種發音(multiple pronunciation, MP) 與新詞(new word, NW) 。本文將呈現針對蒐集語料的深入分析,並基於觀察的結果進行討論。討論重點將著眼於總括性的觀察性統整。我們期望此研究結果能夠反映部分台灣閩南語語詞在實際台灣閩南語對話中的使用情形,並協助改良現有台灣閩南語語音辨識系統。zh_TW
dc.description.abstractThis thesis aims to optimize a Taiwanese Southern Min (TSM) lexicon that accommodates daily use of TSM words. In light of the increasing aging population in Taiwan, it might be necessary to build a database containing TSM words for diverse applications such as speech technologies and language preservation. Nevertheless, as a low-resource language, there is a dearth of available for TSM research. Due to the scarcity of TSM data, this thesis prepared TSM data by gathering on-line TSM conversational speeches, segmenting the content of the speeches, and annotating the data manually. Next, this thesis investigated the word coverage of the existing TSM dictionary and found that some TSM pronunciations and TSM words have yet been included in the dictionary. Data that were not found were then sorted into 3 categories: pronunciation variation (PV), multiple pronunciation (MP), and new word (NW) based on their pronunciation variation types. Followed up an in-depth description of data analysis, a discussion based on our observation will be elicited. The discussion would shed the lights on the generalization of our findings. It is expected that our findings would be capable of capturing a glimpse of daily use of TSM. We hope our results could be able to help optimize the lexicon for Taiwanese Southern Min speech recognition (TSMSR) system in progress and benefit TSM-related studies in the future.en_US
dc.description.sponsorship英語學系zh_TW
dc.identifierG060521035L
dc.identifier.urihttp://etds.lib.ntnu.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dstdcdr&s=id=%22G060521035L%22.&%22.id.&
dc.identifier.urihttp://rportal.lib.ntnu.edu.tw:80/handle/20.500.12235/110959
dc.language英文
dc.subject台灣閩南語zh_TW
dc.subject發音變異zh_TW
dc.subject多種發音zh_TW
dc.subject新詞zh_TW
dc.subject字典zh_TW
dc.subjectTaiwanese Southern Min (TSM)en_US
dc.subjectpronunciation variation (PV)en_US
dc.subjectmultiple pronunciation (MP)en_US
dc.subjectnew word (NW)en_US
dc.subjectlexiconen_US
dc.title台灣閩南語語音字典改良zh_TW
dc.titleOptimizing the Lexicon for Recognizing Taiwanese Southern Min Speechen_US

Files

Collections