台灣閩南語語音字典改良 Optimizing the Lexicon for Recognizing Taiwanese Southern Min Speech

dc.contributor 甯俐馨 zh_TW
dc.contributor Ning, Li-Hsin en_US
dc.contributor.author 陳晏淇 zh_TW
dc.contributor.author Chen, Yan-Chi en_US
dc.date.accessioned 2020-10-19T06:48:58Z
dc.date.available 2025-08-18
dc.date.available 2020-10-19T06:48:58Z
dc.date.issued 2020
dc.description.abstract 本研究旨在改良囊括日常用語的台灣閩南語語音字典。有鑒於台灣的老化人口日益增加,建置台灣閩南語語音資料庫於未來多元應用更趨重要,如:語音科技改良與語言保存。然而,由於台灣閩南語為低資源語言(low-resource language)之一,目前可取得的台灣閩南語語料相當稀少;本研究蒐集線上台灣閩南語對話語料,並予以人工分詞與標記。本研究以蒐集的語料探究現有的台灣閩南語字典之於蒐集語料的涵蓋率,並發現尚有未被收錄於台灣閩南語字典中的台灣閩南語發音詞條與台灣閩南語新詞條。本研究將整理未被收錄於字典中的詞條,蒐集的詞條將被分類為三個類別;其分別為:發音變異(pronunciation variation, PV)、多種發音(multiple pronunciation, MP) 與新詞(new word, NW) 。本文將呈現針對蒐集語料的深入分析,並基於觀察的結果進行討論。討論重點將著眼於總括性的觀察性統整。我們期望此研究結果能夠反映部分台灣閩南語語詞在實際台灣閩南語對話中的使用情形,並協助改良現有台灣閩南語語音辨識系統。 zh_TW
dc.description.abstract This thesis aims to optimize a Taiwanese Southern Min (TSM) lexicon that accommodates daily use of TSM words. In light of the increasing aging population in Taiwan, it might be necessary to build a database containing TSM words for diverse applications such as speech technologies and language preservation. Nevertheless, as a low-resource language, there is a dearth of available for TSM research. Due to the scarcity of TSM data, this thesis prepared TSM data by gathering on-line TSM conversational speeches, segmenting the content of the speeches, and annotating the data manually. Next, this thesis investigated the word coverage of the existing TSM dictionary and found that some TSM pronunciations and TSM words have yet been included in the dictionary. Data that were not found were then sorted into 3 categories: pronunciation variation (PV), multiple pronunciation (MP), and new word (NW) based on their pronunciation variation types. Followed up an in-depth description of data analysis, a discussion based on our observation will be elicited. The discussion would shed the lights on the generalization of our findings. It is expected that our findings would be capable of capturing a glimpse of daily use of TSM. We hope our results could be able to help optimize the lexicon for Taiwanese Southern Min speech recognition (TSMSR) system in progress and benefit TSM-related studies in the future. en_US
dc.description.sponsorship 英語學系 zh_TW
dc.identifier G060521035L
dc.identifier.uri http://etds.lib.ntnu.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dstdcdr&s=id=%22G060521035L%22.&%22.id.&
dc.identifier.uri http://rportal.lib.ntnu.edu.tw:80/handle/20.500.12235/110959
dc.language 英文
dc.subject 台灣閩南語 zh_TW
dc.subject 發音變異 zh_TW
dc.subject 多種發音 zh_TW
dc.subject 新詞 zh_TW
dc.subject 字典 zh_TW
dc.subject Taiwanese Southern Min (TSM) en_US
dc.subject pronunciation variation (PV) en_US
dc.subject multiple pronunciation (MP) en_US
dc.subject new word (NW) en_US
dc.subject lexicon en_US
dc.title 台灣閩南語語音字典改良 zh_TW
dc.title Optimizing the Lexicon for Recognizing Taiwanese Southern Min Speech en_US
Files
Collections