語音合成技術應用於電腦輔助發音訓練之研究

dc.contributor陳柏琳zh_TW
dc.contributorChen, Berlinen_US
dc.contributor.author蔡孟庭zh_TW
dc.contributor.authorTSAI, Meng-Tingen_US
dc.date.accessioned2024-12-17T03:37:21Z
dc.date.available2024-02-05
dc.date.issued2024
dc.description.abstract在語言學習領域中,聽力與口語訓練的方法可概分為跟讀法、聆聽並重複法與回聲法三種方式。這些方法的核心概念均是在聆聽目標語言的發音後,進而模仿其語調並覆述相同內容。然而在實際的學習環境中,要找到可作為學習對象的母語 (First Language, L1) 語者存在諸多限制,例如偏鄉地區師資缺乏、成本高昂、不利於個人化進度安排等。另外研究也指出,當第二語言 (English as a Second Language / Second Language, ESL / L2) 學習者在聆聽與模仿標準語音時,若語音標的的語者特徵與自己較為接近,對於發音技巧的訓練更為有益。這種結合 L2 學習者語者特性與 L1 語者口音特性的語音段落稱為「黃金語者 (Golden Speaker)」。為了解決上述問題,本研究選擇英語作為合成目標語言,以產生 L2 英語學習者的黃金標準語音。除嘗試改進合成結果並提出適用於發音學習情境的合成語音評估框架,也證實合成語音可以改善錯誤發音。研究並將此合成語音言應用於電腦輔助發音訓練領域,驗證 L2 學習者原始語音與合成語音之間動態時間校正差異量可有效作為發音評估的預測特徵,並藉由合成語音提高自動發音評估的準確率,進而促進學習者與教學者在電腦輔助發音訓練情境的學習及工作效益。zh_TW
dc.description.abstractIn the field of language learning, listening and speaking training methods can be broadly categorized into three approaches: shadowing, listen-and-repeat, and echoing. The core concepts underlying these methods are to listen to the pronunciation of the target language and subsequently imitate its intonation and restate the same content. However, in practical learning environments, there are many constraints in finding native speaker, also known as L1 speakers, as learning targets, such as a lack of qualified instructors in rural areas, high costs, and challenges in accommodating personalized progress schedules. In addition, research indicates that when English as a Second Language/Second Language (ESL/L2) learners listen to and imitate standard pronunciations, the more aligned the speaker characteristics of the pronunciation model are with those of the learners, the more advantageous the pronunciation skills training becomes. This study chooses English as the synthesized target language to generate golden standard pronunciations for L2 English learners. Apart from attempting to improve the synthesis results and proposing a synthesis speech evaluation framework applicable to pronunciation training scenarios, this study confirms that synthesized speech can improve erroneous pronunciations. The study further extends the application of this synthesized speech, validating that the dynamic time warping cost between the original speech of L2 learners and the synthesized speech can effectively serve as predictive features for pronunciation assessment. Through the utilization of synthesized speech, the aim is to enhance the accuracy of automatic pronunciation assessment, thereby fostering improved learning and work efficiency for both learners and instructors in the context of computer-assisted pronunciation training.en_US
dc.description.sponsorship資訊工程學系zh_TW
dc.identifier61047011S-44809
dc.identifier.urihttps://etds.lib.ntnu.edu.tw/thesis/detail/7ebad777c171949790edc1075e56d9a6/
dc.identifier.urihttp://rportal.lib.ntnu.edu.tw/handle/20.500.12235/123696
dc.language中文
dc.subject語音合成zh_TW
dc.subject電腦輔助發音訓練zh_TW
dc.subject黃金語音zh_TW
dc.subject自動發音評估zh_TW
dc.subjecttext-to-speechen_US
dc.subjectcomputer-assisted pronunciation trainingen_US
dc.subjectgolden speakeren_US
dc.subjectautomatic pronunciation assessmenten_US
dc.title語音合成技術應用於電腦輔助發音訓練之研究zh_TW
dc.titleA Study of Speech Synthesis Techniques for Computer-Assisted Pronunciation Trainingen_US
dc.type學術論文

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
202400044809-107139.pdf
Size:
2.56 MB
Format:
Adobe Portable Document Format
Description:
學術論文

Collections