改善豐富文脈模型於中文語音合成之研究

dc.contributor陳柏琳zh_TW
dc.contributor.author陳黃威zh_TW
dc.date.accessioned2019-09-05T11:20:16Z
dc.date.available2014-8-25
dc.date.available2019-09-05T11:20:16Z
dc.date.issued2014
dc.description.abstract本論文中,我們首先回顧三種不同的合成技術:串接式語音合成(Concantenative Speech Synthesis)、統計模型式語音合成(Statistical Model-Based Speech Synthesis)以及混和式語音合成(Hybrid-Based Speech Synthesis)。本論文以統計模型式語音合成做為主要研究方向,並介紹兩種技術:基於隱藏式馬可夫模型之語音合成(Hidden Markov Model-Based Speech Synthesis, HMM-Based Speech Synthesis)與使用豐富文脈模型(Rich Context Model-Based)之隱藏式馬可夫模型語音合成。本論文將上述兩種技術應用至中文語音合成當中,並將針對豐富文脈模型之語音合成進行改良,提出使用潛藏語意分析(Latent Semantic Analysis, LSA)分析出文脈(Context)的潛藏韻律,希望藉由其潛藏的韻律從訓練語料庫當中選擇韻律上相似的模型,以便獲得較為優良起始語音參數向量序列(Initial Speech Parameter Vectors Sequence)並使用語音參數產生演算法(Speech Parameter Generation Algorithm)來產生目標語句之語音參數向量序列,並用於實際合成。本論文實驗將使用新釋出的台北科技大學中文電子書語音資料庫(NTUT-AB01-CH)作為語音合成之訓練資料,實驗結果將以一系列的主觀與客觀測驗來評斷統計式語音合成架構本論文所提出之方法與既有方法之長處。zh_TW
dc.description.abstractIn this thesis, we first provide a brief review of three mainstream frameworks for speech synthesis, namely, concatenative speech synthesis, statistical model-based speech synthesis and hybrid-based speech synthesis. Then, we focus our attention exclusively on comparing two important instantiations of the statistical model-based framework and their applications to Mandarin Chinese speech synthesis, which are the hidden Markov model-based method and the rich context model-based method respectively. In addition, we also explore the use of latent semantic analysis (LSA) to discover both lexical and prosodic cues inherent in the contextual descriptions of training speech utterances, with the hope that they can subsequently be used to obtain a good initialization for estimating the observation vector sequence of an utterance to be synthesized. A series of subjective and objective evaluations are conducted, using the newly released NTUT-AB01-CH corpus, to validate the performance merits of the aforementioned various methods stemming from the statistical model-based framework.en_US
dc.description.sponsorship資訊工程學系zh_TW
dc.identifierGN060147059S
dc.identifier.urihttp://etds.lib.ntnu.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dstdcdr&s=id=%22GN060147059S%22.&%22.id.&
dc.identifier.urihttp://rportal.lib.ntnu.edu.tw:80/handle/20.500.12235/106617
dc.language中文
dc.subject基於隱藏式馬可夫模型之語音合成zh_TW
dc.subject豐富文脈模型之語音合成zh_TW
dc.subject起始語音參數序列zh_TW
dc.subject潛藏語意分析zh_TW
dc.subject空間向量模型zh_TW
dc.subjectHidden Markov Model Based Speech Synthesisen_US
dc.subjectRich Context Models Based Speech Synthesisen_US
dc.subjectInitial Speech Parameter Sequenceen_US
dc.subjectLatent Semantic Analysisen_US
dc.subjectVector Space Modelen_US
dc.title改善豐富文脈模型於中文語音合成之研究zh_TW
dc.titleA Study of Enhanced Rich Context Modeling Techniques for Mandarin Speech Synthesisen_US

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
n060147059s01.pdf
Size:
2.15 MB
Format:
Adobe Portable Document Format

Collections