改善豐富文脈模型於中文語音合成之研究

陳黃威

改善豐富文脈模型於中文語音合成之研究

dc.contributor	陳柏琳	zh_TW
dc.contributor.author	陳黃威	zh_TW
dc.date.accessioned	2019-09-05T11:20:16Z
dc.date.available	2014-8-25
dc.date.available	2019-09-05T11:20:16Z
dc.date.issued	2014
dc.description.abstract	本論文中，我們首先回顧三種不同的合成技術：串接式語音合成(Concantenative Speech Synthesis)、統計模型式語音合成(Statistical Model-Based Speech Synthesis)以及混和式語音合成(Hybrid-Based Speech Synthesis)。本論文以統計模型式語音合成做為主要研究方向，並介紹兩種技術：基於隱藏式馬可夫模型之語音合成(Hidden Markov Model-Based Speech Synthesis, HMM-Based Speech Synthesis)與使用豐富文脈模型(Rich Context Model-Based)之隱藏式馬可夫模型語音合成。本論文將上述兩種技術應用至中文語音合成當中，並將針對豐富文脈模型之語音合成進行改良，提出使用潛藏語意分析(Latent Semantic Analysis, LSA)分析出文脈(Context)的潛藏韻律，希望藉由其潛藏的韻律從訓練語料庫當中選擇韻律上相似的模型，以便獲得較為優良起始語音參數向量序列(Initial Speech Parameter Vectors Sequence)並使用語音參數產生演算法(Speech Parameter Generation Algorithm)來產生目標語句之語音參數向量序列，並用於實際合成。本論文實驗將使用新釋出的台北科技大學中文電子書語音資料庫(NTUT-AB01-CH)作為語音合成之訓練資料，實驗結果將以一系列的主觀與客觀測驗來評斷統計式語音合成架構本論文所提出之方法與既有方法之長處。	zh_TW
dc.description.abstract	In this thesis, we first provide a brief review of three mainstream frameworks for speech synthesis, namely, concatenative speech synthesis, statistical model-based speech synthesis and hybrid-based speech synthesis. Then, we focus our attention exclusively on comparing two important instantiations of the statistical model-based framework and their applications to Mandarin Chinese speech synthesis, which are the hidden Markov model-based method and the rich context model-based method respectively. In addition, we also explore the use of latent semantic analysis (LSA) to discover both lexical and prosodic cues inherent in the contextual descriptions of training speech utterances, with the hope that they can subsequently be used to obtain a good initialization for estimating the observation vector sequence of an utterance to be synthesized. A series of subjective and objective evaluations are conducted, using the newly released NTUT-AB01-CH corpus, to validate the performance merits of the aforementioned various methods stemming from the statistical model-based framework.	en_US
dc.description.sponsorship	資訊工程學系	zh_TW
dc.identifier	GN060147059S
dc.identifier.uri	http://etds.lib.ntnu.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dstdcdr&s=id=%22GN060147059S%22.&%22.id.&
dc.identifier.uri	http://rportal.lib.ntnu.edu.tw:80/handle/20.500.12235/106617
dc.language	中文
dc.subject	基於隱藏式馬可夫模型之語音合成	zh_TW
dc.subject	豐富文脈模型之語音合成	zh_TW
dc.subject	起始語音參數序列	zh_TW
dc.subject	潛藏語意分析	zh_TW
dc.subject	空間向量模型	zh_TW
dc.subject	Hidden Markov Model Based Speech Synthesis	en_US
dc.subject	Rich Context Models Based Speech Synthesis	en_US
dc.subject	Initial Speech Parameter Sequence	en_US
dc.subject	Latent Semantic Analysis	en_US
dc.subject	Vector Space Model	en_US
dc.title	改善豐富文脈模型於中文語音合成之研究	zh_TW
dc.title	A Study of Enhanced Rich Context Modeling Techniques for Mandarin Speech Synthesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: n060147059s01.pdf
Size:: 2.15 MB
Format:: Adobe Portable Document Format

Download

Collections

學位論文