英文連續語音辨識之初步研究

許庭瑋; TingWei Hsu

英文連續語音辨識之初步研究

dc.contributor	陳柏琳	zh_TW
dc.contributor	Berlin Chen	en_US
dc.contributor.author	許庭瑋	zh_TW
dc.contributor.author	TingWei Hsu	en_US
dc.date.accessioned	2019-09-05T11:24:30Z
dc.date.available	2007-8-10
dc.date.available	2019-09-05T11:24:30Z
dc.date.issued	2007
dc.description.abstract	本論文為英文連續語音辨識之初步研究。我們實作英文連續語音辨識器，並探討其主要組成，包含語音特徵擷取、聲學模型及語言模型等。首先，針對語音特徵擷取，我們比較傳統式梅爾倒頻譜係數(Mel-frequency Cepstral Coefficients, MFCC)與線性鑑別分析(Linear Discriminant Analysis, LDA)和異質性線性鑑別分析(Heteroscedastic Linear Discriminant Analysis, HLDA)之效能。再者，針對聲學模型，我們探討詞內三連音素模型(Intra-word Triphone Models)、狀態連結(State-Tying)技術、音素模糊矩陣(Phone Confusion Matrix)與非監督式聲學模型訓練(Unsupervised Acoustic Model Training)的使用，以提升語音辨識率。最後，針對語言模型，在語音辨識過程中分別利用詞頻數混合法(Count Merging)與模型插補法(Model Interpolation)，結合背景與同領域語言模型訓練語料，以達到較佳之詞發生預測。本論文實驗是以美國之音與台灣腔英文語料為題材，並有一些初步的觀察及發現。	zh_TW
dc.description.abstract	This thesis is intended to perform a preliminary study on English continuous speech recognition. An English continous speech recognizer was implemented, while parts of its major constituents, including speech feature extraction, acoustic modeling and language modeling, were extensively investigated as well. First, for speech feature extraction, we compared the performance of linear discriminant analysis (LDA) and heteroscedastic linear discriminant analysis (HLDA) to that of the conventional Mel-frequency cepstral coefficients (MFCC) .Second, for acoustic modeling, we explored the use of the intra-word triphone models, the state-tying scheme and the phone confusion matrix, as well as the unsupervised training of acoustic models, for better speech recognition results. Finally, for language modeling, both count-merging and model-interpolation approaches were respectively expoited to combine the background and in-domain language model training corpora to enable better prediction of word occurrences during the speech recognition process. The experiments were conducted on the Voice of America (VOA) and the English Across Taiwan (EAT) corpora.	en_US
dc.description.sponsorship	資訊工程學系	zh_TW
dc.identifier	GN0694470027
dc.identifier.uri	http://etds.lib.ntnu.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dstdcdr&s=id=%22GN0694470027%22.&%22.id.&
dc.identifier.uri	http://rportal.lib.ntnu.edu.tw:80/handle/20.500.12235/106657
dc.language	中文
dc.subject	連續語音辨識	zh_TW
dc.subject	詞內三連音素模型	zh_TW
dc.subject	狀態連結	zh_TW
dc.subject	音素模糊矩陣	zh_TW
dc.subject	Continuous Speech Recognition	en_US
dc.subject	Intra Triphone	en_US
dc.subject	State tying	en_US
dc.subject	Confusion Matrix	en_US
dc.title	英文連續語音辨識之初步研究	zh_TW
dc.title	An Initial Study on English Continuous Speech Recognition	en_US