英文連續語音辨識之初步研究

dc.contributor陳柏琳zh_TW
dc.contributorBerlin Chenen_US
dc.contributor.author許庭瑋zh_TW
dc.contributor.authorTingWei Hsuen_US
dc.date.accessioned2019-09-05T11:24:30Z
dc.date.available2007-8-10
dc.date.available2019-09-05T11:24:30Z
dc.date.issued2007
dc.description.abstract本論文為英文連續語音辨識之初步研究。我們實作英文連續語音辨識器,並探討其主要組成,包含語音特徵擷取、聲學模型及語言模型等。首先,針對語音特徵擷取,我們比較傳統式梅爾倒頻譜係數(Mel-frequency Cepstral Coefficients, MFCC)與線性鑑別分析(Linear Discriminant Analysis, LDA)和異質性線性鑑別分析(Heteroscedastic Linear Discriminant Analysis, HLDA)之效能。再者,針對聲學模型,我們探討詞內三連音素模型(Intra-word Triphone Models)、狀態連結(State-Tying)技術、音素模糊矩陣(Phone Confusion Matrix)與非監督式聲學模型訓練(Unsupervised Acoustic Model Training)的使用,以提升語音辨識率。最後,針對語言模型,在語音辨識過程中分別利用詞頻數混合法(Count Merging)與模型插補法(Model Interpolation),結合背景與同領域語言模型訓練語料,以達到較佳之詞發生預測。本論文實驗是以美國之音與台灣腔英文語料為題材,並有一些初步的觀察及發現。zh_TW
dc.description.abstractThis thesis is intended to perform a preliminary study on English continuous speech recognition. An English continous speech recognizer was implemented, while parts of its major constituents, including speech feature extraction, acoustic modeling and language modeling, were extensively investigated as well. First, for speech feature extraction, we compared the performance of linear discriminant analysis (LDA) and heteroscedastic linear discriminant analysis (HLDA) to that of the conventional Mel-frequency cepstral coefficients (MFCC) .Second, for acoustic modeling, we explored the use of the intra-word triphone models, the state-tying scheme and the phone confusion matrix, as well as the unsupervised training of acoustic models, for better speech recognition results. Finally, for language modeling, both count-merging and model-interpolation approaches were respectively expoited to combine the background and in-domain language model training corpora to enable better prediction of word occurrences during the speech recognition process. The experiments were conducted on the Voice of America (VOA) and the English Across Taiwan (EAT) corpora.en_US
dc.description.sponsorship資訊工程學系zh_TW
dc.identifierGN0694470027
dc.identifier.urihttp://etds.lib.ntnu.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dstdcdr&s=id=%22GN0694470027%22.&%22.id.&
dc.identifier.urihttp://rportal.lib.ntnu.edu.tw:80/handle/20.500.12235/106657
dc.language中文
dc.subject連續語音辨識zh_TW
dc.subject詞內三連音素模型zh_TW
dc.subject狀態連結zh_TW
dc.subject音素模糊矩陣zh_TW
dc.subjectContinuous Speech Recognitionen_US
dc.subjectIntra Triphoneen_US
dc.subjectState tyingen_US
dc.subjectConfusion Matrixen_US
dc.title英文連續語音辨識之初步研究zh_TW
dc.titleAn Initial Study on English Continuous Speech Recognitionen_US

Files

Original bundle

Now showing 1 - 5 of 6
No Thumbnail Available
Name:
n069447002701.pdf
Size:
188.26 KB
Format:
Adobe Portable Document Format
No Thumbnail Available
Name:
n069447002702.pdf
Size:
194.4 KB
Format:
Adobe Portable Document Format
No Thumbnail Available
Name:
n069447002703.pdf
Size:
239.24 KB
Format:
Adobe Portable Document Format
No Thumbnail Available
Name:
n069447002704.pdf
Size:
112.62 KB
Format:
Adobe Portable Document Format
No Thumbnail Available
Name:
n069447002705.pdf
Size:
249.71 KB
Format:
Adobe Portable Document Format

Collections