資料選取方法於鑑別式聲學模型訓練之研究

朱芳輝; Fang-Hui, Chu

資料選取方法於鑑別式聲學模型訓練之研究

dc.contributor	陳柏琳	zh_TW
dc.contributor	Berlin, Chen	en_US
dc.contributor.author	朱芳輝	zh_TW
dc.contributor.author	Fang-Hui, Chu	en_US
dc.date.accessioned	2019-09-05T11:25:35Z
dc.date.available	2008-2-25
dc.date.available	2019-09-05T11:25:35Z
dc.date.issued	2008
dc.description.abstract	本論文旨在研究使用各種訓練資料選取方法來改善以最小化音素錯誤為基礎的鑑別式聲學模型訓練，並應用於中文大詞彙連續語音辨識。首先，我們汲取Boosting演算法中強調被錯誤分類的訓練樣本之精神，修改最小化音素錯誤訓練中每一句訓練語句之統計值權重，以提高易傾向於被辨識錯誤的語句對於聲學模型訓練之貢獻。同時，我們透過多種方式來結合在不同訓練資料選取機制下所訓練出的多個聲學模型，進而降低語音辨識錯誤率。其次，我們亦提出一個基於訓練語句詞圖之期望音素正確率(Expected Phone Accuracy)定義域上的訓練資料選取方法，分別藉由在語句與音素段落兩種不同單位上的訓練資料選取，以提供最小化音素錯誤訓練更具鑑別資訊的訓練樣本。再者，我們嘗試結合本論文所提出的訓練資料選取方法及前人所提出以正規化熵值為基礎之音框層次訓練資料選取方法、以及音框音素正確率函數，冀以提升最小化音素錯誤訓練之成效。最後，本論文以公視新聞語料作為實驗平台，實驗結果初步驗證了本論文所提出方法之可行性。	zh_TW
dc.description.abstract	This thesis aims to investigate various training data selection approaches for improving the minimum phone error (MPE) based discriminative training of acoustic models for Mandarin large vocabulary continuous speech recognition (LVCSR). First, inspired by the concept of the AdaBoost algorithm that lays more emphasis on the training samples misclassified by the already-trained classifier, the accumulated statistics of the training utterances prone to be incorrectly recognized are properly adjusted during the MPE training. Meanwhile, multiple speech recognition systems with their acoustic models respectively trained using various training data selection criteria are combined together at different recognition stages for improving the recognition accuracy. On the other hand, a novel data selection approach conducted on the expected phone accuracy domain of the word lattices of training utterances is explored as well. It is able to select more discriminative training instances, in terms of either utterances or phone arcs, for better model discrimination. Moreover, this approach is further integrated with a previously proposed frame-level data selection approach, namely the normalized entropy based frame-level data selection, and a frame-level phone accuracy function for improving the MPE training. All experiments were performed on the Mandarin broadcast news corpus (MATBN), and the associated results initially demonstrated the feasibility of our proposed training data selection approaches.	en_US
dc.description.sponsorship	資訊工程學系	zh_TW
dc.identifier	GN0694470144
dc.identifier.uri	http://etds.lib.ntnu.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dstdcdr&s=id=%22GN0694470144%22.&%22.id.&
dc.identifier.uri	http://rportal.lib.ntnu.edu.tw:80/handle/20.500.12235/106666
dc.language	中文
dc.subject	資料選取	zh_TW
dc.subject	鑑別式訓練	zh_TW
dc.subject	聲學模型	zh_TW
dc.subject	語音辨識	zh_TW
dc.subject	Data Selection	en_US
dc.subject	Discriminative Training	en_US
dc.subject	Acoustic Models	en_US
dc.subject	Speech Recognition	en_US
dc.title	資料選取方法於鑑別式聲學模型訓練之研究	zh_TW
dc.title	Training Data Selection for Discriminative Training of Acoustic Models	en_US

Collections

學位論文

資料選取方法於鑑別式聲學模型訓練之研究

Files

Collections