資料選取方法於鑑別式聲學模型訓練之研究

dc.contributor陳柏琳zh_TW
dc.contributorBerlin, Chenen_US
dc.contributor.author朱芳輝zh_TW
dc.contributor.authorFang-Hui, Chuen_US
dc.date.accessioned2019-09-05T11:25:35Z
dc.date.available2008-2-25
dc.date.available2019-09-05T11:25:35Z
dc.date.issued2008
dc.description.abstract本論文旨在研究使用各種訓練資料選取方法來改善以最小化音素錯誤為基礎的鑑別式聲學模型訓練,並應用於中文大詞彙連續語音辨識。首先,我們汲取Boosting演算法中強調被錯誤分類的訓練樣本之精神,修改最小化音素錯誤訓練中每一句訓練語句之統計值權重,以提高易傾向於被辨識錯誤的語句對於聲學模型訓練之貢獻。同時,我們透過多種方式來結合在不同訓練資料選取機制下所訓練出的多個聲學模型,進而降低語音辨識錯誤率。其次,我們亦提出一個基於訓練語句詞圖之期望音素正確率(Expected Phone Accuracy)定義域上的訓練資料選取方法,分別藉由在語句與音素段落兩種不同單位上的訓練資料選取,以提供最小化音素錯誤訓練更具鑑別資訊的訓練樣本。再者,我們嘗試結合本論文所提出的訓練資料選取方法及前人所提出以正規化熵值為基礎之音框層次訓練資料選取方法、以及音框音素正確率函數,冀以提升最小化音素錯誤訓練之成效。最後,本論文以公視新聞語料作為實驗平台,實驗結果初步驗證了本論文所提出方法之可行性。zh_TW
dc.description.abstractThis thesis aims to investigate various training data selection approaches for improving the minimum phone error (MPE) based discriminative training of acoustic models for Mandarin large vocabulary continuous speech recognition (LVCSR). First, inspired by the concept of the AdaBoost algorithm that lays more emphasis on the training samples misclassified by the already-trained classifier, the accumulated statistics of the training utterances prone to be incorrectly recognized are properly adjusted during the MPE training. Meanwhile, multiple speech recognition systems with their acoustic models respectively trained using various training data selection criteria are combined together at different recognition stages for improving the recognition accuracy. On the other hand, a novel data selection approach conducted on the expected phone accuracy domain of the word lattices of training utterances is explored as well. It is able to select more discriminative training instances, in terms of either utterances or phone arcs, for better model discrimination. Moreover, this approach is further integrated with a previously proposed frame-level data selection approach, namely the normalized entropy based frame-level data selection, and a frame-level phone accuracy function for improving the MPE training. All experiments were performed on the Mandarin broadcast news corpus (MATBN), and the associated results initially demonstrated the feasibility of our proposed training data selection approaches.en_US
dc.description.sponsorship資訊工程學系zh_TW
dc.identifierGN0694470144
dc.identifier.urihttp://etds.lib.ntnu.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dstdcdr&s=id=%22GN0694470144%22.&%22.id.&
dc.identifier.urihttp://rportal.lib.ntnu.edu.tw:80/handle/20.500.12235/106666
dc.language中文
dc.subject資料選取zh_TW
dc.subject鑑別式訓練zh_TW
dc.subject聲學模型zh_TW
dc.subject語音辨識zh_TW
dc.subjectData Selectionen_US
dc.subjectDiscriminative Trainingen_US
dc.subjectAcoustic Modelsen_US
dc.subjectSpeech Recognitionen_US
dc.title資料選取方法於鑑別式聲學模型訓練之研究zh_TW
dc.titleTraining Data Selection for Discriminative Training of Acoustic Modelsen_US

Files

Collections