改善鑑別式聲學模型訓練於中文連續語音辨識之研究

dc.contributor陳柏琳教授zh_TW
dc.contributorBerlin Chenen_US
dc.contributor.author劉士弘zh_TW
dc.contributor.authorShih-Hung Liuen_US
dc.date.accessioned2019-09-05T11:23:36Z
dc.date.available2007-7-18
dc.date.available2019-09-05T11:23:36Z
dc.date.issued2007
dc.description.abstract本論文探討改善鑑別式聲學模型於中文大詞彙連續語音辨識之研究。首先,本論文提出一個新的時間音框層次音素正確率函數來取代最小化音素錯誤訓練的原始音素正確率函數,此新的音素正確率函數在某種程度上能充分地懲罰刪除錯誤。其次,本論文提出一個新的以時間音框層次正規化熵值為基礎的資料選取方法來改進鑑別式訓練,其正規化熵值是由訓練語料所產生之詞圖中高斯分布之事後機率所求得。此資料選取方法可以讓鑑別式訓練更集中在那些離決定邊界較近的訓練樣本所收集的統計值,以達到較佳的鑑別力。此資料選取方法更進一步地應用到非監督鑑別式聲學模型訓練上。最後,本論文也嘗試修改鑑別式訓練的目標函數,以收集不同的統計值來改進最小化音素錯誤鑑別式訓練。所使用的實驗題材是公視新聞語料。由初步的實驗結果來看,結合時間音框層次的資料選取方法和新的音素正確率函數在前幾次的迭代訓練中確實有些微且一致的進步。zh_TW
dc.description.abstractThis thesis considers improved discriminative training of acoustic models for Mandarin large vocabulary continuous speech recognition (LVCSR). First, we presented a new phone accuracy function based on the frame-level accuracy of hypothesized phone arcs instead of using the raw phone accuracy function of minimum phone error (MPE) training, which to some extent can sufficiently penalize deletion errors of speech recognition. Second, a novel data selection approach based on the normalized frame-level entropy of Gaussian posterior probabilities obtained from the word lattice of the training utterance was explored for discriminative training. It has the merit of making the training algorithm focus much more on the trainingstatistics of those frame samples that center nearly around the decision boundary for better discrimination. The proposed data selection approach was further applied to unsupervised discriminative training of acoustic models. Finally, a few other modifications of the training objective functions, as well as the lattice structures, for the accumulation of MPE training statistics were investigated. Experiments conducted on the Mandarin broadcast news corpus (MATBN) collected in Taiwan showed that the integration of the frame-level data selection and new phone accuracy function could achieve slight but consistent improvements over the conventional MPE training at lower training iterations.en_US
dc.description.sponsorship資訊工程學系zh_TW
dc.identifierGN0693470185
dc.identifier.urihttp://etds.lib.ntnu.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dstdcdr&s=id=%22GN0693470185%22.&%22.id.&
dc.identifier.urihttp://rportal.lib.ntnu.edu.tw:80/handle/20.500.12235/106651
dc.language中文
dc.subject鑑別式聲學模型訓練zh_TW
dc.subject大詞彙連續語音辨識zh_TW
dc.subject時間音框正確率函數zh_TW
dc.subject資料選取zh_TW
dc.subjectDiscriminative trainingen_US
dc.subjectLarge vocabulary continuous speech recognitionen_US
dc.subjecttime frame accuracy functionen_US
dc.subjectdata selectionen_US
dc.title改善鑑別式聲學模型訓練於中文連續語音辨識之研究zh_TW
dc.titleImproved discriminative training for Mandarin continuous speech recognitionen_US

Files

Original bundle

Now showing 1 - 5 of 7
No Thumbnail Available
Name:
n069347018501.pdf
Size:
262.01 KB
Format:
Adobe Portable Document Format
No Thumbnail Available
Name:
n069347018502.pdf
Size:
294.5 KB
Format:
Adobe Portable Document Format
No Thumbnail Available
Name:
n069347018503.pdf
Size:
191.03 KB
Format:
Adobe Portable Document Format
No Thumbnail Available
Name:
n069347018504.pdf
Size:
203.14 KB
Format:
Adobe Portable Document Format
No Thumbnail Available
Name:
n069347018505.pdf
Size:
357.09 KB
Format:
Adobe Portable Document Format

Collections