改善類神經網路聲學模型經由結合多任務學習與整體學習於會議語音辨識之研究

楊明翰; Yang, Ming-Han

改善類神經網路聲學模型經由結合多任務學習與整體學習於會議語音辨識之研究

dc.contributor	陳柏琳	zh_TW
dc.contributor	Chen, Berlin	en_US
dc.contributor.author	楊明翰	zh_TW
dc.contributor.author	Yang, Ming-Han	en_US
dc.date.accessioned	2019-09-05T11:12:26Z
dc.date.available	2016-08-31
dc.date.available	2019-09-05T11:12:26Z
dc.date.issued	2016
dc.description.abstract	本論文旨在研究如何融合多任務學習(multi-task learning, MTL)與整體學習(ensemble learning)技術於聲學模型之參數估測，藉以改善會議語音辨識(meeting speech recognition)之準確性。我們的貢獻主要有三點：1)我們進行了實證研究以充分利用各種輔助任務來加強多任務學習在會議語音辨識的表現。此外，我們還研究多任務與不同聲學模型像是深層類神經網路(deep neural networks, DNN)聲學模型及摺積神經網路(convolutional neural networks, CNN)結合的協同效應，期望增加聲學模型建模之一般化能力(generalization capability)。2)由於訓練多任務聲學模型的過程中，調整不同輔助任務之貢獻(權重)的方式並不是最佳的，因此我們提出了重新調適法，以減輕這個問題。3)我們對整體學習技術進行研究，有系統地整合多任務學習所培訓的各種聲學模型(weak learner)。我們基於歐盟所錄製的擴增多方互動會議語料(augmented multi-party interaction, AMI)及在台灣所收錄的華語會議語料庫(Mandarin meeting recording corpus, MMRC)建立了一系列的實驗。與數種現有的基礎實驗相比，實驗結果揭示了我們所提出的方法之有效性。	zh_TW
dc.description.abstract	This thesis sets out to explore the use of multi-task learning (MTL) and ensemble learning techniques for more accurate estimation of the parameters involved in neural network based acoustic models, so as to improve the accuracy of meeting speech recognition. Our main contributions are three-fold. First, we conduct an empirical study to leverage various auxiliary tasks to enhance the performance of multi-task learning on meeting speech recognition. Furthermore, we also study the synergy effect of combing multi-task learning with disparate acoustic models, such as deep neural network (DNN) and convolutional neural network (CNN) based acoustic models, with the expectation to increase the generalization ability of acoustic modeling. Second, since the way to modulate the contribution (weights) of different auxiliary tasks during acoustic model training is far from optimal and actually a matter of heuristic judgment, we thus propose a simple model adaptation method to alleviate such a problem. Third, an ensemble learning method is investigated to systematically integrate the various acoustic models (weak learners) trained with multi-task learning. A series of experiments have been carried out on the augmented multi-party interaction (AMI) and Mandarin meeting recording (MMRC) corpora, which seem to reveal the effectiveness of our proposed methods in relation to several existing baselines.	en_US
dc.description.sponsorship	資訊工程學系	zh_TW
dc.identifier	G060347034S
dc.identifier.uri	http://etds.lib.ntnu.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dstdcdr&s=id=%22G060347034S%22.&%22.id.&
dc.identifier.uri	http://rportal.lib.ntnu.edu.tw:80/handle/20.500.12235/106414
dc.language	中文
dc.subject	多任務學習	zh_TW
dc.subject	整體學習	zh_TW
dc.subject	深層學習	zh_TW
dc.subject	類神經網路	zh_TW
dc.subject	會議語音辨識	zh_TW
dc.subject	multi-task learning	en_US
dc.subject	ensemble learning	en_US
dc.subject	deep learning	en_US
dc.subject	neural network	en_US
dc.subject	meeting speech recognition	en_US
dc.title	改善類神經網路聲學模型經由結合多任務學習與整體學習於會議語音辨識之研究	zh_TW
dc.title	Improved Neural Network Based Acoustic Modeling Leveraging Multi-task Learning and Ensemble Learning for Meeting Speech Recognition	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 060347034s01.pdf
Size:: 3.22 MB
Format:: Adobe Portable Document Format

Download

Collections

學位論文