改善類神經網路聲學模型經由結合多任務學習與整體學習於會議語音辨識之研究
dc.contributor | 陳柏琳 | zh_TW |
dc.contributor | Chen, Berlin | en_US |
dc.contributor.author | 楊明翰 | zh_TW |
dc.contributor.author | Yang, Ming-Han | en_US |
dc.date.accessioned | 2019-09-05T11:12:26Z | |
dc.date.available | 2016-08-31 | |
dc.date.available | 2019-09-05T11:12:26Z | |
dc.date.issued | 2016 | |
dc.description.abstract | 本論文旨在研究如何融合多任務學習(multi-task learning, MTL)與整體學習(ensemble learning)技術於聲學模型之參數估測,藉以改善會議語音辨識(meeting speech recognition)之準確性。我們的貢獻主要有三點:1)我們進行了實證研究以充分利用各種輔助任務來加強多任務學習在會議語音辨識的表現。此外,我們還研究多任務與不同聲學模型像是深層類神經網路(deep neural networks, DNN)聲學模型及摺積神經網路(convolutional neural networks, CNN)結合的協同效應,期望增加聲學模型建模之一般化能力(generalization capability)。2)由於訓練多任務聲學模型的過程中,調整不同輔助任務之貢獻(權重)的方式並不是最佳的,因此我們提出了重新調適法,以減輕這個問題。3)我們對整體學習技術進行研究,有系統地整合多任務學習所培訓的各種聲學模型(weak learner)。我們基於歐盟所錄製的擴增多方互動會議語料(augmented multi-party interaction, AMI)及在台灣所收錄的華語會議語料庫(Mandarin meeting recording corpus, MMRC)建立了一系列的實驗。與數種現有的基礎實驗相比,實驗結果揭示了我們所提出的方法之有效性。 | zh_TW |
dc.description.abstract | This thesis sets out to explore the use of multi-task learning (MTL) and ensemble learning techniques for more accurate estimation of the parameters involved in neural network based acoustic models, so as to improve the accuracy of meeting speech recognition. Our main contributions are three-fold. First, we conduct an empirical study to leverage various auxiliary tasks to enhance the performance of multi-task learning on meeting speech recognition. Furthermore, we also study the synergy effect of combing multi-task learning with disparate acoustic models, such as deep neural network (DNN) and convolutional neural network (CNN) based acoustic models, with the expectation to increase the generalization ability of acoustic modeling. Second, since the way to modulate the contribution (weights) of different auxiliary tasks during acoustic model training is far from optimal and actually a matter of heuristic judgment, we thus propose a simple model adaptation method to alleviate such a problem. Third, an ensemble learning method is investigated to systematically integrate the various acoustic models (weak learners) trained with multi-task learning. A series of experiments have been carried out on the augmented multi-party interaction (AMI) and Mandarin meeting recording (MMRC) corpora, which seem to reveal the effectiveness of our proposed methods in relation to several existing baselines. | en_US |
dc.description.sponsorship | 資訊工程學系 | zh_TW |
dc.identifier | G060347034S | |
dc.identifier.uri | http://etds.lib.ntnu.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dstdcdr&s=id=%22G060347034S%22.&%22.id.& | |
dc.identifier.uri | http://rportal.lib.ntnu.edu.tw:80/handle/20.500.12235/106414 | |
dc.language | 中文 | |
dc.subject | 多任務學習 | zh_TW |
dc.subject | 整體學習 | zh_TW |
dc.subject | 深層學習 | zh_TW |
dc.subject | 類神經網路 | zh_TW |
dc.subject | 會議語音辨識 | zh_TW |
dc.subject | multi-task learning | en_US |
dc.subject | ensemble learning | en_US |
dc.subject | deep learning | en_US |
dc.subject | neural network | en_US |
dc.subject | meeting speech recognition | en_US |
dc.title | 改善類神經網路聲學模型經由結合多任務學習與整體學習於會議語音辨識之研究 | zh_TW |
dc.title | Improved Neural Network Based Acoustic Modeling Leveraging Multi-task Learning and Ensemble Learning for Meeting Speech Recognition | en_US |
Files
Original bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- 060347034s01.pdf
- Size:
- 3.22 MB
- Format:
- Adobe Portable Document Format