錯誤發音檢測使用評估尺度相關訓練準則
No Thumbnail Available
Date
2016
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
錯誤發音檢測(mispronunciation detection)與錯誤發音診斷(mispronunciation diagnosis)為電腦輔助發音訓練系統的一部分,它們能輔助第二外語學習者準確地找出語句中錯誤發音的部位以增進學習者的口說熟練度。本論文延續過去學者的研究,大致可將貢獻分為三點:1) 我們透過最佳化評估尺度相關訓練法則估測深層類神經網路聲學模型的參數以及發音檢測決策函數之參數。2) 可以發現聲學模型經過我們的方法訓練後,後續的錯誤發音診斷任務之效能也得到改善。3) 我們將錯誤發音診斷視為分類任務,並利用過去學者所提出的蘊含豐富資訊之特徵以提升錯誤發音診斷的效果。一系列的實驗將建立在華語錯誤發音檢測與診斷任務,從實驗中可以觀察到我們提出的方法之優點。
Mispronunciation detection and diagnosis are part and parcel of a computer assisted pronunciation training (CAPT) system, collectively facilitating second-language (L2) learners to pinpoint erroneous pronunciations in a given utterance so as to improve their spoken proficiency. This thesis presents a continuation of such a general line of research and the major contributions are three-fold. First, we propose an effective training approach that estimates the deep neural network based acoustic models involved in the mispronunciation detection process by optimizing an objective directly linked to the ultimate evaluation metric. Second, we investigate the extent to which, the subsequent mispronunciation diagnosis can benefit from using these specifically trained acoustic models. Third, we recast mispronunciation diagnosis as a classification problem and leverage a rich set of features for the idea to work. A series of experiments on a Mandarin mispronunciation detection and diagnosis task seem to show the performance merits of the proposed methods.
Mispronunciation detection and diagnosis are part and parcel of a computer assisted pronunciation training (CAPT) system, collectively facilitating second-language (L2) learners to pinpoint erroneous pronunciations in a given utterance so as to improve their spoken proficiency. This thesis presents a continuation of such a general line of research and the major contributions are three-fold. First, we propose an effective training approach that estimates the deep neural network based acoustic models involved in the mispronunciation detection process by optimizing an objective directly linked to the ultimate evaluation metric. Second, we investigate the extent to which, the subsequent mispronunciation diagnosis can benefit from using these specifically trained acoustic models. Third, we recast mispronunciation diagnosis as a classification problem and leverage a rich set of features for the idea to work. A series of experiments on a Mandarin mispronunciation detection and diagnosis task seem to show the performance merits of the proposed methods.
Description
Keywords
電腦輔助發音訓練, 錯誤發音檢測, 錯誤發音診斷, 聲學模型, 深層類神經網路, computer assisted pronunciation training, mispronunciation detection, mispronunciation diagnosis, acoustic models, deep neural networks