以噪音分類為基礎之深度學習噪音消除法提升人工電子耳使用者之語音理解度表現
Abstract
人工電子耳(cochlear implant, CI)是現今唯一可幫助全聾患者重新聽見聲音的重要科技。於過去的研究證明人工電子耳於安靜的溝通環境下能有效的幫助患者提升語音理解能力。但在噪音環境下,其效益仍存在許多改進空間,並期望能發展出更有效的訊號處理來提升使用者之滿意度。近年,一個基於深度學習理論所發展出的噪音消除方法被提出,即是 deep denoising autoencoder(DDAE)。其研究成果證明,DDAE 噪音消除法在人工電子耳模擬測試下,有顯著的語音理解力的改善效益。但對於真實人工電子耳使用者來說,其 DDAE 之效益仍未有研究證據。有鑑於此,本論文將基於 DDAE 噪 音 消 除 法 進 行 改 良 , 並 提 出 一 個 新 的 噪 音 消 除 方 法 , 稱 noise classification+DDAE (NC+DDAE)。此外,也將所提出之方法進行真實人工電子耳使用者之臨床效益驗證。從客觀之聲電指標驗證及語音聽辨力測試結果發現,在噪音環境下,NC+DDAE 能比兩個常見的傳統噪音消除法(logMMSE, KLT)有更佳之語音理解力表現,特別是噪音是己知情況。更具體的來說,當噪音情境是已知時,其 NC+DDAE 分別在不同測試條件下能比其他方法最多提升了 41.5 %之語音理解度表現;當噪音情境是未知的情況下其 NC+DDAE 能比其他方法最多提升了 17.5 %之語音理解度表現。有鑑於上述之結果證明,本論文所提出之 NC+DDAE 噪音消除法將能有效的提升人工電子耳使用者於噪音情境下之聆聽效益。
Cochlear implant (CI) is the only technology to help deaf hearing loss individual to hear sound again. Previous studies demonstrate that the CI technologies has enabled many CI users to enjoy a high level of speech understanding in quiet; however, for the most CI users, listening under noisy conditions remains challenging and desire the efficient signal processing be proposed to overcome this issue. More recently, deep learning-based NR approach, called deep denoising autoencoder (DDAE), have been proposed and confirmed to be effective in various NR tasks. In addition, the previous study indicated that the DDAE-based NR approach yielded higher intelligibility scores than those obtained with conventional NR techniques in CI simulation; however, the efficacy of the DDAE NR approach for real CI recipients remains unevaluated. In view of this, this study further to evaluate the performance of DDAE-based NR in real CI subject. In addition, a new DDAE-based NR model, called NC+DDAE, has been proposed in this study to further improve the intelligibility performance for CI users. The experimental results of objective evaluation and listening test indicate that, under challenging listening conditions, the proposed NC+DDAE NR approach yields higher intelligibility scores than two classical NR techniques (i.e., logMMSE, KLT), especially under match training condition. More specifically, the NC+DDAE improve speech recognition up to 41.5 % and 17.5 % at most under test conditions when the noise has ever been provided and never provided in training phase. The present study demonstrates that, under challenging listening conditions, the proposed NC+DDAE NR approach could improve speech recognition more effectively when compared to conventional NR techniques. Furthermore, the results shows that NC+DDAE has superior noise suppression capabilities, and provides less distortion of speech envelope information for Mandarin CI recipients, compared to conventional techniques. Therefore, the proposed NC+DDAE NR approach can potentially be integrated into existing CI processors to overcome the degradation of speech perception caused by noise.
Cochlear implant (CI) is the only technology to help deaf hearing loss individual to hear sound again. Previous studies demonstrate that the CI technologies has enabled many CI users to enjoy a high level of speech understanding in quiet; however, for the most CI users, listening under noisy conditions remains challenging and desire the efficient signal processing be proposed to overcome this issue. More recently, deep learning-based NR approach, called deep denoising autoencoder (DDAE), have been proposed and confirmed to be effective in various NR tasks. In addition, the previous study indicated that the DDAE-based NR approach yielded higher intelligibility scores than those obtained with conventional NR techniques in CI simulation; however, the efficacy of the DDAE NR approach for real CI recipients remains unevaluated. In view of this, this study further to evaluate the performance of DDAE-based NR in real CI subject. In addition, a new DDAE-based NR model, called NC+DDAE, has been proposed in this study to further improve the intelligibility performance for CI users. The experimental results of objective evaluation and listening test indicate that, under challenging listening conditions, the proposed NC+DDAE NR approach yields higher intelligibility scores than two classical NR techniques (i.e., logMMSE, KLT), especially under match training condition. More specifically, the NC+DDAE improve speech recognition up to 41.5 % and 17.5 % at most under test conditions when the noise has ever been provided and never provided in training phase. The present study demonstrates that, under challenging listening conditions, the proposed NC+DDAE NR approach could improve speech recognition more effectively when compared to conventional NR techniques. Furthermore, the results shows that NC+DDAE has superior noise suppression capabilities, and provides less distortion of speech envelope information for Mandarin CI recipients, compared to conventional techniques. Therefore, the proposed NC+DDAE NR approach can potentially be integrated into existing CI processors to overcome the degradation of speech perception caused by noise.
Description
Keywords
人工電子耳, 噪音消除, DDAE, 噪音分類器, 深度學習, cochlear implant, noise reduction, deep denoising autoencoder, noise classification, deep learning