使用鑑別式語言模型於語音辨識結果重新排序

劉鳳萍

使用鑑別式語言模型於語音辨識結果重新排序

dc.contributor	陳柏琳	zh_TW
dc.contributor.author	劉鳳萍	zh_TW
dc.date.accessioned	2019-09-05T11:32:43Z
dc.date.available	2009-8-25
dc.date.available	2019-09-05T11:32:43Z
dc.date.issued	2009
dc.description.abstract	語言模型代表語言的規律性，在語音辨識中，它可用以減輕聲學特徵混淆所造成的問題，引導辨識器在多個候選字串中作搜尋，並量化辨識器產生的最終辨識結果字串的可接受度高低。然而，隨著時空及領域的不同，語言產生差異，固定不變的語言模型無法符合實際需求。語言模型調適提供了一個解決之道，使用少量同時期或同領域的調適語料對語言模型進行調整，以增進效能。鑑別式語言模型為語言模型調適方法之一，它首先取得一些特徵(Feature)，每一個特徵各有其對應之權重(Feature Weight)，以代表語言中的句子或字串，並以這些特徵及其相關權重為基礎，構建出一套評分機制，用以對基礎辨識器(Baseline Recognizer)所產生的多個辨識結果進行重新排序(Reranking)，以期最正確的詞序列可以成為最終辨識結果。本文提出以關鍵詞自動擷取方法所得結果，增加鑑別式語言模型之特徵。關鍵詞自動擷取方法是透過計算字或詞在語料庫中同時重複出現的次數以擷取出關鍵詞，其優點為可以在不依賴詞典(Lexicon)的情況下，擷取出新生詞彙或不存在詞典裡的語彙，這樣的特性也許會對鑑別式訓練有所助益，但實驗結果顯示未有顯著之改善效果。	zh_TW
dc.description.abstract	A language model (LM) is designed to represent the regularity of a given language. When applied to speech recognition, it can be used to constrain the acoustic analysis, guide the search through multiple candidate word strings, and quantify the acceptability of the final word string output from a recognizer. However, the regularity of a language would change along with time and cross domains, such that a static or invariable language model cannot meet the realistic demand. Language model adaptation seems to provide a solution, by using a small amount of contemporaneous or in-domain data to adapt the original language model, for better performance. The discriminative model is one of the representative approaches for language model adaptation in speech recognition. It first derives a set of indicative features, where each feature has a different weight, to characterize sentences or word strings in a language, and then build a sentence scoring mechanism on the basis of these features and the associated weights. This mechanism is used to re-rank the M-best recognition results such that the most correct candidate word string is expected to be on the top of the rank. This paper proposes an approach which takes the results of a fast keyword extraction method as additional features for the discriminative model. This method extracts keywords by counting the repetition of co-occurrences of characters or words in the speech corpus, such that these keywords may capture the regularity of language being used. A nice property is that it extracts keyword without the need of a lexicon, so it can extract new keywords and the keywords which do not exist in, or contain words of the lexicon. This property may be useful for discriminative language modeling, but, however, empirical experiments show it only provides insignificant improvements.	en_US
dc.description.sponsorship	資訊工程學系	zh_TW
dc.identifier	GN0696470643
dc.identifier.uri	http://etds.lib.ntnu.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dstdcdr&s=id=%22GN0696470643%22.&%22.id.&
dc.identifier.uri	http://rportal.lib.ntnu.edu.tw:80/handle/20.500.12235/106754
dc.language	中文
dc.subject	鑑別式語言模型	zh_TW
dc.subject	語言模型調適	zh_TW
dc.subject	關鍵詞自動擷取方法	zh_TW
dc.subject	discriminative language model	en_US
dc.subject	language model adaptation	en_US
dc.subject	Boosting	en_US
dc.subject	Perceptron	en_US
dc.subject	Minimum Sample Risk	en_US
dc.subject	keyword extraction	en_US
dc.title	使用鑑別式語言模型於語音辨識結果重新排序	zh_TW
dc.title	Applying Discriminative Language Models to Reranking of M-best Speech Recognition Results	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: n069647064301.pdf
Size:: 1.07 MB
Format:: Adobe Portable Document Format

Download

Collections

學位論文