探索虛擬關聯回饋技術和鄰近資訊於語音文件檢索與辨識之改進

陳憶文

探索虛擬關聯回饋技術和鄰近資訊於語音文件檢索與辨識之改進

dc.contributor	陳柏琳	zh_TW
dc.contributor	Berlin Chen	en_US
dc.contributor.author	陳憶文	zh_TW
dc.date.accessioned	2019-09-05T11:44:33Z
dc.date.available	2013-8-15
dc.date.available	2019-09-05T11:44:33Z
dc.date.issued	2013
dc.description.abstract	虛擬文件檢索(Pseudo-Relevance Feedback)為目前最常見的查詢重建(Query Reformulation)典範。它假設預檢索(Initial-round of Retrieval)排名前端的文件都是相關的，所以可全用於查詢擴展(Query Expansion)。然而，預檢索所獲得的文件中，極可能同時包含重複性資訊(Redundant)和非關聯(Non-relevant)資訊，使得重新建立的查詢不能有良好檢索效能。有鑑於此，本論文探討運用不同資訊以在預檢索獲得的語音文件中挑選適當的關聯文件來建立查詢表示，讓語音文件檢索結果可以更準確。另一方面，關聯模型(Relevance Model )雖然可藉由詞袋(Bag-of-words)假設來簡化模型推導和估測，卻可能因此過度簡化問題，特別是用於語音辨識的語言模型。為了調適關聯模型，本論文有兩個貢獻。其一，本論文提出詞鄰近資訊使用於關聯模型以改善詞袋(Bag-of-words)假設於語音辨識的不適。其二，本論文也進一步探討主題鄰近資訊以強化鄰近關聯模型的架構。實驗結果證明本論文所提出之方法，不論在語音文件檢索還是語音辨識方面皆可有效改善現有方法的效能。	zh_TW
dc.description.abstract	Pseudo-relevance feedback is by far the most commonly-used paradigm for query reformulation in spoken document retrieval, which assumes that a small amount of top-ranked feedback documents obtained from the initial retrieval are relevant and can be utilized for query expansion. Nevertheless, simply taking all of the top-ranked feedback documents acquired from the initial retrieval for query modeling does not necessary work well, especially when the top-ranked documents contain much redundant or non-relevant cues. In view of this, we explore different kinds of information cues for selecting helpful feedback documents to further improve information retrieval. On the other hand, relevance model (RM) based on “bag-of-words” assumption, which can facilitate the derivation and estimation, may be oversimplified for the task of language modeling in speech recognition. Hence, we also enhance RM in two significant aspects. First, “bag-of-words” assumption of RM is relaxed by incorporating word proximity information into RM formulation. Second, topic-based proximity information is additionally explored to further enhance the proximity-based RM framework. Experiments conducted on not only a spoken document retrieval task but also a speech recognition task indicates that our approaches can bring competitive utilities to existing ones.	en_US
dc.description.sponsorship	資訊工程學系	zh_TW
dc.identifier	GN0699470462
dc.identifier.uri	http://etds.lib.ntnu.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dstdcdr&s=id=%22GN0699470462%22.&%22.id.&
dc.identifier.uri	http://rportal.lib.ntnu.edu.tw:80/handle/20.500.12235/106914
dc.language	英文
dc.subject	語音文件檢索	zh_TW
dc.subject	語音辨識	zh_TW
dc.subject	語言模型	zh_TW
dc.subject	虛擬關聯回饋	zh_TW
dc.subject	鄰近資訊	zh_TW
dc.subject	Spoken document retrieval	en_US
dc.subject	Speech Recognition	en_US
dc.subject	Language Modeling	en_US
dc.subject	Pseudo-Relevance Feedback	en_US
dc.subject	Proximity	en_US
dc.title	探索虛擬關聯回饋技術和鄰近資訊於語音文件檢索與辨識之改進	zh_TW
dc.title	Exploring Effective Pseudo-Relevance Feedback and Proximity Information for Speech Retrieval and Transcription	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: n069947046201.pdf
Size:: 2.84 MB
Format:: Adobe Portable Document Format

Download

Collections

學位論文