探索虛擬關聯回饋技術和鄰近資訊於語音文件檢索與辨識之改進

dc.contributor陳柏琳zh_TW
dc.contributorBerlin Chenen_US
dc.contributor.author陳憶文zh_TW
dc.date.accessioned2019-09-05T11:44:33Z
dc.date.available2013-8-15
dc.date.available2019-09-05T11:44:33Z
dc.date.issued2013
dc.description.abstract虛擬文件檢索(Pseudo-Relevance Feedback)為目前最常見的查詢重建(Query Reformulation)典範。它假設預檢索(Initial-round of Retrieval)排名前端的文件都是相關的,所以可全用於查詢擴展(Query Expansion)。然而,預檢索所獲得的文件中,極可能同時包含重複性資訊(Redundant)和非關聯(Non-relevant)資訊,使得重新建立的查詢不能有良好檢索效能。有鑑於此,本論文探討運用不同資訊以在預檢索獲得的語音文件中挑選適當的關聯文件來建立查詢表示,讓語音文件檢索結果可以更準確。另一方面,關聯模型(Relevance Model )雖然可藉由詞袋(Bag-of-words)假設來簡化模型推導和估測,卻可能因此過度簡化問題,特別是用於語音辨識的語言模型。為了調適關聯模型,本論文有兩個貢獻。其一,本論文提出詞鄰近資訊使用於關聯模型以改善詞袋(Bag-of-words)假設於語音辨識的不適。其二,本論文也進一步探討主題鄰近資訊以強化鄰近關聯模型的架構。實驗結果證明本論文所提出之方法,不論在語音文件檢索還是語音辨識方面皆可有效改善現有方法的效能。zh_TW
dc.description.abstractPseudo-relevance feedback is by far the most commonly-used paradigm for query reformulation in spoken document retrieval, which assumes that a small amount of top-ranked feedback documents obtained from the initial retrieval are relevant and can be utilized for query expansion. Nevertheless, simply taking all of the top-ranked feedback documents acquired from the initial retrieval for query modeling does not necessary work well, especially when the top-ranked documents contain much redundant or non-relevant cues. In view of this, we explore different kinds of information cues for selecting helpful feedback documents to further improve information retrieval. On the other hand, relevance model (RM) based on “bag-of-words” assumption, which can facilitate the derivation and estimation, may be oversimplified for the task of language modeling in speech recognition. Hence, we also enhance RM in two significant aspects. First, “bag-of-words” assumption of RM is relaxed by incorporating word proximity information into RM formulation. Second, topic-based proximity information is additionally explored to further enhance the proximity-based RM framework. Experiments conducted on not only a spoken document retrieval task but also a speech recognition task indicates that our approaches can bring competitive utilities to existing ones.en_US
dc.description.sponsorship資訊工程學系zh_TW
dc.identifierGN0699470462
dc.identifier.urihttp://etds.lib.ntnu.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dstdcdr&s=id=%22GN0699470462%22.&%22.id.&
dc.identifier.urihttp://rportal.lib.ntnu.edu.tw:80/handle/20.500.12235/106914
dc.language英文
dc.subject語音文件檢索zh_TW
dc.subject語音辨識zh_TW
dc.subject語言模型zh_TW
dc.subject虛擬關聯回饋zh_TW
dc.subject鄰近資訊zh_TW
dc.subjectSpoken document retrievalen_US
dc.subjectSpeech Recognitionen_US
dc.subjectLanguage Modelingen_US
dc.subjectPseudo-Relevance Feedbacken_US
dc.subjectProximityen_US
dc.title探索虛擬關聯回饋技術和鄰近資訊於語音文件檢索與辨識之改進zh_TW
dc.titleExploring Effective Pseudo-Relevance Feedback and Proximity Information for Speech Retrieval and Transcriptionen_US

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
n069947046201.pdf
Size:
2.84 MB
Format:
Adobe Portable Document Format

Collections