探究語句模型技術應用於摘錄式語音文件摘要

dc.contributor陳柏琳zh_TW
dc.contributorBer-lin Chenen_US
dc.contributor.author張皓欽zh_TW
dc.contributor.authorHao-Chin Changen_US
dc.date.accessioned2019-09-05T11:44:53Z
dc.date.available2013-2-21
dc.date.available2019-09-05T11:44:53Z
dc.date.issued2013
dc.description.abstract摘錄式語音摘要是根據事先定義的摘要比例,從語音文件中選取一些重要的語句來產生簡潔的摘要以代表原始文件的主旨或主題,在近幾年已成為一項非常熱門的研究議題。其中,使用語言模型(Language Modeling)架構結合庫爾貝克-萊伯勒差異量(Kullback-Leibler Divergence)來進行重要語句選取的方法,在一些文字與語音文件摘要任務上已展現不錯的效能。本論文延伸此一方法而三個主要貢獻。首先,基於所謂關聯性(Relevance)的概念,我們探索新穎的語句模型技術。透過不同層次(例如詞或音節)索引單位的使用所建立的語句模型能與文件模型進行比對,來估算候選摘要語句與語音文件的關係。再者,我們不僅使用了語音文件中所含有語彙資訊(Lexical Information),也使用了語音文件中所含隱含的主題資訊(Topical Information)來建立各種語句模型。最後,為了改善關聯模型(Relevance Modeling)需要初次檢索的問題,本論文提出了詞關聯模型(Word Relevance Modeling)。語音摘要實驗是在中文廣播新聞上進行;相較於其它非監督式摘要方法,本論文所提出摘要方法似乎能有一定的效能提升。zh_TW
dc.description.abstractExtractive speech summarization, aiming to select an indicative set of sentences from a spoken document so as to concisely represent the most important aspects of the document, has emerged as an attractive area of research and experimentation. A recent school of thought is to employ the language modeling (LM) framework along with the Kullback-Leibler (KL) divergence measure for important sentence selection, which has shown preliminary promise for extractive speech summarization. Our work in this paper continues this general line of research in three significant aspects. First, we explore a novel sentence modeling approach built on top of the notion of relevance, where the relationship between a candidate summary sentence and the spoken document to be summarized is discovered through various granularities of semantic context for relevance modeling. Second, not only lexical but also topical cues inherent in the spoken document are exploited for sentence modeling. Third, to counteract the shortcoming of the RM approach, need of resorting to a time-consuming retrieval procedure for relevance modeling, we present a word relevance modeling(WRM) approach. Experiments on broadcast news summarization seem to demonstrate the performance merits of our methods when compared to several existing unsupervised methods.en_US
dc.description.sponsorship資訊工程學系zh_TW
dc.identifierGN0699470503
dc.identifier.urihttp://etds.lib.ntnu.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dstdcdr&s=id=%22GN0699470503%22.&%22.id.&
dc.identifier.urihttp://rportal.lib.ntnu.edu.tw:80/handle/20.500.12235/106917
dc.language中文
dc.subject語音摘要zh_TW
dc.subject語句模型zh_TW
dc.subject語言模型zh_TW
dc.subject庫爾貝克-萊伯勒差異量zh_TW
dc.subjectSpeech summarizationen_US
dc.subjectsentence modelingen_US
dc.subjectlanguage modelingen_US
dc.subjectKullback-Leibler divergenceen_US
dc.title探究語句模型技術應用於摘錄式語音文件摘要zh_TW
dc.titleSentence Modeling Techniques for Extractive Spoken Document Summarizationen_US

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
n069947050301.pdf
Size:
1.43 MB
Format:
Adobe Portable Document Format

Collections