探究語句模型技術應用於摘錄式語音文件摘要

張皓欽; Hao-Chin Chang

探究語句模型技術應用於摘錄式語音文件摘要

dc.contributor	陳柏琳	zh_TW
dc.contributor	Ber-lin Chen	en_US
dc.contributor.author	張皓欽	zh_TW
dc.contributor.author	Hao-Chin Chang	en_US
dc.date.accessioned	2019-09-05T11:44:53Z
dc.date.available	2013-2-21
dc.date.available	2019-09-05T11:44:53Z
dc.date.issued	2013
dc.description.abstract	摘錄式語音摘要是根據事先定義的摘要比例，從語音文件中選取一些重要的語句來產生簡潔的摘要以代表原始文件的主旨或主題，在近幾年已成為一項非常熱門的研究議題。其中，使用語言模型(Language Modeling)架構結合庫爾貝克-萊伯勒差異量(Kullback-Leibler Divergence)來進行重要語句選取的方法，在一些文字與語音文件摘要任務上已展現不錯的效能。本論文延伸此一方法而三個主要貢獻。首先，基於所謂關聯性(Relevance)的概念，我們探索新穎的語句模型技術。透過不同層次(例如詞或音節)索引單位的使用所建立的語句模型能與文件模型進行比對，來估算候選摘要語句與語音文件的關係。再者，我們不僅使用了語音文件中所含有語彙資訊(Lexical Information)，也使用了語音文件中所含隱含的主題資訊(Topical Information)來建立各種語句模型。最後，為了改善關聯模型(Relevance Modeling)需要初次檢索的問題，本論文提出了詞關聯模型(Word Relevance Modeling)。語音摘要實驗是在中文廣播新聞上進行；相較於其它非監督式摘要方法，本論文所提出摘要方法似乎能有一定的效能提升。	zh_TW
dc.description.abstract	Extractive speech summarization, aiming to select an indicative set of sentences from a spoken document so as to concisely represent the most important aspects of the document, has emerged as an attractive area of research and experimentation. A recent school of thought is to employ the language modeling (LM) framework along with the Kullback-Leibler (KL) divergence measure for important sentence selection, which has shown preliminary promise for extractive speech summarization. Our work in this paper continues this general line of research in three significant aspects. First, we explore a novel sentence modeling approach built on top of the notion of relevance, where the relationship between a candidate summary sentence and the spoken document to be summarized is discovered through various granularities of semantic context for relevance modeling. Second, not only lexical but also topical cues inherent in the spoken document are exploited for sentence modeling. Third, to counteract the shortcoming of the RM approach, need of resorting to a time-consuming retrieval procedure for relevance modeling, we present a word relevance modeling(WRM) approach. Experiments on broadcast news summarization seem to demonstrate the performance merits of our methods when compared to several existing unsupervised methods.	en_US
dc.description.sponsorship	資訊工程學系	zh_TW
dc.identifier	GN0699470503
dc.identifier.uri	http://etds.lib.ntnu.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dstdcdr&s=id=%22GN0699470503%22.&%22.id.&
dc.identifier.uri	http://rportal.lib.ntnu.edu.tw:80/handle/20.500.12235/106917
dc.language	中文
dc.subject	語音摘要	zh_TW
dc.subject	語句模型	zh_TW
dc.subject	語言模型	zh_TW
dc.subject	庫爾貝克-萊伯勒差異量	zh_TW
dc.subject	Speech summarization	en_US
dc.subject	sentence modeling	en_US
dc.subject	language modeling	en_US
dc.subject	Kullback-Leibler divergence	en_US
dc.title	探究語句模型技術應用於摘錄式語音文件摘要	zh_TW
dc.title	Sentence Modeling Techniques for Extractive Spoken Document Summarization	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: n069947050301.pdf
Size:: 1.43 MB
Format:: Adobe Portable Document Format

Download

Collections

學位論文