應用兩階段生成模型於會議摘要之研究
No Thumbnail Available
Date
2023
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
近年來,由於疫情的影響和遠端工作的普及,線上會議和視訊交流平台的使用 變得更加廣泛。但隨之而來的問題是,會議記錄往往包含許多分散的資訊,要 在大量的對話中擷取和理解關鍵資訊是困難的,且隨著會議越來越頻繁,意味 著參與者需要在有限的時間內掌握會議的要點,以便在忙碌的日程中做出明智 的決策。在這樣的情境下,能夠從會議紀錄中自動辨識和摘要出關鍵資訊的技 術變得更為重要。自動文件摘要主要分為擷取式 (Extractive) 和重寫式 (Abstractive) 兩種方 法,擷取式摘要透過計算原始文件中每個句子的重要性分數,選擇得分高的句 子並將它們組合起來成為摘要。重寫式摘要透過對原始文件的理解重新改寫句 子,生成出一個簡潔且包含原始文件中核心內容的摘要。由於對話中的話語經 常是不流暢且資訊分散的,使用擷取式摘要容易擷取出不完整的句子,造成可 讀性不高。目前在會議摘要任務中,主要的應用是能夠將原始語句改寫的重寫 式摘要。雖然已有許多相關的研究被提出,重寫式的方法應用在會議摘要中仍 面臨幾個普遍性的限制,包括輸入長度問題、複雜的對話結構,以及缺乏訓練 資料與事實不一致,而這些問題也是提高會議摘要模型效能的關鍵。本論文專注在「輸入長度問題」和「對話式結構」的研究,提出了一個先 擷取後生成的會議摘要模型架構,在擷取階段設計了三種方法來選擇重要的文 本片段,分別是異質圖神經網路模型、對話語篇剖析和文本相似度。在生成階 段使用先進的生成式預訓練模型。實驗結果顯示,提出的方法透過微調基線模 型,可以達到效果提升。
In recent years, the use of online meetings and video communication platforms has become more widespread due to the impact of the pandemic and the popularity of remote work. However, this trend brings along certain challenges. Meeting transcripts often contain scattered information, making it difficult to extract and understand key details from a large volume of conversations. Additionally, as meetings become increasingly frequent, participants need to grasp the main points of the discussions within limited time to make informed decisions amidst their busy schedules. In such a context, the ability to automatically identify and summarize crucial information from meeting transcripts becomes even more important.Automatic document summarization can be categorized into two main approaches: extractive and abstractive. Extractive summarization calculates the importance scores of each sentence in the original document and selects high-scoring sentences to form the summary. On the other hand, abstractive summarization involves understanding the original document and rewriting sentences to generate a concise summary that captures the core content. Extractive summarization is prone to extracting incomplete sentences due to the often disjointed and scattered nature of dialogues, leading to reduced readability. Currently, the primary application in meeting summarization tasks is abstractive summarization, which involves rewriting the original sentences. Despite the numerous related studies, the application of abstractive methods in meeting summarization still faces several common limitations, including input length constraints, complex dialogue structures, the lack of training data, and consistency with facts. Addressing these issues is crucial for improving the performance of meeting summarization models.This paper focuses on the research of"input length constraints" and "dialogue- style structures" and proposes a meeting summarization model architecture that follows an extract-then-generate approach. In the extraction phase, three methods are designed to select important text segments: heterogeneous graph neural network model, dialogue discourse parsing, and cosine similarity. Advanced generative pre-training model are employed in the generation phase. Experimental results demonstrate that the proposed approach, through fine-tuning the baseline model, achieves performance improvements.
In recent years, the use of online meetings and video communication platforms has become more widespread due to the impact of the pandemic and the popularity of remote work. However, this trend brings along certain challenges. Meeting transcripts often contain scattered information, making it difficult to extract and understand key details from a large volume of conversations. Additionally, as meetings become increasingly frequent, participants need to grasp the main points of the discussions within limited time to make informed decisions amidst their busy schedules. In such a context, the ability to automatically identify and summarize crucial information from meeting transcripts becomes even more important.Automatic document summarization can be categorized into two main approaches: extractive and abstractive. Extractive summarization calculates the importance scores of each sentence in the original document and selects high-scoring sentences to form the summary. On the other hand, abstractive summarization involves understanding the original document and rewriting sentences to generate a concise summary that captures the core content. Extractive summarization is prone to extracting incomplete sentences due to the often disjointed and scattered nature of dialogues, leading to reduced readability. Currently, the primary application in meeting summarization tasks is abstractive summarization, which involves rewriting the original sentences. Despite the numerous related studies, the application of abstractive methods in meeting summarization still faces several common limitations, including input length constraints, complex dialogue structures, the lack of training data, and consistency with facts. Addressing these issues is crucial for improving the performance of meeting summarization models.This paper focuses on the research of"input length constraints" and "dialogue- style structures" and proposes a meeting summarization model architecture that follows an extract-then-generate approach. In the extraction phase, three methods are designed to select important text segments: heterogeneous graph neural network model, dialogue discourse parsing, and cosine similarity. Advanced generative pre-training model are employed in the generation phase. Experimental results demonstrate that the proposed approach, through fine-tuning the baseline model, achieves performance improvements.
Description
Keywords
會議摘要, 自動文件摘要, 自然語言處理, 異質圖神經網路, 對話語 篇剖析, 生成式模型, Meeting Summarization, Automatic Document Summarization, Natural Language Processing, Heterogeneous Graph Neural Network, Dialogue Discourse Parsing, Generative Model