兩個專有詞彙關聯句自動擷取之研究 Associated Sentences Retrieval for Two Domain-Specific Terms

dc.contributor 柯佳伶 zh_TW
dc.contributor.author 鍾昇宏 zh_TW
dc.contributor.author Sheng-Hong Chung en_US
dc.date.accessioned 2019-09-05T11:42:02Z
dc.date.available 2017-8-23
dc.date.available 2019-09-05T11:42:02Z
dc.date.issued 2012
dc.description.abstract 本論文之研究目的是針對可信文字資料來源,根據使用者所輸入的兩個專有詞彙,依照詞彙不同的關係,由資料來源中自動找出關聯句組或是關聯句,幫助使用者比較兩個專有詞彙概念。我們將詞彙關係分成兩大類:包含關係和非包含關係。本系統利用網路搜尋引擎分別搜尋兩個查詢詞彙,蒐集包含個別查詢詞彙的前K名網頁摘要,統計兩個查詢詞彙在彼此網頁摘要中出現的機率作為特徵,依據詞彙關係分類模型進行自動分類。兩個查詢詞彙若被分類為”包含”關係,系統會取出同時包含兩個查詢詞彙之句子作為關聯句集,比對關聯句型規則模型,並計算與查詢詞彙之語意關聯度,選出關聯分數最高的句子當作關聯句。查詢詞彙若被分類為 ”非包含” 關係,系統則取出包含任一查詢詞彙的句子作為關聯句集,從中找出對兩個查詢詞彙有高度關聯的共同概念詞,將句子依照共同概念詞進行分群,評估句子與共同概念詞以及句子間兩兩配對的語意相關分數,挑選分數最高的兩個句子形成關聯句組。實驗結果顯示本研究所提出的方法能有效對查詢字組的關係自動分類;考慮句型和語意關聯度分數找出的關聯句有助於使用者了解查詢詞彙的關聯性;而利用句組分數篩選出的關聯句組亦大多可以幫助使用者釐清兩個查詢詞彙在某些概念上相同相異的比較。 zh_TW
dc.description.abstract According to different relationships between two domain-specific query terms, this thesis studies the strategies of automatically extracting the associated sentences or sentence pairs of the query terms from a reliable text data source. The goal of this task is to help users comparing two domain-specific query terms from the retrieved results. Two categories for the relationships between query terms are defined in this thesis: contained and not-contained relationships. The system uses a search engine on theweb to search the given two query termsforcollecting the top-k snippets for each query term. The probability of a query term appearing in the top-k snippets of the other query term is used as features to train aclassifier of query pair relationship. Ifthe two query terms have the containedrelationship, the sentences containing both terms are retrieved as the candidate sentences.Foreach candidate sentence, itsassociated score is evaluated by matching the lexical pattern withthe associated sentence rule model and computing the semantic relatedness degreewith the query terms. The sentence with the highest associated score is selected as the associated sentence.If the relationship is a not-containedrelationship, the common concept terms, which have high semantic relatedness with both query terms, are extracted from the sentences containingone of the two query terms.We use common concept terms to group sentences.Within each group, the representation scoreof each candidate sentence pair is evaluated by computing its sematic relatedness with the concept terms andthe sematic relatedness sematic similaritybetween the sentence pair. The sentence pairwith the highest representation score isselected as an associated sentence pair.The experimental results show that the proposed methodcan effectively classifythe relationshipsof query terms. Moreover, the retrieved associated sentencesare helpful for usersto understand the semantic relationshipbetween two query terms.The discovered associated sentence pairs also effectively help users to clarify the similar and dissimilar concept between two query terms. en_US
dc.description.sponsorship 資訊工程學系 zh_TW
dc.identifier GN0699470084
dc.identifier.uri http://etds.lib.ntnu.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dstdcdr&s=id=%22GN0699470084%22.&%22.id.&
dc.identifier.uri http://rportal.lib.ntnu.edu.tw:80/handle/20.500.12235/106889
dc.language 中文
dc.subject 專有詞彙 zh_TW
dc.subject 問題分類 zh_TW
dc.subject 句型樣式 zh_TW
dc.subject 語意關聯度 zh_TW
dc.subject 關聯句 zh_TW
dc.subject 關聯句組 zh_TW
dc.subject domain-specific term en_US
dc.subject query classification en_US
dc.subject lexical pattern en_US
dc.subject relatedness degree en_US
dc.subject associated sentence en_US
dc.subject associated sentence pair en_US
dc.title 兩個專有詞彙關聯句自動擷取之研究 zh_TW
dc.title Associated Sentences Retrieval for Two Domain-Specific Terms en_US
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
n069947008401.pdf
Size:
1.19 MB
Format:
Adobe Portable Document Format
Description:
Collections