提供網頁搜尋結果篩選之查詢字詞推薦
No Thumbnail Available
Date
2014
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
本研究的目標是從搜尋引擎所回傳的大量搜尋結果,評估挑選出一些查詢推薦字,讓使用者透過這些推薦字篩選搜尋結果,以減少使用者瀏覽搜尋結果的負擔。本研究提出一個雙層的查詢字詞推薦方法,稱為M_PhRank,第一層提供概念廣的主題查詢字詞,第二層則呈現語意較明確的次主題查詢字詞。本論文提出的方法主要分為挑選主題查詢字詞,計算單字語意明確度以及挑選次主題查詢字詞三部分。在第一部分,針對前處理後留下的單字藉由涵蓋的資料物件數量作為挑選依據,將主題查詢字詞作為第一階層的推薦。第二部分建立單字之間的鄰近位置出現的關係圖,以此關係圖透過隨機漫步演算法,計算各個候選字在該搜尋結果中的語意明確程度。最後,基於給定的推薦字詞之數量,依據主題查詢字詞的涵蓋率做比例分配,評估其第二層可推薦之數量進而挑選推薦字詞,完成階層架構之建置。實驗顯示M_PhRank比基準方法能涵蓋更多查詢結果關聯度高的物件,且能降低涵蓋率提升時重複率增加的幅度;另外,從使用者評估的實驗結果顯示, M_PhRank所建立的查詢推薦字架構能提供較好的輔助查詢效果。
The goal of this thesis is to automatically suggest query keywords from the search results returned by the search engine in order to further filter the large amount of search results by using these query keywords as the specialized queries. A two-level query suggestion method, called the M_PhRank, is proposed. The first level suggestion aims to provide the query terms, which can cover search results as many as possible, and the query terms in the second level should have clear meaning and lower overlap between their covered objects. Firstly, the coverage over search results is computed as the novelty score of a word, which is used to select the topic terms in the first level suggestion. Secondly, the semantic scores of words are estimated by using the random walk algorithm on the co-occurrence graph of words. The query keywords consisting of 2-3 non-topic terms form the candidate subtopic terms, whose semantic scores are computed according to the semantic scores of their composing words. According to the given suggestion number, the number of subtopic terms under the topic-terms is decided proportional to the coverage of the topic terms. Finally, the hierarchical query suggestion structure is constructed by the topic terms in first level and their corresponding subtopic terms on the second level. The empirical experiment results show that the M_PhRank method performs better than the baseline method on providing more semantics specific terms and high coverage with limited overlap increasing. Moreover, according to user survey, the hierarchy of query keyword suggestions constructed by M_PhRank gets high satisfaction on query assistance.
The goal of this thesis is to automatically suggest query keywords from the search results returned by the search engine in order to further filter the large amount of search results by using these query keywords as the specialized queries. A two-level query suggestion method, called the M_PhRank, is proposed. The first level suggestion aims to provide the query terms, which can cover search results as many as possible, and the query terms in the second level should have clear meaning and lower overlap between their covered objects. Firstly, the coverage over search results is computed as the novelty score of a word, which is used to select the topic terms in the first level suggestion. Secondly, the semantic scores of words are estimated by using the random walk algorithm on the co-occurrence graph of words. The query keywords consisting of 2-3 non-topic terms form the candidate subtopic terms, whose semantic scores are computed according to the semantic scores of their composing words. According to the given suggestion number, the number of subtopic terms under the topic-terms is decided proportional to the coverage of the topic terms. Finally, the hierarchical query suggestion structure is constructed by the topic terms in first level and their corresponding subtopic terms on the second level. The empirical experiment results show that the M_PhRank method performs better than the baseline method on providing more semantics specific terms and high coverage with limited overlap increasing. Moreover, according to user survey, the hierarchy of query keyword suggestions constructed by M_PhRank gets high satisfaction on query assistance.
Description
Keywords
查詢字推薦, 階層式推薦, 隨機漫走, query suggestions, hierarchical suggestions, random walk