柯佳伶Koh,Jia-ling徐毓雯Hsu,Yu-wen2019-09-052016-2-212019-09-052011http://etds.lib.ntnu.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dstdcdr&s=id=%22GN0697470743%22.&%22.id.&http://rportal.lib.ntnu.edu.tw:80/handle/20.500.12235/106818現今大多數意見探勘研究中,對於產品特徵字詞的挑選大多由人工給定或是依據詞頻的高低來決定,對不同種類的產品則需要重新給定產品特徵字詞,因此我們希望透過自動擷取產品特徵字詞,降低在產品特徵挑選所花費的人力成本。本論文運用不同的字詞重要性評估方式,探討如何有效地自動從論壇文章中擷取出產品特徵字詞。我們以名詞為候選特徵字詞,分別對論壇文件庫及相機介紹文件庫,統計每個字詞在文件庫中各廠牌討論文的出現頻率,反應出一般常見特徵;運用不同廠牌產品特徵字詞出現的機率差異程度,反應出廠牌特有特徵;並運用廠牌與特徵字詞出現的相關程度,反應出廠牌關聯特徵。此外我們亦考慮跨文件庫的字詞出現機率差異程度,反應出論壇及相機文中常用的產品特徵字詞,再透過常見字詞列表進行一般口語字詞的過濾篩選。我們提出產品特徵字詞重要性評估函式,結合各種分析方法所得的重要性評估值作為產品特徵字詞擷取的依據。實驗結果顯示以所提出的字詞重要性評估函式篩選字詞,可有效地自動擷取出產品特徵字詞。In the recent researches on opinion mining, the feature terms of products are usually manual assigned or determined according to the term frequencies. Consequently, it would take lots of costs when we choose different products. For this reason, the goal of this thesis is to study how to extract feature terms of products from documents in a forum automatically and effectively. We select forum and expert commentaries as the corpora. Within a corpus, the nouns appearing in the documents are selected as the candidate feature terms. The term frequency is counted for each candidate term for the documents discussing a certain brand, which shows the popularity of a feature term. The divergence of probability between different brands is calculated for each candidate term, which shows the particular feature term of a brand. The correlation of a feature term with a brand is also calculated to show the related terms of a brand. Furthermore, the divergence of probability between the two different corpora is calculated for a candidate term to show the special terms of different corpora. Finally, we propose an importance measure function of terms to evaluate the importance of terms, which combine the scores of the above various evaluation methods. The experimental results show that the rank list of feature terms obtained by using the importance measure function could extract product feature terms automatically and effectively.產品評論特徵自動擷取字詞重要性評估函式意見探勘feature terms of productsautomatic extractionimportance measure function of termsopinion mining產品評論特徵自動擷取之研究Automatic Feature Terms Extraction for Product Opinions