中文文法剖析應用於電影評論之意見情感分類
No Thumbnail Available
Date
2012
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
在網路發達的現今社會,各種領域的評論資訊觸手可及,人們也習慣於收集產品的網路評論作為消費前的參考。尤其在電影產品上,除了從片商釋出預告片裡的片段內容外,事前無法試看,事後也無法退費。因此在前往電影院購票前,人們會更加重視網路上的評論心得。
在本篇論文中,收集來自電影評論網當中觀影民眾的評論文章,希望透過自然語言的分析技術,總結出一個電影整體的推薦分數以及數個電影元素(如劇情、演員、特效等)的高頻率意見詞,提供使用者選擇適合自己的電影觀賞。
在研究方法上,選擇以中文電影的評論文章為主,在傳統的電影評論意見分類步驟中引入中央研究院的中文剖析器,發展一套根據文法關係圖判斷意見詞與屬性詞配對的程式流程,以便針對大量字數的評論文章獲得更準確的分析及評分結果,最後再以五等第制的方式呈現。
實驗的結果證明本論文所提出系統的評分結果在誤差一分的情況下有70.7%的準確率,整體的MRR值為0.61;將五等第化為推薦與不推薦的結論時,也分別獲得了F-score 74.3%與51.4%的成果。這表示本實驗系統在透過大量收集網路評論文章來幫助使用者判斷電影的推薦程度上,確實達到預期的效果。
In the modern society with highly developing internet, it is easy to reach reviews of various domains. People are used to collect the reviews as references before their consumption. Especially in movie products, we can only preview some brief and fragmented contents by trailers and cannot refund after we watched it, so people think more highly of the movie reviews on the internet. In this study, we collected movie reviews from websites and analyzed them with nature language processing approaches, which resulted in a general recommendation grade and several frequent opinion keywords in some movie elements such as plots, actors/actresses, special effects…etc. According to these results, people can choose the movies that suit themselves. Focusing on the movie reviews in Chinese, the study leaded the CKIP Chinese Parser into traditional opinion mining approach to propose a new procedure which can extract the pairs of opinion keywords and feature keywords according to dependency grammar graphs. This parsing-based approach is more suitable for articles with plenty of words. The grading results will be presented by a 5-grade marking system. The experimental results show that the accuracy of our system, with the deviation of grades less than 1, is 70.7%, and the MRR value is 0.61. In addition, when we changed the 5-grade marking system into the recommend and un-recommend choices, we got F-score 74.3% and 51.4% respectively. The result indicates that our system can reach satisfied expectancy for movie recommendation.
In the modern society with highly developing internet, it is easy to reach reviews of various domains. People are used to collect the reviews as references before their consumption. Especially in movie products, we can only preview some brief and fragmented contents by trailers and cannot refund after we watched it, so people think more highly of the movie reviews on the internet. In this study, we collected movie reviews from websites and analyzed them with nature language processing approaches, which resulted in a general recommendation grade and several frequent opinion keywords in some movie elements such as plots, actors/actresses, special effects…etc. According to these results, people can choose the movies that suit themselves. Focusing on the movie reviews in Chinese, the study leaded the CKIP Chinese Parser into traditional opinion mining approach to propose a new procedure which can extract the pairs of opinion keywords and feature keywords according to dependency grammar graphs. This parsing-based approach is more suitable for articles with plenty of words. The grading results will be presented by a 5-grade marking system. The experimental results show that the accuracy of our system, with the deviation of grades less than 1, is 70.7%, and the MRR value is 0.61. In addition, when we changed the 5-grade marking system into the recommend and un-recommend choices, we got F-score 74.3% and 51.4% respectively. The result indicates that our system can reach satisfied expectancy for movie recommendation.
Description
Keywords
中文文法剖析, 意見探勘, 情感分類, 電影評論, Chinese parser, opinion mining, sentiment classification, movie review