標題: | 意見探勘在中文電影評論之應用 Applying Opinion Mining to Chinese Movie Reviews |
作者: | 邱鴻達 Ciou, Hong-Da 梁婷 Liang, Tyne 資訊科學與工程研究所 |
關鍵字: | 意見探勘;自動評分;Opinion Mining;Auto Scoring |
公開日期: | 2010 |
摘要: | 隨著Web2.0網路蓬勃發展,使用者的意見不再僅限於口耳相傳,藉由不同的網路平台,例如mobile01、博客來、Yahoo!奇摩等網站,使用者意見成為一股無法忽視的力量。有鑒於此,我們希望研究如何將數篇不同的電影評論整合並計算可信的分數提供使用者參考。在本篇論文中,我們實作出一個有效的電影評價系統,其中包括語料處理、屬性詞人工擷取及分類、意見詞擷取、意見詞分數計算和電影評分。首先我們透過手動收集屬性詞,再利用同義詞詞林作擴充取得較完整的電影屬性詞集。接著我們提出一個以詞性組合序列為基礎的方法擴充意見詞集。取得屬性詞及意見詞後,我們將屬性詞及對應的意見詞配對。與其他研究不同的是,我們考慮到有些只有意見詞卻缺少屬性詞的句子,因此我們利用五種特徵在支持向量機上辨識意見詞所屬的類別。最後我們提出一個考慮電影的四大屬性類別的評分,對226部電影做權重式評分實驗。實驗結果顯示最高的F-score為83%,整體正確率為79%。 In the age of Web2.0, more and more opinion platforms are developed on internet, such as Mobile01, Yahoo, Amazon etc. And more and more people express their opinions through internet platforms. It is important for industries and internet users to access huge amount of opinions quickly. In this thesis, a movie evaluation system of Chinese review corpus is developed by using opinion mining techniques. The system contains corpus processing, attribute word acquisition and classification, opinion word acquisition and opinion word score calculation, movie rating evaluation. The attribute words are manually acquire and expanded by Tongyici Cilins. The opinion words are expanded by part of speech model. For those opinion words with no corresponding attribute words, we use SVM to classify them into four categories of movie attributes. Finally, a movie scoring function is proposed to evaluate ratings of movies according to the categories of movie attributes. We evaluated the performance with a review set of 226 movies. The result shows that the best F-score is 83%, and precision is 79%. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT079855553 http://hdl.handle.net/11536/48288 |
Appears in Collections: | Thesis |
Files in This Item:
If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.