標題: 基於使用者情緒關鍵詞彙之臉書粉絲專頁評論分類與評分系統
An Opinion Classification and Ranking System for Comments of Facebook Fan Pages based on Users’ Emotion Keywords
作者: 謝佩庭
Hsieh, Pei-Ting
陳玲慧
Chen, Ling-Hwei
多媒體工程研究所
關鍵字: 意見探勘;語意分析;情緒分類;自動評分;社群關係;Opinion mining;Sentiment analysis;Emotion classification;Opinion auto ranking;Social network
公開日期: 2014
摘要: 隨著各種網路服務的興起,人們除了從網際網路上獲取資訊之外,也開始熱衷於把自己的意見分享出去。這些由個人產生的大量訊息,可以用於了解產品評價、市場趨勢及客戶意見,不管對於個人或企業來說,都是相當有價值的資訊,也因此能有效從大量訊息中發掘價值的意見探勘技術更顯得重要。此外,在社群網路非常活躍的現在,每一則意見的發表者本身也是該意見中重要的一部份,讓意見探勘不能只考慮意見的文字內容,應該把發表者的特性也一同考慮進去。 網路科技與社群的活躍帶動了微網誌的盛行,越來越多情緒分析的研究以微網誌為分析對象,有別於長篇文章,微網誌的使用者必須用少數且有限的字數表達想傳達的內容,所以微網誌的短句大多屬於沒有結構性的網路非正規語言,容易被斷章取義,作情緒分析較為困難。在這篇論文中,我們針對臉書粉絲專頁,提出一個粉絲意見分類與評分系統,包含訓練與分類。在訓練的部份,分為情緒詞庫與語意標籤建立以及模型建立兩部分,首先用漢語分詞系統將訓練資料做斷詞,將具有情緒代表性的詞彙以及表情符號,加上知網的正負情緒詞彙文件來建立情緒關鍵庫,再將訓練文件中具有否定、加強語氣、減弱語氣的詞彙建立否定語意及程度語意標籤;接著根據二元詞頻及粉絲頁專有名詞產生詞彙合併規則,將斷詞後的資料重新合併。最後基於此情緒詞庫及語意標籤,將訓練文件做情緒特徵擷取,轉為特徵向量,以訓練一支持向量機情緒分類模型。在分類的部分,此系統首先將粉絲的每則評論進行字詞的拆解與情緒特徵擷取,基於利用訓練好的支持向量機將評論做正向、中立、負向情緒分類,並根據粉絲個別使用過的所有情緒詞彙,計算評論的分數。 我們實驗的評論來自於九個台灣連鎖餐廳的臉書粉絲專頁,實驗結果顯示對於語境有限的Facebook粉絲專頁評論,此系統仍能有不錯的情緒分類效果;而調整評論的分數能將減少使用者不同用詞習慣對評論分數的影響,讓評論分數更能反應使用者評論的情緒程度。
In recent years, various Internet services have been developed and quite popular. People not only retrieve rich information through the Internet but also share their own opinions to the public. The great quantity of personal opinions, which could be used in understanding product evaluation, market trends, and customer feedback, are very valuable to individuals and enterprises. Therefore, the opinion mining technologies that can discover hidden value from a large amount of messages are quite important. Moreover, the person who publishes an opinion is also an essential part of the opinion, especially in social networks. Those personal properties in an opinion are worth being considered when analyzing the social network. Along with the rise of network technology and social network, microblog service is becoming popular. More and more study of emotion analysis focus on detecting emotions in texts from microblog service. Different from posting a long article, the users of microblog service need to express their thoughts with few and limited words. Therefore, the contents of microblog usually contain network informal language which has non-complete structures. It is more difficult to detect emotions in short sentences of microblogs then in long articles. In this study, an opinion classification and ranking system for Facebook fan pages based on Chinese semantic orientation is proposed, including training and classification parts. In the training part of the system, firstly, every training comment data is tokenized into words by Chinese Lexical Analysis System. Then we create some words combination rules according to term frequency of two words and proper nouns of Facebook fan pages. Finally, the extraction of emotion features in training data is performed. We translate those emotion features into feature vectors, training a Support Vector Machine classification model to classify emotions. In the classification part, the system tokenizes the comments data and performs feature extraction at first. After that, the system classifies emotion of every social comment into positive, neutral or negative based on the Support Vector Machine emotions classification model trained in the training part, and then calculate the comment score according to all the emotion words of that commenter. In our experiment, the comments from Facebook Fan Pages are employed as resource data to achieve opinion analysis based on users’ characteristics. The adjustment of comment scores can reduce the influences caused by different properties of different users when using emotion words, making the comments score more effectively reflect the emotion degree of the comments.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT070156629
http://hdl.handle.net/11536/76010
Appears in Collections:Thesis