標題: | 近體詩自動分類研究 The Study of Chinese Jintishi Categorization |
作者: | 劉博榮 Liu, Po-Jung 梁婷 Liang, Tyne 資訊科學與工程研究所 |
關鍵字: | 文件分類;語意消歧;詩作分類;特徵選擇;Text Classification;Word Sense Disambiguation;Poetry Classification;Feature Selection |
公開日期: | 2010 |
摘要: | 近體詩是華人社會中一項重要的文化資產,然而很多詩作中皆含有隱喻,使得近體詩對於學生而言不容易了解其中含義。在本論文中,我們提出幾個有效的方法來做近體詩的自動分類,藉以幫助學習者對於詩作的理解。我們利用法則式的方法搭配同義詞詞林來做語意標記,以及SVM的分類模型來做詩作分類。並從詩作的語料中探勘七種特徵來做為分類特徵,再利用Forward Sequential Selection Algorithm來做為選取特徵的演算法,而我們所提出的方法經過217首的五言絕句來做六個類別近體詩的詩作分類實驗,可達到72.35%的正確率。 Chinese Jintishi is one important heritage in Chinese societies. Nevertheless, many poets use metaphors while composing their poems. So it becomes hard to understand Jintishi for high school students. In this thesis, an effective approach to automate Jintishi is presented with the aim to facilitate poem comprehension. We propose a method to tackle with semantic role labeling based on Tongyici Cilin and a SVM-based model to handle poem categorization. The categorization employs seven kinds of features mined from training corpus. Best set of features is selected by using forward sequential selection algorithm. The approach is justified in terms of 72.35% accuracy by categorizing 217 five-character quatrains into six types of Jintishi. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT079755635 http://hdl.handle.net/11536/45980 |
顯示於類別: | 畢業論文 |