Full metadata record
DC FieldValueLanguage
dc.contributor.author劉博榮en_US
dc.contributor.authorLiu, Po-Jungen_US
dc.contributor.author梁婷en_US
dc.contributor.authorLiang, Tyneen_US
dc.date.accessioned2014-12-12T01:43:51Z-
dc.date.available2014-12-12T01:43:51Z-
dc.date.issued2010en_US
dc.identifier.urihttp://140.113.39.130/cdrfb3/record/nctu/#GT079755635en_US
dc.identifier.urihttp://hdl.handle.net/11536/45980-
dc.description.abstract近體詩是華人社會中一項重要的文化資產,然而很多詩作中皆含有隱喻,使得近體詩對於學生而言不容易了解其中含義。在本論文中,我們提出幾個有效的方法來做近體詩的自動分類,藉以幫助學習者對於詩作的理解。我們利用法則式的方法搭配同義詞詞林來做語意標記,以及SVM的分類模型來做詩作分類。並從詩作的語料中探勘七種特徵來做為分類特徵,再利用Forward Sequential Selection Algorithm來做為選取特徵的演算法,而我們所提出的方法經過217首的五言絕句來做六個類別近體詩的詩作分類實驗,可達到72.35%的正確率。zh_TW
dc.description.abstractChinese Jintishi is one important heritage in Chinese societies. Nevertheless, many poets use metaphors while composing their poems. So it becomes hard to understand Jintishi for high school students. In this thesis, an effective approach to automate Jintishi is presented with the aim to facilitate poem comprehension. We propose a method to tackle with semantic role labeling based on Tongyici Cilin and a SVM-based model to handle poem categorization. The categorization employs seven kinds of features mined from training corpus. Best set of features is selected by using forward sequential selection algorithm. The approach is justified in terms of 72.35% accuracy by categorizing 217 five-character quatrains into six types of Jintishi.en_US
dc.language.isozh_TWen_US
dc.subject文件分類zh_TW
dc.subject語意消歧zh_TW
dc.subject詩作分類zh_TW
dc.subject特徵選擇zh_TW
dc.subjectText Classificationen_US
dc.subjectWord Sense Disambiguationen_US
dc.subjectPoetry Classificationen_US
dc.subjectFeature Selectionen_US
dc.title近體詩自動分類研究zh_TW
dc.titleThe Study of Chinese Jintishi Categorizationen_US
dc.typeThesisen_US
dc.contributor.department資訊科學與工程研究所zh_TW
Appears in Collections:Thesis


Files in This Item:

  1. 563501.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.