Title: 唐詩之詩風探勘
Style mining for Tang Poetry
Authors: 王迺仁
Nai-Jun Wang
Shian-Shyong Tseng
Keywords: 唐詩;風格探勘;概念階層;分群;關聯規則探勘;Tang Poetry;style mining;concept hierarchy;clustering;association rule mining
Issue Date: 2005
Abstract:   詩是中國文化偉大的文學創作,尤其在唐朝時最為盛行。詩為韻文之一種,講究音韻諧和,所用文字是古漢語,與現在的白話文差異很大,且漢語文字多有同義與歧義的問題。對詩作文字進行風格探勘,字詞的概念繁瑣,由專家學者來分析處理已是不容易,雖然文學評論者將唐詩依作者區分為不同風格的詩作,但區分的條件或規則並不明顯,要從詩句文字中找出詩風分類的規則更是困難。   本論文運用資料探勘技術中關聯規則探勘盛唐時期文人詩作,找出詩人創作詩時所偏好使用的名詞組合規則。第一階段是名詞擷取:文人在詩中常描寫景觀及事物變化用詞大多為名詞,所以從詩作擷取出名詞作為基本的分析資料。第二階段是名詞概念歸納(concept generalization):本論文提出唐詩名詞概念階層(concept hierarchy)將詞義概念繁雜的名詞概念歸納成概念精簡的名詞類別,建置成唐詩名詞類別集。第三階段是名詞使用差異性分群(clustering):比較詩作名詞類別使用的差異性(dissimilarity)來分群,找出文人風格詩作群集。第四階段是名詞使用關聯規則探勘(association rule mining):利用關聯規則探勘分析風格詩作群集使用名詞類別的組合,依可信度(confidence)及支持度(support)找出詩人詩詞創作名詞使用的風格規則。從分析的結果發現,實驗方法將王維詩作分為邊塞詩群及山水詩群等,可與詩詞專家的評論相互佐證。並將文學評論者所整理之山水派唐詩經名詞擷取、名詞概念歸納後,與實驗結果比較,與山水詩群相近且名詞使用的規則也相同。   此實驗方法可以客觀找出詩作因名詞使用不同的風格判別規則,供詩詞研究者分析或歸類詩作,或供學習者了解詩作中名詞使用所表現的風格意涵。
The Chinese poetry is the greatest Chinese literature creation especially in Tang Dynasty. It is a kind of rhymed article which is concerned with the harmonious rhyme. It is a challenge to analyze Tang poetry because the ancient Chinese language used in the poem is different from the modern Chinese language. Besides, there are many synonyms and ambiguity in Chinese vocabulary. Therefore, the analysis of the poetry style is difficult for Tang Poetry. In this thesis, we propose a Data Mining approach to discover the style of the poetry for the poet. Firstly, in the noun retrieval process, the nouns in the poem are treated as the features of the poetry style and retrieved for data analysis. Secondly, in the noun concept generalization, we transform the nouns by Tang noun concept hierarchy to the corresponding concept. We construct the Tang concept hierarchy table from the transformed results. Thirdly, in the style clustering of poetry, we cluster all the poetry by comparing the nouns dissimilarity of poetry. Finally, in the noun association rule mining, we apply the association rule mining to discover the noun association of poetry. During these processes of the style mining for Tang poetry, we obtained the noun sets of poetry and the noun association rules. These results are verified by the experiment, and we can distinguish the poetry style from these association rules. In the future, the noun association rules of Tang poetry that are discovered objectively by the style mining can support the researcher for further research.
Appears in Collections:Thesis

Files in This Item:

  1. 350202.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.