標題: 中文篇章中時間關係的辨識研究
A Study of Temporal Relation Recognition in Chinese Texts
作者: 黃慧庭
Hui-Ting Huang
梁婷
Tyne Liang
資訊科學與工程研究所
關鍵字: 語篇關係;連貫關係;時間關係;自動標記;Discourse Relation;Rhetorical Relation;Coherence Relation;Temporal Relation;Automatic Recognition
公開日期: 2007
摘要: 一般文本式的篇章中,語段間常包含許多不同的語意銜接關係,諸如:轉折、並列、因果……等等。在此篇論文中,我們針對時間關係進行探討。本論文提出一個統計式的辨識中文語篇時間關係的方法,我們使用兩種訓練方式,貝氏分類器與C4.5決策樹。我們使用所挑選出的特徵作為辨識依據,這些特徵包含位置特徵、主詞為人物、主要動詞組、時間詞組、主要動詞前動詞、主要動詞前後副詞、時態字根、主要動詞詞性組以及小句重心一致性。實驗數據顯示所有的特徵對於辨識都有正向的作用,在句內辨識的實驗中,在使用C4.5決策樹的情況下,我們的F-Score達95.39(正例有22842例,負例有15269例)。
There are various types of coherent relations among discourse segments such as disjunctive, elaborativion, condition, cause-and-effect, etc. In this thesis, the temporal relation which is commonly expressed in Chinese written texts is addressed. Two statistical approaches, namely, Naive Bayes and C4.5 decision tree, are presented and studied to effectively recognizing the addressed temporal relations among discourse segments. The recognition is implemented with the help of corpus-based feature mining which extracts informative features like position, headnoun is human or not, main verb pair, temporal words, verbs with verbal object, adverbs, tense/aspect, part-of speech of main verb pair, consistence.Experimental results show that all extracted features employed by both statistical approaches turn out to be positive in relation recognition. Meanwhile, the intra-sentential recognition implemented by using C4.5 decition tree can achieve 95.39% F-score on constructed corpus containing 22842 positive instances and 15269 negative instances.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT009555581
http://hdl.handle.net/11536/39533
顯示於類別:畢業論文


文件中的檔案:

  1. 558101.pdf
  2. 558102.pdf
  3. 558103.pdf

若為 zip 檔案,請下載檔案解壓縮後,用瀏覽器開啟資料夾中的 index.html 瀏覽全文。