Title: 應用基因演算法於類別描述子之研究
A Genetic Approach to Class Descriptors
Authors: 郭純亨
Kuo Chun-Heng
梁婷
Liang Tyne
資訊科學與工程研究所
Keywords: 類別描述子;基因演算法;資訊擷取;Class descriptors;Genetic algorithm;Information retrieval
Issue Date: 1998
Abstract: 為了有效管理大量的文件,文件分類是非常重要的。而影響文件分類效能的因素之一便是類別描述子。傳統上類別描述子是聯集所有同一個類別下的描述子。這方法可能導致於大量的類別描述子,類別描述子與文件描述子低的相似值與大量的計算時間。因此,我們提出結合資訊擷取觀念(權重,相似值,命中率)與基因演算法特色(探索與開發)的Ga-based模組,來萃取出適合的類別描述子。從實驗結果可以看出所提模組在第一適應函數下可萃取出較高的相似值,儲存空間發費較少的類別描述子。
To manage a huge amount of documents easily and efficiently, document classification is important in information retrieval. One of the factors to affect document classification performance is class descriptor. The traditional method of extracting class descriptors is to union all descriptors in the same class to express class. This method results in a large number of class descriptor, low similarity between document descriptors and class descriptors, and much computing time. Hence, we propose the Ga-based model, which combines the concepts in information retrieval (like similarity, weighted and hit ratio) with characteristics of genetic algorithm (like exploration and exploitation) to extract suitable class descriptors. The experimental results indicate that the proposed model with the first fitness functio extracts class descriptor with higher similarity between document descriptor, and less space overheads.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT870394073
http://hdl.handle.net/11536/64216
Appears in Collections:Thesis