标题: | 变数精准粗略集之理论与应用 Variable Precision Rough Sets Theory and Its Application |
作者: | 许志华 Jyh-Hwa Hsu 苏朝墩 Chao-Ton Su 工业工程与管理学系 |
关键字: | 资料探勘;粗略集理论;最简化属性集合;离散化;Chi2 演算法;data mining;Rough Set Theory (RST);β-reduct;discretization;Chi2 algorithm |
公开日期: | 2003 |
摘要: | 摘 要 变数精确粗略集理论是资料探勘的重要工具之一,已广泛应用于不同领域的知识获取。然而,变数精确粗略集理论却无法应用于资料含有连续型属性的分类问题,它需要一个将能将属性离散化的方作来进行资料的前置处理。此外,变数精确粗略集理论缺乏一个适当的方法来决定精确参数(β)值以确定其最简化属性集合(β-reducts)。本论文提出一个称为“扩充的Chi2”的新演算法,此演算法以Chi2演算法为基础来发展,并改善了Chi2演算法无法由训练样本决定预先定义的错误分类率(δ)的问题。本论文也提出一个根据精确参数来选择最简化集合的方法,这方法首先利用资料错误率的最小上界来决定精确参数值,并利用所获得的精确参数值来寻找资讯系统的子集合;接着计算每一个子集合的分类品质并利用分类品质的量测移除子集合中多余的属性,而删除多余属性的子集合即β的最简化属性集合。 本论文利用决策树软体See 5分析五笔数值资料。分析结果显示所提出的“扩充的Chi2”演算法之绩效优于Chi2演算法。论文中也利用一个简单的范例说明所提出的最简化属性集合选择方法如何实施,并分析一个实际的医学案例将实验结果和类神经网路进行比较,实验结果显示本论文所提出的方法有较好的绩效。最后,一个通讯产业的应用案例被分析,利用本论文所修正的变数精确粗略集理论来删减行动电话制造程序中多余的无线电频率测试项目。实验结果显示,无线电频率测试项目显着的减少,而利用这些剩余的测试项目进行后续分析,结果显示产品检验的准确率非常接近原先未进行测试项目删减前的测试程序;此外,与决策树相比较,变数精确粗略集理论也有较好的绩效。 关键词:资料探勘,粗略集理论,最简化属性集合,离散化,Chi2演算法 Abstract The Variable Precision Rough Sets (VPRS) theory is a powerful tool for data mining, as it has been widely applied to acquire knowledge. Despite its diverse applications in many domains, the VPRS theory unfortunately cannot be applied to real world classification tasks involving continuous attributes. This requires a discretization method to pre-process the data. Also, the VPRS theory lacks a feasible method to determine a precision parameter (β) value to control the choice of β-reducts. In this study we first propose a new algorithm, named the extended Chi2 algorithm that uses a Chi2 algorithm as a basis, whereby the extended Chi2 algorithm improves the Chi2 algorithm in that the value of pre-defined misclassification rate (δ) is calculated based on the training data itself. In addition, an effective method is proposed to select the β-reducts. First, we calculate a precision parameter value to obtain the subsets of information system that are based on the least upper bound of the data misclassification error. Next, we measure the quality of classification and remove redundant attributes from each subset. Five numerical examples are analyzed in this study. By running the software of See5, our proposed extended algorithm possesses a better performance than the Chi2 algorithm. To show the effectiveness of the proposed β-reducts selection approach, a simple example and a real-world medical case are analyzed. Comparing the implementation results from the proposed method with the neural network approach, our proposed approach demonstrates a better performance. Finally, a real example from communication industry is analyzed. The VPRS theory using our proposed procedures is applied to reduce the Radio Frequency (RF) test items in mobile phone manufacturing. Implementation results show that the test items have been significantly reduced. By using these remaining test items, the inspection accuracy is very close to that of the original test procedure. Also, VPRS demonstrates a better performance than that of the decision tree approach. Keywords: date mining, Rough Set Theory (RST), β-reduct, discretization, Chi2 algorithm. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT009033809 http://hdl.handle.net/11536/38802 |
显示于类别: | Thesis |
文件中的档案:
-
380901.pdf
-
380902.pdf
-
380903.pdf
-
380904.pdf
-
380905.pdf
-
380906.pdf
-
380907.pdf
-
380908.pdf
-
380909.pdf
-
380910.pdf
-
380911.pdf
-
380912.pdf
-
380913.pdf
-
380914.pdf
-
380915.pdf
If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.