標題: 知識系統中快速索引機制之研究
A Study of an Efficient Indexing Technology for Knowledge Systems
作者: 陳威州
Wei-Chou Chen
曾憲雄
Shian-Shyong Tseng
資訊科學與工程研究所
關鍵字: 知識發現;位元組索引;資料挖掘;模式比對;特徵選取;知識萃取;Knowledge Discovery;Bit-wise Indexing;Data Mining;Pattern Match;Feature Selection;Knowledge Acquisition
公開日期: 2004
摘要: 近年來,知識發現系統(System for Knowledge Discovery in Database)隨著資訊技術的進步與普及,愈來愈受重視,相關的應用技術及研究也相繼被提出,目的是希望使用資料庫知識發現(Knowledge Discovery from Database)的技術,將企業所累積的交易及製造的資料,透過資料探勘(Data Mining)的方法,找出企業知識(Business Intelligent)與各種行為模式(Behavior Patterns),進而達到累積企業知識的目的。由於企業在營運的過程中所累積下來的資料量十分可觀,如何即時達成資料挖掘的功能並提出有效的知識則成為一個重要的課題。在此篇論文中,我們將提出一個適用於知識系統及資料庫之資料索引技術-位元組索引技術(Bit-wise Indexing Technology)。在這個技術中,我們總共提出了三個不同的索引方法,包含簡單位元組索引方法(Simple Bit-wise Indexing Method)、概括式位元組索引方法(Encapsulated Bit-wise Indexing Method)及精簡式位元組索引方法(Compacted Bit-wise Indexing Method),可針對連續性及非連續性型態的資料進行處理,我們亦也提出了二元化索引編碼及資料搜尋演算法,用以節省搜尋大量資料的處理時間。 為了驗證我們所提出技術的效率、彈性及實際可用性,我們將這個技術分別應用於四個不同的知識系統領域, 包含回饋式學習(Reinforcement Learning),模式學習(Pattern Learning), 監督式學習(Supervised Learning)及非監督式資料(Unsupervised Learning)挖掘知識系統等。而這四個實際系統包含應用在製造過程中由於製程時間的問題所產生的產品缺陷之以遺傳演算法之製造缺陷偵測系統、應用在網路入侵偵測系統中的入侵模式的挖掘與比對以提昇系統彈性及效率、應用在以資料為導向之約略集合論特徵選取技術並使用於知識擷取系統上以節省執行時間及應用在半導體製造過程中用於缺陷偵測的資料挖掘系統以提昇系統效能。其中用於半導體製造過程中用於缺陷偵測的資料挖掘系統己被台灣積體電路公司正式納入該公司之智慧型電子資料分析系統中的良率改善子系統,用以提高良率改善的效率,而以資料為導向之約略集合論特徵選取技術已被實際應用於某國際壽險的客戶關係管理系統專案中之擷取壽險保單回流貸款客戶特徵候選名單用以提昇企業收益。
Recently, the Knowledge Discovery in Database (KDD) has grown rapidly, as IT and AI technologies have become widely discussed and researched. Relevant research, applications, and tool development in business, science, government, and academia are becoming increasingly popular. Particularly in some worldwide enterprises, KDD systems are applied to discover useful business intelligence and customer behavior patterns using data mining technology. However, since the quantity of data is continuously and rapidly growing in such enterprises, correctly and efficiently discovering useful information is becoming a significant issue. In this thesis, we will propose an efficient indexing technology of knowledge and database systems, called Bit-wise Indexing Technology. There are three indexing models in this technology, including Simple Bit-wise Indexing Method, Encapsulated Bit-wise Indexing Method and Compacted Bit-wise Indexing Method. Also, the corresponding indexing and matching algorithms for such indexing models are also proposed. In order to demonstrate the suitability, flexibility and efficiency of the proposed indexing methods, we will try to apply the proposed method in four kinds of KDD applications, including reinforcement learning, pattern matching, supervised learning and unsupervised-learning data mining applications, in this thesis. For enhancing the system performance, the simple bit-wise indexing method was applied to the manufacturing defect detection problem, time aspect (MDDP-t) for manufacturing domains. For improving the flexibility and accuracy, the encapsulated bit-wise indexing method is applied to the pattern matching module of an Internet intrusion detection system. To reduce the processing time, the compacted bit-wise indexing method is applied to the data-driven rough-set based feature selection. Additionally, the proposed feature selection method was adopted in a KA project to discover the desired feature sets to construct a CBR system for a world-wide financial group customer relationship management system’s loan promotion function. In the last application, three proposed methods are hybridly applied to the data mining module of a defect detection mechanism in a semiconductor manufacturing system to improving the accuracy and usability. The proposed method was officially employed in the Yield Explorer Function of Intelligent Engineering Data Analysis system (iEDA) in Taiwan Semiconductor Manufacturing Corporation (TSMC) for root cause detection of manufacturing defects and yield enhancement.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT008823805
http://hdl.handle.net/11536/64223
顯示於類別:畢業論文


文件中的檔案:

  1. 380501.pdf

若為 zip 檔案,請下載檔案解壓縮後,用瀏覽器開啟資料夾中的 index.html 瀏覽全文。