標題: 植基於樹狀結構之XML資料查詢之索引方法
Tree Structure-based XML Data Query Using a Novel Index Method
作者: 呂平祥
Ping-Hsiang Lu
李素瑛
Yuh-Yin Lee
資訊科學與工程研究所
關鍵字: 索引機制;半結構化資料;XML;indexing;structural join;Semi-Structured Data
公開日期: 2003
摘要: 半結構資料庫不同於關聯式資料庫以及物件導向資料庫,無法事先知道它的固定輪廓,而且必須和資料存放在不同地方。概括地說,半結構化資料可以自我描述而且比關聯式資料和物件導向資料能更自然的模塑異質性資料。XML文件就是其中的一個例子,它是加入標籤於文件中的自我描述資料。 由於交換XML文件愈來愈重要,最近許多學者在研究提供彈性的查詢能力以利於從結構性的XML文件提煉資料,因此對XML建立索引以改善XML文件的查詢效率變成一個重要的課題。一些索引的方法,將查詢條件分解成多個子查詢,最後再將這些子查詢的結果結合在一起產生最後的結果。有些方法利用路徑和節點建立索引,另一種方法利用樹狀結構為查詢的基本單元,可免除昂貴的結合操作。我們將延伸其中一種樹狀結構為查詢單元的方法,提供一個新的索引方法,以加速查詢程序,並改善它的查詢演算法,加快它的查詢速度。實驗結果證實索引方法加速了查詢的速度。
Semistructured databases, unlike relational and object-oriented database, do not have a fixed schema known in advance and the schemas are stored separately from the data. Generally speaking, semistructured data is self-describing and can model heterogeneity more naturally than either relational or object-oriented data model. Examples of such self-describing data are tagged documents like XML documents. With the growing importance of XML in information exchange, much research has been focused on providing flexible query facilities to extract data from structured XML documents database. Indexing of XML documents becomes an important issue for effective XML query. Some index methods disassemble a query into multiple sub-queries, and then the results of these sub-queries are joined to provide the final answers. Those approaches create indexes on paths or nodes in DTD (Document Type Definition) trees. Another one uses tree structures as the basic unit of query to avoid expensive join operations. We will exploit tree structures as the basic unit of query and propose a new index mechanism to assist the query processing. Besides, the algorithm is modified to speed up the retrieval. Experiments are performed to verify the indexing mechanism.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT009117605
http://hdl.handle.net/11536/50469
Appears in Collections:Thesis