標題: 應用Trie結構動態時間扭曲法於台灣股市分鐘資料行為分析
Using Trie-structure Dynamic Time Warping on the Taiwan Stock Intra-day Index Time Series Analysis
作者: 葉志彥
Yeh Chih-Yen
陳安斌
Chen An-Pin
資訊管理研究所
關鍵字: 圖形識別;時間序列;距離度量;台灣股價加權指數;分群法;動態時間扭曲法;Pattern Recognition;Time Series;Distance Measures;TAIEX;Cluster Analysis;Dynamic Time Warping Method
公開日期: 2003
摘要: 從2003年的諾貝爾經濟學獎的論文可知,金融市場的時間序列存在著某些行為與規則。而這些行為與規則,除了使用傳統的統計數學工具被發現外,也可以應用圖形識別的方法學搜尋。 圖形識別的問題,包含了影像處理、語音辨識、時間序列的趨勢分析等領域。影像處理與語音辨識應用在商業、娛樂相當的廣泛。而隨著資訊科技的進步,圖形識別的應用與技術方法,也更加成熟。圖形識別中的距離度量(Distance measures),是影像處理、語音辨識中不可或缺的一個函式。距離度量的定義不同,往往會影響了圖形識別技術效果的好壞。 距離度量是圖形識別的一種基本量度工具,應用在分群(Clustering)與相似度搜尋(Similarity search)上。動態時間扭曲法(Dynamic time warping)便是一種距離度量的方法學,由於其在語音辨識的優異標現,1994年Berndt與Clifford將這個方法學帶入了資料挖掘的領域中。雖然這個距離度量相較於傳統的距離度量而言,在資料分析上,有更好的表現,然而其距離計算成本卻因為其演算法的時間雜度而變大,這也限制了它即時處理的表現。 本研究提出一個改良過後的動態時間扭曲法—trie結構動態時間扭曲演算法,並利用階層分群法,並將其應用於股市每日分鐘資料的行為預測上,並且和其他兩種距離度量比較其預測正確率的表現與處理的時間長短。 經實證的結果發現改良過後的動態時間扭曲演算法在預測準確率上較歐幾里德距離來的好,而和原本的動態時間扭曲法相近,而處理時間也較原來的動態時間扭曲法短。因此,trie結構的動態時間扭曲法可以應用於及時的金融預測上。
According to the 2003 Nobel Economic Prize, some behaviors and rules exist in time series at financial market. However, these behavior and rules can be found not only by traditional statistic or mathematical tools but by pattern recognition methodologies. The problem of pattern recognition includes image processing, speech recognition, time series data analysis and so on. Image processing and speech recognition have been widely applied to business and entertainment According to the progress of information technology, the methodologies of pattern recognition are enhanced with high-speed computation and high-volume storage devices. The distance measures in image and speech processing of pattern recognition are necessary and important tools. Distance measures can be applied to clustering and similarity search. Dynamic time warping is one of distance measures, which is used and well-performed at speech recognition. In 1994, DTW was introduced in data mining domain by Berndt and Clifford. Although it performs well than traditional distance measure, such as Euclidean distance, cost of computation can be large because of the algorithm of DTW. This cost can limit the performance while using DTW on real-time analysis. An improved DTW, trie-structure DTW is proposed in this research. By using hierarchical clustering, trie-structure DTW will be applied to analysis of time series of minute-data in TAIEX(Taiwan Stock Exchange Corporation Capitalization Weighted Stock Index). The classic DTW and Euclidean distance will be compared with trie-structure DTW in this research. After experiments, using trie-structure DTW would get better performance than E Euclidean distance measure. Furthermore, the time cost of trie-structure DTW is less than the classic DTW and it’s possible to use the improved DTW on real-time financial prediction.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT009134512
http://hdl.handle.net/11536/58090
Appears in Collections:Thesis


Files in This Item:

  1. 451201.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.