標題: 以多維族群式共生進化演算法為基礎的改良式Q學習
Improved Q-learning Based on Multi-populations Symbiotic Evolution Algorithm
作者: 張邱仁
林昇甫
電控工程研究所
關鍵字: 增強式學習;共生進化演算法;reinforcement learning;Q-learning;symbiotic evolution;TSK fuzzy model
公開日期: 2007
摘要: 本論文提出以多維族群式共生進化演算法為基礎的改良式Q學習 (improved Q-learning based on multi-populations symbiotic evolution algorithm),多維族群式共生進化演算法的想法來自於共生進化演算法,和共生進化演算法不同的地方在於它有很多的族群(population),每個族群代表了染色體(chromosome)的集合,而每個族群代表了一個模糊規則(fuzzy rule),經由初始化、適應值回傳、排序、精英複製策略、交配和突變六個步驟來改善共生進化演算法的缺點。在智慧型學習系統中,由於許多訓練資料太過昂貴或是難以取得,因此增強式學習演算法(reinforcement learning)變的格外重要,Q學習是許多有名的增強式學習演算法之一,而兩段式Q學習便是經由傳統Q學習所做的延伸改良,它簡化了傳統Q學習的Q值查詢表(look-up table),並且強化了判別系統控制好壞程度的機制,以便得到更好的效能。最後藉由單倒單擺控制系統(inverted pendulum system)和雙倒單擺控制系統(tandem pendulum system)的實驗與其他方法比較,可得知本論文提出的方法有較好的效能。
In this thesis, Improved Q-learning based on Multi-populations Symbiotic Evolution algorithm (IQ-MSE) is proposed. The Multi-populations Symbiotic Evolution (MSE) is developed from symbiotic evolution. In MSE, compared with traditional symbiotic evolution, there are several populations. Each population formed by a set of chromosomes represents a fuzzy rule. The learning process of the MSE in each population involves six major steps: initialization, fitness assignment, sorting, elite-based reproduction strategy, crossover strategy and mutation strategy. With these six steps can improve the traditional symbiotic evolution. In intelligent learning system, some real-world applications exact training data may be expensive or even impossible to obtain. Therefore, reinforcement learning algorithm is important. Q-learning is one of famous reinforcement learning algorithms, and the Improved Q-learning (IQ) is developed from traditional Q-learning. In IQ, it simplifies the look-up table of traditional Q-learning and makes the judgment stronger to obtain better performance. The proposed IQ-MSE in this thesis gets better performance by comparing the IQ-MSE with other methods through the simulations of inverted pendulum system and tandem pendulum system.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT009512615
http://hdl.handle.net/11536/38323
顯示於類別:畢業論文