Full metadata record
DC FieldValueLanguage
dc.contributor.author何長安en_US
dc.contributor.authorHo, Chang-Anen_US
dc.contributor.author林昇甫en_US
dc.contributor.authorLin, Sheng-Fuuen_US
dc.date.accessioned2014-12-12T01:27:33Z-
dc.date.available2014-12-12T01:27:33Z-
dc.date.issued2008en_US
dc.identifier.urihttp://140.113.39.130/cdrfb3/record/nctu/#GT079612518en_US
dc.identifier.urihttp://hdl.handle.net/11536/41837-
dc.description.abstract本論文係利用循序搜尋之概念進行類神經網路架構中的所有權值進行添加擾動量的動作,提出了一循序權值擾動於安全性增強式學習架構。並於擾動量添加完成後,對於添加擾動量前後進行優劣性評價,藉此達至權值更新動作。避免傳統擾動學習演算法易落入局部解或於解空間中某解附近產生振盪現象,而導致學習速度趨緩之問題。此外,於增強式學習架構中,利用受控體的能量概念定義學習目標狀態集合,透過此設計可大幅降低傳統增強式學習於解空間中過度搜尋較佳解之時間,即能迅速將受控體狀態控制於目標狀態集合中。於測試模擬中,利用n質量單擺系統模型進行人型機器人模擬測試,藉此證實本論文所提出的學習演算法效能表現較為彰顯。zh_TW
dc.description.abstractThis article is about sequential perturbation learning architecture through safe reinforcement learning (SRL-SP) which based on the concept of linear search to apply perturbations on each weight value of the neural network. The evaluation of value of function between pre-perturb and post-perturb network is executed after the perturbations are applied, so as to update the weights. Applying perturbations can avoid the solution form the phenomenon which falls into the hands of local solution and oscillating in the solution space that decreases the learning efficiency. Besides, in the reinforcement learning structure, use the Lyapunov design methods to set the learning objective and pre-defined set of the goal state. This method would greatly reduces the learning time, in other words, it can rapidly guide the plant’s state into the goal state. During the simulation, use the n-mass inverted pendulum model to perform the experiment of humanoid robot model. To prove the method in this article is more effective in learning.en_US
dc.language.isozh_TWen_US
dc.subject安全性增強式學習zh_TW
dc.subject權值擾動zh_TW
dc.subject循序搜尋zh_TW
dc.subjectsafe reinforcement learningen_US
dc.subjectweight-perturbationen_US
dc.subjectsequential searchen_US
dc.title基於安全性增強式學習之循序擾動學習演算法zh_TW
dc.titleSafe Reinforcement Learning based Sequential Perturbation Learning Algorithmen_US
dc.typeThesisen_US
dc.contributor.department電控工程研究所zh_TW
Appears in Collections:Thesis