完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.author | Lai, Tung-Yi | en_US |
dc.contributor.author | Hsueh, Chu-Hsuan | en_US |
dc.contributor.author | Lin, You-Hsuan | en_US |
dc.contributor.author | Chu, Yeong-Jia Roger | en_US |
dc.contributor.author | Hsueh, Bo-Yang | en_US |
dc.contributor.author | Wu, I-Chen | en_US |
dc.date.accessioned | 2020-05-05T00:02:00Z | - |
dc.date.available | 2020-05-05T00:02:00Z | - |
dc.date.issued | 2019-01-01 | en_US |
dc.identifier.isbn | 978-1-7281-4666-9 | en_US |
dc.identifier.issn | 2376-6816 | en_US |
dc.identifier.uri | http://hdl.handle.net/11536/154061 | - |
dc.description.abstract | This paper proposes a deep reinforcement learning algorithm for solving robotic tasks, such as grasping objects. We propose in this paper a combination of cross-entropy optimization (CE) with deep deterministic policy gradient (DDPG). More specifically, where in the CE method, we first sample from a Gaussian distribution with zero as its initial mean, we now set the initial mean to DDPG's output instead. The resulting algorithm is referred to as the DDPG-CE method. Next, to negate the effects of bad samples, we improve on DDPG-CE by substituting the CE component with a weighted CE method, resulting in the DDPG-WCE algorithm. Experiments show that DDPG-WCE achieves a higher success rate on grasping previously unseen objects, than other approaches, such as supervised learning, DDPG, CE, and DDPG-CE. | en_US |
dc.language.iso | en_US | en_US |
dc.subject | reinforcement learning | en_US |
dc.subject | robotics | en_US |
dc.subject | object grasping | en_US |
dc.subject | deep deterministic policy gradient | en_US |
dc.subject | cross-entropy method | en_US |
dc.title | Combining Deep Deterministic Policy Gradient with Cross-Entropy Method | en_US |
dc.type | Proceedings Paper | en_US |
dc.identifier.journal | 2019 INTERNATIONAL CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE (TAAI) | en_US |
dc.citation.spage | 0 | en_US |
dc.citation.epage | 0 | en_US |
dc.contributor.department | 資訊工程學系 | zh_TW |
dc.contributor.department | Department of Computer Science | en_US |
dc.identifier.wosnumber | WOS:000524126200080 | en_US |
dc.citation.woscount | 0 | en_US |
顯示於類別: | 會議論文 |