Abstract
In this paper we present a new concept of self-reflection learning to support a deep reinforcement learning model. The self-reflective process occurs offline between episodes to help the agent to learn to navigate towards a goal location and boost its online performance. In particular, a so far optimal experience is recalled and compared with other similar but suboptimal episodes to reemphasize worthy decisions and deemphasize unworthy ones using eligibility and learning traces. At the same time, relatively bad experience is forgotten to remove its confusing effect. We set up a layer-wise deep actor-critic architecture and apply the self-reflection process to help to train it. We show that the self-reflective model seems to work well and initial experimental result on real robot shows that the agent accomplished good success rate in reaching a goal location.
Original language | English |
---|---|
Pages | 4565 - 4570 |
DOIs | |
Publication status | Published - 3 Nov 2016 |
Event | 2016 International Joint Conference on Neural Networks - Vancouver, Canada Duration: 24 Jul 2016 → 29 Jul 2016 |
Conference
Conference | 2016 International Joint Conference on Neural Networks |
---|---|
Abbreviated title | IJCNN |
Country/Territory | Canada |
City | Vancouver |
Period | 24/07/16 → 29/07/16 |
Bibliographical note
© 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other worksKeywords
- robot navigation
- self-reflective deep reinforcement learning
- deep learning
- actor-critic
- neural networks