Abstract
This work describes a fast learning robot goal-aware navigation model that employs both gradient and
conjugate gradient Temporal Difference (TD, TD-conj) methods. It builds on the fact that TD-conj was proven to be equivalent to a gradient TD method with a variable lambda under certain conditions. Based on straightforward features extraction process combined with goal-aware capabilities provided by whole image measure, the model solves what we call u-turn-homing benchmark problem without using landmarks. Only one goal snapshot was used with agent facing the goal directly. Therefore a novel threshold stopping formula was used to recognize the goal which is less sensitive to the agent-goal orientation problem. Unlike other models, this model refrains from artificially manipulating or assuming a priori knowledge about the environment, two constraints that widely restrict the applicability of existing models in realistic scenarios. An on-line control method was used to train a set of neural networks. With the aid of variable and fixed eligibility traces, these networks approximate the agent’s action-value function allowing it to take close to optimal actions to reach its home. The effectiveness of the model was experimentally verified on an agent.
Original language | English |
---|---|
Pages | 1534-1541 |
DOIs | |
Publication status | Published - Jul 2014 |
Event | Neural Networks, 2014 International Joint Conference - Beijing, China Duration: 6 Jul 2014 → 11 Jul 2014 |
Conference
Conference | Neural Networks, 2014 International Joint Conference |
---|---|
Abbreviated title | IJCNN |
Country/Territory | China |
City | Beijing |
Period | 6/07/14 → 11/07/14 |
Bibliographical note
© 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.”Keywords
- TD-conj
- Home Aware
- Variable λ TD
- U-Turn-Homin
- Orientation Insensitive Thersholding