Learning reactive behaviors with constructive neural networks in mobile robotics

University dissertation from Örebro : Örebro universitetsbibliotek

Abstract: This thesis investigates a learning system for acquiring robot behaviors by mapping sensor information directly to motor actions. It addresses the integration of three learning paradigms, namely unsupervised learning (UL), supervised learning (SL), and reinforcement learning (RL). The approach is characterized by the use of constructive artificial neural networks (ANNs). The sensor-motor mappings acquired by the learning system form part of a tight "sense-learn-act" cycle, as opposed to "sense-plan-act", thus allowing the robot to learn concepts within its own sensorimotor experience while avoiding anthropomorphic bias.Novel techniques for robot learning using constructive radial basis function (RBF) networks are introduced. This leads to a self-organizing, incremental and local construction of the sensorimotor space for learning different behaviors with the same basic architecture, thus a great simplification of the engineering design process of the ANN's structure. Integration of the different learning paradigms takes place in a two-layer learning architecture.The lower layer with the UL and SL paradigms is used to quickly construct a controller for the required behavior. The upper layer with the RL paradigm is used for tuning and refining of the controller resulting from the lower layer (or a controller obtained from other prior knowledge) to further improve the robustness and performance of the behavior. Both layers apply constructive RBF networks, taking into account the different requirements of the respective learning paradigms.The learning system is verified by a number of experiments on a real robot. We begin our experiments with the lower layer together with a teaching-by-demonstration approach for acquiring different behaviors. The experimental results show that the lower layer can learn a wide range of robot behaviors, thus demonstrating the task non-specific nature of the architecture. We then demonstrate the necessity of the layered learning architecture for more complex behaviors by a docking behavior requiring precise positioning at a goal location. The results obtained show that learning with only the lower layer can not obtain robust performance and optimal trajectories, while learning with only the upper layer is impractical and even infeasible on the real robot due to the slow learning process of the RL paradigm. With layered learning, however, the upper layer is speeded up by bootstrapping from the learnedcontroller in the lower layer, and a robust and time-optimal controller can be obtained.

  This dissertation MIGHT be available in PDF-format. Check this page to see if it is available for download.