Data-Efficient Representation Learning for Grasping and Manipulation

Abstract: General-purpose robotics require adaptability to environmental variations and, therefore, need effective representations for programming them. A common way to acquire such representations is through machine learning. Machine learning has shown great potential in computer vision, natural language processing, reinforcement learning, and robotics. However, this success relies on the availability of large datasets. It is hard to generate data for most robotic tasks; therefore, robotics requires data efficiency. This licentiate presents our initial progress on learning effective representations for robotics using a low amount of data. Specifically, we have used neural fields as our choice of generative models, and we have shown that we can learn local surface models for grasp synthesis and joint scene-motion models for motion generation. In the former work, we use local surface models as correspondences for grasp transfer. Unlike previous work, our method can transfer grasps demonstrated on a single object to novel objects, including ones from unseen categories, while acquiring higher spatial accuracy. The latter work shows we can model scenes and motions as smooth joint functions of shared embeddings. Unlike previous work, our approach requires less expert demonstration yet still generates precise motion trajectories.