Study and implementation of stereo vision systems for robotic applications

University dissertation from Democritus University of Thrace

Abstract: Stereo vision has been chosen by natural selection as the most common way to estimate the depth of objects. A pair of two-dimensional images is enough in order to retrieve the third dimension of the scene under observation. The importance of this method is great, apart from the living creatures, for sophisticated machine systems, as well. During the last years robotics has made significant progress and the state of the art is now about achieving autonomous behaviors. In order to accomplish the target of robots being able to move and act autonomously, accurate representations of their environments are required. Both these fields, stereo vision and accomplishing autonomous robotic behaviors, have been in the center of this PhD thesis. The issue of robots using machine stereo vision is not a new one. The number and significance of the researchers that have been involved, as well as the publishing rate of relevant scientific papers indicates an issue that is interesting and still open to solutions and fresh ideas rather than a banal and solved issue. The motivation of this PhD thesis has been the observation that the combination of stereo vision usage and autonomous robots is usually performed in a simplistic manner of simultaneously using two independent technologies. This situation is owed to the fact that the two technologies have evolved independently and by different scientific communities. Stereo vision has mainly evolved within the field of computer vision. On the other hand, autonomous robots are a branch of the robotics and mechatronics field. Methods that have been proposed within the frame of computer vision are not generally satisfactory for use in robotic applications. This fact is due to that an autonomous robot places strict constraints concerning the demanded speed of calculations and the available computational resources. Moreover, their inefficiency is commonly owed to factors related to the environments and the conditions of operation. As a result, the used algorithms, in this case the stereo vision algorithms, should take into consideration these factors during their development. The required compromises have to retain the functionality of the integrated system. The objective of this PhD thesis is the development of stereo vision systems customized for use in autonomous robots. Initially, a literature survey was conducted concerning stereo vision algorithms and corresponding robotic applications. The survey revealed the state of the art in the specific field and pointed out issues that had not yet been answered in a satisfactory manner. Afterwards, novel stereo vision algorithms were developed, which satisfy the demands posed by robotic systems and propose solutions to the open issues indicated by the literature survey. Finally, systems that embody the proposed algorithms and treat open robotic applications’ issues have been developed. Within this dissertation there have been used for the first time and combined in a novel way various computational tools and ideas originating from different scientific fields. There have been used biologically and psychologically inspired methods, such as the logarithmic response law (Weber-Fechner law) and the gestalt laws of perceptual organization (proximity, similarity and continuity). Furthermore, there have been used sophisticated computational methods, such as 2D and 3D cellular automata and fuzzy inference systems for computer vision applications. Additionally, ideas from the field of video coding have been incorporated in stereo vision applications. The resulting methods have been applied to basic computer vision depth extraction applications and even to advanced autonomous robotic behaviors. In more detail, the possibility of implementing effective hardware-implementable stereo correspondence algorithms has been investigated. Specifically, an algorithm that combines rapid execution, simple and straight-forward structure, as well as high-quality of results is presented. These features render it as an ideal candidate for hardware implementation and for real-time applications. The algorithm utilizes Gaussian aggregation weights and 3D cellular automata in order to achieve high-quality results. This algorithm comprised the basis of a multi-view stereo vision system. The final depth map is produced as a result of a certainty assessment procedure. Moreover, a new hierarchical correspondence algorithm is presented, inspired by motion estimation techniques originally used in video encoding. The algorithm performs a 2D correspondence search using a similar hierarchical search pattern and the intermediate results are refined by 3D cellular automata. This algorithm can process uncalibrated and non-rectified stereo image pairs, maintaining the computational load within reasonable levels. It is well known that non-ideal environmental conditions, such as differentiations in illumination depending on the viewpoint heavily affect the stereo algorithms’ performance. In this PhD thesis a new illumination-invariant pixels’ dissimilarity measure is presented that can substitute the established intensity-based ones. The proposed measure can be adopted by almost any of the existing stereo algorithms, enhancing them with its robust features. The algorithm using the proposed dissimilarity measure has outperformed all the other examined algorithms, exhibiting tolerance to illumination differentiations and robust behavior. Moreover, a novel stereo correspondence algorithm that incorporates many biologically and psychologically in- spired features to an adaptive weighted sum of absolute differences framework is presented. In addition to ideas already exploited, such as the color information utilization, gestalt laws of proximity and similarity, new ones have been adopted. The algorithm introduces the use of circular support regions, the gestalt law of continuity, as well as the psychophysically-based logarithmic response law. All the aforementioned perceptual tools act complementarily inside a straight-forward computational algorithm. Furthermore, stereo correspondence algorithms have been further exploited as the basis of more advanced robotic behaviors. Vision-based obstacle avoidance algorithms for autonomous mobile robots are presented. These algorithms avoid, as much as possible, computationally complex processes. The only sensor required is a stereo camera. The algorithms consist of two building blocks. The first one is a stereo algorithm, able to provide reliable depth maps of the scenery in frame rates suitable for a robot to move autonomously. The second building block is either a simple decision- making algorithm or a fuzzy logic-based one, which analyze the depth maps and deduce the most appropriate direction for the robot to avoid any existing obstacles. Finally, a visual Simultaneous Localization and Mapping (SLAM) algorithm suitable for indoor applications is proposed. The algorithm is focused on computational effectiveness and the only sensor used is a stereo camera placed onboard a moving robot. The algorithm processes the acquired images calculating the depth of the scenery, detecting occupied areas and progressively building a map of the environment. The stereo vision-based SLAM algorithm embodies a custom-tailored stereo correspondence algorithm, the robust scale and rotation invariant feature detection and matching "Speeded Up Robust Features" (SURF) method, a computationally effective v-disparity image calculation scheme, a novel map-merging module, as well as a sophisticated cellular automata-based enhancement stage.

  This dissertation MIGHT be available in PDF-format. Check this page to see if it is available for download.