Motion and Structure Estimation From Video

University dissertation from Linköping : Linköping University Electronic Press

Abstract: Digital camera equipped cell phones were introduced in Japan in 2001, they quickly became popular and by 2003 outsold the entire stand-alone digital camera market. In 2010 sales passed one billion units and the market is still growing. Another trend is the rising popularity of smartphones which has led to a rapid development of the processing power on a phone, and many units sold today bear close resemblance to a personal computer. The combination of a powerful processor and a camera which is easily carried in your pocket, opens up a large eld of interesting computer vision applications.The core contribution of this thesis is the development of methods that allow an imaging device such as the cell phone camera to estimates its own motion and to capture the observed scene structure. One of the main focuses of this thesis is real-time performance, where a real-time constraint does not only result in shorter processing times, but also allows for user interaction.In computer vision, structure from motion refers to the process of estimating camera motion and 3D structure by exploring the motion in the image plane caused by the moving camera. This thesis presents several methods for estimating camera motion. Given the assumption that a set of images has known camera poses associated to them, we train a system to solve the camera pose very fast for a new image. For the cases where no a priory information is available a fast minimal case solver is developed. The solver uses ve points in two camera views to estimate the cameras relative position and orientation. This type of minimal case solver is usually used within a RANSAC framework. In order to increase accuracy and performance a renement to the random sampling strategy of RANSAC is proposed. It is shown that the new scheme doubles the performance for the ve point solver used on video data. For larger systems of cameras a new Bundle Adjustment method is developed which are able to handle video from cell phones.Demands for reduction in size, power consumption and price has led to a redesign of the image sensor. As a consequence the sensors have changed from a global shutter to a rolling shutter, where a rolling shutter image is acquired row by row. Classical structure from motion methods are modeled on the assumption of a global shutter and a rolling shutter can severely degrade their performance. One of the main contributions of this thesis is a new Bundle Adjustment method for cameras with a rolling shutter. The method accurately models the camera motion during image exposure with an interpolation scheme for both position and orientation.The developed methods are not restricted to cellphones only, but is rather applicable to any type of mobile platform that is equipped with cameras, such as a autonomous car or a robot. The domestic robot comes in many  avors, everything from vacuum cleaners to service and pet robots. A robot equipped with a camera that is capable of estimating its own motion while sensing its environment, like the human eye, can provide an eective means of navigation for the robot. Many of the presented methods are well suited of robots, where low latency and real-time constraints are crucial in order to allow them to interact with their environment.