Biologically-Based Interactive Neural Network Models for Visual Attention and Object Recognition

University dissertation from Linköping : Linköping University Electronic Press

Abstract: The main focus of this thesis is to develop biologically-based computational models for object recognition. A series of models for attention and object recognition were developed in the order of increasing functionality and complexity. These models are based on information processing in the primate brain, and specially inspired from the theory of visual information processing along the two parallel processing pathways of the primate visual cortex. To capture the true essence of incremental, constraint satisfaction style processing in the visual system, interactive neural networks were used for implementing our models. Results from eye-tracking studies on the relevant visual tasks, as well as our hypothesis regarding the information processing in the primate visual system, were implemented in the models and tested with simulations.As a first step, a model based on the ventral pathway was developed to recognize single objects. Through systematic testing, structural and algorithmic parameters of these models were fine tuned for performing their task optimally. In the second step, the model was extended by considering the dorsal pathway, which enables simulation of visual attention as an emergent phenomenon. The extended model was then investigated for visual search tasks. In the last step, we focussed on occluded and overlapped object recognition. A couple of eye-tracking studies were conducted in this regard and on the basis of the results we made some hypotheses regarding information processing in the primate visual system. The models were further advanced on the lines of the presented hypothesis, and simulated on the tasks of occluded and overlapped object recognition.On the basis of the results and analysis of our simulations we have further found that the generalization performance of interactive hierarchical networks improves with the addition of a small amount of Hebbian learning to an otherwise pure error-driven learning. We also concluded that the size of the receptive fields in our networks is an important parameter for the generalization task and depends on the object of interest in the image. Our results show that networks using hard coded feature extraction perform better than the networks that use Hebbian learning for developing feature detectors. We have successfully demonstrated the emergence of visual attention within an interactive network and also the role of context in the search task. Simulation results with occluded and overlapped objects support our extended interactive processing approach, which is a combination of the interactive and top-down approach, to the segmentation-recognition issue. Furthermore, the simulation behavior of our models is in line with known human behavior for similar tasks.In general, the work in this thesis will improve the understanding and performance of biologically-based interactive networks for object recognition and provide a biologically-plausible solution to recognition of occluded and overlapped objects. Moreover, our models provide some suggestions for the underlying neural mechanism and strategies behind biological object recognition.

  CLICK HERE TO DOWNLOAD THE WHOLE DISSERTATION. (in PDF format)