Exploring Biologically-Inspired Interactive Networks for Object Recognition

University dissertation from Linköping : Linköping University Electronic Press

Abstract: This thesis deals with biologically-inspired interactive neural networks for the task of object recognition. Such networks offer an interesting alternative approach to traditional image processing techniques. Although the networks are very powerful classification tools, they are difficult to handle due to their bidirectional interactivity. It is one of the main reasons why these networks do not perform the task of generalization to novel objects well. Generalization is a very important property for any object recognition system, as it is impractical for a system to learn all instances of an object class before classifying. In this thesis, we have investigated the working of an interactive neural network by fine tuning different structural and algorithmic parameters.  The performance of the networks was evaluated by analyzing the generalization ability of the trained network to novel objects. Furthermore, the interactivity of the network was utilized to simulate focus of attention during object classification. Selective attention is an important visual mechanism for object recognition and provides an efficient way of using the limited computational resources of the human visual system. Unlike most previous work in the field of image processing, in this thesis attention is considered as an integral part of object processing. Attention focus, in this work, is computed within the same network and in parallel with object recognition.As a first step, a study into the efficacy of Hebbian learning as a feature extraction method was conducted. In a second study, the receptive field size in the network, which controls the size of the extracted features as well as the number of layers in the network, was varied and analyzed to find its effect on generalization. In a continuation study, a comparison was made between learnt (Hebbian learning) and hard coded feature detectors. In the last study, attention focus was computed using interaction between bottom-up and top-down activation flow with the aim to handle multiple objects in the visual scene. On the basis of the results and analysis of our simulations we have found that the generalization performance of the bidirectional hierarchical network improves with the addition of a small amount of Hebbian learning to an otherwise error-driven learning. We also conclude that the optimal size of the receptive fields in our network depends on the object of interest in the image. Moreover, each receptive field must contain some part of the object in the input image. We have also found that networks using hard coded feature extraction perform better than the networks that use Hebbian learning for developing feature detectors. In the last study, we have successfully demonstrated the emergence of visual attention within an interactive network that handles more than one object in the input field. Our simulations demonstrate how bidirectional interactivity directs attention focus towards the required object by using both bottom-up and top-down effects.In general, the findings of this thesis will increase understanding about the working of biologically-inspired interactive networks. Specifically, the studied effects of the structural and algorithmic parameters that are critical for the generalization property will help develop these and similar networks and lead to improved performance on object recognition tasks. The results from the attention simulations can be used to increase the ability of networks to deal with multiple objects in an efficient and effective manner.