Low and Medium Level Vision Using Channel Representations

University dissertation from Linköping : Linköping University Electronic Press

Abstract: This thesis introduces and explores a new type of representation for low and medium level vision operations called channel representation. The channel representation is a more general way to represent information than e.g. as numerical values, since it allows incorporation of uncertainty, and simultaneous representation of several hypotheses. More importantly it also allows the representation of “no information” when no statement can be given. A channel representation of a scalar value is a vector of channel values, which are generated by passing the original scalar value through a set of kernel functions. The resultant representation is sparse and monopolar. The word sparse signifies that information is not necessarily present in all channels. On the contrary, most channel values will be zero. The word monopolar signifies that all channel values have the same sign, e.g. they are either positive or zero. A zero channel value denotes “no information”, and for non-zero values, the magnitude signifies the relevance.In the thesis, a framework for channel encoding and local decoding of scalar values is presented. Averaging in the channel representation is identified as a regularised sampling of a probability density function. A subsequent decoding is thus a mode estimation technique.'The mode estimation property of channel averaging is exploited in the channel smoothing technique for image noise removal. We introduce an improvement to channel smoothing, called alpha synthesis, which deals with the problem of jagged edges present in the original method. Channel smoothing with alpha synthesis is compared to mean-shift filtering, bilateral filtering, median filtering, and normalized averaging with favourable results.A fast and robust blob-feature extraction method for vector fields is developed. The method is also extended to cluster constant slopes instead of constant regions. The method is intended for view-based object recognition and wide baseline matching. It is demonstrated on a wide baseline matching problem.A sparse scale-space representation of lines and edges is implemented and described. The representation keeps line and edge statements separate, and ensures that they are localised by inhibition from coarser scales. The result is however still locally continuous, in contrast to non-max-suppression approaches, which introduce a binary threshold.The channel representation is well suited to learning, which is demonstrated by applying it in an associative network. An analysis of representational properties of associative networks using the channel representation is made.Finally, a reactive system design using the channel representation is proposed. The system is similar in idea to recursive Bayesian techniques using particle filters, but the present formulation allows learning using the associative networks.

  This dissertation MIGHT be available in PDF-format. Check this page to see if it is available for download.