On Hardware Implementation of Discrete-Time Cellular Neural Networks

University dissertation from Department of Electrical and Information Technology, Lund University

Abstract: Cellular Neural Networks are characterized by simplicity of operation. The network consists of a large number of nonlinear processing units; called cells; that are equally spread in the space. Each cell has a simple function (sequence of multiply-add followed by a single discrimination) that takes an element of a topographic map and then interacts with all cells within a specified sphere of interest through direct connections. Due to their intrinsic parallel computing power, CNNs have attracted the attention of a wide variety of scientists in, e.g., the fields of image and video processing, robotics and higher brain functions.
Simplicity of operation together with the local connectivity gives CNNs first-hand advantages for tiled VLSI implementations with very high speed and complexity. The first VLSI implementation has been based on analogue technology but was small and suffered from parasitic capacitances and resistances leading to undesired behaviour. Later implementations focus on larger network and higher level of robustness. Mixed full-custom chips are most famous and widely considered as a roadmap for advanced realizations. The digital counter parts have focused on emulating the functionality of the CNN rather than providing real-time performance. Furthermore, they are totally dependent on a host PC to function properly. In spite of being less sensitive to parasitic noise and fabrication artefacts beside providing a quasi-infinite accuracy, fully digital implementations are, however, still not available. In other words, the exploitation of a stand-alone fully-digital approach is highly desired, which this thesis aims to tackle.
Macro enriched Field-Programmable Gate-Arrays (FPGAs) are used to realize such systems on silicon. At first glance a pipelined approach, based on circuit switching, seems promising. Two different approaches are investigated; Spatial and Temporal, of which the former is to prefer. Later on, in order to overcome design limitations and thus enhance performance, the benefits of packet-based switching have been explored. Although circuit switching is still employed, the enhancement is achieved by adopting the concept of Network-on-Chip (NoC), where packets are transmitted in a predefined communication pattern. The choice is between Serialized and Switched broadcasting schemes. The digital implementation of the Switched broadcasting is performed using Xilinx Virtex-II Pro P30 and the advantages over the pipelined approach are discussed by means of clock rate, area utilization and memory considerations. A serial communication approach shows, however, that network size can be increased further by a clear decrease in the size of communication interface. The thesis illustrates the power of the different implementations experimentally. It is shown how the digital CNN can be used to estimate velocity from images or to facilitate authentication by means of vein feature extractions. Furthermore, the issue of robustness is discussed from a different point of view. Here, the limited accuracy is compensated by gradual adjustment of the operative parameters, i.e. template coefficients. Finally, the thesis discusses main ingredients in system architecture to achieve the goal of a stand-alone fully-digital design.