Trellis Decoding: From Algorithm to Flexible Architectures

University dissertation from Department of Electroscience, Lund University

Abstract: Trellis decoding is a popular method to recover encoded information corrupted during transmission over a noisy channel. Prominent members of this class of decoding algorithms are the Viterbi algorithm, which provides maximum likelihood estimates, and the BCJR algorithm, which is a maximum a posteriori estimator commonly used in iterative decoding. In this thesis, the Viterbi algorithm is chosen since it provides a good trade-off between achievable coding gain and implementation complexity. This is the basis for considerations on simplified, hybrid, and, most importantly, flexible VLSI architectures. Algorithm simplifications are necessary to reduce the computational burden laid on an implementation platform. In our work on trellis decoding blocks, a simplification that lowers the number of arithmetic operations is derived and evaluated. By using a complementary code property, the arithmetic complexity of the main part on the Viterbi algorithm is reduced by 17%. Synthesized blocks show varying savings for cell area and estimated power consumption. A comparison to a competing simplification shows the advantage in a hardware implementation of our work for the important class of rate 1/2 convolutional codes. Hybrid architectures try to combine benefits of several approaches to lower the drawbacks of the individual contributors. For survivor path processing in Viterbi decoders, a new hybrid approach is proposed. A low-latency algorithm, whose implementation complexity quickly increases with the number of trellis states, is combined with a scalable RAM-based method. As a result, the developed hybrid architecture exhibits a better latency-complexity behavior compared to other hybrid approaches. Flexible VLSI architectures to cover several communication standards become increasingly important as fabrication costs for microchips rise rapidly with every new process generation. In the context of flexible trellis decoding, earlier work mostly concentrated on varying encoder memory and thus the number of trellis states. This work studies the effects on hardware size and throughput introduced by flexibility if the code rate is varied. The investigation of a decoder for bandwidth-efficient codes, which was fabricated in a 0.13 um digital CMOS process and verified for functionality, distinguishes between task- and rate-flexibility. A comparison is carried out between flexible designs, which decode both convolutional and TCM codes and provide two or three transmission rates. It is concluded that the larger number of rates is more beneficial from a cost--flexibility viewpoint.