MIMO Decoding Algorithm and Implementation

University dissertation from Department of Electroscience, Lund University

Abstract: Multi-Input Multi-Output system has been one of the hot technologies for future wireless communications, since it can increase the capacity (coverage or link quality in other senses) at no cost in frequency spectrum. This doctoral thesis investigates improvements to MIMO decoding algorithms and presents VLSI implementation results. The main contributions consist of four algorithms and three VLSI implementations. On the aspect of sphere decoding for MIMO systems, a low complexity Schnorr-Euchner sphere decoding algorithm, based on depth-first searching, is firstly proposed with two techniques that reduce the complexity of the proposed algorithm. Then K-best Schnorr-Euchner sphere decoding algorithm is proposed and shown to be suitable for VLSI implementations for hard-output MIMO decoding. A modified K-best Schnorr-Euchner sphere decoding algorithm is further proposed to improve the performance of the original algorithm for soft-output MIMO decoding. Moreover, a VLSI architecture is proposed for both the hard-output K-best Schnorr-Euchner algorithm and the soft-output Modified K-best Schnorr-Euchner algorithm. The proposed hard-output decoder and the soft-output decoder is implemented for 4x4 16-QAM MIMO decoding in a 0.35-um and a 0.13-um CMOS technology, respectively. The implemented hard-output decoder chip core has 91 K gates (equivalent NAND2). Its decoding throughput is up to 53.3 Mb/s with a core power consumption of 626 mW using 100 MHz clock frequency and 2.8 V supply. The implemented soft-output decoder chip can achieve a decoding throughput of more than 100 Mb/s with 97 K core gates. On the aspect of sub-optimal MIMO decoding, a modified sub-optimal MIMO decoding algorithm is proposed to approach the performance of the sphere decoding algorithm with comparable complexity and much higher decoding throughput. Moreover, a low complexity VLSI architecture is proposed for the square-root algorithm. It is implemented for a 4x4 QPSK MIMO system in a 0.35-um CMOS technology. The chip core has 190 K gates. The decoding throughput of the chip depends on the received symbol packet length. When the packet length is larger than or equal to 100 bytes, it can achieve a maximal throughput of 128-160 Mb/s at a maximal clock frequency of 80 MHz. The core power consumption, measured at 2.7 V and room temperature, is about 608 mW for 160 Mb/s data rate at 80 MHz, and 81 mW for 20 Mb/s at 10 MHz.

  CLICK HERE TO DOWNLOAD THE WHOLE DISSERTATION. (in PDF format)