Towards flexible audio coding

University dissertation from Stockholm : Signaler, sensorer och system

Abstract: thesis is about audio coding and improving flexibility thereof. Audio coding is used to reduce the bit rate needed to represent audio signals in digital format. When the available bit rate is low, parametric coding methods that use models to describe perceptually-important features of audio signals have shown their efficiency. We introduce several improvements to sinusoidal coding, a parametric coding method that is based on sinusoidal modeling of audio. Flexibility is important since new applications bring a need for coders that can operate over a large range of possible bit rates and are able to represent different types of audio material. Flexibility can be obtained by combining into one coder a set of subcoders that can efficiently represent different types or features of audio signals, and properly complement each other at different rates. For true flexibility, methods that allow fast coder adaptation should be deployed. We develop methods for rate-distortion optimal real-time design of quantizers in audio coding.This thesis consists of seven research papers. In paper A, we introduce a signal pre-processing method that facilitates removal of pre-echo artifacts when coding signals containing transients (sharp attacks). Papers B-F are devoted to sinusoidal audio coding. In papers B and C, we present improvements to the matching-pursuit sinusoidal estimation method. In papers D, E, and F, we consider quantization of sinusoidal parameters. We apply high-rate quantization theory to find the asymptotically optimal rate distribution between sinusoids and the corresponding asymptotically optimal quantizers for sinusoidal parameters, such that a perceptual distortion measure is minimized under a given rate constraint. The quantizers are derived analytically, which allows a coder to adapt quickly to changing bit-rate requirements. Paper G is devoted to multistage audio coding, a coding method where subcoders are combined in a cascaded way. We use high-rate theory to develop a flexible analytical framework for the asymptotically optimal rate distribution between subcoders and the design of the corresponding asymptotically optimal quantizers.

  This dissertation MIGHT be available in PDF-format. Check this page to see if it is available for download.