Very low bitrate facial video coding based on principal component analysis

University dissertation from Umeå : Tillämpad fysik och elektronik

Abstract: This thesis introduces a coding scheme for very low bitrate video coding through the aid of principal component analysis. Principal information of the facial mimic for a person can be extracted and stored in an Eigenspace. Entire video frames of this persons face can then be compressed with the Eigenspace to only a few projection coefficients. Principal component video coding encodes entire frames at once and increased frame size does not increase the necessary bitrate for encoding, as standard coding schemes do. This enables video communication with high frame rate, spatial resolution and visual quality at very low bitrates. No standard video coding technique provides these four features at the same time.Theoretical bounds for using principal components to encode facial video sequences are presented. Two different theoretical bounds are derived. One that describes the minimal distortion when a certain number of Eigenimages are used and one that describes the minimum distortion when a minimum number of bits are used.We investigate how the reconstruction quality for the coding scheme is affected when the Eigenspace, mean image and coefficients are compressed to enable efficient transmission. The Eigenspace and mean image are compressed through JPEG-compression while the while the coefficients are quantized. We show that high compression ratios can be used almost without any decrease in reconstruction quality for the coding scheme.Different ways of re-using the Eigenspace for a person extracted from one video sequence to encode other video sequences are examined. The most important factor is the positioning of the facial features in the video frames.Through a user test we find that it is extremely important to consider secondary workloads and how users make use of video when experimental setups are designed.

  CLICK HERE TO DOWNLOAD THE WHOLE DISSERTATION. (in PDF format)