Methods for Objective and Subjective Video Quality Assessment and for Speech Enhancement

Abstract: The overwhelming trend of the usage of multimedia services has raised the consumers' awareness about quality. Both service providers and consumers are interested in the delivered level of perceptual quality. The perceptual quality of an original video signal can get degraded due to compression and due to its transmission over a lossy network. Video quality assessment (VQA) has to be performed in order to gauge the level of video quality. Generally, it can be performed by following subjective methods, where a panel of humans judges the quality of video, or by using objective methods, where a computational model yields an estimate of the quality. Objective methods and specifically No-Reference (NR) or Reduced-Reference (RR) methods are preferable because they are practical for implementation in real-time scenarios. This doctoral thesis begins with a review of existing approaches proposed in the area of NR image and video quality assessment. In the review, recently proposed methods of visual quality assessment are classified into three categories. This is followed by the chapters related to the description of studies on the development of NR and RR methods as well as on conducting subjective experiments of VQA. In the case of NR methods, the required features are extracted from the coded bitstream of a video, and in the case of RR methods additional pixel-based information is used. Specifically, NR methods are developed with the help of suitable techniques of regression using artificial neural networks and least-squares support vector machines. Subsequently, in a later study, linear regression techniques are used to elaborate the interpretability of NR and RR models with respect to the selection of perceptually significant features. The presented studies on subjective experiments are performed using laboratory based and crowdsourcing platforms. In the laboratory based experiments, the focus has been on using standardized methods in order to generate datasets that can be used to validate objective methods of VQA. The subjective experiments performed through crowdsourcing relate to the investigation of non-standard methods in order to determine perceptual preference of various adaptation scenarios in the context of adaptive streaming of high-definition videos. Lastly, the use of adaptive gain equalizer in the modulation frequency domain for speech enhancement has been examined. To this end, two methods of demodulating speech signals namely spectral center of gravity carrier estimation and convex optimization have been studied.