On Prosodic Modification of Speech

University dissertation from Stockholm : KTH

Author: Barbara Resch; Kth.; [2006]

Abstract: Prosodic modification has become of major theoretical and practical interest in the field of speech processing research over the last decades. Algorithms for time and pitch scaling are used both for speech modification and for speech synthesis. The thesis consists of an introduction providing an overview and discussion of existing techniques for time and pitch scaling and of three research papers in this area.In paper A a system for time synchronization of speech is presented. It performs an alignment of two utterances of the same sentence, where one of the utterances is modified in time scale so as to be synchronized with the other utterance. The system is based on Dynamic Time Warping (DTW) and the Waveform Similarity Overlap and Add (WSOLA) method, a technique for time scaling of speech signals. Paper B and C complement each other and present a novel speech representation system that facilitates both time and pitch scaling of speech signals. Paper A describes a method to warp a signal with time-varying pitch to a signal with constant pitch. For this an accurate continuous pitch track is needed. The continuous pitch track is described as a B-spline expansion with coefficients that are selected to maximize a periodicity criterion. The warping to a constant pitch corresponds to the first stage of the system presented in paper C, which describes a two-stage transform that exploits long-term periodicity to obtain a sparse representation of speech. The new system facilitates a decomposition into a voiced and unvoiced component.