Soft-Constrained Subband Beamforming for Speech Enhancement
Abstract: New speech acquisition applications are emerging as a result of advances in technology and the prevalence of mobile communication. While today voice control of consumer equipment is becoming a reality, communication technology has extended voice connectivity to personal computers and mobile communication devices with the aim of enabling natural communication in a variety of environments such as cars, restaurants and offices. The comfort and flexibility provided through the hands-free acquisition of speech in mobile telephony, speech recognition and hearing aids require robust techniques to deal with problems of environmental noise, reverberation, acoustic feedback and other interfering sounds which corrupt the received speech. For mobile environments, speech enhancement techniques should also provide an adaptation capacity to speaker motion with no perceptible degradations of the original speech. In this thesis, multi-microphone techniques for speech enhancement are developed. First, a framework for constrained beamforming is introduced. This framework allows us to control the tradeoff relationship between noise reduction, dereverberation and speech degradation. A constraint on the power minimization of the beamformer’s output is formulated to guarantee the integrity of the desired signal. It is shown that the robustness towards microphone mismatch of the soft-constrained beamforming structure is guaranteed by modeling the source as spatially spread. A subband recursive least-squares (RLS) beamformer is investigated and evaluated in real handsfree acoustical environments. The proposed methodology is defined to perform background noise and interference reduction, while a soft constraint built from calibration data in low noise conditions guarantees the undistorted filtering of the desired signal. This adaptive structure allows for a tracking of the noise characteristics, so as to efficiently accomplish its attenuation. A subband beamforming structure is used to improve the performance of the system and reduce the computational complexity. A real-time DSP implementation is described and evaluated for dual microphone speech enhancement. Furthermore, a novel blind soft-constrained beamforming approach for moving source speech enhancement is presented. It is based on a soft constraint defined for a delay-spread corresponding to a volume around the speech source location. A new speech-oriented time-delay estimation algorithm is combined with the beamformer to allow for speaker movement. The proposed method does not require any calibration data, knowledge of the array manifold or any other characteristics of the acoustical environment. Hence, it provides means to blindly enhance a dominant speaker in adverse noise conditions. This structure is further developed to allow for the detection and enhancement of multiple dominant speakers in a mixture of interferences and background noise. The use of a frequency-dependent constraint region opens the path for a trade-off between noise suppression and speech integrity.
CLICK HERE TO DOWNLOAD THE WHOLE DISSERTATION. (in PDF format)