Subband Beamforming for Speech Enhancement in Hands-Free Communication

University dissertation from Karlskrona : Blekinge Institute of Technology

Abstract: Speech enhancement by means of microphone array signal processing has a major role in voice communication applications such as audio-conferencing, hands-free telephony, voice recognition and hearing aids. In these communication scenarios, the speaker is positioned at a remote distance from the microphones, which causes problems of environment noise and interfering sound corrupting the received speech. Additionally, reverberations of the voice from walls or ceilings, also impairs the received speech signal. In the case of a duplex communication, the acoustic feedback constitutes another disturbance for the talker who hears his or her voice echoed. Successful speech enhancement solutions should achieve speech dereverberation, efficient noise and interference reduction, and for mobile environments, they should also provide an adaptation capacity to speaker motion. Microphone arrays spatially sample the sound pressure field. When combined with spatio-temporal filtering techniques known as {\em beamforming}, they can extract the sound source information from signals, of which only a mixture is observed. This is based on the inherent ability of sensor arrays to exploit the spatial correlation of multiple received signals. A subband beamforming structure can be used in order to improve the performance of the time-domain filters and reduce their computational complexity. Each of the received signals is decomposed into a set of narrow-band signals and the filtering operations of the beamformer are performed for each frequency band separately. The output of the subband beamformers are then used to reconstruct a full-band output signal. In this thesis an adaptive subband RLS beamforming approach is investigated and evaluated in real hands-free acoustical environments. The proposed methodology is defined such to perform background noise and acoustic coupling reduction, while producing an undistorted filtered version of the signal originating from a desired location. The beamformer recursively minimizes a Least Squares error based on the continuously received data. This adaptive structure allows for a tracking of the noise characteristics, such to accomplish its attenuation in an efficient manner. A soft constraint built from calibration data in low noise conditions guarantee the integrity of the desired signal without the need of any speech detection. Additionally, a new spatial filter bank design method for beamforming applications, which includes the constraint of signal passage at one position and closing in other undesired positions, is suggested. Furthermore, to allow for source mobility tracking, a soft constrained beamforming approach with built-in speaker localization, is proposed. The source of interest is modelled as a cluster of point sources and source motion is accommodated by revising the point source cluster. Real speech signals are used in the simulations and results show accurate speaker movement tractability with maintained noise and interference suppression of about 10-15 dB, when using a four-microphone array.

  CLICK HERE TO DOWNLOAD THE WHOLE DISSERTATION. (in PDF format)