Articulatory-Acoustic Relationships in Swedish Vowel Sounds
Abstract: The goal of this work was to evaluate the performance of a classical method for predicting vocal tract cross-sectional areas from cross-distances, to be implemented in speaker-specific articulatory modelling. The data forming the basis of the evaluation were magnetic resonance images from the vocal tract combined with simultaneous audio and video recordings. These data were collected from one female and one male speaker. The speech materials consisted of extended articulation of each of the nine Swedish long vowels together with two short allophonic qualities. The data acquisition and processing involved, among other things, the development of a method for dental integration in the MR image, and a refined sound recording technique required for the particular experimental conditions. Articulatory measurements were made of cross-distances and cross-sectional areas from the speakers’ larynx, pharynx, oral cavity and lip section, together with estimations on the vocal tract termination points. Acoustic and auditory analyses were made of the sound recordings, including an evaluation of the influence of the noise from the MR machine on the vowel productions. Cross-distance to cross-sectional area conversion rules were established from the articulatory measurements. The evaluation of these rules involved quantitative as well as qualitative dimensions. The articulatory evaluation gave rise to a vowel-dependent extension of the method under investigation, allowing more geometrical freedom for articulatory configurations along the vocal tract. The extended method proved to be more successful in predicting cross-sectional areas, particularly in the velar region. The acoustic evaluation, based on area functions derived from the proposed rules, did however not show significant differences in formant patterns between the classical and the extended method. This was interpreted as evidence for the classic method having higher acoustic than physiological validity on the present materials. For application and extrapolation in articulatory modelling, it is however possible that the extended method will perform better in articulation and acoustics, given its physiologically more fine-tuned foundation. Research funded by the NIH (R01 DC02014) and Stockholm University (SU 617-0230-01).
This dissertation MIGHT be available in PDF-format. Check this page to see if it is available for download.