Radius vector-driven 3-D Mandarin vocal tract model
YAO Yun1, WU Xiyu2, KONG Jiangping2
1. College of Chinese Language and Literature, Henan University, Kaifeng 475001, China;
2. Department of Chinese Language and Literature, Peking University, Beijing 100871, China
Abstract:Analyses of the vocal tract resonant characteristics need accurate models of the vocal tract shape. This article presents a three-dimensional Mandarin vocal tract model using vocal tract shape data and midsagittal radius vector data from MRI images for seven Mandarin sustained vowels[a],[o],[r],[i],[u],[y] and[e]. The vocal tracts images were cut into 36 sections of equal distances along the midline of the vocal tract. The Mandarin vocal tract model for each section is then driven by the length of the radius vector in the cross-sectional images. The sound synthesized by this model sounds very much like natural speech.
马大猷. 说话的科学技术[M]. 北京:清华大学出版社, 2004.MA Dayou. Talking Science and Technology[M]. Beijing:Tsinghua University Press, 2004. (in Chinese)
[2]
Stevens K N, House A S. Development of a quantitative description of vowel articulation[J]. Journal of the Acoustical Society of America, 1955, 27:484-493.
[3]
Fant G. The Acoustic Theory of Speech Production[M]. Hague:Mouton, 1960.
[4]
Heinz J M, Stevens K N. On the derivation of area functions and acoustic spectra from cineradiographic films of speech[J]. Journal of the Acoustical Society of America, 1964, 36:1037.
[5]
Sundberg J. On the problem of obtaining area functions from lateral X-ray pictures of the vocal tract[J]. Royal Inst Technol STL-QPSR, 1969, 1:43-45.
[6]
Chiba T, Kajiyama M. The Vowel:Its Nature and Structure[M]. Tokyo:Kaiseikan Publishing Company, 1942.
[7]
Baer T, Gore J C, Gracco L C, et al. Analysis of vocal tract shape and dimensions using magnetic resonance imaging:vowels[J]. Journal of the Acoustical Society of America, 1991, 90(2):799-828.
[8]
Story B H, Hoffman E A, Titze I R. Vocal tract imaging:A comparison of MRI and EBCT[J]. Medical Imaging Physiology and Function from Multidimensional Images, Hoffman, 1996, 2709:209-222.
[9]
Narayanan S S, Alwan A A, Haker K. Toward articulatory-acoustic models for liquid approximants based on MRI and EPG data. Part I. The laterals[J]. Journal of the Acoustical Society of America, 1997, 101(2):1064-1077.
[10]
Alwan A, Narayanan S, Haker K. Toward articulatory-acoustic models for liquid approximants based on MRI and EPG data. Part Ⅱ. The rhotics[J]. Journal of the Acoustical Society of America, 1997, 101(2):1078-1089.
[11]
Espy-Wilson C Y, Boyce S E, Jackson M, et al. Acoustic modeling of American English vertical bar r vertical bar[J]. Journal of the Acoustical Society of America, 2000, 108(1):343-356.
[12]
Story B H, Titze I R. Parameterization of vocal tract area functions by empirical orthogonal modes[J]. Journal of Phonetics, 1998, 26(3):223-260.
[13]
Story B H. A parametric model of the vocal tract area function for vowel and consonant simulation[J]. Journal of the Acoustical Society of America, 2005, 117(5):3231-3254.
[14]
Dang J W, Honda K. Construction and control of a physiological articulatory model[J]. Journal of the Acoustical Society of America, 2004, 115(2):853-870.
[15]
Dang J W, Honda K. Estimation of vocal tract shapes from speech sounds with a physiological articulatory model[J]. Journal of Phonetics, 2002, 30(3):511-532.
[16]
Dang J W, Honda K, Suzuki H. Morphological and acoustical analysis of the nasal and the paranasal cavities[J]. Journal of the Acoustical Society of America, 1994, 96(4):2088-2100.
[17]
Dang J W, Honda K. Acoustic characteristics of the piriform fossa in models and humans[J]. Journal of the Acoustical Society of America, 1997, 101(1):456-465.
[18]
Dang J W, Shadle C H, Kawanishi Y, et al. An experimental study of the open end correction coefficient for side branches within an acoustic tube[J]. Journal of the Acoustical Society of America, 1998, 104(2):1075-1084.