Effect of speaking rate on the formant dynamics of triphthongs
CAO Honglin1,2,3, WANG Yujing4, LI Jingyang3,5
1. Collaborative Innovation Center of Judicial Civilization, Beijing 100088, China;
2. Key Laboratory of Evidence Science China University of Political Science and Law, Ministry of Education, Beijing 100088, China;
3. Key Laboratory of Intelligent Speech Technology, Ministry of Public Security, Beijing 100038, China;
4. Control Commission of Chaoyang District, Beijing 100026, China;
5. Ministry of Public Security Evidence Identification Center, Beijing 100038, China
Abstract:This study investigates the individual differences in the formant dynamics for four Chinese triphthongs/iau/,/iou/,/uai/and/uei/produced by thirty male subjects aged 18 to 28 years old at three different speaking rates (fast/normal/slow). The formant dynamics are described by cubic polynomial fits. The objective is to be able to discriminate between different speakers. The results show that the discriminant abilities vary for different speaking rates. Specifically, the fast speech has the best discrimination (76.7%), followed by the normal (69.5%) and slow (70.3%) speech. The triphthong discrimination ability decreases when the speaking speeds are differ, with "fast + slow" speeds giving the worst discrimination (48.0%). In all cases,/iau/more easily identifies different speakers than the other three triphthongs. Therefore, the formant dynamics of triphthongs with the same or similar speaking rates can be used to more easily distinguish different speakers.
曹洪林, 王宇靖, 李敬阳. 语速对三合元音共振峰动态特征的影响[J]. 清华大学学报(自然科学版), 2017, 57(9): 958-962.
CAO Honglin, WANG Yujing, LI Jingyang. Effect of speaking rate on the formant dynamics of triphthongs. Journal of Tsinghua University(Science and Technology), 2017, 57(9): 958-962.
王英利. 论声纹鉴定中复合韵母和鼻韵母中音素间连接形态特征[J]. 警察技术, 2001, 5:25-27.WANG Yingli. Connection morphological characteristic between phonemes of compound vowels and nasal terminal vowels in forensic phonesic[J]. Police Technology, 2001, 5:25-27. (in Chinese)
曹洪林, 孔江平. 长时共振峰分布特征在声纹鉴定中的应用[J]. 中国司法鉴定, 2013, 66(1):62-67.CAO Honglin, KONG Jiangping. Forensic speaker comparison by using long-term formant distribution[J]. Chinese Journal of Forensic Sciences, 2013, 66(1):62-67. (in Chinese)
McDougall K. Speaker characterising properties of formant dynamics:a case study[C]//Proc of 9th Australasian International Conference on SST. Melbourne, Australia, 2002:403-408.
McDougall K. Speaker-specific formant dynamics:An experiment on Australian English/a?/[J]. Int J Speech Lang La, 2004, 11(1):103-130.
McDougall K. Dynamic features of speech and the characterization of speakers:Towards a new approach using formant frequencies[J]. Int J Speech Lang La, 2006, 13(1):89-126.
Goldstein U G. Speaker identifying features based on formant tracks[J]. J Acoust Soc Am, 1976, 59(1):176-182.
Clermont F. Speaker variance ratios in forensically realisatic vowel formant data:Normalising for consonantal context[C]//Proc of 20th IAFPA. Vienna, Austria, 2011.
Ingram J C L, Prandolini R, Ong S. Formant trajectories as indices of phonetic variation for speaker identification[J]. Int J Speech Lang La, 1996, 3(1):129-145.
Greisbach R, Esser O, Weinstock C. Speaker identification by formant contours[J]. Beiträge Zur Phonetik Und Linguistik, 1995, 64:49-55.
Morrison G S. Likelihood-ratio forensic voice comparison using parametric representations of the formant trajectories of diphthongs[J]. J Acoust Soc Am, 2009, 125(4):2387-2397.
Zhang C, Morrison G S, Thiruvaran T. Forensic voice comparison using Chinese/iau/[C]//Proc of 17th ICPhS. Hong Kong, China, 2011:2280-2283.
McDougall K, Nolan F. Discrimination of speakers using the formant dynamics of/u:/in British English[C]//Proc of 16th ICPhS. Saarbrücken, German, 2007:1825-1828.
Enzinger E. Characterizing formant tracks in Viennese diphthongs for forensic speaker comparison[C]//Proc of 39th AES Conferences. Santander, Spain, 2010:47-52.
Taitechawat S, Foulkes P. Discrimination of speakers using tone and formant dynamics in Thai[C]//Proc of 17th ICPhS. Hong Kong, China, 2011:1975-1981.
Zuo D, Mok P P K. Formant dynamics of/ua/in the speech of Mandarin-Shanghainese bilingual identical twins[C]//Proc of 17th ICPhS. Hong Kong, China, 2011:2332-2335.
李敬阳, 王莉, 崔杰, 等. 说话人汉语普通话二合元音共振峰动态特征分析[C]//公安部物证鉴定中心.第一届全国声像资料检验鉴定技术交流会论文选.北京:中国人民公安大学出版社, 2011:612-615.LI Jingyang, WANG Li, CUI Jie, et al. Formant dynamic features in Chinese diphthong[C]//The Ministry of Public Security Material Evidence Identification Center. The 1st National Audio-visual Materials Appraisal Technical Forums. Beijing:People's Public Security University of China Press, 2011:612-615. (in Chinese)
Pitermann M. Effect of speaking rate and contrastive stress on formant dynamics and vowel perception[J]. J Acoust Soc Am, 2000, 107(6):3425-3437.
Fejlová D, Lukeš D, Skarnitzl R. Formant contours in Czech vowels:Speaker discriminating potential[C]//Proc of Interspeech. Lyon,France, 2013:3182-3186.
Wood S, Hughes H, Foulkes P. Filled pauses as variables in speaker comparison:Dynamic formant analysis and duration measurements improve performance for um[C]//Proc of 23th IAFPA. Zürich, Switzerland, 2014:81-82.
Skarnitzl R, Vaňková J, Weingartová L. Speaker discrimination using short-and long-term segmental information in vowels[C]//Proc of 21th IAFPA. Santander, Spain, 2012:3-4.
Zuo D, Mok P P K. Formant dynamics of bilingual identical twins in non-contemporaneous speech[C]//Proc of 14th Australasian International Conference on SST. Sydney, Australia, 2012:89-92.