Please wait a minute...
 首页  期刊介绍 期刊订阅 联系我们
 
最新录用  |  预出版  |  当期目录  |  过刊浏览  |  阅读排行  |  下载排行  |  引用排行  |  百年期刊
Journal of Tsinghua University(Science and Technology)    2017, Vol. 57 Issue (9) : 958-962     DOI: 10.16511/j.cnki.qhdxxb.2017.26.047
ELECTRONIC ENGINEERING |
Effect of speaking rate on the formant dynamics of triphthongs
CAO Honglin1,2,3, WANG Yujing4, LI Jingyang3,5
1. Collaborative Innovation Center of Judicial Civilization, Beijing 100088, China;
2. Key Laboratory of Evidence Science China University of Political Science and Law, Ministry of Education, Beijing 100088, China;
3. Key Laboratory of Intelligent Speech Technology, Ministry of Public Security, Beijing 100038, China;
4. Control Commission of Chaoyang District, Beijing 100026, China;
5. Ministry of Public Security Evidence Identification Center, Beijing 100038, China
Download: PDF(1915 KB)  
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks    
Abstract  This study investigates the individual differences in the formant dynamics for four Chinese triphthongs/iau/,/iou/,/uai/and/uei/produced by thirty male subjects aged 18 to 28 years old at three different speaking rates (fast/normal/slow). The formant dynamics are described by cubic polynomial fits. The objective is to be able to discriminate between different speakers. The results show that the discriminant abilities vary for different speaking rates. Specifically, the fast speech has the best discrimination (76.7%), followed by the normal (69.5%) and slow (70.3%) speech. The triphthong discrimination ability decreases when the speaking speeds are differ, with "fast + slow" speeds giving the worst discrimination (48.0%). In all cases,/iau/more easily identifies different speakers than the other three triphthongs. Therefore, the formant dynamics of triphthongs with the same or similar speaking rates can be used to more easily distinguish different speakers.
Keywords triphthong      formant dynamics      polynomial fitting      discriminant analysis      speaking rate     
ZTFLH:  H017  
  DF794  
Issue Date: 15 September 2017
Service
E-mail this article
E-mail Alert
RSS
Articles by authors
CAO Honglin
WANG Yujing
LI Jingyang
Cite this article:   
CAO Honglin,WANG Yujing,LI Jingyang. Effect of speaking rate on the formant dynamics of triphthongs[J]. Journal of Tsinghua University(Science and Technology), 2017, 57(9): 958-962.
URL:  
http://jst.tsinghuajournals.com/EN/10.16511/j.cnki.qhdxxb.2017.26.047     OR     http://jst.tsinghuajournals.com/EN/Y2017/V57/I9/958
  
  
  
  
  
  
  
[1] 王英利. 论声纹鉴定中复合韵母和鼻韵母中音素间连接形态特征[J]. 警察技术, 2001, 5:25-27.WANG Yingli. Connection morphological characteristic between phonemes of compound vowels and nasal terminal vowels in forensic phonesic[J]. Police Technology, 2001, 5:25-27. (in Chinese)
[2] 曹洪林, 孔江平. 长时共振峰分布特征在声纹鉴定中的应用[J]. 中国司法鉴定, 2013, 66(1):62-67.CAO Honglin, KONG Jiangping. Forensic speaker comparison by using long-term formant distribution[J]. Chinese Journal of Forensic Sciences, 2013, 66(1):62-67. (in Chinese)
[3] McDougall K. Speaker characterising properties of formant dynamics:a case study[C]//Proc of 9th Australasian International Conference on SST. Melbourne, Australia, 2002:403-408.
[4] McDougall K. Speaker-specific formant dynamics:An experiment on Australian English/a?/[J]. Int J Speech Lang La, 2004, 11(1):103-130.
[5] McDougall K. Dynamic features of speech and the characterization of speakers:Towards a new approach using formant frequencies[J]. Int J Speech Lang La, 2006, 13(1):89-126.
[6] Goldstein U G. Speaker identifying features based on formant tracks[J]. J Acoust Soc Am, 1976, 59(1):176-182.
[7] Clermont F. Speaker variance ratios in forensically realisatic vowel formant data:Normalising for consonantal context[C]//Proc of 20th IAFPA. Vienna, Austria, 2011.
[8] Ingram J C L, Prandolini R, Ong S. Formant trajectories as indices of phonetic variation for speaker identification[J]. Int J Speech Lang La, 1996, 3(1):129-145.
[9] Greisbach R, Esser O, Weinstock C. Speaker identification by formant contours[J]. Beiträge Zur Phonetik Und Linguistik, 1995, 64:49-55.
[10] Morrison G S. Likelihood-ratio forensic voice comparison using parametric representations of the formant trajectories of diphthongs[J]. J Acoust Soc Am, 2009, 125(4):2387-2397.
[11] Zhang C, Morrison G S, Thiruvaran T. Forensic voice comparison using Chinese/iau/[C]//Proc of 17th ICPhS. Hong Kong, China, 2011:2280-2283.
[12] McDougall K, Nolan F. Discrimination of speakers using the formant dynamics of/u:/in British English[C]//Proc of 16th ICPhS. Saarbrücken, German, 2007:1825-1828.
[13] Enzinger E. Characterizing formant tracks in Viennese diphthongs for forensic speaker comparison[C]//Proc of 39th AES Conferences. Santander, Spain, 2010:47-52.
[14] Taitechawat S, Foulkes P. Discrimination of speakers using tone and formant dynamics in Thai[C]//Proc of 17th ICPhS. Hong Kong, China, 2011:1975-1981.
[15] Zuo D, Mok P P K. Formant dynamics of/ua/in the speech of Mandarin-Shanghainese bilingual identical twins[C]//Proc of 17th ICPhS. Hong Kong, China, 2011:2332-2335.
[16] 李敬阳, 王莉, 崔杰, 等. 说话人汉语普通话二合元音共振峰动态特征分析[C]//公安部物证鉴定中心.第一届全国声像资料检验鉴定技术交流会论文选.北京:中国人民公安大学出版社, 2011:612-615.LI Jingyang, WANG Li, CUI Jie, et al. Formant dynamic features in Chinese diphthong[C]//The Ministry of Public Security Material Evidence Identification Center. The 1st National Audio-visual Materials Appraisal Technical Forums. Beijing:People's Public Security University of China Press, 2011:612-615. (in Chinese)
[17] Pitermann M. Effect of speaking rate and contrastive stress on formant dynamics and vowel perception[J]. J Acoust Soc Am, 2000, 107(6):3425-3437.
[18] Fejlová D, Lukeš D, Skarnitzl R. Formant contours in Czech vowels:Speaker discriminating potential[C]//Proc of Interspeech. Lyon,France, 2013:3182-3186.
[19] Wood S, Hughes H, Foulkes P. Filled pauses as variables in speaker comparison:Dynamic formant analysis and duration measurements improve performance for um[C]//Proc of 23th IAFPA. Zürich, Switzerland, 2014:81-82.
[20] Skarnitzl R, Vaňková J, Weingartová L. Speaker discrimination using short-and long-term segmental information in vowels[C]//Proc of 21th IAFPA. Santander, Spain, 2012:3-4.
[21] Zuo D, Mok P P K. Formant dynamics of bilingual identical twins in non-contemporaneous speech[C]//Proc of 14th Australasian International Conference on SST. Sydney, Australia, 2012:89-92.
[22] 王英利. 声纹鉴定技术[M]. 北京:群众出版社, 2013.WANG Yingli. Forensic Phonetics[M]. Beijing:Masses Press, 2013.(in Chinese)
[23] Sj lander K, Beskow J. Wavesurfer-an open source speech tool[C]//Proc of 6th ICSLP. Beijing, China, 2000:464-467.
[1] AISIKAER Rouzi, WANG Dong, LI Lantian, ZHENG Fang, ZHANG Xiaodong, JIN Panshi. Score domain speaking rate normalization for speaker recognition[J]. Journal of Tsinghua University(Science and Technology), 2018, 58(4): 337-341.
[2] IMAM Seyyare, PARHAT Rayilam, HAMDULLA Askar, LI Zhijun. Keyword extraction algorithms for emotion recognition from Uyghur text[J]. Journal of Tsinghua University(Science and Technology), 2017, 57(3): 270-273.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
Copyright © Journal of Tsinghua University(Science and Technology), All Rights Reserved.
Powered by Beijing Magtech Co. Ltd