Please wait a minute...
 首页  期刊介绍 期刊订阅 联系我们 横山亮次奖 百年刊庆
 
最新录用  |  预出版  |  当期目录  |  过刊浏览  |  阅读排行  |  下载排行  |  引用排行  |  横山亮次奖  |  百年刊庆
清华大学学报(自然科学版)  2017, Vol. 57 Issue (9): 958-962    DOI: 10.16511/j.cnki.qhdxxb.2017.26.047
  电子工程 本期目录 | 过刊浏览 | 高级检索 |
语速对三合元音共振峰动态特征的影响
曹洪林1,2,3, 王宇靖4, 李敬阳3,5
1. 司法文明协同创新中心, 北京 100088;
2. 证据科学教育部重点实验室(中国政法大学), 北京 100088;
3. 智能语音技术公安部重点实验室, 北京 100038;
4. 北京市朝阳区监察委员会, 北京 100026;
5. 公安部物证鉴定中心, 北京 100038
Effect of speaking rate on the formant dynamics of triphthongs
CAO Honglin1,2,3, WANG Yujing4, LI Jingyang3,5
1. Collaborative Innovation Center of Judicial Civilization, Beijing 100088, China;
2. Key Laboratory of Evidence Science China University of Political Science and Law, Ministry of Education, Beijing 100088, China;
3. Key Laboratory of Intelligent Speech Technology, Ministry of Public Security, Beijing 100038, China;
4. Control Commission of Chaoyang District, Beijing 100026, China;
5. Ministry of Public Security Evidence Identification Center, Beijing 100038, China
全文: PDF(1915 KB)  
输出: BibTeX | EndNote (RIS)      
摘要 该文以30位18至28岁的男性被试为对象,在快速、中速和慢速3种语速条件下,对汉语普通话中4个三合元音(/iau/、/iou/、/uai/、/uei/)共振峰的动态特征进行了量化分析。运用三次多项式拟合方法描述前3条共振峰的动态轨迹,以拟合系数为自变量,对共振峰的动态特征进行了判别分析。结果表明:相同语速语音比对时,语速不同,判别能力也不同,快速发音的判别能力最高(平均为76.7%),中速和慢速发音的判别能力相对较低(分别为69.5%、70.3%)。不同语速语音组合比对时,各三合元音的判别能力均有所下降,其中"快+慢"组合的判别效果最差(平均为48.0%)。所有的语速条件下,判别能力最高的三合元音均为/iau/。由此可知,语速相同或相近时,三合元音的共振峰动态特征可以有效区分不同说话人。
服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
曹洪林
王宇靖
李敬阳
关键词 三合元音共振峰动态特征多项式拟合判别分析语速    
Abstract:This study investigates the individual differences in the formant dynamics for four Chinese triphthongs/iau/,/iou/,/uai/and/uei/produced by thirty male subjects aged 18 to 28 years old at three different speaking rates (fast/normal/slow). The formant dynamics are described by cubic polynomial fits. The objective is to be able to discriminate between different speakers. The results show that the discriminant abilities vary for different speaking rates. Specifically, the fast speech has the best discrimination (76.7%), followed by the normal (69.5%) and slow (70.3%) speech. The triphthong discrimination ability decreases when the speaking speeds are differ, with "fast + slow" speeds giving the worst discrimination (48.0%). In all cases,/iau/more easily identifies different speakers than the other three triphthongs. Therefore, the formant dynamics of triphthongs with the same or similar speaking rates can be used to more easily distinguish different speakers.
Key wordstriphthong    formant dynamics    polynomial fitting    discriminant analysis    speaking rate
收稿日期: 2016-05-12      出版日期: 2017-09-15
ZTFLH:  H017  
  DF794  
通讯作者: 李敬阳,研究员,E-mail:lijy@263.net     E-mail: lijy@263.net
引用本文:   
曹洪林, 王宇靖, 李敬阳. 语速对三合元音共振峰动态特征的影响[J]. 清华大学学报(自然科学版), 2017, 57(9): 958-962.
CAO Honglin, WANG Yujing, LI Jingyang. Effect of speaking rate on the formant dynamics of triphthongs. Journal of Tsinghua University(Science and Technology), 2017, 57(9): 958-962.
链接本文:  
http://jst.tsinghuajournals.com/CN/10.16511/j.cnki.qhdxxb.2017.26.047  或          http://jst.tsinghuajournals.com/CN/Y2017/V57/I9/958
  图1 三合元音/iau/的宽带语图及其起始和结束点位置
  图2 M1/iau/ (快速)的共振峰动态轨迹
  图3 M3/iau/(快速)的共振峰动态轨迹
  图4 M3/iau/(中速)的共振峰动态轨迹
  图5 相同语速比对时的判别能力
  图6 不同语速比对时的判别能力
  图7 不同语速比对时的判别能力均值
[1] 王英利. 论声纹鉴定中复合韵母和鼻韵母中音素间连接形态特征[J]. 警察技术, 2001, 5:25-27.WANG Yingli. Connection morphological characteristic between phonemes of compound vowels and nasal terminal vowels in forensic phonesic[J]. Police Technology, 2001, 5:25-27. (in Chinese)
[2] 曹洪林, 孔江平. 长时共振峰分布特征在声纹鉴定中的应用[J]. 中国司法鉴定, 2013, 66(1):62-67.CAO Honglin, KONG Jiangping. Forensic speaker comparison by using long-term formant distribution[J]. Chinese Journal of Forensic Sciences, 2013, 66(1):62-67. (in Chinese)
[3] McDougall K. Speaker characterising properties of formant dynamics:a case study[C]//Proc of 9th Australasian International Conference on SST. Melbourne, Australia, 2002:403-408.
[4] McDougall K. Speaker-specific formant dynamics:An experiment on Australian English/a?/[J]. Int J Speech Lang La, 2004, 11(1):103-130.
[5] McDougall K. Dynamic features of speech and the characterization of speakers:Towards a new approach using formant frequencies[J]. Int J Speech Lang La, 2006, 13(1):89-126.
[6] Goldstein U G. Speaker identifying features based on formant tracks[J]. J Acoust Soc Am, 1976, 59(1):176-182.
[7] Clermont F. Speaker variance ratios in forensically realisatic vowel formant data:Normalising for consonantal context[C]//Proc of 20th IAFPA. Vienna, Austria, 2011.
[8] Ingram J C L, Prandolini R, Ong S. Formant trajectories as indices of phonetic variation for speaker identification[J]. Int J Speech Lang La, 1996, 3(1):129-145.
[9] Greisbach R, Esser O, Weinstock C. Speaker identification by formant contours[J]. Beiträge Zur Phonetik Und Linguistik, 1995, 64:49-55.
[10] Morrison G S. Likelihood-ratio forensic voice comparison using parametric representations of the formant trajectories of diphthongs[J]. J Acoust Soc Am, 2009, 125(4):2387-2397.
[11] Zhang C, Morrison G S, Thiruvaran T. Forensic voice comparison using Chinese/iau/[C]//Proc of 17th ICPhS. Hong Kong, China, 2011:2280-2283.
[12] McDougall K, Nolan F. Discrimination of speakers using the formant dynamics of/u:/in British English[C]//Proc of 16th ICPhS. Saarbrücken, German, 2007:1825-1828.
[13] Enzinger E. Characterizing formant tracks in Viennese diphthongs for forensic speaker comparison[C]//Proc of 39th AES Conferences. Santander, Spain, 2010:47-52.
[14] Taitechawat S, Foulkes P. Discrimination of speakers using tone and formant dynamics in Thai[C]//Proc of 17th ICPhS. Hong Kong, China, 2011:1975-1981.
[15] Zuo D, Mok P P K. Formant dynamics of/ua/in the speech of Mandarin-Shanghainese bilingual identical twins[C]//Proc of 17th ICPhS. Hong Kong, China, 2011:2332-2335.
[16] 李敬阳, 王莉, 崔杰, 等. 说话人汉语普通话二合元音共振峰动态特征分析[C]//公安部物证鉴定中心.第一届全国声像资料检验鉴定技术交流会论文选.北京:中国人民公安大学出版社, 2011:612-615.LI Jingyang, WANG Li, CUI Jie, et al. Formant dynamic features in Chinese diphthong[C]//The Ministry of Public Security Material Evidence Identification Center. The 1st National Audio-visual Materials Appraisal Technical Forums. Beijing:People's Public Security University of China Press, 2011:612-615. (in Chinese)
[17] Pitermann M. Effect of speaking rate and contrastive stress on formant dynamics and vowel perception[J]. J Acoust Soc Am, 2000, 107(6):3425-3437.
[18] Fejlová D, Lukeš D, Skarnitzl R. Formant contours in Czech vowels:Speaker discriminating potential[C]//Proc of Interspeech. Lyon,France, 2013:3182-3186.
[19] Wood S, Hughes H, Foulkes P. Filled pauses as variables in speaker comparison:Dynamic formant analysis and duration measurements improve performance for um[C]//Proc of 23th IAFPA. Zürich, Switzerland, 2014:81-82.
[20] Skarnitzl R, Vaňková J, Weingartová L. Speaker discrimination using short-and long-term segmental information in vowels[C]//Proc of 21th IAFPA. Santander, Spain, 2012:3-4.
[21] Zuo D, Mok P P K. Formant dynamics of bilingual identical twins in non-contemporaneous speech[C]//Proc of 14th Australasian International Conference on SST. Sydney, Australia, 2012:89-92.
[22] 王英利. 声纹鉴定技术[M]. 北京:群众出版社, 2013.WANG Yingli. Forensic Phonetics[M]. Beijing:Masses Press, 2013.(in Chinese)
[23] Sj lander K, Beskow J. Wavesurfer-an open source speech tool[C]//Proc of 6th ICSLP. Beijing, China, 2000:464-467.
[1] 艾斯卡尔·肉孜, 王东, 李蓝天, 郑方, 张晓东, 金磐石. 说话人识别中的分数域语速归一化[J]. 清华大学学报(自然科学版), 2018, 58(4): 337-341.
[2] 苗晓晓, 张健, 索宏彬, 周若华, 颜永红. 应用于短时语音语种识别的时长扩展方法[J]. 清华大学学报(自然科学版), 2018, 58(3): 254-259.
[3] 李英浩, 孔江平. 语速对普通话音段产生的影响[J]. 清华大学学报(自然科学版), 2017, 57(9): 963-969.
[4] 赛牙热·依马木, 热依莱木·帕尔哈提, 艾斯卡尔·艾木都拉, 李志军. 基于不同关键词提取算法的维吾尔文本情感辨识[J]. 清华大学学报(自然科学版), 2017, 57(3): 270-273.
[5] 宋鹏, 郑文明, 赵力. 基于特征迁移学习方法的跨库语音情感识别[J]. 清华大学学报(自然科学版), 2016, 56(11): 1179-1183.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 《清华大学学报(自然科学版)》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn