Please wait a minute...
 首页  期刊介绍 期刊订阅 联系我们 横山亮次奖 百年刊庆
 
最新录用  |  预出版  |  当期目录  |  过刊浏览  |  阅读排行  |  下载排行  |  引用排行  |  横山亮次奖  |  百年刊庆
清华大学学报(自然科学版)  2017, Vol. 57 Issue (9): 945-951    DOI: 10.16511/j.cnki.qhdxxb.2017.26.045
  电子工程 本期目录 | 过刊浏览 | 高级检索 |
矢量半径驱动的汉语普通话立体声道模型
姚云1, 吴西愉2, 孔江平2
1. 河南大学 文学院, 开封 475001;
2. 北京大学 中国语言文学系, 北京 100871
Radius vector-driven 3-D Mandarin vocal tract model
YAO Yun1, WU Xiyu2, KONG Jiangping2
1. College of Chinese Language and Literature, Henan University, Kaifeng 475001, China;
2. Department of Chinese Language and Literature, Peking University, Beijing 100871, China
全文: PDF(2949 KB)  
输出: BibTeX | EndNote (RIS)      
摘要 为得到更加精确的声道共鸣特性,人们越来越关心说话时声道的结构及其形状变化。该文从MRI 3-D图像数据中提取了汉语普通话7个单元音[a]、[o]、[r]、[i]、[u]、[y]、[e]的声道边缘、中线和矢量半径数据,并沿着声道中线从唇到声门等间距截取声道的36个截面,对于每个截面,根据矢量半径的大小使同一个位置的截面形状作线性过渡,从而建立用矢量半径驱动的汉语普通话立体声道模型。计算模型的共振峰并合成语音样本,在与自然语音对比的听辨实验中,该模型取得了较好的语音合成效果。
服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
姚云
吴西愉
孔江平
关键词 汉语普通话声道建模矢量半径    
Abstract:Analyses of the vocal tract resonant characteristics need accurate models of the vocal tract shape. This article presents a three-dimensional Mandarin vocal tract model using vocal tract shape data and midsagittal radius vector data from MRI images for seven Mandarin sustained vowels[a],[o],[r],[i],[u],[y] and[e]. The vocal tracts images were cut into 36 sections of equal distances along the midline of the vocal tract. The Mandarin vocal tract model for each section is then driven by the length of the radius vector in the cross-sectional images. The sound synthesized by this model sounds very much like natural speech.
Key wordsMandarin    vocal tract model    radius vector
收稿日期: 2016-05-06      出版日期: 2017-09-15
ZTFLH:  H017  
通讯作者: 孔江平,教授,E-mail:jpkong@pku.edu.cn     E-mail: jpkong@pku.edu.cn
引用本文:   
姚云, 吴西愉, 孔江平. 矢量半径驱动的汉语普通话立体声道模型[J]. 清华大学学报(自然科学版), 2017, 57(9): 945-951.
YAO Yun, WU Xiyu, KONG Jiangping. Radius vector-driven 3-D Mandarin vocal tract model. Journal of Tsinghua University(Science and Technology), 2017, 57(9): 945-951.
链接本文:  
http://jst.tsinghuajournals.com/CN/10.16511/j.cnki.qhdxxb.2017.26.045  或          http://jst.tsinghuajournals.com/CN/Y2017/V57/I9/945
  表1 磁共振设备扫描参数列表
  图1 汉语普通话元音[a]的声道截面
  图2 汉语普通话单元音[a]、[i]、[u]的立体声道
  表2 汉语普通话声道从唇到声门36个截面矢量半径
  图3 声道不同位置的截面边缘线
  图4 汉语普通话立体声道网格模型
  图5 汉语普通话[a]到[i]过渡声道面积函数和声道传递函数
  表3 自然语音与计算模拟的语音共振峰数据
[1] 马大猷. 说话的科学技术[M]. 北京:清华大学出版社, 2004.MA Dayou. Talking Science and Technology[M]. Beijing:Tsinghua University Press, 2004. (in Chinese)
[2] Stevens K N, House A S. Development of a quantitative description of vowel articulation[J]. Journal of the Acoustical Society of America, 1955, 27:484-493.
[3] Fant G. The Acoustic Theory of Speech Production[M]. Hague:Mouton, 1960.
[4] Heinz J M, Stevens K N. On the derivation of area functions and acoustic spectra from cineradiographic films of speech[J]. Journal of the Acoustical Society of America, 1964, 36:1037.
[5] Sundberg J. On the problem of obtaining area functions from lateral X-ray pictures of the vocal tract[J]. Royal Inst Technol STL-QPSR, 1969, 1:43-45.
[6] Chiba T, Kajiyama M. The Vowel:Its Nature and Structure[M]. Tokyo:Kaiseikan Publishing Company, 1942.
[7] Baer T, Gore J C, Gracco L C, et al. Analysis of vocal tract shape and dimensions using magnetic resonance imaging:vowels[J]. Journal of the Acoustical Society of America, 1991, 90(2):799-828.
[8] Story B H, Hoffman E A, Titze I R. Vocal tract imaging:A comparison of MRI and EBCT[J]. Medical Imaging Physiology and Function from Multidimensional Images, Hoffman, 1996, 2709:209-222.
[9] Narayanan S S, Alwan A A, Haker K. Toward articulatory-acoustic models for liquid approximants based on MRI and EPG data. Part I. The laterals[J]. Journal of the Acoustical Society of America, 1997, 101(2):1064-1077.
[10] Alwan A, Narayanan S, Haker K. Toward articulatory-acoustic models for liquid approximants based on MRI and EPG data. Part Ⅱ. The rhotics[J]. Journal of the Acoustical Society of America, 1997, 101(2):1078-1089.
[11] Espy-Wilson C Y, Boyce S E, Jackson M, et al. Acoustic modeling of American English vertical bar r vertical bar[J]. Journal of the Acoustical Society of America, 2000, 108(1):343-356.
[12] Story B H, Titze I R. Parameterization of vocal tract area functions by empirical orthogonal modes[J]. Journal of Phonetics, 1998, 26(3):223-260.
[13] Story B H. A parametric model of the vocal tract area function for vowel and consonant simulation[J]. Journal of the Acoustical Society of America, 2005, 117(5):3231-3254.
[14] Dang J W, Honda K. Construction and control of a physiological articulatory model[J]. Journal of the Acoustical Society of America, 2004, 115(2):853-870.
[15] Dang J W, Honda K. Estimation of vocal tract shapes from speech sounds with a physiological articulatory model[J]. Journal of Phonetics, 2002, 30(3):511-532.
[16] Dang J W, Honda K, Suzuki H. Morphological and acoustical analysis of the nasal and the paranasal cavities[J]. Journal of the Acoustical Society of America, 1994, 96(4):2088-2100.
[17] Dang J W, Honda K. Acoustic characteristics of the piriform fossa in models and humans[J]. Journal of the Acoustical Society of America, 1997, 101(1):456-465.
[18] Dang J W, Shadle C H, Kawanishi Y, et al. An experimental study of the open end correction coefficient for side branches within an acoustic tube[J]. Journal of the Acoustical Society of America, 1998, 104(2):1075-1084.
[1] 汪高武, 党建武, 孔江平. 基于磁共振成像的汉语普通话舌尖调音建模[J]. 清华大学学报(自然科学版), 2017, 57(2): 158-163.
[2] 李英浩, 孔江平. 焦点重音对普通话音段产出和声学特征的影响[J]. 清华大学学报(自然科学版), 2016, 56(11): 1196-1201.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 《清华大学学报(自然科学版)》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn