基于磁共振成像的汉语普通话舌尖调音建模

doi:10.16511/j.cnki.qhdxxb.2017.22.008

摘要
图/表
参考文献
相关文章
Metrics

全文: PDF(1616 KB)
输出: BibTeX | EndNote (RIS)

摘要通过对汉语普通话磁共振成像数据的分析，对舌尖的形状和运动进行调音建模。建立了汉语普通话磁共振成像调音数据库，包括9个单元音和75个辅音变体。提取了发音器官在正中矢状面上的形状边缘；对舌头的形状边缘进行主成分分析，发现舌尖和舌体分开建模更为简洁；针对舌尖调音动作，用舌尖前伸（TTP）和舌尖上翘（TTR）两个调音参数来控制舌尖形状和动作，建立了舌尖的调音模型。

	服务

	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	RSS

	作者相关文章
	汪高武
	党建武
	孔江平

关键词 ：磁共振成像, 汉语普通话, 舌尖, 调音模型

Abstract：The tongue tip motion in Standard Chinese was modeled based on articulatory data from magnetic resonance imaging (MRI) images. An MRI articulatory database was developed for Standard Chinese, including 9 vowels and 75 consonant variants. Principle component analysis (PCA) of the tongue shape was then used to find articulatory factors. The results show that the tongue should be divided as the tongue tip and tongue body and modeled separately for more precise results. The tongue tip motion is modeled with two articulatory parameters for tongue tip protrude and tongue tip raise which represent the protruding/advancing and raising/retroflexing movements of the tongue tip.

Key words： magnetic resonance imaging (MRI) Standard Chinese tongue tip articulatory model

收稿日期: 2016-06-23 出版日期: 2017-02-15

ZTFLH:

H017

引用本文:

汪高武, 党建武, 孔江平. 基于磁共振成像的汉语普通话舌尖调音建模[J]. 清华大学学报（自然科学版）, 2017, 57(2): 158-163.
WANG Gaowu, DANG Jianwu, KONG Jiangping. Modeling of the tongue tip in Standard Chinese using MRI. Journal of Tsinghua University(Science and Technology), 2017, 57(2): 158-163.

链接本文:

http://jst.tsinghuajournals.com/CN/10.16511/j.cnki.qhdxxb.2017.22.008 或 http://jst.tsinghuajournals.com/CN/Y2017/V57/I2/158

图1 普通话元音“a”的MRI图像和处理后得到的发音器官形状

图2 MRI静态数据的舌形分布和前10个主成分的贡献率

图3 舌形的前4个主成分的运动趋势示意图

图4 口腔示意图(把舌尖和舌体分开来分析)

图5 舌尖的前2个主成分的运动趋势示意图

图6 舌尖重构误差最大的4个辅音的口腔示意图

表1 整个舌头、舌体、舌尖的主成分重构误差

图7 正中矢状面上(从身体左侧看舌头)舌尖前伸、舌尖上翘参数对舌尖形状的影响

[1]	Fant G. Acoustic Theory of Speech Production[M]. 2nd Ed. Hague:Mouton, 1970:328.
[2]	Hardcastle W J, Laver J. The Handbook of Phonetic Sciences[M]. Oxford:Blackwell Publishing, 1999.
[3]	Story B H. A parametric model of the vocal tract area function for vowel and consonant simulation[J]. J Acoust Soc Am, 2005, 117(5):3231-3254.
[4]	Flanagan J. Speech Analysis Synthesis and Perception[M]. New York:Spinger, 1972.
[5]	Wilhelms-Tricarico R. A biomechanical and physiologically-based vocal tract model and its control[J]. J Phonetics, 1996, 24(1):23-38.
[6]	Dang J W, Honda K. Construction and control of a physiological articulatory model[J]. J Acoust Soc Am, 2004, 115(2):853-870.
[7]	Iskarous K. Patterns of tongue movement[J]. J Phonetics, 2005, 33(4):363-381.
[8]	Badin P, Bailly G, Reveret L, et al. Three-dimensional linear articulatory modeling of tongue, lips and face, based on MRI and video images[J]. J Phonetics, 2002, 30(3):533-553.
[9]	Engwall O. Combining MRI, EMA and EPG measurements in a three-dimensional tongue model[J]. Speech Comm, 2003, 41(2/3):303-329.
[10]	Mermelstein P. Articulatory model for the study of speech production[J]. J Acoust Soc Am, 1973, 53(4):1070-1082.
[11]	Coker C H. A model of articulatory dynamics and control[J]. Proceedings of the IEEE, 1976, 64(4):452-460.
[12]	Lindblom B, Sundberg J. Acoustical consequences of lip, tongue, jaw, and larynx movement[J]. J Acoust Soc Am, 1971, 50(4):1166-1179.
[13]	Harshman R, Ladefoged P, Goldstein L. Factor analysis of tongue shapes[J]. J Acoust Soc Am, 1977, 62(3):693-707.
[14]	Beautemps D, Badin P, Bailly G. Linear degrees of freedom in speech production:Analysis of cineradio-and labio-film data and articulatory-acoustic modeling[J]. J Acoust Soc Am, 2001, 109(5):2165-2180.
[15]	Wang G, Kitamura T, Lu X G, et al. MRI-based study of morphological and acoustical properties of Mandarin sustained steady vowels[J]. J Signal Process, 2008, 12(4):311-314.
[16]	Wang Y, Wang H, Gao J, et al. Detailed morphological analysis of mandarin sustained steady vowels[C]//International Symposium on Chinese Spoken Language Processing (ISCSLP). Hong Kong, 2012:413-416.

[1]	石保顺, 刘政, 刘柯讯. 基于可训练对偶标架的模型驱动并行压缩感知磁共振成像算法及其收敛性分析[J]. 清华大学学报（自然科学版）, 2024, 64(4): 712-723.
[2]	吴锦超, 仇诗涵, 李沐恒, 韦兴, 陈秉耀, 应葵. 基于Kalman滤波和生物传热模型的实时磁共振温度成像精度提升[J]. 清华大学学报（自然科学版）, 2020, 60(4): 334-340.
[3]	姚云, 吴西愉, 孔江平. 矢量半径驱动的汉语普通话立体声道模型[J]. 清华大学学报（自然科学版）, 2017, 57(9): 945-951.
[4]	李英浩, 孔江平. 焦点重音对普通话音段产出和声学特征的影响[J]. 清华大学学报（自然科学版）, 2016, 56(11): 1196-1201.
[5]	刘伟强, 蒲婷, 顾洪生, 廖振华, 姜锦鹏. 中国人颈椎间盘尺寸分析[J]. 清华大学学报（自然科学版）, 2014, 54(2): 172-177.

Viewed

Full text

Abstract

Cited

Shared

Discussed