无声语音接口中超声图像的混合特征提取

doi:10.16511/j.cnki.qhdxxb.2017.26.060

摘要
图/表
参考文献
相关文章
Metrics

全文: PDF(1302 KB)
输出: BibTeX | EndNote (RIS)

摘要在基于超声的无声语音接口实现中，通常使用主成分分析或离散余弦变换提取舌部超声图像的特征。为了保留图像的关键信息，该文提出3种混合特征提取方法：使用主成分分析从小波系数中提取特征（Wavelet PCA）、分块离散余弦变换主成分分析（block DCT-PCA）和分块Walsh Hadamard变换主成分分析（block WHT-PCA）。根据能量选取适量的离散余弦变换或WHT变换系数，使用主成分分析提取选定系数的特征。实验结果表明：该文提出的混合特征提取方法优于主成分分析或离散余弦变换，其中block DCT-PCA方法最优。

	服务

	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	RSS

	作者相关文章
	路文焕
	曲悦欣
	杨亚龙
	王建荣
	党建武

关键词 ：无声语音接口, 超声, 舌部, 主成分分析, 离散余弦变换, Walsh-Hadamard变换

Abstract：Principal component analysis (PCA) and discrete cosine transform (DCT) are used to extract features from ultrasound images to build an ultrasound based silent speech interface. The critical information in the image is presented by using three hybrid feature extraction methods. The first method uses PCA to extract discrete wavelet transform coefficient features. The second and third methods truncate the DCT or Walsh-Hadamard transform coefficients to the appropriate number according to the energy with the truncated coefficients then used by PCA to extract the features. Tests show that this hybrid feature extraction method outperforms standalone PCA or DCT analyses. The block DCT-PCA method gives the best result among all the methods.

Key words： silent speech interface ultrasound tongue principal component analysis discrete cosine transform Walsh-Hadamard transform

收稿日期: 2017-02-23 出版日期: 2017-11-15

ZTFLH:

TP391.4

通讯作者: 王建荣,副教授,E-mail:wrj@tju.edu.cn E-mail: wrj@tju.edu.cn

引用本文:

路文焕, 曲悦欣, 杨亚龙, 王建荣, 党建武. 无声语音接口中超声图像的混合特征提取[J]. 清华大学学报（自然科学版）, 2017, 57(11): 1159-1162,1169.
LU Wenhuan, QU Yuexin, YANG Yalong, WANG Jianrong, DANG Jianwu. Hybrid feature extraction from ultrasound images for a silent speech interface. Journal of Tsinghua University(Science and Technology), 2017, 57(11): 1159-1162,1169.

链接本文:

http://jst.tsinghuajournals.com/CN/10.16511/j.cnki.qhdxxb.2017.26.060 或 http://jst.tsinghuajournals.com/CN/Y2017/V57/I11/1159

图１　视觉特征提取过程

图２　超声图像的一级Haar小波变换

图３　BlockDCTＧPCA 方法的特征提取过程

表１　不同特征提取方法的识别率

图４　使用不同维度的DCT和WHT系数的识别率

图５　混淆矩阵

[1]	Denby B, Schultz T, Honda K, et al. Silent speech interfaces[J]. Speech Communication, 2010, 52(4):270-287.
[2]	Denby B, Oussar Y, Dreyfus G, et al. Prospects for a silent speech interface using ultrasound imaging[C]//IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway, NJ, USA:IEEE Press, 2006:365-368.
[3]	Hueber T, Chollet G, Denby B, et al. Acquisition of ultrasound, video and acoustic speech data for a silent-speech interface application[J]. Proc of ISSP, 2008:365-369.
[4]	Denby B, Stone M. Speech synthesis from real time ultrasound images of the tongue[C]//IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway, NJ, USA:IEEE Press, 2004:685-688.
[5]	Hueber T, Aversano G, Chollet G, et al. Eigentongue feature extraction for an ultrasound-based silent speech interface[C]//IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway, NJ, USA:IEEE Press, 2007:1245-1248.
[6]	Cai J, Denby B, Roussel-Ragot P, et al. Recognition and real time performances of a lightweight ultrasound based silent speech interface employing a language model[C]//INTERSPEECH. Baixas, France:ISCA, 2011:1005-1008.
[7]	Hueber T, Benaroya E L, Chollet G, et al. Development of a silent speech interface driven by ultrasound and optical images of the tongue and lips[J]. Speech Communication, 2010, 52(4):288-300.
[8]	Safari M, Harandi M T, Araabi B N. A SVM-based method for face recognition using a wavelet PCA representation of faces[C]//International Conference on Image Processing. Piscataway, NJ, USA:IEEE Press, 2004:853-856.
[9]	Puyati W, Walairacht A. Efficiency improvement for unconstrained face recognition by weightening probability values of modular PCA and wavelet PCA[C]//International Conference on Advanced Communication. Piscataway, NJ, USA:IEEE Press, 2008:1449-1453.
[10]	Chitaliya N G, Trivedi A I. Feature extraction using Wavelet-PCA and neural network for application of object classification & face recognition[C]//International Conference on Computer Engineering and Applications. Piscataway, NJ, USA:IEEE Press, 2010:510-514.
[11]	Akrouf S, Sehili M A, Chakhchoukh A, et al. Face recognition using PCA and DCT[C]//Proceedings Fifth International Conference on MEMS, Nano, and Smart Systems. Los Alamitos, CA, USA:IEEE Computer Society, 2009:15-19.
[12]	Hong X, Yao H, Wan Y, et al. A PCA based visual DCT feature extraction method for lip-reading[C]//International Conference on Intelligent Information Hiding and Multimedia Signal Processing. Los Alamitos, CA, USA:IEEE Computer Society, 2006:321-326.
[13]	Hassan M, Osman I, Yahia M. Walsh-hadamard transform for facial feature extraction in face recognition[J]. Proceedings of World Academy of Science Engineering & Technolog, 2007, 1(3):1264-1268.
[14]	Young S J, Jansen J, Odell J J, et al. The HTK Hidden Markov Model Toolkit Book[M]. Cambridge:Entropic Cambridge Research Laboratory, 1995.
[15]	Yuan J, Ryant N, Liberman M, et al. Automatic phonetic segmentation using boundary models[C]//INTERSPEECH. Lyon, France:ISCA, 2013:2306-2310.

[1]	王广兴, 房冠辉, 李健, 刘涛, 何青松, 贾贺. 攻角效应对降落伞拉直过程影响的仿真模拟[J]. 清华大学学报（自然科学版）, 2023, 63(3): 311-321.
[2]	张章, 吴杰, 赵淼, 王奇, 刘宇. 空间充气式返回器气动弹性动力响应特征[J]. 清华大学学报（自然科学版）, 2023, 63(3): 394-405.
[3]	熊谦, 唐文哲, 王忠静. 雄安新区水资源一体化管理要素分析与体系构建[J]. 清华大学学报（自然科学版）, 2023, 63(2): 255-263.
[4]	张云, 梁光顺, 曹聪, 唐志勇. 基于超声波反射信号的曲轴油膜厚度测量系统[J]. 清华大学学报（自然科学版）, 2022, 62(9): 1484-1491.
[5]	左逢源. 内转式TBCC组合动力进气道设计方法研究进展[J]. 清华大学学报（自然科学版）, 2022, 62(3): 555-561.
[6]	周恺, 张睿哲, 叶宽, 李鸿达, 王哲, 黄松岭. 基于同步压缩小波变换的接地扁钢缺陷电磁超声SH导波检测方法[J]. 清华大学学报（自然科学版）, 2022, 62(12): 2013-2020.
[7]	宋佳, 石若凌, 郭小红, 刘杨. 基于核极限学习机的飞行器故障诊断方法[J]. 清华大学学报（自然科学版）, 2020, 60(10): 795-803.
[8]	周海鹏, 韩赞东, 都东, 陈以方. 基于Gauss调制脉冲模型的超声信号提取算法[J]. 清华大学学报（自然科学版）, 2019, 59(2): 96-102.
[9]	孙斐然, 丁雨林, 孙振国, 陈强, MURAYAMA Riichi. 基于缓冲波导的T(0,1)模态导波激励方法实验研究[J]. 清华大学学报（自然科学版）, 2018, 58(8): 740-745.
[10]	黄忠山, 田凌, 向东, 韦尧中. 基于PCA和SPC-动态神经网络的风电机组齿轮箱油温趋势预测[J]. 清华大学学报（自然科学版）, 2018, 58(6): 539-546.
[11]	赵日, 刘立业, 李君利. 基于主成分分析和Mahalanobis距离的异常γ能谱识别[J]. 清华大学学报（自然科学版）, 2017, 57(8): 826-831.
[12]	邹诚, 蔡栋, 孙振国, 张文增, 陈强. 锯齿形超声相控阵声场特性[J]. 清华大学学报（自然科学版）, 2017, 57(6): 604-608.
[13]	韩赞东, 李永杰, 陈以方. 陶瓷涂层结合质量的超声斜入射检测[J]. 清华大学学报（自然科学版）, 2017, 57(5): 454-458.
[14]	辛喆, 邹若冰, 李升波, 俞佳莹, 戴一凡, 陈海亮. 基于超声波传感器阵列的车辆周围目标物识别[J]. 清华大学学报（自然科学版）, 2017, 57(12): 1287-1295.
[15]	宋胜利, 杨健. 基于鲁棒主成分分析的SAR舰船检测[J]. 清华大学学报（自然科学版）, 2015, 55(8): 844-848.

Viewed

Full text

Abstract

Cited

Shared

Discussed