Please wait a minute...
 首页  期刊介绍 期刊订阅 联系我们 横山亮次奖 百年刊庆
 
最新录用  |  预出版  |  当期目录  |  过刊浏览  |  阅读排行  |  下载排行  |  引用排行  |  横山亮次奖  |  百年刊庆
清华大学学报(自然科学版)  2017, Vol. 57 Issue (11): 1159-1162,1169    DOI: 10.16511/j.cnki.qhdxxb.2017.26.060
  计算机科学与技术 本期目录 | 过刊浏览 | 高级检索 |
无声语音接口中超声图像的混合特征提取
路文焕1, 曲悦欣1, 杨亚龙1, 王建荣2, 党建武2
1. 天津大学 软件学院, 天津 300350;
2. 天津大学 计算机科学与技术学院, 天津 300350
Hybrid feature extraction from ultrasound images for a silent speech interface
LU Wenhuan1, QU Yuexin1, YANG Yalong1, WANG Jianrong2, DANG Jianwu2
1. School of Computer Software, Tianjin University, Tianjin 300350, China;
2. School of Computer Science and Technology, Tianjin University, Tianjin 300350, China
全文: PDF(1302 KB)  
输出: BibTeX | EndNote (RIS)      
摘要 在基于超声的无声语音接口实现中,通常使用主成分分析或离散余弦变换提取舌部超声图像的特征。为了保留图像的关键信息,该文提出3种混合特征提取方法:使用主成分分析从小波系数中提取特征(Wavelet PCA)、分块离散余弦变换主成分分析(block DCT-PCA)和分块Walsh Hadamard变换主成分分析(block WHT-PCA)。根据能量选取适量的离散余弦变换或WHT变换系数,使用主成分分析提取选定系数的特征。实验结果表明:该文提出的混合特征提取方法优于主成分分析或离散余弦变换,其中block DCT-PCA方法最优。
服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
路文焕
曲悦欣
杨亚龙
王建荣
党建武
关键词 无声语音接口超声舌部主成分分析离散余弦变换Walsh-Hadamard变换    
Abstract:Principal component analysis (PCA) and discrete cosine transform (DCT) are used to extract features from ultrasound images to build an ultrasound based silent speech interface. The critical information in the image is presented by using three hybrid feature extraction methods. The first method uses PCA to extract discrete wavelet transform coefficient features. The second and third methods truncate the DCT or Walsh-Hadamard transform coefficients to the appropriate number according to the energy with the truncated coefficients then used by PCA to extract the features. Tests show that this hybrid feature extraction method outperforms standalone PCA or DCT analyses. The block DCT-PCA method gives the best result among all the methods.
Key wordssilent speech interface    ultrasound    tongue    principal component analysis    discrete cosine transform    Walsh-Hadamard transform
收稿日期: 2017-02-23      出版日期: 2017-11-15
ZTFLH:  TP391.4  
通讯作者: 王建荣,副教授,E-mail:wrj@tju.edu.cn     E-mail: wrj@tju.edu.cn
引用本文:   
路文焕, 曲悦欣, 杨亚龙, 王建荣, 党建武. 无声语音接口中超声图像的混合特征提取[J]. 清华大学学报(自然科学版), 2017, 57(11): 1159-1162,1169.
LU Wenhuan, QU Yuexin, YANG Yalong, WANG Jianrong, DANG Jianwu. Hybrid feature extraction from ultrasound images for a silent speech interface. Journal of Tsinghua University(Science and Technology), 2017, 57(11): 1159-1162,1169.
链接本文:  
http://jst.tsinghuajournals.com/CN/10.16511/j.cnki.qhdxxb.2017.26.060  或          http://jst.tsinghuajournals.com/CN/Y2017/V57/I11/1159
  图1 视觉特征提取过程
  图2 超声图像的一级Haar小波变换
  图3 BlockDCTGPCA 方法的特征提取过程
  表1 不同特征提取方法的识别率
  图4 使用不同维度的DCT和WHT系数的识别率
  图5 混淆矩阵
[1] Denby B, Schultz T, Honda K, et al. Silent speech interfaces[J]. Speech Communication, 2010, 52(4):270-287.
[2] Denby B, Oussar Y, Dreyfus G, et al. Prospects for a silent speech interface using ultrasound imaging[C]//IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway, NJ, USA:IEEE Press, 2006:365-368.
[3] Hueber T, Chollet G, Denby B, et al. Acquisition of ultrasound, video and acoustic speech data for a silent-speech interface application[J]. Proc of ISSP, 2008:365-369.
[4] Denby B, Stone M. Speech synthesis from real time ultrasound images of the tongue[C]//IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway, NJ, USA:IEEE Press, 2004:685-688.
[5] Hueber T, Aversano G, Chollet G, et al. Eigentongue feature extraction for an ultrasound-based silent speech interface[C]//IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway, NJ, USA:IEEE Press, 2007:1245-1248.
[6] Cai J, Denby B, Roussel-Ragot P, et al. Recognition and real time performances of a lightweight ultrasound based silent speech interface employing a language model[C]//INTERSPEECH. Baixas, France:ISCA, 2011:1005-1008.
[7] Hueber T, Benaroya E L, Chollet G, et al. Development of a silent speech interface driven by ultrasound and optical images of the tongue and lips[J]. Speech Communication, 2010, 52(4):288-300.
[8] Safari M, Harandi M T, Araabi B N. A SVM-based method for face recognition using a wavelet PCA representation of faces[C]//International Conference on Image Processing. Piscataway, NJ, USA:IEEE Press, 2004:853-856.
[9] Puyati W, Walairacht A. Efficiency improvement for unconstrained face recognition by weightening probability values of modular PCA and wavelet PCA[C]//International Conference on Advanced Communication. Piscataway, NJ, USA:IEEE Press, 2008:1449-1453.
[10] Chitaliya N G, Trivedi A I. Feature extraction using Wavelet-PCA and neural network for application of object classification & face recognition[C]//International Conference on Computer Engineering and Applications. Piscataway, NJ, USA:IEEE Press, 2010:510-514.
[11] Akrouf S, Sehili M A, Chakhchoukh A, et al. Face recognition using PCA and DCT[C]//Proceedings Fifth International Conference on MEMS, Nano, and Smart Systems. Los Alamitos, CA, USA:IEEE Computer Society, 2009:15-19.
[12] Hong X, Yao H, Wan Y, et al. A PCA based visual DCT feature extraction method for lip-reading[C]//International Conference on Intelligent Information Hiding and Multimedia Signal Processing. Los Alamitos, CA, USA:IEEE Computer Society, 2006:321-326.
[13] Hassan M, Osman I, Yahia M. Walsh-hadamard transform for facial feature extraction in face recognition[J]. Proceedings of World Academy of Science Engineering & Technolog, 2007, 1(3):1264-1268.
[14] Young S J, Jansen J, Odell J J, et al. The HTK Hidden Markov Model Toolkit Book[M]. Cambridge:Entropic Cambridge Research Laboratory, 1995.
[15] Yuan J, Ryant N, Liberman M, et al. Automatic phonetic segmentation using boundary models[C]//INTERSPEECH. Lyon, France:ISCA, 2013:2306-2310.
[1] 王广兴, 房冠辉, 李健, 刘涛, 何青松, 贾贺. 攻角效应对降落伞拉直过程影响的仿真模拟[J]. 清华大学学报(自然科学版), 2023, 63(3): 311-321.
[2] 张章, 吴杰, 赵淼, 王奇, 刘宇. 空间充气式返回器气动弹性动力响应特征[J]. 清华大学学报(自然科学版), 2023, 63(3): 394-405.
[3] 熊谦, 唐文哲, 王忠静. 雄安新区水资源一体化管理要素分析与体系构建[J]. 清华大学学报(自然科学版), 2023, 63(2): 255-263.
[4] 张云, 梁光顺, 曹聪, 唐志勇. 基于超声波反射信号的曲轴油膜厚度测量系统[J]. 清华大学学报(自然科学版), 2022, 62(9): 1484-1491.
[5] 左逢源. 内转式TBCC组合动力进气道设计方法研究进展[J]. 清华大学学报(自然科学版), 2022, 62(3): 555-561.
[6] 周恺, 张睿哲, 叶宽, 李鸿达, 王哲, 黄松岭. 基于同步压缩小波变换的接地扁钢缺陷电磁超声SH导波检测方法[J]. 清华大学学报(自然科学版), 2022, 62(12): 2013-2020.
[7] 宋佳, 石若凌, 郭小红, 刘杨. 基于核极限学习机的飞行器故障诊断方法[J]. 清华大学学报(自然科学版), 2020, 60(10): 795-803.
[8] 周海鹏, 韩赞东, 都东, 陈以方. 基于Gauss调制脉冲模型的超声信号提取算法[J]. 清华大学学报(自然科学版), 2019, 59(2): 96-102.
[9] 孙斐然, 丁雨林, 孙振国, 陈强, MURAYAMA Riichi. 基于缓冲波导的T(0,1)模态导波激励方法实验研究[J]. 清华大学学报(自然科学版), 2018, 58(8): 740-745.
[10] 黄忠山, 田凌, 向东, 韦尧中. 基于PCA和SPC-动态神经网络的风电机组齿轮箱油温趋势预测[J]. 清华大学学报(自然科学版), 2018, 58(6): 539-546.
[11] 赵日, 刘立业, 李君利. 基于主成分分析和Mahalanobis距离的异常γ能谱识别[J]. 清华大学学报(自然科学版), 2017, 57(8): 826-831.
[12] 邹诚, 蔡栋, 孙振国, 张文增, 陈强. 锯齿形超声相控阵声场特性[J]. 清华大学学报(自然科学版), 2017, 57(6): 604-608.
[13] 韩赞东, 李永杰, 陈以方. 陶瓷涂层结合质量的超声斜入射检测[J]. 清华大学学报(自然科学版), 2017, 57(5): 454-458.
[14] 辛喆, 邹若冰, 李升波, 俞佳莹, 戴一凡, 陈海亮. 基于超声波传感器阵列的车辆周围目标物识别[J]. 清华大学学报(自然科学版), 2017, 57(12): 1287-1295.
[15] 宋胜利, 杨健. 基于鲁棒主成分分析的SAR舰船检测[J]. 清华大学学报(自然科学版), 2015, 55(8): 844-848.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 《清华大学学报(自然科学版)》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn