机器人自身噪声环境下的自动语音识别

doi:10.16511/j.cnki.qhdxxb.2017.22.007

摘要
图/表
参考文献
相关文章
Metrics

全文: PDF(1893 KB)
输出: BibTeX | EndNote (RIS)

摘要当机器人移动身体任何部位时，都会不可避免地产生自身噪声。这些自身噪声由身体关节或其他硬件设备如风扇等引起。由于自身噪声距离机器人麦克风较近，较目标声源更容易被获取。该文根据机器人自身噪声种类，提出了一种将谱减法、关节噪声模板减法、基于标注区域的倒谱均值减法以及多条件训练相结合的方法，从而估计和抑制自身噪声。一系列实验证明了所提出的方法可以有效地减少自身噪声影响，提高语音识别的鲁棒性。

	服务

	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	RSS

	作者相关文章
	王建荣
	张句
	路文焕
	魏建国
	党建武

关键词 ：机器人, 语音识别, 语音增强

Abstract：Robots inevitably produce noise when they are moving any part of their body. Such noise is caused by the various body joint motors as well as the CPU cooling fans. Moreover, these noises are easily captured by the robots' microphones because they are closer to the microphones than the target speech source. This paper presents a de-noising method using the spectral subtraction, joint noise template substraction, labeled area cepstral mean substraction and multi-condition training to estimate and suppress robot noise. Tests show that this method significantly reduces the effect of robot noise which enhances the automatic speech recognition.

Key words： robot speech recognition speech enhancement

收稿日期: 2016-06-20 出版日期: 2017-02-15

ZTFLH:	TP242
	TN912.34

通讯作者: 路文焕,副教授,E-mail:wenhuan@tju.edu.cn E-mail: wenhuan@tju.edu.cn

引用本文:

王建荣, 张句, 路文焕, 魏建国, 党建武. 机器人自身噪声环境下的自动语音识别[J]. 清华大学学报（自然科学版）, 2017, 57(2): 153-157.
WANG Jianrong, ZHANG Ju, LU Wenhuan, WEI Jianguo, DANG Jianwu. Automatic speech recognition with robot noise. Journal of Tsinghua University(Science and Technology), 2017, 57(2): 153-157.

链接本文:

http://jst.tsinghuajournals.com/CN/10.16511/j.cnki.qhdxxb.2017.22.007 或 http://jst.tsinghuajournals.com/CN/Y2017/V57/I2/153

图1 机器人自身噪声环境下的语音样本

图2 处在不同幅度下的CPU散热风扇噪声

图3 AldebaranRobotics的NAO机器人

表1 不同技术以及技术组合方法在不同自身噪声下的语音识别结果(距离120cm)

图4 谱减法处理风扇噪声的前后对比

图5 关节噪声模板减法处理前后的对比

图6 基于标注区域倒谱均值减法与全局倒谱均值减法的对比

[1]	Ince G, Nakadai K, Rodemann T, et al. A hybrid framework for ego noise cancellation of a robot[C]//2010 IEEE International Conference on Robotics and Automation (ICRA). Piscataway, NJ:IEEE Press, 2010:3623-3628.
[2]	Breazeal C L. Designing Sociable Robots[M]. Boston, MA:MIT Press, 2004.
[3]	Miwa H, Okuchi T, Itoh K, et al. A new mental model for humanoid robots for human friendly communication introduction of learning system, mood vector and second order equations of emotion[C]//Proc 2003 IEEE International Conference on Robotics and Automation (ICRA). Piscataway, NJ:IEEE Press, 2003, 3:3588-3593.
[4]	Nakadai K, Lourens T, Okuno H G, et al. Active audition for humanoid[C]//Proc of the 17th National Conference on Artificial Intelligence and 12th Conference on Innovative Applications of Artificial Intelligence. Palo Alto, CA:AAAI Press, 2000:832-839.
[5]	Even J, Sawada H, Saruwatari H, et al. Semi-blind suppression of internal noise for hands-free robot spoken dialog system[C]//Proc 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Piscataway, NJ:IEEE Press, 2009:658-663.
[6]	Cohen I, Berdugo B. Speech enhancement for non-stationary noise environments[J]. Signal Processing, 2001, 81(1):2403-2418.
[7]	Cohen I, Berdugo B. Noise estimation by minima controlled recursive averaging for robust speech enhancement[J]. IEEE Signal Processing Letters, 2002, 9(1):12-15.
[8]	Ito A, Kanayama T, Suzuki M, et al. Internal noise suppression for speech recognition by small robots[J]. IEICE Technical Report Speech, 2005, 105:43-48.
[9]	Nishimura Y, Ishizuka M, Nakadai K, et al. Speech recognition for a humanoid with motor noise utilizing missing feature theory[C]//2006 6th IEEE-RAS International Conference on Humanoid Robots. Piscataway, NJ:IEEE Press, 2006:26-33.
[10]	Ince G, Nakadai K, Rodemann T, et al. Incremental learning for ego noise estimation of a robot[C]//2011 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Piscataway, NJ:IEEE Press, 2011:131-136.
[11]	Ince G, Nakadai K, Rodemann T, et al. Ego noise suppression of a robot using template subtraction[C]//Proceedings of the 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway, NJ:IEEE Press, 2009:199-204.
[12]	Boll S. Suppression of acoustic noise in speech using spectral subtraction[J]. Processing IEEE Transactions on Acoustics Speech & Signal, 1979, 27(2):113-120.
[13]	Viikki O, Bye D, Laurila K. A recursive feature vector normalization approach for robust speech recognition in noise[C]//Proc 1998 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway, NJ:IEEE Press, 1998, 2:733-736.

[1]	李建, 王生海, 刘将, 高钰富, 韩广冬, 孙玉清. 绳驱动船舱清洗机器人动力学建模及鲁棒控制[J]. 清华大学学报（自然科学版）, 2024, 64(3): 562-577.
[2]	李佳龙, 陈永灿, 李永龙, 王皓冉, 谢辉. 泥沙淤积环境下清淤置换模块设计及检测效率分析[J]. 清华大学学报（自然科学版）, 2023, 63(7): 1104-1112.
[3]	陈永灿, 陈嘉杰, 王皓冉, 巩宇, 冯跃, 刘昭伟, 祁宁春, 刘梅, 李永龙, 谢辉. 大直径长引水隧洞水下检测机器人系统关键技术[J]. 清华大学学报（自然科学版）, 2023, 63(7): 1015-1031.
[4]	徐鹏飞, 陈梅雅, 开艳, 王子鹏, 李新宇, 万刚, 王延杰. 大型水电站坝体检测水下机器人研究进展[J]. 清华大学学报（自然科学版）, 2023, 63(7): 1032-1040.
[5]	王皓冉, 谢辉, 陈永灿, 刘康, 李正文, 李永龙. 消力池底板混凝土磨蚀智能检测与数值仿真[J]. 清华大学学报（自然科学版）, 2023, 63(7): 1095-1103.
[6]	祁宁春, 聂强, 来记桃, 陈永灿, 李永龙. 水电站多元场景水下智能巡检关键技术与实践[J]. 清华大学学报（自然科学版）, 2023, 63(7): 1124-1134.
[7]	潘飞羽, 贾炎冰, 杨孟辉, 吕逸飞, 赵军, 郝智秀, 王人成. 卧姿下肢康复训练运动生物力学特性[J]. 清华大学学报（自然科学版）, 2023, 63(12): 1984-1993.
[8]	陈书清, 李铁民. 基于自适应柔顺控制的航天器部件装配[J]. 清华大学学报（自然科学版）, 2023, 63(11): 1808-1819.
[9]	冯消冰, 王建军, 王永科, 陈苏云, 刘爱平. 面向大型结构件爬行机器人智能焊接技术[J]. 清华大学学报（自然科学版）, 2023, 63(10): 1608-1625.
[10]	姜帅, 宋立滨, 陈晓永, 张朋, 刘科成, 常俊虎. 基于视觉识别的民机零件专用自动喷涂系统[J]. 清华大学学报（自然科学版）, 2023, 63(10): 1650-1657.
[11]	刘鹏, 乔心州. 大跨度完全约束空间3-DOF柔索驱动并联机器人稳定性灵敏度研究[J]. 清华大学学报（自然科学版）, 2022, 62(9): 1548-1558.
[12]	李政清, 侯森浩, 韦金昊, 唐晓强. 面向仓储物流的平面索并联机器人视觉自标定方法[J]. 清华大学学报（自然科学版）, 2022, 62(9): 1508-1515.
[13]	张文, 丁雨林, 陈咏华, 孙振国. 基于外部视觉与机载IMU组合的爬壁机器人自主定位方法[J]. 清华大学学报（自然科学版）, 2022, 62(9): 1524-1531.
[14]	刘天云. 大型填筑工程3D打印技术与应用[J]. 清华大学学报（自然科学版）, 2022, 62(8): 1281-1291.
[15]	王煜天, 张瑞杰, 吴军, 汪劲松. 移动式混联喷涂机器人的动力学性能波动评价[J]. 清华大学学报（自然科学版）, 2022, 62(5): 971-977.

Viewed

Full text

Abstract

Cited

Shared

Discussed