Please wait a minute...
 首页  期刊介绍 期刊订阅 联系我们 横山亮次奖 百年刊庆
 
最新录用  |  预出版  |  当期目录  |  过刊浏览  |  阅读排行  |  下载排行  |  引用排行  |  横山亮次奖  |  百年刊庆
清华大学学报(自然科学版)  2018, Vol. 58 Issue (5): 516-522    DOI: 10.16511/j.cnki.qhdxxb.2018.25.026
  计算机科学与技术 本期目录 | 过刊浏览 | 高级检索 |
基于双耳相干函数的鲁棒时延差估计与混响抑制算法
方义1, 陈友元2, 牟宏宇2, 冯海泓2
1. 中国科学院大学 声学研究所, 北京 100190;
2. 中科院声学研究所 东海研究站, 上海 200815
A robust time-delay estimation and dereverberation algorithm based on the coherence function
FANG Yi1, CHEN Youyuan2, MOU Hongyu2, FENG Haihong2
1. The Institute of Acoustics, University of Chinese Academy of Sciences, Beijing 100190, China;
2. Shanghai Acoustics Laboratory, Chinese Academy of Sciences, Shanghai 200815, China
全文: PDF(1357 KB)  
输出: BibTeX | EndNote (RIS)      
摘要 传统的基于相关峰的广义互相关算法在混响环境下性能急剧下降,尽管一些优先效应模型被提出以改善其性能,但是这些模型计算复杂且对阈值选取很敏感。该文首先通过协方差矩阵的特征值来分别更新语音的相干函数和噪声的相干函数,随后将语音的相干函数与理想相干函数匹配,用于时延差估计。估计出的时延差和噪声的相干函数用于相干与散射信号能量比值(coherent-to-diffuse power ratio,CDR)的估计,最后利用实时估计出来的CDR值进行混响抑制。实验结果表明:该方法的定位误差明显低于传统方法,且混响抑制后的主观语音质量评估(perceptual evaluation of speech quality,PESQ)分数高于对比算法。
服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
方义
陈友元
牟宏宇
冯海泓
关键词 优先效应模型时延差估计混响抑制相干函数    
Abstract:The performance of traditional cross-correlation based time-delay estimation methods is sharply degraded in reverberation environments. Precedence effect models have been proposed with cross-correlation functions, but these models are quite parameter-sensitive and the front-end processes are very complex. This paper describes a method that first updates a function of the speech and noise based on the eigenvalues of the covariance matrix. Then, a coherence function of the speech is matched to the ideal coherence function for the time-delay estimate. Then, the estimated time delay and the noise coherence function are applied to the coherent-to-diffuse power ratio (CDR) estimator for reverberation suppression. Tests show that this scheme has higher localization accuracy than traditional methods and achieves higher PESQ (perceptual evaluation of speech quality) scores than other CDR estimators.
Key wordsprecedence effect    time-delay estimation    dereverberation    coherence function
收稿日期: 2017-09-26      出版日期: 2018-05-15
ZTFLH:  TP242  
  TN912.34  
基金资助:国家自然科学基金资助项目(11474309)
通讯作者: 陈友元,副研究员,E-mail:chenyouyuan@mail.ioa.ac.cn     E-mail: chenyouyuan@mail.ioa.ac.cn
作者简介: 方义(1990-),男,博士研究生。
引用本文:   
方义, 陈友元, 牟宏宇, 冯海泓. 基于双耳相干函数的鲁棒时延差估计与混响抑制算法[J]. 清华大学学报(自然科学版), 2018, 58(5): 516-522.
FANG Yi, CHEN Youyuan, MOU Hongyu, FENG Haihong. A robust time-delay estimation and dereverberation algorithm based on the coherence function. Journal of Tsinghua University(Science and Technology), 2018, 58(5): 516-522.
链接本文:  
http://jst.tsinghuajournals.com/CN/10.16511/j.cnki.qhdxxb.2018.25.026  或          http://jst.tsinghuajournals.com/CN/Y2018/V58/I5/516
  图1 本文算法系统框图
  图2 噪声的相干函数真实值与估计值
  图3 时延差估计框图
  图4 单声源定位结果
  图5 三声源定位结果
  图6 混响抑制算法处理前后图
  表1 时延估计算法的均方根误差(RMSerror)
  图7 1m 时的不同角度的平均 PESQ 分数
  图8 2m 时的不同角度的平均 PESQ 分数
  图9 3m 时的不同角度的平均 PESQ 分数
  表2 不同距离下的平均 PESQ 分数
  表3 不同距离下的平均segSNR
[1] ALLEN J B, BERKLEY D A, BLAUERT J. Multimicrophone signal-processing technique to remove room reverberation from speech signals[J]. Journal of the Acoustical Society of America, 1977, 62(4):912-915.
[2] ZELINSKI R. A microphone array with adaptive post-filtering for noise reduction in reverberant rooms[C]//1998 International Conference on Acoustics, Speech, and Signal Processing. Atlanta, GA, USA:IEEE, 1988:2578-2581.
[3] LEBART K, BOUCHER J M, DENBIGH P N. A binaural system for the suppression of late reverberation[C]//Proceedings of the 2nd European Signal Processing Conference (EUSIPCO). Rhodes, Greece:EURASIP, 1998:97-100
[4] JEUB M, FER M, ESCH T, et al. Model-based dereverberation preserving binaural cues[J]. IEEE Transactions on Audio Speech & Language Processing, 2010, 18(7):1732-1745.
[5] SCHWARZ A, KELLERMANN W. Coherent-to-diffuse power ratio estimation for dereverberation[J]. IEEE/ACM Transactions on Audio Speech & Language Processing, 2015, 23(6):1006-1018.
[6] ZHENG C, SCHWARZ A, KELLERMANN W, et al. Binaural coherent-to-diffuse-ratio estimation for dereverberation using an ITD model[C]//Proceedings of the 23rd European Signal Processing Conference. Nice, French:EURASIP, 2015:1048-1052.
[7] KNAPP C, CARTER G. The generalized correlation method for estimation of time delay[J]. IEEE Transactions on Acoustics Speech & Signal Processing, 2003, 24(4):320-327.
[8] LIU C, WHEELER B C, JR W D O, et al. Localization of multiple sound sources with two microphones[J]. Journal of the Acoustical Society of America, 2000, 108(4):1888-1905.
[9] JR R M S, COLBURN H S. Theory of binaural interaction based on auditory-nerve data. IV. A model for subjective lateral position[J]. Journal of the Acoustical Society of America, 1978, 64(1):127-140.
[10] LINDEMANN W. Extension of a binaural cross-correlation model by contralateral inhibition. I. Simulation of lateralization for stationary signals[J]. Journal of the Acoustical Society of America, 1986, 80(6):1608-1622.
[11] LITOVSKY R Y, COLBURN H S, YOST W A, et al. The precedence effect[J]. Journal of the Acoustical Society of America, 1999, 106(4):1633-1654.
[12] HUANG J, OHNISHI N, SUGIE N. Sound localization in reverberant environment based on the model of the precedence effect[J]. IEEE Transactions on Instrumentation & Measurement, 1997, 46(4):842-846.
[13] MARTIN K D. Echo suppression in a computational model of the precedence effect[C]//1997 IEEE ASSP Workshop on Applications of Signal Processing to Audio and Acoustics. New Paltz, NY, USA:IEEE, 1997:4.
[14] MEDDIS R, HEWITT M J, SHACKLETON T M. Implementation details of a computation model of the inner hair-cell auditory-nerve synapse[J]. Journal of the Acoustical Society of America, 1990, 87(87):1813-1816.
[15] FALLER C, MERIMAA J. Source localization in complex listening situations:Selection of binaural cues based on interaural coherence[J]. Journal of the Acoustical Society of America, 2004, 116(5):3075-3089.
[16] LAVANDIER M, CULLING J F. Speech segregation in rooms:Importance of the interferer interaural coherence[J]. Journal of the Acoustical Society of America, 2008, 123(5):2977-2977.
[17] RAKERD B, HARTMANN W M. Localization of sound in rooms. V. Binaural coherence and human sensitivity to interaural time differences in noise[J]. Journal of the Acoustical Society of America, 2010, 128(5):3052-3063.
[18] JI Y, PARK Y C, KIM D W, et al. Robust noise PSD estimation for binaural hearing aids in time-varying diffuse noise field[C]//IEEE International Conference on Acoustics, Speech and Signal Processing. Vancouver, Canada:IEEE, 2013:7264-7268.
[19] ALLEN J B, BERKLEY D A. Image method for efficiently simulating small room acoustics[J]. Journal of the Acoustical Society of America, 1979, 65(4):943-950.
[20] ROTHAUSER E H, CHAPMAN W D, GUTTMAN N, et al. IEEE recommended practice for speech quality measurements[J]. IEEE Transactions on Audio and Electroacoust, 1969, 17(3):225-246.
[21] RIX A W, BEERENDS J G, HOLLIER M P, et al. Perceptual evaluation of speech quality (PESQ)-A new method for speech quality assessment of telephone networks and codecs[C]//IEEE International Conference on Acoustics, Speech, and Signal Processing. Salt Lake City, UT, USA:IEEE, 2001:749-752.
[22] JEUB M, FER M, VARY P. A binaural room impulse response database for the evaluation of dereverberation algorithms[C]//Proceedings of the 16th International Conference on Digital Signal Processing. Santorini, Greece:IEEE, 2009:1-5.
[1] 郭吉昌, 朱志明, 王鑫, 马国锐. 全位置焊接机器人逆运动学数值求解及轨迹规划方法[J]. 清华大学学报(自然科学版), 2018, 58(3): 292-297.
[2] 崔志伟, 唐晓强, 侯森浩, 项程远. 索驱动并联机器人可控刚度特性[J]. 清华大学学报(自然科学版), 2018, 58(2): 204-211.
[3] 于广, 王立平, 吴军, 王冬. 3自由度并联主轴头的动力学建模及动态特性[J]. 清华大学学报(自然科学版), 2017, 57(12): 1317-1323.
[4] 王建荣, 高永春, 张句, 魏建国, 党建武. 基于Kinect辅助的机器人带噪语音识别[J]. 清华大学学报(自然科学版), 2017, 57(9): 921-925.
[5] 张彬彬, 王立平, 吴军. 3自由度并联机构的动力学各向同性评价方法[J]. 清华大学学报(自然科学版), 2017, 57(8): 803-809.
[6] 朱志明, 郭吉昌, 马国锐, 刘博. 箱型钢结构环缝焊接的机器人运动学分析与轨迹规划[J]. 清华大学学报(自然科学版), 2017, 57(8): 785-791.
[7] 王国磊, 伊强, 缪东晶, 陈恳, 王力强. 面向机器人喷涂的多变量涂层厚度分布模型[J]. 清华大学学报(自然科学版), 2017, 57(3): 324-330.
[8] 王建荣, 张句, 路文焕, 魏建国, 党建武. 机器人自身噪声环境下的自动语音识别[J]. 清华大学学报(自然科学版), 2017, 57(2): 153-157.
[9] 张继文, 刘莉, 陈恳. 小型仿人足球机器人MOS-7的系统设计及局部优化[J]. 清华大学学报(自然科学版), 2016, 56(8): 811-817.
[10] 张继文, 刘莉, 陈恳. 基于AHRS反馈的仿人机器人步行稳定控制[J]. 清华大学学报(自然科学版), 2016, 56(8): 818-823.
[11] 付骁鑫, 江永亨, 黄德先, 王京春, 黄开胜. 基于最优计算量分配的公路轨迹规划[J]. 清华大学学报(自然科学版), 2016, 56(3): 273-280.
[12] 罗磊, 陈恳, 杜峰坡, 马振书. 基于改进型粒子群算法的曲面匹配与位姿获取[J]. 清华大学学报(自然科学版), 2015, 55(10): 1061-1066.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 《清华大学学报(自然科学版)》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn