Please wait a minute...
 首页  期刊介绍 期刊订阅 联系我们
 
最新录用  |  预出版  |  当期目录  |  过刊浏览  |  阅读排行  |  下载排行  |  引用排行  |  百年期刊
Journal of Tsinghua University(Science and Technology)    2018, Vol. 58 Issue (5) : 516-522     DOI: 10.16511/j.cnki.qhdxxb.2018.25.026
COMPUTER SCIENCE AND TECHNOLOGY |
A robust time-delay estimation and dereverberation algorithm based on the coherence function
FANG Yi1, CHEN Youyuan2, MOU Hongyu2, FENG Haihong2
1. The Institute of Acoustics, University of Chinese Academy of Sciences, Beijing 100190, China;
2. Shanghai Acoustics Laboratory, Chinese Academy of Sciences, Shanghai 200815, China
Download: PDF(1357 KB)  
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks    
Abstract  The performance of traditional cross-correlation based time-delay estimation methods is sharply degraded in reverberation environments. Precedence effect models have been proposed with cross-correlation functions, but these models are quite parameter-sensitive and the front-end processes are very complex. This paper describes a method that first updates a function of the speech and noise based on the eigenvalues of the covariance matrix. Then, a coherence function of the speech is matched to the ideal coherence function for the time-delay estimate. Then, the estimated time delay and the noise coherence function are applied to the coherent-to-diffuse power ratio (CDR) estimator for reverberation suppression. Tests show that this scheme has higher localization accuracy than traditional methods and achieves higher PESQ (perceptual evaluation of speech quality) scores than other CDR estimators.
Keywords precedence effect      time-delay estimation      dereverberation      coherence function     
ZTFLH:  TP242  
  TN912.34  
Issue Date: 15 May 2018
Service
E-mail this article
E-mail Alert
RSS
Articles by authors
FANG Yi
CHEN Youyuan
MOU Hongyu
FENG Haihong
Cite this article:   
FANG Yi,CHEN Youyuan,MOU Hongyu, et al. A robust time-delay estimation and dereverberation algorithm based on the coherence function[J]. Journal of Tsinghua University(Science and Technology), 2018, 58(5): 516-522.
URL:  
http://jst.tsinghuajournals.com/EN/10.16511/j.cnki.qhdxxb.2018.25.026     OR     http://jst.tsinghuajournals.com/EN/Y2018/V58/I5/516
  
  
  
  
  
  
  
  
  
  
  
  
[1] ALLEN J B, BERKLEY D A, BLAUERT J. Multimicrophone signal-processing technique to remove room reverberation from speech signals[J]. Journal of the Acoustical Society of America, 1977, 62(4):912-915.
[2] ZELINSKI R. A microphone array with adaptive post-filtering for noise reduction in reverberant rooms[C]//1998 International Conference on Acoustics, Speech, and Signal Processing. Atlanta, GA, USA:IEEE, 1988:2578-2581.
[3] LEBART K, BOUCHER J M, DENBIGH P N. A binaural system for the suppression of late reverberation[C]//Proceedings of the 2nd European Signal Processing Conference (EUSIPCO). Rhodes, Greece:EURASIP, 1998:97-100
[4] JEUB M, FER M, ESCH T, et al. Model-based dereverberation preserving binaural cues[J]. IEEE Transactions on Audio Speech & Language Processing, 2010, 18(7):1732-1745.
[5] SCHWARZ A, KELLERMANN W. Coherent-to-diffuse power ratio estimation for dereverberation[J]. IEEE/ACM Transactions on Audio Speech & Language Processing, 2015, 23(6):1006-1018.
[6] ZHENG C, SCHWARZ A, KELLERMANN W, et al. Binaural coherent-to-diffuse-ratio estimation for dereverberation using an ITD model[C]//Proceedings of the 23rd European Signal Processing Conference. Nice, French:EURASIP, 2015:1048-1052.
[7] KNAPP C, CARTER G. The generalized correlation method for estimation of time delay[J]. IEEE Transactions on Acoustics Speech & Signal Processing, 2003, 24(4):320-327.
[8] LIU C, WHEELER B C, JR W D O, et al. Localization of multiple sound sources with two microphones[J]. Journal of the Acoustical Society of America, 2000, 108(4):1888-1905.
[9] JR R M S, COLBURN H S. Theory of binaural interaction based on auditory-nerve data. IV. A model for subjective lateral position[J]. Journal of the Acoustical Society of America, 1978, 64(1):127-140.
[10] LINDEMANN W. Extension of a binaural cross-correlation model by contralateral inhibition. I. Simulation of lateralization for stationary signals[J]. Journal of the Acoustical Society of America, 1986, 80(6):1608-1622.
[11] LITOVSKY R Y, COLBURN H S, YOST W A, et al. The precedence effect[J]. Journal of the Acoustical Society of America, 1999, 106(4):1633-1654.
[12] HUANG J, OHNISHI N, SUGIE N. Sound localization in reverberant environment based on the model of the precedence effect[J]. IEEE Transactions on Instrumentation & Measurement, 1997, 46(4):842-846.
[13] MARTIN K D. Echo suppression in a computational model of the precedence effect[C]//1997 IEEE ASSP Workshop on Applications of Signal Processing to Audio and Acoustics. New Paltz, NY, USA:IEEE, 1997:4.
[14] MEDDIS R, HEWITT M J, SHACKLETON T M. Implementation details of a computation model of the inner hair-cell auditory-nerve synapse[J]. Journal of the Acoustical Society of America, 1990, 87(87):1813-1816.
[15] FALLER C, MERIMAA J. Source localization in complex listening situations:Selection of binaural cues based on interaural coherence[J]. Journal of the Acoustical Society of America, 2004, 116(5):3075-3089.
[16] LAVANDIER M, CULLING J F. Speech segregation in rooms:Importance of the interferer interaural coherence[J]. Journal of the Acoustical Society of America, 2008, 123(5):2977-2977.
[17] RAKERD B, HARTMANN W M. Localization of sound in rooms. V. Binaural coherence and human sensitivity to interaural time differences in noise[J]. Journal of the Acoustical Society of America, 2010, 128(5):3052-3063.
[18] JI Y, PARK Y C, KIM D W, et al. Robust noise PSD estimation for binaural hearing aids in time-varying diffuse noise field[C]//IEEE International Conference on Acoustics, Speech and Signal Processing. Vancouver, Canada:IEEE, 2013:7264-7268.
[19] ALLEN J B, BERKLEY D A. Image method for efficiently simulating small room acoustics[J]. Journal of the Acoustical Society of America, 1979, 65(4):943-950.
[20] ROTHAUSER E H, CHAPMAN W D, GUTTMAN N, et al. IEEE recommended practice for speech quality measurements[J]. IEEE Transactions on Audio and Electroacoust, 1969, 17(3):225-246.
[21] RIX A W, BEERENDS J G, HOLLIER M P, et al. Perceptual evaluation of speech quality (PESQ)-A new method for speech quality assessment of telephone networks and codecs[C]//IEEE International Conference on Acoustics, Speech, and Signal Processing. Salt Lake City, UT, USA:IEEE, 2001:749-752.
[22] JEUB M, FER M, VARY P. A binaural room impulse response database for the evaluation of dereverberation algorithms[C]//Proceedings of the 16th International Conference on Digital Signal Processing. Santorini, Greece:IEEE, 2009:1-5.
[1] GUO Jichang, ZHU Zhiming, WANG Xin, MA Guorui. Numerical solution of the inverse kinematics and trajectory planning for an all-position welding robot[J]. Journal of Tsinghua University(Science and Technology), 2018, 58(3): 292-297.
[2] CUI Zhiwei, TANG Xiaoqiang, HOU Senhao, XIANG Chengyuan. Characteristics of controllable stiffness for cable-driven parallel robots[J]. Journal of Tsinghua University(Science and Technology), 2018, 58(2): 204-211.
[3] YU Guang, WANG Liping, WU Jun, WANG Dong. Dynamic model and dynamic characteristics of a 3-DOF spindle with a parallel linkage mechanism[J]. Journal of Tsinghua University(Science and Technology), 2017, 57(12): 1317-1323.
[4] WANG Jianrong, GAO Yongchun, ZHANG Ju, WEI Jianguo, DANG Jianwu. Automatic speech recognition by a Kinect sensor for a robot under ego noises[J]. Journal of Tsinghua University(Science and Technology), 2017, 57(9): 921-925.
[5] ZHANG Binbin, WANG Liping, WU Jun. Dynamic isotropic performance evaluation of a 3-DOF parallel manipulator[J]. Journal of Tsinghua University(Science and Technology), 2017, 57(8): 803-809.
[6] ZHU Zhiming, GUO Jichang, MA Guorui, LIU Bo. Kinematics analysis and trajectory planning for a welding robot for girth welding of box-type steel structures[J]. Journal of Tsinghua University(Science and Technology), 2017, 57(8): 785-791.
[7] WANG Guolei, YI Qiang, MIAO Dongjing, CHEN Ken, WANG Liqiang. Multivariable coating thickness distribution model for robotic spray painting[J]. Journal of Tsinghua University(Science and Technology), 2017, 57(3): 324-330.
[8] WANG Jianrong, ZHANG Ju, LU Wenhuan, WEI Jianguo, DANG Jianwu. Automatic speech recognition with robot noise[J]. Journal of Tsinghua University(Science and Technology), 2017, 57(2): 153-157.
[9] ZHANG Jiwen, LIU Li, CHEN Ken. System design and local optimization of a small humanoid soccer robot MOS-7[J]. Journal of Tsinghua University(Science and Technology), 2016, 56(8): 811-817.
[10] ZHANG Jiwen, LIU Li, CHEN Ken. Stabilizing control of humanoids' walking based on AHRS feedback[J]. Journal of Tsinghua University(Science and Technology), 2016, 56(8): 818-823.
[11] FU Xiaoxin, JIANG Yongheng, HUANG Dexian, WANG Jingchun, HUANG Kaisheng. On-road trajectory planning based on optimal computing budget allocation[J]. Journal of Tsinghua University(Science and Technology), 2016, 56(3): 273-280.
[12] LUO Lei, CHEN Ken, DU Fengpo, MA Zhenshu. Surface fitting and position-pose measurements based on an improved SA-PSO algorithm[J]. Journal of Tsinghua University(Science and Technology), 2015, 55(10): 1061-1066.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
Copyright © Journal of Tsinghua University(Science and Technology), All Rights Reserved.
Powered by Beijing Magtech Co. Ltd