结合EEMD与<em>K</em>-SVD字典训练的语音增强算法

doi:10.16511/j.cnki.qhdxxb.2017.26.011

摘要
图/表
参考文献
相关文章
Metrics

全文: PDF(1774 KB)
输出: BibTeX | EndNote (RIS)

摘要该文提出一种总体平均经验模态分解（ensemble empirical mode decomposition，EEMD）方法与K奇异值分解（K-singular value decomposition，K-SVD）字典算法相结合的语音增强算法。将带噪语音通过EEMD分解得到各本征模式分量（intrinsic mode function，IMF），对各IMF分量进行互相关和自相关分析，去除噪声IMF分量，并将过渡IMF分量再次进行EEMD分解，去除其中的噪声IMF分量。将过渡IMF分量和剩余的IMF分量叠加，得到预降噪的带噪语音。利用纯净语音，通过K-SVD字典训练算法得到过完备字典。对预降噪的带噪语音通过过完备字典进行稀疏表示，稀疏系数重构出纯净语音。实验结果表明：在低信噪比和高信噪比情况下，该算法的去噪效果明显优于传统的谱减法、小波阈值去噪法和K-SVD字典训练。

	服务

	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	RSS

	作者相关文章
	甘振业
	陈浩
	杨鸿武

关键词 ：语音增强, 总体平均经验模态分解, K奇异值分解, 相关性

Abstract：This paper presents a speech enhancement algorithm that combines the ensemble empirical mode decomposition (EEMD) algorithm and the K-singular value decomposition (K-SVD) dictionary-training algorithm. The EEMD algorithm is used to obtain the intrinsic mode function (IMF) components from noisy speech. The cross-correlations and autocorrelations of each IMF are calculated from the IMF components to filter out the noisy IMF components. The transition IMF components are again decomposed with EEMD to further remove the noisy component. The remained IMFs and transition IMFs are superimposed to generate the de-noised speech. An over-complete dictionary is then trained from the clean speech by the K-SVD dictionary training algorithm. The de-noised speech is then sparse decomposed with the over-complete dictionary to obtain the enhanced speech by recovering the speech signal from sparse coefficient vectors. Tests show that the algorithm achieves better de-noising than the traditional spectral subtraction, wavelet threshold de-noising and K-SVD dictionary-training algorithms for both low signal-to-noise ratio (SNR) and high SNR environments.

Key words： speech enhancement ensemble empirical mode decomposition (EEMD) K-singular value decomposition (K-SVD) correlation

收稿日期: 2016-06-23 出版日期: 2017-03-15

ZTFLH:

TN912.35

通讯作者: 杨鸿武,教授,E-mail:yanghw@nwnu.edu.cn E-mail: yanghw@nwnu.edu.cn

引用本文:

甘振业, 陈浩, 杨鸿武. 结合EEMD与K-SVD字典训练的语音增强算法[J]. 清华大学学报（自然科学版）, 2017, 57(3): 286-292.
GAN Zhenye, CHEN Hao, YANG Hongwu. Speech enhancement algorithm that combines EEMD and K-SVD dictionary training. Journal of Tsinghua University(Science and Technology), 2017, 57(3): 286-292.

链接本文:

http://jst.tsinghuajournals.com/CN/10.16511/j.cnki.qhdxxb.2017.26.011 或 http://jst.tsinghuajournals.com/CN/Y2017/V57/I3/286

图1 算法流程

图2 加噪输出信噪比为-5dB时各IMF分量的相关系数

图3 加噪输出信噪比5dB时各IMF分量的相关系数

图4 加噪输出信噪比-5dB时IMF1的自相关性分析

图5 加噪输出信噪比5dB时IMF1的自相关性分析

图6 各算法不同信噪比下增强处理后的信噪比

图7 各算法不同信噪比下增强处理后的PESQ

图8 加入白噪声-5dB时4种算法增强后的波形

图9 加入色噪声-5dB时4种算法增强后的波形

表1 加入不同噪声时4种算法的DRT评分

[1]	Chen J, Benesty J, Huang Y, et al. New insights into the noise reduction Wiener filter[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2006, 14(4):1218-1234.
[2]	Boll S. Suppression of acoustic noise in speech using spectral subtraction[J]. IEEE Transactions on Acoustics, Speech, and Signal Processing, 1979, 27(2):113-120.
[3]	Donoho D L. De-noising by soft-thresholding[J]. IEEE Transactions on Information Theory, 1995, 41(3):613-627.
[4]	王文波, 张晓东, 汪祥莉. 基于主成分分析的经验模态分解消噪方法[J]. 电子学报, 2013, 41(7):1425-1430. WANG Wenbo, ZHANG Xiaodong, WANG Xiangli. Empirical mode decomposition de-noising method based on principal component analysis[J]. Chinese Journal of Electronics, 2013, 41(7):1425-1430. (in Chinese)
[5]	石光明, 刘丹华, 高大化, 等. 压缩感知理论及其研究进展[J]. 电子学报, 2009, 37(5):1070-1081.SHI Guangming, LIU Danhua, GAO Dahua, et al. Advances in theory and application of compressed sensing[J]. Chinese Journal of Electronics, 2009, 37(5):1070-1081. (in Chinese)
[6]	Donoho D L. Compressed sensing[J]. IEEE Transactions on Information Theory, 2006, 52(4):1289-1306.
[7]	Michal A, Michael E, Alfred B. <em>K</em>-SVD:An algorithm for designing overcomplete dictionaries for sparse representation[J]. IEEE Transactions on Signal Processing, 2006, 54(11):4311-4322.
[8]	Huang N E, Shen Z, Long S R, et al. The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis[C]//Proceedings of the Royal Society of London A:Mathematical, Physical and Engineering Sciences. London, UK:Royal Society, 1998:903-995.
[9]	Wu Z, Huang N E. Ensemble empirical mode decomposition:A noise-assisted data analysis method[J]. Advances in Adaptive Data Analysis, 2009, 1(1):1-41.
[10]	Karahanoglu N B, Erdogan H. A<sup>*</sup> Orthogonal matching pursuit:Best-first search for compressed sensing signal recovery[J]. Digital Signal Processing, 2012, 22(4):555-568.
[11]	Engan K, Aase S O, Husøy J H. Multi-frame compression:Theory and design[J]. Signal Processing, 2000, 80(10):2121-2140.
[12]	李月, 彭蛟龙, 马海涛, 等. 过渡内蕴模态函数对经验模态分解去噪结果的影响研究及改进算法[J]. 地球物理学报, 2013, 56(2):626-634. LI Yue, PENG Jiaolong, MA Haitao, et al. Study of the influence of transition IMF on EMD do-noising and improved algorithm[J]. Chinese Journal of Geophysics, 2013, 56(2):626-634. (in Chinese)
[13]	Donoho D L, Elad M, Temlyakov V N. Stable recovery of sparse overcomplete representations in the presence of noise[J]. IEEE Transactions on Information Theory, 2006, 52(1):6-18.
[14]	Li Z, Tan E C, McLoughlin I, et al. Proposal of standards for intelligibility tests of Chinese speech[J]. IEEE Proceedings-Vision, Image and Signal Processing, 2000, 147(3):254-260.

[1]	王兴旺, 刘耀儒, 吕帅, 杨强. 高拱坝蓄水期库岸变形与水库诱发地震相关性研究[J]. 清华大学学报（自然科学版）, 2022, 62(8): 1341-1350.
[2]	王建荣, 张句, 路文焕, 魏建国, 党建武. 机器人自身噪声环境下的自动语音识别[J]. 清华大学学报（自然科学版）, 2017, 57(2): 153-157.
[3]	曹洪林, 孔江平. 成年人声道参数与身高的相关性[J]. 清华大学学报（自然科学版）, 2016, 56(11): 1184-1189,1195.
[4]	吕艳丽, 李元龙, 向爽, 夏春和. 基于服务相关性的应用层安全事件危害评估方法[J]. 清华大学学报（自然科学版）, 2016, 56(1): 35-41.

Viewed

Full text

Abstract

Cited

Shared

Discussed