全变量系统和支持向量机结合的说话人确认

doi:10.16511/j.cnki.qhdxxb.2017.26.003

摘要
图/表
参考文献
相关文章
Metrics

全文: PDF(1117 KB)
输出: BibTeX | EndNote (RIS)

摘要基于全变量因子分析和概率线性区分性分析的算法是目前与文本无关的说话人确认的主流算法。该文将全变量分析和支持向量机结合起来，把低维的全变量因子作为支持向量机的输入特征，并采用余弦核函数来增强低维特征的区分性，该系统取得了与当前主流算法相当的性能；进一步，将此系统得分和概率线性鉴别分析系统得分融合起来可以取得明显的性能提升。在NIST 2012说话人评测通用测试条件的女声部分，融合后的系统在情境一和三的检测代价函数相对最好的单系统分别下降了25.1%和25.2%。

	服务

	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	RSS

	作者相关文章
	郭武
	张圣
	徐杰
	胡国平
	马啸空

关键词 ：说话人确认, 全变量系统, 支持向量机, 核函数

Abstract：The total variability factor extractor and the probability linear discriminant analysis (PLDA) algorithms have been the state-of-the-art for text-independent speaker verification. This study combines a support vector machine (SVM) with the PLDA. The low dimensional i-vectors of the total variability system are used as the inputs to the support vector machine, with the cosine kernel function used to achieve better discrimination. This method achieves considerable performance improvement with the PLDA system. Furthermore, the score fusion of the SVM with the PLDA give even better results. Tests were conducted on the female part of the interview section of the NIST 2012 core test corpus. The detection cost function (DCF) of the fusion system was reduced by 25.1% for common condition 1 and 25.2% for condition 3 compared with the best results for a single system.

Key words： speaker verification total variability support vector machine kernel function

收稿日期: 2016-06-21 出版日期: 2017-03-15

ZTFLH:

TN912.34

引用本文:

郭武, 张圣, 徐杰, 胡国平, 马啸空. 全变量系统和支持向量机结合的说话人确认[J]. 清华大学学报（自然科学版）, 2017, 57(3): 240-243.
GUO Wu, ZHANG Sheng, XU Jie, HU Guoping, MA Xiaokong. Speaker verification based on SVM and total variability. Journal of Tsinghua University(Science and Technology), 2017, 57(3): 240-243.

链接本文:

http://jst.tsinghuajournals.com/CN/10.16511/j.cnki.qhdxxb.2017.26.003 或 http://jst.tsinghuajournals.com/CN/Y2017/V57/I3/240

表1 不同输入特征矢量实验对比

表2 SVM 系统一系列实验对比

表3 余弦核SVM 下不同规整方法对比

表4 系统得分融合前后的性能对比

[1]	Reynolds D A, Quatieri T F, Dunn R B. Speaker verification using adapted Gaussian mixture models[J]. Digital Signal Processing, 2000, 10(1):19-41.
[2]	Kenny P, Boulianne G, Ouellet P, et al. Joint factor analysis versus eigenchannels in speaker recognition[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2007, 15(4):1435-1447.
[3]	Dehak N, Kenny P J, Dehak R, et al. Front-end factor analysis for speaker verification[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(4):788-798.
[4]	Prince S J D, Elder J H. Probabilistic linear discriminant analysis for inferences about identity[C]//2007 IEEE 11th International Conference on Computer Vision. Rio de Janeiro, Brazil:IEEE Press, 2007:1-8.
[5]	Burget L, Plchot O, Cumani S, et al. Discriminatively trained probabilistic linear discriminant analysis for speaker verification[C]//2011 IEEE international conference on acoustics, speech and signal processing (ICASSP). Prague, Czech Republic:IEEE Press, 2011:4832-4835.
[6]	Jiang Y, Kong A L, Wang L. PLDA in the i-supervector space for text-independent speaker verification[J]. Eurasip Journal on Audio Speech and Music Processing, 2014, 2014(1):1-13.
[7]	Kenny P, Stafylakis T, Ouellet P, et al. PLDA for speaker verification with utterances of arbitrary duration[C]//2013 IEEE International Conference on Acoustics, Speech and Signal Processing. Brisbane, Australia:IEEE Press, 2013:7649-7653.
[8]	Li N, Mak M W. SNR-invariant PLDA modeling in nonparametric subspace for robust speaker verification[J]. IEEE/ACM Transactions on Audio Speech and Language Processing, 2015, 23(10):1648-1659.
[9]	Bourouba H, Korba C A, Djemili R. Novel approach in speaker identification using SVM and GMM[J]. Control Engineering & Applied Informatics, 2013, 15(3):87-95.
[10]	Ding I J, Yen C T, Ou D C. A method to integrate GMM, SVM and DTW for speaker recognition[J]. International Journal of Engineering and Technology Innovation, 2014, 4(1):38-47.
[11]	Campbell W M, Sturim D E, Reynolds D A, et al. SVM based speaker verification using a GMM supervector kernel and NAP variability compensation[C]//2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings. Brisbane, Australia:IEEE Press, 2006, 1:I-I.
[12]	Solomonoff A, Quillen C, Campbell W M. Channel compensation for SVM speaker recognition[C]//ICASSP 2005, Acoustics, Speech, and Signal Processing Proceedings. Philadelphia, PA, USA:IEEE Press, 2010:629-632."

[1]	屠守中, 杨婧, 赵林, 朱小燕. 半监督的微博话题噪声过滤方法[J]. 清华大学学报（自然科学版）, 2019, 59(3): 178-185.
[2]	吐松江·卡日, 高文胜, 张紫薇, 莫文雄, 王红斌, 崔屹平. 基于支持向量机和遗传算法的变压器故障诊断[J]. 清华大学学报（自然科学版）, 2018, 58(7): 623-629.
[3]	陈冬青, 张普含, 王华忠. 基于MIKPSO-SVM方法的工业控制系统入侵检测[J]. 清华大学学报（自然科学版）, 2018, 58(4): 380-386.
[4]	徐洪平, 刘洋, 易航, 阎小涛, 康健, 张文瑾. 运载火箭测发网络异常流量识别技术[J]. 清华大学学报（自然科学版）, 2018, 58(1): 20-26,34.
[5]	刘成颖, 吴昊, 王立平, 张智. 基于PSO优化LS-SVM的刀具磨损状态识别[J]. 清华大学学报（自然科学版）, 2017, 57(9): 975-979.
[6]	赛牙热·依马木, 热依莱木·帕尔哈提, 艾斯卡尔·艾木都拉, 李志军. 基于不同关键词提取算法的维吾尔文本情感辨识[J]. 清华大学学报（自然科学版）, 2017, 57(3): 270-273.
[7]	辛喆, 邹若冰, 李升波, 俞佳莹, 戴一凡, 陈海亮. 基于超声波传感器阵列的车辆周围目标物识别[J]. 清华大学学报（自然科学版）, 2017, 57(12): 1287-1295.
[8]	杨殿阁, 何长伟, 李满, 何奇洸. 基于支持向量机的汽车转向与换道行为识别[J]. 清华大学学报（自然科学版）, 2015, 55(10): 1093-1097.
[9]	张超, 刘奕, 张辉, 黄弘. 基于支持向量机的城市燃气日负荷预测方法研究[J]. 清华大学学报（自然科学版）, 2014, 54(3): 320-325.

Viewed

Full text

Abstract

Cited

Shared

Discussed