Please wait a minute...
 首页  期刊介绍 期刊订阅 联系我们
 
最新录用  |  预出版  |  当期目录  |  过刊浏览  |  阅读排行  |  下载排行  |  引用排行  |  百年期刊
Journal of Tsinghua University(Science and Technology)    2017, Vol. 57 Issue (1) : 28-32     DOI: 10.16511/j.cnki.qhdxxb.2017.21.006
COMPUTER SCIENCE AND TECHNOLOGY |
Score regulation based on GMM token ratio similarity for speaker recognition
YANG Yingchun, DENG Licai
College of Computer Science & Technology, Zhejiang University, Hangzhou 310027, China
Download: PDF(993 KB)  
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks    
Abstract  A GMM token ratio similarity based score regulation approach for speaker recognition is presented in this paper to judge the reliability of a test score based on the GMM token ratio similarity. In the GMM-UBM (universal background model) method, the GMM token which is the index of the UBM component giving the highest score is saved for each frame to form a vector called the GMM token ratio (GTR) of an utterance during the training and testing phases. In the test phase, the test utterance GTR is compared to the training utterance GTR to compute the similarity for a target speaker. When the similarity is less than a threshold, the original likelihood score is regulated by multiplying by a penalty factor as the final score of this test utterance. Tests on MASC show that this method improves the speaker recognition performance.
Keywords speaker recognition      GMM token ratio (GTR)      score regulation     
ZTFLH:  TP391.43  
Issue Date: 15 January 2017
Service
E-mail this article
E-mail Alert
RSS
Articles by authors
YANG Yingchun
DENG Licai
Cite this article:   
YANG Yingchun,DENG Licai. Score regulation based on GMM token ratio similarity for speaker recognition[J]. Journal of Tsinghua University(Science and Technology), 2017, 57(1): 28-32.
URL:  
http://jst.tsinghuajournals.com/EN/10.16511/j.cnki.qhdxxb.2017.21.006     OR     http://jst.tsinghuajournals.com/EN/Y2017/V57/I1/28
  
  
  
  
[7] Torres-Carrasquillo P, Reynolds D. Language identification using Gaussian mixture model tokenization[C]//Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing. Orlando, FL, USA:IEEE Press, 2002:757-760.<br />
[8] 顾明亮, 沈兆勇.基于语音配列的汉语方言自动辨识[J]. 中文信息学报, 2006, 20(5):77-82.GU Mingliang, SHEN Zhaoyong. Phonotatics based Chinese dialects identification[J]. Journal of Chinese Information Processing, 2006, 20(5):77-82. (in Chinese)<br />
[1] Reynolds D. A Gaussian Mixture Modeling Approach to Text-Independent Speaker Identification[D]. Atlanta, GA, USA:Georgia Institute of Technology, 1992.
[2] Reynolds D. Speaker verification using adapted Gaussian mixture models[J]. Digital Signal Processing, 2000, 10(1):19-41.
[9] MA Bin, ZHU Donglai, TONG Rong, et al. Speaker cluster based on GMM Tokenization for speaker recognition[C]//Proceedings of Interspeech, Pittsburgh, PA, USA, 2006:505-508.<br />
[3] 吴朝晖, 杨莹春. 说话人识别模型与方法[M]. 北京:清华大学出版社, 2009.WU Zhaohui, YANG Yingchun. Speaker Recognition:Models and Methods[M]. Beijing:Tsinghua University Press, 2009. (in Chinese)
[4] Tomi K, LI Haizhou. An overview of text-independent speaker recognition:From features to supervectors[J]. Speech Communication, 2010, 52(1):12-40.
[10] TONG Rong, MA Bin, LEE Kong-Aik, et al. Fusion of acoustic and tokenization features for speaker recognition[C]//Proceedings of the 5th International Symposium on Chinese Spoken Language Processing. Kentridge, Singapore:Springer Press, 2006:566-577.<br />
[5] XIANG Bing. Text-independent speaker verification with dynamic trajectory model[J]. IEEE Signal Processing Letters, 2003, 10(5):141-142.
[11] 邓立才. GMM说话人建模的关键问题研究[D]. 杭州:浙江大学, 2014.DENG Licai, Research on Key Problems of GMM Speaker Modeling[D]. Hanzghou:Zhejiang University, 2014. (in Chinese)<br />
[6] Zissman M. Comparison of four approaches to automatic language identification of telephone speech[J]. IEEE Transaction on Speech and Audio Processing, 1996, 4(1):31-44.
[12] WU Tian, YANG Yingchun, WU Zhaohui, et al. MASC:A speech corpus in Mandarin for emotion analysis and affective speaker recognition[C]//Proceedings of IEEE Odyssey Speaker and Language Recognition Workshop, Puerto Rico:IEEE Press, 2006:1-5.
[7] Torres-Carrasquillo P, Reynolds D. Language identification using Gaussian mixture model tokenization[C]//Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing. Orlando, FL, USA:IEEE Press, 2002:757-760.
[8] 顾明亮, 沈兆勇.基于语音配列的汉语方言自动辨识[J]. 中文信息学报, 2006, 20(5):77-82.GU Mingliang, SHEN Zhaoyong. Phonotatics based Chinese dialects identification[J]. Journal of Chinese Information Processing, 2006, 20(5):77-82. (in Chinese)
[9] MA Bin, ZHU Donglai, TONG Rong, et al. Speaker cluster based on GMM Tokenization for speaker recognition[C]//Proceedings of Interspeech, Pittsburgh, PA, USA, 2006:505-508.
[10] TONG Rong, MA Bin, LEE Kong-Aik, et al. Fusion of acoustic and tokenization features for speaker recognition[C]//Proceedings of the 5th International Symposium on Chinese Spoken Language Processing. Kentridge, Singapore:Springer Press, 2006:566-577.
[11] 邓立才. GMM说话人建模的关键问题研究[D]. 杭州:浙江大学, 2014.DENG Licai, Research on Key Problems of GMM Speaker Modeling[D]. Hanzghou:Zhejiang University, 2014. (in Chinese)
[12] WU Tian, YANG Yingchun, WU Zhaohui, et al. MASC:A speech corpus in Mandarin for emotion analysis and affective speaker recognition[C]//Proceedings of IEEE Odyssey Speaker and Language Recognition Workshop, Puerto Rico:IEEE Press, 2006:1-5.
[1] AISIKAER Rouzi, WANG Dong, LI Lantian, ZHENG Fang, ZHANG Xiaodong, JIN Panshi. Score domain speaking rate normalization for speaker recognition[J]. Journal of Tsinghua University(Science and Technology), 2018, 58(4): 337-341.
[2] TIAN Yao, CAI Meng, HE Liang, LIU Jia. Speaker recognition system based on deep neural networks and bottleneck features[J]. Journal of Tsinghua University(Science and Technology), 2016, 56(11): 1143-1148.
[3] GUO Wu, MA Xiaokong. Voice activity detection in complex noise environment[J]. Journal of Tsinghua University(Science and Technology), 2016, 56(11): 1190-1195.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
Copyright © Journal of Tsinghua University(Science and Technology), All Rights Reserved.
Powered by Beijing Magtech Co. Ltd