Please wait a minute...
 首页  期刊介绍 期刊订阅 联系我们
 
最新录用  |  预出版  |  当期目录  |  过刊浏览  |  阅读排行  |  下载排行  |  引用排行  |  百年期刊
Journal of Tsinghua University(Science and Technology)    2017, Vol. 57 Issue (2) : 147-152     DOI: 10.16511/j.cnki.qhdxxb.2017.22.006
INFORMATION ENGINEERING |
Design and optimization of a low resource speech recognition system
ZHANG Pengyuan1, JI Zhe2, HOU Wei2, JIN Xin2, HAN Weisheng1
1. Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics, Chinese Academy of Sciences, Beijing 100190, China;
2. National Computer Network Emergency Response Technical Team/Coordination Center of China, Beijing 100029, China
Download: PDF(1225 KB)  
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks    
Abstract  Wearable devices and smart home systems need speech recognition engines with few resources and high rejection rates. Traditional methods cannot provide such systems. This paper presents algorithms for decoding and rejection for a low source speech recognition system. The decoding improves the rejection rate up to 64.8% by changing the filler reentry while the memory is only increased 8.5 kB compared with the baseline system. The rejection algorithm computes a background probability which is compared to similar probabilities calculated in advance online decoding. The system gives a rejection rate of 93.8% with little loss in the recognition rate. The memory and computational speed are also optimized.
Keywords speech recognition      low resource      confidence measure     
ZTFLH:  TN912.34  
Issue Date: 15 February 2017
Service
E-mail this article
E-mail Alert
RSS
Articles by authors
ZHANG Pengyuan
JI Zhe
HOU Wei
JIN Xin
HAN Weisheng
Cite this article:   
ZHANG Pengyuan,JI Zhe,HOU Wei, et al. Design and optimization of a low resource speech recognition system[J]. Journal of Tsinghua University(Science and Technology), 2017, 57(2): 147-152.
URL:  
http://jst.tsinghuajournals.com/EN/10.16511/j.cnki.qhdxxb.2017.22.006     OR     http://jst.tsinghuajournals.com/EN/Y2017/V57/I2/147
  
  
  
  
  
  
  
[1] 韩娜, 钟卓成, 吴振权, 等. 基于体感控制的智能家居系统设计与实现[J]. 信息技术, 2015(12):91-93.HAN Na, ZHONG Zhuocheng, WU Zhenquan, et al. Design and implementation of smart home system based on somatosensory control[J]. Information Technology, 2015(12):91-93. (in Chinese)
[2] 叶高扬, 毕冉. 基于物联网的智能家居系统设计与实现[J]. 计算机应用, 2014(S1):318-319.YE Gaoyang, BI Ran. Design and implementation of smart home system based on Internet of things[J]. Journal of Computer Applications, 2014(S1):318-319. (in Chinese)
[3] Joshi V, Bilgi R, Umesh S, et al. Sub-band based histogram equalization in cepstral domain for speech recognition[J]. Speech Communication, 2015, 69:46-65.
url: http://dx.doi.org/10.1016/j.specom.2015.02.005
[4] 王智国. 嵌入式人机语音交互系统关键技术研究[D]. 合肥:中国科学技术大学, 2014.WANG Zhiguo. Research on Key Technologies of Embedded Human-Machine Speech Interaction System[D]. Hefei:University of Science and Technology of China, 2014. (in Chinese)
[5] 邵健, 韩疆, 颜永红. 嵌入式语音识别中一种高效的搜索树构造方法[C]//第8届全国人机语音通讯学术会议. 北京, 2005.SHAO Jian, HAN Jiang, YAN Yonghong. An efficient search algorithm in embed speech recognition[C]//The Eighth National Conference on Man-Machine Speech Communication. Beijing, China, 2005. (in Chinese)
[6] Jiang H. Confidence measures for speech recognition:A survey[J]. Speech Communication, 2005, 45(4):455-470.
[7] Sanchez-Cortina I, Andrés-Ferrer J, Sanchis A, et al. Speaker-adapted confidence measures for speech recognition of video lectures[J]. Computer Speech & Language, 2016, 37:11-23.
url: http://dx.doi.org/ter Speech
[8] Young S R. Detecting misrecognitions and out-of-vocabulary words[C]//Acoustics, Speech, and Signal Processing. Adelaide, SA, Australia, 1994, 2:21-24.
[9] Wessel F, Schluter R, Macherey K, et al. Confidence measures for large vocabulary continuous speech recognition[J]. IEEE Transactions on Speech and Audio Processing, 2001, 9(3):288-298.
[10] Yoma N B, Carrasco J, Molina C. Bayes-based confidence measure in speech recognition[J]. IEEE Signal Processing Letters, 2005, 12(11):745-748.
[11] Sherif A, Scordilis M S. Beam search pruning in speech recognition using a posterior probability-based confidence measure[J]. Speech Communication, 2003, 42:409-428.
[12] Sanchis A, Juan A, Vidal E. A word-based naïve Bayes classifier for confidence estimation in speech recognition[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2012, 20(2):565-574.
[1] MAN Zhibo, MAO Cunli, YU Zhengtao, LI Xunyu, GAO Shengxiang, ZHU Junguo. Chinese-English-Burmese neural machine translation based on multilingual joint training[J]. Journal of Tsinghua University(Science and Technology), 2021, 61(9): 927-935.
[2] ZHANG Yu, ZHANG Pengyuan, YAN Yonghong. Long short-term memory with attention and multitask learning for distant speech recognition[J]. Journal of Tsinghua University(Science and Technology), 2018, 58(3): 249-253.
[3] YI Jiangyan, TAO Jianhua, LIU Bin, WEN Zhengqi. Transfer learning for acoustic modeling of noise robust speech recognition[J]. Journal of Tsinghua University(Science and Technology), 2018, 58(1): 55-60.
[4] WANG Jianrong, GAO Yongchun, ZHANG Ju, WEI Jianguo, DANG Jianwu. Automatic speech recognition by a Kinect sensor for a robot under ego noises[J]. Journal of Tsinghua University(Science and Technology), 2017, 57(9): 921-925.
[5] Mijit Ablimit, Akbar Pattar, Askar Hamdulla. Multilayer structure based lexicon optimization for language modeling[J]. Journal of Tsinghua University(Science and Technology), 2017, 57(3): 257-263.
[6] WANG Jianrong, ZHANG Ju, LU Wenhuan, WEI Jianguo, DANG Jianwu. Automatic speech recognition with robot noise[J]. Journal of Tsinghua University(Science and Technology), 2017, 57(2): 153-157.
[7] Aisikaer Rouzi, YIN Shi, ZHANG Zhiyong, WANG Dong, Askar Hamdulla, ZHENG Fang. THUYG-20: A free Uyghur speech database[J]. Journal of Tsinghua University(Science and Technology), 2017, 57(2): 182-187.
[8] HOU Jingyong, XIE Lei, YANG Peng, XIAO Xiong, LEUNG Cheung-Chi, XU Haihua, WANG Lei, LV Hang, MA Bin, CHNG EngSiong, LI Haizhou. Spoken term detection based on DTW[J]. Journal of Tsinghua University(Science and Technology), 2017, 57(1): 18-23.
[9] XING Anhao, ZHANG Pengyuan, PAN Jielin, YAN Yonghong. SVD-based DNN pruning and retraining[J]. Journal of Tsinghua University(Science and Technology), 2016, 56(7): 772-776.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
Copyright © Journal of Tsinghua University(Science and Technology), All Rights Reserved.
Powered by Beijing Magtech Co. Ltd