基于SVD的DNN裁剪方法和重训练

doi:10.16511/j.cnki.qhdxxb.2016.21.043

摘要
图/表
参考文献
相关文章
Metrics

全文: PDF(989 KB)
输出: BibTeX | EndNote (RIS)

摘要深层神经网络（DNN）的参数量巨大,限制了其在一些计算资源受限或是注重速度的应用场景中的应用。为了降低DNN参数量，有学者提出利用奇异值分解（SVD）对DNN进行裁剪，然而其方法缺乏自适应性，因为它会从所有隐层裁减掉同样数量的奇异值。该文提出了一种基于奇异值比率裁剪因子（singular rate pruning factor, SRPF）的DNN裁剪方法。该方法以数据驱动的方式分别为DNN的各个隐层计算出SRPF，然后以不同的裁剪因子对各隐层进行裁剪，这充分利用了各隐层权值矩阵的奇异值分布特性。与固定数量裁剪法相比，该方法具有自适应性。实验表明：在同样裁剪力度下，该方法给DNN造成的性能损失更小。另外，该文还提出了一种适合裁剪后的DNN的重训练方法。

	服务

	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	RSS

	作者相关文章
	邢安昊
	张鹏远
	潘接林
	颜永红

关键词 ：语音识别, 深层神经网络(DNN), 奇异值分解(SVD)

Abstract：Deep neural networks (DNN) have many parameters, which restricts the use of DNN in scenarios with limited computing resources or when speed is a priority. Some researchers have proposed to prune the DNN using singular value decomposition (SVD). However, this method lacks adaptivity as it prunes the same number of singular values in all the hidden DNN layers. A singular rate pruning factor (SRPF) based DNN pruning method is given here. This method first separately calculates the SRPFs for each hidden layer based on the data with every layer then pruned using different pruning factors. This method makes full use of the distribution traits of the singular values in each hidden layer. This method is more adaptive than pruning a fixed portion of singular values with experiments showing that a DNN pruned with this method performs better. A retraining method is also given which adapts to the pruned DNN.

Key words： speech recognition deep neural network (DNN) singular value decomposition (SVD)

收稿日期: 2015-07-10 出版日期: 2016-07-15

ZTFLH:

TN912.34

基金资助:国家自然科学基金资助项目（11461141004，91120001，61271426）；国家“八六三”高技术项目（2012AA012503）；中国科学院战略性先导科技专项（XDA06030100，XDA06030500）；中国科学院重点部署项目（KGZD-EW-103-2）

通讯作者: 张鹏远,副研究员,E-mail:zhangpengyuan@hccl.ioa.ac.cn E-mail: zhangpengyuan@hccl.ioa.ac.cn

引用本文:

邢安昊, 张鹏远, 潘接林, 颜永红. 基于SVD的DNN裁剪方法和重训练[J]. 清华大学学报（自然科学版）, 2016, 56(7): 772-776.
XING Anhao, ZHANG Pengyuan, PAN Jielin, YAN Yonghong. SVD-based DNN pruning and retraining. Journal of Tsinghua University(Science and Technology), 2016, 56(7): 772-776.

链接本文:

http://jst.tsinghuajournals.com/CN/10.16511/j.cnki.qhdxxb.2016.21.043 或 http://jst.tsinghuajournals.com/CN/Y2016/V56/I7/772

图１　DNN 结构示意图

图２　不同层的奇异值分布情况对比

表１　两种裁剪方法对比

图３　重训练过程中的性能变化曲线

[1] Hinton G, Deng L, Yu D, et al. Deep neural networks for acoustic modeling in speech recognition[J]. IEEE Signal Processing Magazine, 2012, 29(6):82-97.
[2] Mohamed A, Dahl G, Hinton G. Acoustic modeling using deep belief networks[J]. IEEE Trans. on Audio, Speech, and Language Processing, 2012, 20(1):14-22.
[3] Deng L, Yu D, Platt J. Scalable stacking and learning for building deep architectures[C]//ICASSP. Kyoto, Japan:IEEE Press, 2012:2133-2136.
[4] 张宇, 计哲, 万辛, 等. 基于DNN的声学模型自适应实验研究[J].天津大学学报:自然科学与工程技术版, 2015, 48(9):765-769.ZHANG Yu, JI Zhe, WAN Xin, et al. Adaptation of Deep Neural Network for Large Vocabulary Continuous Speech Recognition[J]. Journal of Tianjin University (Sci and Tech), 2015, 48(9):765-769. (in Chinese)
[5] Liu C, Zhang Z, Wang D. Pruning Deep Neural Networks by Optimal Brain Damage[C]//Proc Interspeech. Singapore, 2014.
[6] LeCun Y, Denker J, Solla S, et al. Optimal brain damage[J]. Advances in Neural Information Processing Systems (NIPS), 1989, 2:598-605.
[7] Li J, Zhao R, Huang J, et al. Learning Small-Size DNN with Output-Distribution-Based Criteria[C]//Proc Interspeech. Singapore, 2014.
[8] Xue J, Li J, Gong Y. Restructuring of deep neural network acoustic models with singular value decomposition[C]//Proc Interspeech. Lyon, France, 2013.
[9] Shlens J. A Tutorial on Principal Component Analysis[J]. Eprint Arxiv, 2014, 58(3):219-226.
[10] Hecht-Nielsen R. Theory of the backpropagation neural network[J]. Neural Networks, 1988, 1(1):65-93.

[1]	张宇, 张鹏远, 颜永红. 基于注意力LSTM和多任务学习的远场语音识别[J]. 清华大学学报（自然科学版）, 2018, 58(3): 249-253.
[2]	易江燕, 陶建华, 刘斌, 温正棋. 基于迁移学习的噪声鲁棒语音识别声学建模[J]. 清华大学学报（自然科学版）, 2018, 58(1): 55-60.
[3]	王建荣, 高永春, 张句, 魏建国, 党建武. 基于Kinect辅助的机器人带噪语音识别[J]. 清华大学学报（自然科学版）, 2017, 57(9): 921-925.
[4]	米吉提·阿不里米提, 艾克白尔·帕塔尔, 艾斯卡尔·艾木都拉. 基于层次化结构的语言模型单元集优化[J]. 清华大学学报（自然科学版）, 2017, 57(3): 257-263.
[5]	张鹏远, 计哲, 侯炜, 金鑫, 韩卫生. 小资源下语音识别算法设计与优化[J]. 清华大学学报（自然科学版）, 2017, 57(2): 147-152.
[6]	王建荣, 张句, 路文焕, 魏建国, 党建武. 机器人自身噪声环境下的自动语音识别[J]. 清华大学学报（自然科学版）, 2017, 57(2): 153-157.
[7]	艾斯卡尔·肉孜, 殷实, 张之勇, 王东, 艾斯卡尔·艾木都拉, 郑方. THUYG-20:免费的维吾尔语语音数据库[J]. 清华大学学报（自然科学版）, 2017, 57(2): 182-187.

Viewed

Full text

Abstract

Cited

Shared

Discussed