Please wait a minute...
 首页  期刊介绍 期刊订阅 联系我们 横山亮次奖 百年刊庆
 
最新录用  |  预出版  |  当期目录  |  过刊浏览  |  阅读排行  |  下载排行  |  引用排行  |  横山亮次奖  |  百年刊庆
清华大学学报(自然科学版)  2020, Vol. 60 Issue (5): 415-421    DOI: 10.16511/j.cnki.qhdxxb.2020.26.006
  专题:大数据 本期目录 | 过刊浏览 | 高级检索 |
基于多头注意力胶囊网络的文本分类模型
贾旭东, 王莉
太原理工大学 大数据学院, 太原 030024
Text classification model based on multi-head attention capsule neworks
JIA Xudong, WANG Li
College of Data Science, Taiyuan University of Technology, Taiyuan 030024, China
全文: PDF(4184 KB)  
输出: BibTeX | EndNote (RIS)      
摘要 文本序列中各单词的重要程度及其之间的依赖关系对于识别文本类别有重要影响。胶囊网络不能选择性关注文本中重要单词,并且由于不能编码远距离依赖关系,在识别具有语义转折的文本时有很大局限性。为解决上述问题,该文提出了一种基于多头注意力的胶囊网络模型,该模型能编码单词间的依赖关系、捕获文本中重要单词,并对文本语义编码,从而有效提高了文本分类任务的效果。结果表明:该文模型在文本分类任务中效果明显优于卷积神经网络和胶囊网络,在多标签文本分类任务上效果更优,能更好地从注意力中获益。
服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
贾旭东
王莉
关键词 胶囊网络多头注意力自然语言处理文本分类    
Abstract:The importance of each word in a text sequence and the dependencies between them have a significant impact on identifying the text categories. Capsule networks cannot selectively focus on important words in texts. Moreover, it is not possible to encode long-distance dependencies, therefore there are significant limitations in identifying texts with semantic transitions. In order to solve the above problems, this paper proposes a capsule networks based on multi-head attention, which can encode the dependencies between words, capture important words in texts, and encode the semantic of texts, thus effectively improve the effect of text classification task. The experimental results show that the model of this paper is better than the convolutional neural network and the capsule networks in the text classification task, it is more effective in the multi-label text classification task. In addition, it proves that this model can benefit better from the attention.
Key wordscapsule networks    multi-head attention    natural language processing    text classification
收稿日期: 2019-09-02      出版日期: 2020-04-26
基金资助:王莉,教授,E-mail:wangli@tyut.edu.cn
引用本文:   
贾旭东, 王莉. 基于多头注意力胶囊网络的文本分类模型[J]. 清华大学学报(自然科学版), 2020, 60(5): 415-421.
JIA Xudong, WANG Li. Text classification model based on multi-head attention capsule neworks. Journal of Tsinghua University(Science and Technology), 2020, 60(5): 415-421.
链接本文:  
http://jst.tsinghuajournals.com/CN/10.16511/j.cnki.qhdxxb.2020.26.006  或          http://jst.tsinghuajournals.com/CN/Y2020/V60/I5/415
  
  
  
  
  
  
  
  
  
  
[1] JOACHIMS T. Text categorization with suport vector machines:Learning with many relevant features[C]//Proceedings of the 10th European Conference on Machine Learning. Chemnitz, Germany:Springer, 1998:137-142.
[2] MCCALLUM A, NIGAM K. A comparison of event models for naive bayes text classification[C]//AAAI-98 Workshop on Learning for Text Categorization. Madison, Wisconsin:AAAI, 1998:41-48.
[3] ZHANG W, YOSHIDA T, TANG X J. TFIDF, LSI and multi-word in information retrieval and text categorization[C]//Proceedings of 2008 IEEE International Conference on Systems, Man and Cybernetics. Singapore:IEEE, 2008:108-113.
[4] LIN C Y, HOVY E. Automatic evaluation of summaries using N-gram co-occurrence statistics[C]//Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology. Edmonton, Canada:ACL, 2003. DOI:10.3115/1073445.1073465.
[5] GENKIN A, LEWIS D D, MADIGAN D. Large-scale Bayesian logistic regression for text categorization[J]. Technometrics, 2007, 49(3):291-304.
[6] PANG B, LEE L, VAITHYANATHAN S. Thumbs up? Sentiment classification using machine learning techniques[C]//Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA:ACL, 2002:79-86.
[7] MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[C]//Proceedings of the 26th International Conference on Neural Information Processing Systems. Lake Tahoe, USA, 2013:3111-3119.
[8] PENNINGTON J, SOCHER R, MANNING C D. GloVe:Global vectors for word representation[C]//Proceedings of 2014 Conference on Empirical Methods in Natural Language Processing. Doha, Qatar:ACL, 2014:1532-1543.
[9] KIM Y. Convolutional neural networks for sentence classification[J]. arXiv preprint arXiv, 2014:1408.5882.
[10] CONNEAU A, SCHWENK H, CUN Y L, et al. Very deep convolutional networks for text classification[C]//Proceedings of the 15th European Chapter of the Association for Computational Linguistics. Valencia, Spain:ACL, 2017:1107-1116.
[11] MOUSA A, SCHULLER B. Contextual bidirectional long short-term memory recurrent neural network language models:A generative approach tosentiment analysis[C]//Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. Valencia, Spain:ACL, 2017:1023-1032.
[12] XI E, BING S, JIN Y. Capsule network performance on complex data[J]. arXiv preprint arXiv, 2017:1712.03480.
[13] YANG M, ZHAO W, YE J B, et al. Investigating capsule networks with dynamic routing for text classification[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels, Belgium:ACL, 2018:3110-3119.
[14] HINTON G E, KRIZHEVSKY A, WANG S D. Transforming auto-encoders[C]//Proceedings of the 21st International Conference on Artificial Neural Networks. Espoo, Finland:Springer, 2011:44-51.
[15] SABOUR S, FROSST N, HINTON G E. Dynamic routing between capsules[C]//Proceedings of the 31st Conference on Neural Information Processing Systems. Long Beach, USA:Curran Associates Inc., 2017:3856-3866.
[16] HINTON G, SABOUR S, FROSST N. Matrix capsules with EM routing[C]//International Conference on Learning Representations. Toronto, Canada, 2018.
[17] LIN Z H, FENG M W, SANTOS C N D, et al. A structured self-attentive sentence embedding[C]//International Conference on Learning Representations. Toulon, France, 2017.
[18] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of 31st Conference on Neural Information Processing Systems. Long Beach, USA:Curran Associates Inc., 2017:5998-6008.
[19] KIM Y, LEE H, JUNG K. AttnConvnet at SemEval-2018 task 1:Attention-based convolutional neural networks for multi-label emotion classification[C]//Proceedings of the 12th International Workshop on Semantic Evaluation. New Orleans, Louisiana:ACL, 2018:141-145.
[20] LUONG M T, PHAM H, MANNING C D. Effective approaches to attention-based neural machine translation[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon, Portugal:ACL, 2015:1412-1421.
[1] 任俊飞, 朱桐, 陈文亮. 基于部分标注的自训练多标签文本分类框架[J]. 清华大学学报(自然科学版), 2024, 64(4): 679-687.
[2] 王昀, 胡珉, 塔娜, 孙海涛, 郭毅峰, 周武爱, 郭昱, 张皖哲, 冯建华. 大语言模型及其在政务领域的应用[J]. 清华大学学报(自然科学版), 2024, 64(4): 649-658.
[3] 张雪芹, 刘岗, 王智能, 罗飞, 吴建华. 基于多特征融合和深度学习的微观扩散预测[J]. 清华大学学报(自然科学版), 2024, 64(4): 688-699.
[4] 王庆人, 王银子, 仲红, 张以文. 面向中文的字词组合序列实体识别方法[J]. 清华大学学报(自然科学版), 2023, 63(9): 1326-1338.
[5] 陆思聪, 李春文. 基于场景与话题的聊天型人机会话系统[J]. 清华大学学报(自然科学版), 2022, 62(5): 952-958.
[6] 胡滨, 耿天玉, 邓赓, 段磊. 基于知识蒸馏的高效生物医学命名实体识别模型[J]. 清华大学学报(自然科学版), 2021, 61(9): 936-942.
[7] 陈乐乐, 黄松, 孙金磊, 惠战伟, 吴开舜. 基于BM25算法的问题报告质量检测方法[J]. 清华大学学报(自然科学版), 2020, 60(10): 829-836.
[8] 王元龙, 李茹, 张虎, 王智强. 阅读理解中因果关系类选项的研究[J]. 清华大学学报(自然科学版), 2018, 58(3): 272-278.
[9] 卢兆麟, 李升波, Schroeder Felix, 周吉晨, 成波. 结合自然语言处理与改进层次分析法的乘用车驾驶舒适性评价[J]. 清华大学学报(自然科学版), 2016, 56(2): 137-143.
[10] 张旭, 王生进. 基于自然语言处理的特定属性物体检测[J]. 清华大学学报(自然科学版), 2016, 56(11): 1137-1142.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 《清华大学学报(自然科学版)》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn