Please wait a minute...
 首页  期刊介绍 期刊订阅 联系我们 横山亮次奖 百年刊庆
最新录用  |  预出版  |  当期目录  |  过刊浏览  |  阅读排行  |  下载排行  |  引用排行  |  横山亮次奖  |  百年刊庆
清华大学学报(自然科学版)  2018, Vol. 58 Issue (2): 122-130    DOI: 10.16511/j.cnki.qhdxxb.2018.22.015
  计算机科学与技术 本期目录 | 过刊浏览 | 高级检索 |
张仰森, 郑佳, 黄改娟, 蒋玉茹
北京信息科技大学 智能信息处理研究所, 北京 100101
Microblog sentiment analysis method based on a double attention model
ZHANG Yangsen, ZHENG Jia, HUANG Gaijuan, JIANG Yuru
Institute of Intelligent Information Processing, Beijing Information Science and Technology University, Beijing 100101, China
全文: PDF(2574 KB)  
输出: BibTeX | EndNote (RIS)      
摘要 微博情感分析是获取微博用户观点的基础。该文针对现有大多数情感分析方法将深度学习模型与情感符号相剥离的现状,提出了一种基于双重注意力模型的微博情感分析方法。该方法利用现有的情感知识库构建了一个包含情感词、程度副词、否定词、微博表情符号和常用网络用语的微博情感符号库;采用双向长短记忆网络模型和全连接网络,分别对微博文本和文本中包含的情感符号进行编码;采用注意力模型分别构建微博文本和情感符号的语义表示,并将两者的语义表示进行融合,以构建微博文本的最终语义表示;基于所构建的语义表示对情感分类模型进行训练。该方法通过将注意力模型和情感符号相结合,有效增强了对微博文本情感语义的捕获能力,提高了微博情感分类的性能。基于自然语言处理与中文计算会议(NLPCC)微博情感测评公共数据集,对所提出的模型进行评测,结果表明:该模型在多个情感分类任务中都取得了最佳效果,相对于已知最好的模型,在2013年的数据集上,宏平均和微平均的F1值分别提升了1.39%和1.26%,在2014年的数据集上,宏平均和微平均的F1值分别提升了2.02%和2.21%。
E-mail Alert
关键词 情感分析双重注意力模型微博语义表示情感符号    
Abstract:Microblog sentiment analysis is used to get a user's point of view. Most sentiment analysis methods based on deep learning models do not use emotion symbols. This study uses a double attention model for microblog sentiment analysis that first constructs a microblog emotion symbol knowledge base based on existing emotional semantic resources including emotion words, degree adverbs, negative words, microblog emoticons and common Internet slang. Then, bidirectional long short-term memory and a full connection network are used to encode the microblog text and the emotion symbols in the text. After that, an attention model is used to construct the semantic representations of the microblog text and emotion symbols which are combined to construct the final semantic expression of the microblog text. Finally, the emotion classification model is trained on these semantic representations. The combined attention model and emotion symbols enhance the ability to capture the emotions and improve the microblog sentiment classification. This model gives the best accuracy for many sentiment classification tasks on the Natural Language Processing and Chinese Computing (NLPCC) microblog sentiment analysis task datasets. Tests on the 2013 and 2014 NLPCC datasets give F1-scores for the macro and micro averages that are 1.39% and 1.26% higher than the known best model for the 2013 dataset and 2.02% and 2.21% higher for the 2014 dataset.
Key wordssentiment analysis    double attention model    microblog    semantic representation    emotion symbol
收稿日期: 2017-08-22      出版日期: 2018-02-15
ZTFLH:  TP391.1  
张仰森, 郑佳, 黄改娟, 蒋玉茹. 基于双重注意力模型的微博情感分析方法[J]. 清华大学学报(自然科学版), 2018, 58(2): 122-130.
ZHANG Yangsen, ZHENG Jia, HUANG Gaijuan, JIANG Yuru. Microblog sentiment analysis method based on a double attention model. Journal of Tsinghua University(Science and Technology), 2018, 58(2): 122-130.
链接本文:  或
  表1 微博情感符号库
  图1 双重注意力模型架构
  表2 NLP CC 2 0 1 3和 NLP CC 2 0 1 4数据集中微博文本的数目
  表3 情感二分类数据集中微博文本的数目
  表4 模型参数调节列表
  表5 NLP CC 2 0 1 3实验结果
  表6 NLP CC 2 0 1 4实验结果
  图2 正负极性分类结果对比图
  图3 主客观分类结果对比图
  图4 测评任务中情感符号性能对比
  图5 正负极性分类任务中情感符号性能对比
  图6 (网络版彩图)2 0 1 4年数据测评任务分析可视化结果
[1] 丁兆云, 贾焰, 周斌. 微博数据挖掘研究综述[J]. 计算机研究与发展, 2014, 51(4):691-706.DING Z Y, JIA Y, ZHOU B. Survey of data mining for microblogs[J]. Journal of Computer Research and Development, 2014, 51(4):691-706. (in Chinese)
[2] TABOADA M, BROOKE J, TOFILOSKI M, et al. Lexicon-based methods for sentiment analysis[J]. Computational Linguistics, 2011, 37(2):267-307.
[3] WIEBE J, WILSON T, CARDIE C. Annotating s of opinions and emotions in language[J]. Language Resources and Evaluation, 2005, 39(2):165-210.
[4] PANG B, LEE L. Opinion mining and sentiment analysis[J]. Foundations and Trends in Information Retrieval, 2008, 2(1/2):1-135.
[5] PANG B, LEE L, VAITHYANATHAN S. Thumbs up?:Sentiment classification using machine learning techniques[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA, 2002:79-86.
[6] KIM Y. Convolutional neural networks for sentence classification[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Doha, Qatar, 2014:1746-1751.
[7] ZHANG Y S, JIANG Y R, TONG Y X. Study of sentiment classification for Chinese microblog based on recurrent neural network[J]. Chinese Journal of Electronics, 2016, 25(4):601-607.
[8] WANG J, YU L C, LAI K R, et al. Dimensional sentiment analysis using a regional CNN-LSTM model[C]//Proceedings of the Annual Meeting of the Association for Computational Linguistics. Berlin, Germany, 2016:225-230.
[9] BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate[C]//Proceedings of the International Conference on Learning Representations. San Diego, USA, 2015.
[10] LUONG M T, PHAM H, MANNING C D. Effective approaches to attention-based neural machine translation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Lisbon, Portugal, 2015:1412-1421.
[11] HERMANN K M, TOMÁS K, GREFENSTETTE E, et al. Teaching machines to read and comprehend[C]//Proceedings of the International Conference on Neural Information Processing Systems. Montreal, Canada, 2015:1693-1701.
[12] RAFFEL C, ELLIS D P W. Feed-forward networks with attention can solve some long-term memory problems[C]//Proceedings of the International Conference on Learning Representations. San Juan, Puerto Rico, 2016.
[13] 栗雨晴, 礼欣, 韩煦, 等. 基于双语词典的微博多类情感分析方法[J]. 电子学报, 2016, 44(9):2068-2073.LI Y Q, LI X, HAN X, et al. A bilingual lexicon-based multi-class semantic orientation analysis for microblogs[J]. Acta Electronica Sinica, 2016, 44(9):2068-2073. (in Chinese)
[14] BARBOSA L, FENG J. Robust sentiment detection on Twitter from biased and noisy data[C]//Proceedings of the International Conference on Computational Linguistics. Beijing, 2010:36-44.
[15] JIANG F, LIU Y Q, LUAN H B, et al. Microblog sentiment analysis with emoticon space model[J]. Journal of Computer Science and Technology, 2015, 30(5):1120-1129.
[16] 何炎祥, 孙松涛, 牛菲菲, 等. 用于微博情感分析的一种情感语义增强的深度学习模型[J]. 计算机学报, 2017, 40(4):773-790.HE Y X, SUN S T, NIU F F, et al. A deep learning model enhanced with emotion semantics for microblog sentiment analysis[J]. Chinese Journal of Computer, 2017, 40(4):773-790. (in Chinese)
[17] SONG K, FENG S, GAO W, et al. Build emotion lexicon from microblogs by combining effects of seed words and emoticons in a heterogeneous graph[C]//Proceedings of the ACM Conference on Hypertext and Social Media. New York, USA, 2015:283-292.
[18] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 2012, 9(8):1735-1780.
[19] GRAVES A, JAITLY N, MOHAMED A R. Hybrid speech recognition with deep bidirectional LSTM[C]//Proceedings of the Automatic Speech Recognition and Understanding. Olomouc, Czech Republic, 2013:273-278.
[20] 徐琳宏, 林鸿飞, 潘宇, 等. 情感词汇本体的构造[J]. 情报学报, 2008, 27(2):180-185.XU L H, LIN H F, PAN Y, et al. Constructing the affective lexicon ontology[J]. Journal of the China Society for Scientific and Technical Information, 2008, 27(2):180-185. (in Chinese)
[21] BERMINGHAM A, SMEATON A F. Classifying sentiment in microblogs:Is brevity an advantage?[C]//Proceedings of the ACM International Conference on Information and Knowledge Management. Toronto, Canada, 2010:1833-1836.
[1] 陈舒婷, 疏学明, 胡俊, 解学才, 张雷, 张伽. 基于时序超网络模型的突发事件网络舆情热点话题发现与演化[J]. 清华大学学报(自然科学版), 2023, 63(6): 968-979.
[2] 周义棋, 田向亮, 钟茂华. 基于微博数据的自然灾害应急救助需求评估[J]. 清华大学学报(自然科学版), 2022, 62(10): 1626-1635.
[3] 侯文惠, 曲维光, 魏庭新, 李斌, 顾彦慧, 周俊生. 面向中文AMR标注体系的兼语语料库构建及兼语结构识别[J]. 清华大学学报(自然科学版), 2021, 61(9): 920-926.
[4] 陈安滢, 朱昊然, 苏国锋. 微博用户的应急预警信息传播行为研究[J]. 清华大学学报(自然科学版), 2021, 61(6): 527-535.
[5] 张婧, 黄德根, 黄锴宇, 刘壮, 孟祥主. 基于λ-主动学习方法的中文微博分词[J]. 清华大学学报(自然科学版), 2018, 58(3): 260-265.
[6] 邓青, 马晔风, 刘艺, 张辉. 基于BP神经网络的微博转发量的预测[J]. 清华大学学报(自然科学版), 2015, 55(12): 1342-1347.
[7] 周沧琦, 赵千川, 卢文博. 基于兴趣变化的微博用户转发行为建模[J]. 清华大学学报(自然科学版), 2015, 55(11): 1163-1170.
[8] 张晶, 黄京华, 黎波, 严威. 新浪企业微博口碑传播的实证研究[J]. 清华大学学报(自然科学版), 2014, 54(5): 649-654.
Full text



版权所有 © 《清华大学学报(自然科学版)》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持