基于双重注意力模型的微博情感分析方法

doi:10.16511/j.cnki.qhdxxb.2018.22.015

摘要
图/表
参考文献
相关文章
Metrics

全文: PDF(2574 KB)
输出: BibTeX | EndNote (RIS)

摘要微博情感分析是获取微博用户观点的基础。该文针对现有大多数情感分析方法将深度学习模型与情感符号相剥离的现状，提出了一种基于双重注意力模型的微博情感分析方法。该方法利用现有的情感知识库构建了一个包含情感词、程度副词、否定词、微博表情符号和常用网络用语的微博情感符号库；采用双向长短记忆网络模型和全连接网络，分别对微博文本和文本中包含的情感符号进行编码；采用注意力模型分别构建微博文本和情感符号的语义表示，并将两者的语义表示进行融合，以构建微博文本的最终语义表示；基于所构建的语义表示对情感分类模型进行训练。该方法通过将注意力模型和情感符号相结合，有效增强了对微博文本情感语义的捕获能力，提高了微博情感分类的性能。基于自然语言处理与中文计算会议（NLPCC）微博情感测评公共数据集，对所提出的模型进行评测，结果表明：该模型在多个情感分类任务中都取得了最佳效果，相对于已知最好的模型，在2013年的数据集上，宏平均和微平均的F₁值分别提升了1.39%和1.26%，在2014年的数据集上，宏平均和微平均的F₁值分别提升了2.02%和2.21%。

	服务

	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	RSS

	作者相关文章
	张仰森
	郑佳
	黄改娟
	蒋玉茹

关键词 ：情感分析, 双重注意力模型, 微博, 语义表示, 情感符号

Abstract：Microblog sentiment analysis is used to get a user's point of view. Most sentiment analysis methods based on deep learning models do not use emotion symbols. This study uses a double attention model for microblog sentiment analysis that first constructs a microblog emotion symbol knowledge base based on existing emotional semantic resources including emotion words, degree adverbs, negative words, microblog emoticons and common Internet slang. Then, bidirectional long short-term memory and a full connection network are used to encode the microblog text and the emotion symbols in the text. After that, an attention model is used to construct the semantic representations of the microblog text and emotion symbols which are combined to construct the final semantic expression of the microblog text. Finally, the emotion classification model is trained on these semantic representations. The combined attention model and emotion symbols enhance the ability to capture the emotions and improve the microblog sentiment classification. This model gives the best accuracy for many sentiment classification tasks on the Natural Language Processing and Chinese Computing (NLPCC) microblog sentiment analysis task datasets. Tests on the 2013 and 2014 NLPCC datasets give F₁-scores for the macro and micro averages that are 1.39% and 1.26% higher than the known best model for the 2013 dataset and 2.02% and 2.21% higher for the 2014 dataset.

Key words： sentiment analysis double attention model microblog semantic representation emotion symbol

收稿日期: 2017-08-22 出版日期: 2018-02-15

ZTFLH:

TP391.1

引用本文:

张仰森, 郑佳, 黄改娟, 蒋玉茹. 基于双重注意力模型的微博情感分析方法[J]. 清华大学学报（自然科学版）, 2018, 58(2): 122-130.
ZHANG Yangsen, ZHENG Jia, HUANG Gaijuan, JIANG Yuru. Microblog sentiment analysis method based on a double attention model. Journal of Tsinghua University(Science and Technology), 2018, 58(2): 122-130.

链接本文:

http://jst.tsinghuajournals.com/CN/10.16511/j.cnki.qhdxxb.2018.22.015 或 http://jst.tsinghuajournals.com/CN/Y2018/V58/I2/122

表１微博情感符号库

图１双重注意力模型架构

表２ NLP CC ２０１３和 NLP CC ２０１４数据集中微博文本的数目

表３情感二分类数据集中微博文本的数目

表４模型参数调节列表

表５ NLP CC ２０１３实验结果

表６ NLP CC ２０１４实验结果

图２正负极性分类结果对比图

图３主客观分类结果对比图

图４测评任务中情感符号性能对比

图５正负极性分类任务中情感符号性能对比

图６ (网络版彩图)２０１４年数据测评任务分析可视化结果

[1]	丁兆云, 贾焰, 周斌. 微博数据挖掘研究综述[J]. 计算机研究与发展, 2014, 51(4):691-706.DING Z Y, JIA Y, ZHOU B. Survey of data mining for microblogs[J]. Journal of Computer Research and Development, 2014, 51(4):691-706. (in Chinese)
[2]	TABOADA M, BROOKE J, TOFILOSKI M, et al. Lexicon-based methods for sentiment analysis[J]. Computational Linguistics, 2011, 37(2):267-307.
[3]	WIEBE J, WILSON T, CARDIE C. Annotating s of opinions and emotions in language[J]. Language Resources and Evaluation, 2005, 39(2):165-210.
[4]	PANG B, LEE L. Opinion mining and sentiment analysis[J]. Foundations and Trends in Information Retrieval, 2008, 2(1/2):1-135.
[5]	PANG B, LEE L, VAITHYANATHAN S. Thumbs up?:Sentiment classification using machine learning techniques[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA, 2002:79-86.
[6]	KIM Y. Convolutional neural networks for sentence classification[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Doha, Qatar, 2014:1746-1751.
[7]	ZHANG Y S, JIANG Y R, TONG Y X. Study of sentiment classification for Chinese microblog based on recurrent neural network[J]. Chinese Journal of Electronics, 2016, 25(4):601-607.
[8]	WANG J, YU L C, LAI K R, et al. Dimensional sentiment analysis using a regional CNN-LSTM model[C]//Proceedings of the Annual Meeting of the Association for Computational Linguistics. Berlin, Germany, 2016:225-230.
[9]	BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate[C]//Proceedings of the International Conference on Learning Representations. San Diego, USA, 2015.
[10]	LUONG M T, PHAM H, MANNING C D. Effective approaches to attention-based neural machine translation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Lisbon, Portugal, 2015:1412-1421.
[11]	HERMANN K M, TOMÁS K, GREFENSTETTE E, et al. Teaching machines to read and comprehend[C]//Proceedings of the International Conference on Neural Information Processing Systems. Montreal, Canada, 2015:1693-1701.
[12]	RAFFEL C, ELLIS D P W. Feed-forward networks with attention can solve some long-term memory problems[C]//Proceedings of the International Conference on Learning Representations. San Juan, Puerto Rico, 2016.
[13]	栗雨晴, 礼欣, 韩煦, 等. 基于双语词典的微博多类情感分析方法[J]. 电子学报, 2016, 44(9):2068-2073.LI Y Q, LI X, HAN X, et al. A bilingual lexicon-based multi-class semantic orientation analysis for microblogs[J]. Acta Electronica Sinica, 2016, 44(9):2068-2073. (in Chinese)
[14]	BARBOSA L, FENG J. Robust sentiment detection on Twitter from biased and noisy data[C]//Proceedings of the International Conference on Computational Linguistics. Beijing, 2010:36-44.
[15]	JIANG F, LIU Y Q, LUAN H B, et al. Microblog sentiment analysis with emoticon space model[J]. Journal of Computer Science and Technology, 2015, 30(5):1120-1129.
[16]	何炎祥, 孙松涛, 牛菲菲, 等. 用于微博情感分析的一种情感语义增强的深度学习模型[J]. 计算机学报, 2017, 40(4):773-790.HE Y X, SUN S T, NIU F F, et al. A deep learning model enhanced with emotion semantics for microblog sentiment analysis[J]. Chinese Journal of Computer, 2017, 40(4):773-790. (in Chinese)
[17]	SONG K, FENG S, GAO W, et al. Build emotion lexicon from microblogs by combining effects of seed words and emoticons in a heterogeneous graph[C]//Proceedings of the ACM Conference on Hypertext and Social Media. New York, USA, 2015:283-292.
[18]	HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 2012, 9(8):1735-1780.
[19]	GRAVES A, JAITLY N, MOHAMED A R. Hybrid speech recognition with deep bidirectional LSTM[C]//Proceedings of the Automatic Speech Recognition and Understanding. Olomouc, Czech Republic, 2013:273-278.
[20]	徐琳宏, 林鸿飞, 潘宇, 等. 情感词汇本体的构造[J]. 情报学报, 2008, 27(2):180-185.XU L H, LIN H F, PAN Y, et al. Constructing the affective lexicon ontology[J]. Journal of the China Society for Scientific and Technical Information, 2008, 27(2):180-185. (in Chinese)
[21]	BERMINGHAM A, SMEATON A F. Classifying sentiment in microblogs:Is brevity an advantage?[C]//Proceedings of the ACM International Conference on Information and Knowledge Management. Toronto, Canada, 2010:1833-1836.

[1]	陈舒婷, 疏学明, 胡俊, 解学才, 张雷, 张伽. 基于时序超网络模型的突发事件网络舆情热点话题发现与演化[J]. 清华大学学报（自然科学版）, 2023, 63(6): 968-979.
[2]	周义棋, 田向亮, 钟茂华. 基于微博数据的自然灾害应急救助需求评估[J]. 清华大学学报（自然科学版）, 2022, 62(10): 1626-1635.
[3]	侯文惠, 曲维光, 魏庭新, 李斌, 顾彦慧, 周俊生. 面向中文AMR标注体系的兼语语料库构建及兼语结构识别[J]. 清华大学学报（自然科学版）, 2021, 61(9): 920-926.
[4]	陈安滢, 朱昊然, 苏国锋. 微博用户的应急预警信息传播行为研究[J]. 清华大学学报（自然科学版）, 2021, 61(6): 527-535.
[5]	张婧, 黄德根, 黄锴宇, 刘壮, 孟祥主. 基于λ-主动学习方法的中文微博分词[J]. 清华大学学报（自然科学版）, 2018, 58(3): 260-265.
[6]	邓青, 马晔风, 刘艺, 张辉. 基于BP神经网络的微博转发量的预测[J]. 清华大学学报（自然科学版）, 2015, 55(12): 1342-1347.
[7]	周沧琦, 赵千川, 卢文博. 基于兴趣变化的微博用户转发行为建模[J]. 清华大学学报（自然科学版）, 2015, 55(11): 1163-1170.
[8]	张晶, 黄京华, 黎波, 严威. 新浪企业微博口碑传播的实证研究[J]. 清华大学学报（自然科学版）, 2014, 54(5): 649-654.

Viewed

Full text

Abstract

Cited

Shared

Discussed