Abstract：Microblog sentiment analysis is used to get a user's point of view. Most sentiment analysis methods based on deep learning models do not use emotion symbols. This study uses a double attention model for microblog sentiment analysis that first constructs a microblog emotion symbol knowledge base based on existing emotional semantic resources including emotion words, degree adverbs, negative words, microblog emoticons and common Internet slang. Then, bidirectional long short-term memory and a full connection network are used to encode the microblog text and the emotion symbols in the text. After that, an attention model is used to construct the semantic representations of the microblog text and emotion symbols which are combined to construct the final semantic expression of the microblog text. Finally, the emotion classification model is trained on these semantic representations. The combined attention model and emotion symbols enhance the ability to capture the emotions and improve the microblog sentiment classification. This model gives the best accuracy for many sentiment classification tasks on the Natural Language Processing and Chinese Computing (NLPCC) microblog sentiment analysis task datasets. Tests on the 2013 and 2014 NLPCC datasets give F1-scores for the macro and micro averages that are 1.39% and 1.26% higher than the known best model for the 2013 dataset and 2.02% and 2.21% higher for the 2014 dataset.
丁兆云, 贾焰, 周斌. 微博数据挖掘研究综述[J]. 计算机研究与发展, 2014, 51(4):691-706.DING Z Y, JIA Y, ZHOU B. Survey of data mining for microblogs[J]. Journal of Computer Research and Development, 2014, 51(4):691-706. (in Chinese)
TABOADA M, BROOKE J, TOFILOSKI M, et al. Lexicon-based methods for sentiment analysis[J]. Computational Linguistics, 2011, 37(2):267-307.
WIEBE J, WILSON T, CARDIE C. Annotating s of opinions and emotions in language[J]. Language Resources and Evaluation, 2005, 39(2):165-210.
PANG B, LEE L. Opinion mining and sentiment analysis[J]. Foundations and Trends in Information Retrieval, 2008, 2(1/2):1-135.
PANG B, LEE L, VAITHYANATHAN S. Thumbs up?:Sentiment classification using machine learning techniques[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA, 2002:79-86.
KIM Y. Convolutional neural networks for sentence classification[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Doha, Qatar, 2014:1746-1751.
ZHANG Y S, JIANG Y R, TONG Y X. Study of sentiment classification for Chinese microblog based on recurrent neural network[J]. Chinese Journal of Electronics, 2016, 25(4):601-607.
WANG J, YU L C, LAI K R, et al. Dimensional sentiment analysis using a regional CNN-LSTM model[C]//Proceedings of the Annual Meeting of the Association for Computational Linguistics. Berlin, Germany, 2016:225-230.
BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate[C]//Proceedings of the International Conference on Learning Representations. San Diego, USA, 2015.
LUONG M T, PHAM H, MANNING C D. Effective approaches to attention-based neural machine translation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Lisbon, Portugal, 2015:1412-1421.
HERMANN K M, TOMÁS K, GREFENSTETTE E, et al. Teaching machines to read and comprehend[C]//Proceedings of the International Conference on Neural Information Processing Systems. Montreal, Canada, 2015:1693-1701.
RAFFEL C, ELLIS D P W. Feed-forward networks with attention can solve some long-term memory problems[C]//Proceedings of the International Conference on Learning Representations. San Juan, Puerto Rico, 2016.
栗雨晴, 礼欣, 韩煦, 等. 基于双语词典的微博多类情感分析方法[J]. 电子学报, 2016, 44(9):2068-2073.LI Y Q, LI X, HAN X, et al. A bilingual lexicon-based multi-class semantic orientation analysis for microblogs[J]. Acta Electronica Sinica, 2016, 44(9):2068-2073. (in Chinese)
BARBOSA L, FENG J. Robust sentiment detection on Twitter from biased and noisy data[C]//Proceedings of the International Conference on Computational Linguistics. Beijing, 2010:36-44.
JIANG F, LIU Y Q, LUAN H B, et al. Microblog sentiment analysis with emoticon space model[J]. Journal of Computer Science and Technology, 2015, 30(5):1120-1129.
何炎祥, 孙松涛, 牛菲菲, 等. 用于微博情感分析的一种情感语义增强的深度学习模型[J]. 计算机学报, 2017, 40(4):773-790.HE Y X, SUN S T, NIU F F, et al. A deep learning model enhanced with emotion semantics for microblog sentiment analysis[J]. Chinese Journal of Computer, 2017, 40(4):773-790. (in Chinese)
SONG K, FENG S, GAO W, et al. Build emotion lexicon from microblogs by combining effects of seed words and emoticons in a heterogeneous graph[C]//Proceedings of the ACM Conference on Hypertext and Social Media. New York, USA, 2015:283-292.
HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 2012, 9(8):1735-1780.
GRAVES A, JAITLY N, MOHAMED A R. Hybrid speech recognition with deep bidirectional LSTM[C]//Proceedings of the Automatic Speech Recognition and Understanding. Olomouc, Czech Republic, 2013:273-278.
徐琳宏, 林鸿飞, 潘宇, 等. 情感词汇本体的构造[J]. 情报学报, 2008, 27(2):180-185.XU L H, LIN H F, PAN Y, et al. Constructing the affective lexicon ontology[J]. Journal of the China Society for Scientific and Technical Information, 2008, 27(2):180-185. (in Chinese)
BERMINGHAM A, SMEATON A F. Classifying sentiment in microblogs:Is brevity an advantage?[C]//Proceedings of the ACM International Conference on Information and Knowledge Management. Toronto, Canada, 2010:1833-1836.