Combined self-attention mechanism for named entity recognition in social media

LI Mingyang, KONG Fang

Journal of Tsinghua University(Science and Technology) ›› 2019, Vol. 59 ›› Issue (6) : 461-467.

PDF(1001 KB)
PDF(1001 KB)
Journal of Tsinghua University(Science and Technology) ›› 2019, Vol. 59 ›› Issue (6) : 461-467. DOI: 10.16511/j.cnki.qhdxxb.2019.25.005
COMPUTER SCIENCE AND TECHNOLOGY

Combined self-attention mechanism for named entity recognition in social media

  • {{article.zuoZhe_EN}}
Author information +
History +

Abstract

Named entity recognition (NER) in Chinese social media is less effective than in standard news mainly due to the normalization and the size of the existing annotated corpus. In recent years, research on named entity recognition in Chinese social media has tended to use external knowledge and joint training to improve performance due to the small size of the annotated corpus. However, there are few studies on mining entity recognition characteristics in social media. This article focuses on named entity recognition in text articles using a neural network model that combines bi-directional long short-term memory with a self-attention mechanism. This model extracts context information from different dimensions to better understand and represent the sentence structure and improve the recognition performance. Tests on the Weibo NER released corpus show that this method is more effective than previous approaches and that this method has a 58.76% F1-score without using external knowledge or joint learning.

Key words

named entity recognition (NER) / Chinese social media / self-attention mechanism

Cite this article

Download Citations
LI Mingyang, KONG Fang. Combined self-attention mechanism for named entity recognition in social media[J]. Journal of Tsinghua University(Science and Technology). 2019, 59(6): 461-467 https://doi.org/10.16511/j.cnki.qhdxxb.2019.25.005

References

[1] PENG N Y, DREDZE M. Named entity recognition for Chinese social media with jointly trained embeddings[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon, Portugal:Association for Computational Linguistics, 2015:548-554.
[2] PENG N Y, DREDZE M. Improving named entity recognition for Chinese social media with word segmentation representation learning[J]. arXiv preprint arXiv:1603.00786, 2016.
[3] HE H F, SUN X. F-score driven max margin neural network for named entity recognition in Chinese social media[J]. arXiv preprint arXiv:1611.04234, 2016.
[4] HE H F, SUN X. A unified model for cross-domain and semi-supervised named entity recognition in Chinese social media[C]//Proceedings of the 31st AAAI Conference on Artificial Intelligence. San Francisco, USA:AAAI, 2017:3216-3222.
[5] HE J Z, WANG H F. Chinese named entity recognition and wordsegmentation based oncharacter[C]//Proceedings of the Sixth SIGHAN Workshop on Chinese Language Processing. Hyderabad, India, 2008.(未找到本条文献出版者信息, 请补充)
[6] GOLDBERG Y, LEVY O. Word2vec explained:Deriving Mikolov et al.'s negative-sampling word-embedding method[J]. arXiv preprint arXiv:1402.3722, 2014.
[7] HOCHREITER S, SCHMIDHUBER J. LSTM can solve hard long time lag problems[C]//Proceedings of the 9th International Conference on Neural Information Processing Systems. Denver, Colorado, USA:MIT Press, 1996:473-479.
[8] 冯辉. 视觉注意力机制及其应用研究[D]. 北京:华北电力大学(北京), 2011.FENG H. Research on visual attention mechanism and its application[D]. Beijing:North China Electric Power University (Beijing), 2011. (in Chinese)
[9] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Advances in Neural Information Processing Systems. Long Beach:NIPS, 2017:6000-6010.
[10] YAO L, TORABI A, CHO K, et al. Video description generation incorporating spatio-temporal features and a soft-attention mechanism[J]. arXiv preprint arXiv:1502.08029, 2015.
[11] HUANG Z H, XU W, YU K. Bidirectional LSTM-CRF models for sequence tagging[J]. arXiv preprint arXiv:1508.01991, 2015.
[12] 洪铭材, 张阔, 唐杰, 等. 基于条件随机场(CRFs)的中文词性标注方法[J]. 计算机科学, 2006, 33(10):148-151, 155.HONG M C, ZHANG K, TANG J, et al. A Chinese part-of-speech tagging approach using conditional random fields[J]. Computer Science, 2006, 33(10):148-151, 155. (in Chinese)
[13] FORNEY G D. The viterbi algorithm[J]. Proceedings of the IEEE, 1973, 61(3):268-278.
[14] BOTTOU L. Stochastic gradient descent tricks[M]//MONTAVON G, ORR G B, MÜLLER K R. Neural networks:Tricks of the trade. Berlin, Heidelberg, Germany:Springer, 2012:421-436.
[15] 陈治纲, 何丕廉, 孙越恒, 等. 基于向量空间模型的文本分类系统的研究与实现[J]. 中文信息学报, 2005, 19(1):37-41.CHEN Z G, HE P L, SUN Y H, et al. Research and implementation of text classification system based on VSP[J]. Journal of Chinese Information Processing, 2005, 19(1):37-41. (in Chinese)
PDF(1001 KB)

Accesses

Citation

Detail

Sections
Recommended

/