融入自注意力机制的社交媒体命名实体识别

doi:10.16511/j.cnki.qhdxxb.2019.25.005

摘要
图/表
参考文献
相关文章
Metrics

全文: PDF(1001 KB)
输出: BibTeX | EndNote (RIS)

摘要相比规范新闻文本中命名实体识别（named entity recognition，NER），中文社交媒体中命名实体识别的性能偏低，这主要受限于文本的规范性和标注语料的规模。近年来中文社交媒体的命名实体识别研究主要针对标注语料规模小这一问题，倾向于使用外部知识或者借助联合训练来提升最终的识别性能，但对社交媒体文本不规范导致的对文本自身蕴含特征的挖掘不够这一问题的研究很少。该文着眼于文本自身，提出了一种结合双向长短时记忆和自注意力机制的命名实体识别方法。该方法通过在多个不同子空间捕获上下文相关信息来更好地理解和表示句子结构，充分挖掘文本自身蕴含的特征，并最终提升不规范文本的实体识别性能。在Weibo NER公开语料上进行了多组对比实验，实验结果验证了方法的有效性。结果表明：在不使用外部资源和联合训练的情况下，命名实体识别的F₁值达到了58.76%。

	服务

	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	RSS

	作者相关文章
	李明扬
	孔芳

关键词 ：命名实体识别, 中文社交媒体, 自注意力机制

Abstract：Named entity recognition (NER) in Chinese social media is less effective than in standard news mainly due to the normalization and the size of the existing annotated corpus. In recent years, research on named entity recognition in Chinese social media has tended to use external knowledge and joint training to improve performance due to the small size of the annotated corpus. However, there are few studies on mining entity recognition characteristics in social media. This article focuses on named entity recognition in text articles using a neural network model that combines bi-directional long short-term memory with a self-attention mechanism. This model extracts context information from different dimensions to better understand and represent the sentence structure and improve the recognition performance. Tests on the Weibo NER released corpus show that this method is more effective than previous approaches and that this method has a 58.76% F₁-score without using external knowledge or joint learning.

Key words： named entity recognition (NER) Chinese social media self-attention mechanism

收稿日期: 2018-10-15 出版日期: 2019-06-01

基金资助:国家自然科学基金资助项目（61472264，61876118）；人工智能应急项目（61751206）；国家重点研发计划子课题（2017YFB1002101）

通讯作者: 孔芳,教授,E-mail:kongfang@suda.edu.cn E-mail: kongfang@suda.edu.cn

引用本文:

李明扬, 孔芳. 融入自注意力机制的社交媒体命名实体识别[J]. 清华大学学报（自然科学版）, 2019, 59(6): 461-467.
LI Mingyang, KONG Fang. Combined self-attention mechanism for named entity recognition in social media. Journal of Tsinghua University(Science and Technology), 2019, 59(6): 461-467.

链接本文:

http://jst.tsinghuajournals.com/CN/10.16511/j.cnki.qhdxxb.2019.25.005 或 http://jst.tsinghuajournals.com/CN/Y2019/V59/I6/461

图１ LSTMＧSelf_AttＧCRF模型

图２ LSTM 单元结构

图３放缩点积注意力机制

图４多头注意力机制

表１ WeiboNER数据集结构

表２ WeiboNER 语料分布

表３模型实验设置对比

表４中文社交媒体命名实体识别实验结果

表５ NE识别详细实验对比结果

表６ NM 识别详细实验对比结果

表７ NE识别详细实验对比结果

表８ NM 识别详细实验对比结果

表９３种识别情况分布

[1] PENG N Y, DREDZE M. Named entity recognition for Chinese social media with jointly trained embeddings[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon, Portugal:Association for Computational Linguistics, 2015:548-554.
[2] PENG N Y, DREDZE M. Improving named entity recognition for Chinese social media with word segmentation representation learning[J]. arXiv preprint arXiv:1603.00786, 2016.
[3] HE H F, SUN X. F-score driven max margin neural network for named entity recognition in Chinese social media[J]. arXiv preprint arXiv:1611.04234, 2016.
[4] HE H F, SUN X. A unified model for cross-domain and semi-supervised named entity recognition in Chinese social media[C]//Proceedings of the 31st AAAI Conference on Artificial Intelligence. San Francisco, USA:AAAI, 2017:3216-3222.
[5] HE J Z, WANG H F. Chinese named entity recognition and wordsegmentation based oncharacter[C]//Proceedings of the Sixth SIGHAN Workshop on Chinese Language Processing. Hyderabad, India, 2008.(未找到本条文献出版者信息, 请补充)
[6] GOLDBERG Y, LEVY O. Word2vec explained:Deriving Mikolov et al.'s negative-sampling word-embedding method[J]. arXiv preprint arXiv:1402.3722, 2014.
[7] HOCHREITER S, SCHMIDHUBER J. LSTM can solve hard long time lag problems[C]//Proceedings of the 9th International Conference on Neural Information Processing Systems. Denver, Colorado, USA:MIT Press, 1996:473-479.
[8] 冯辉. 视觉注意力机制及其应用研究[D]. 北京:华北电力大学(北京), 2011.FENG H. Research on visual attention mechanism and its application[D]. Beijing:North China Electric Power University (Beijing), 2011. (in Chinese)
[9] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Advances in Neural Information Processing Systems. Long Beach:NIPS, 2017:6000-6010.
[10] YAO L, TORABI A, CHO K, et al. Video description generation incorporating spatio-temporal features and a soft-attention mechanism[J]. arXiv preprint arXiv:1502.08029, 2015.
[11] HUANG Z H, XU W, YU K. Bidirectional LSTM-CRF models for sequence tagging[J]. arXiv preprint arXiv:1508.01991, 2015.
[12] 洪铭材, 张阔, 唐杰, 等. 基于条件随机场(CRFs)的中文词性标注方法[J]. 计算机科学, 2006, 33(10):148-151, 155.HONG M C, ZHANG K, TANG J, et al. A Chinese part-of-speech tagging approach using conditional random fields[J]. Computer Science, 2006, 33(10):148-151, 155. (in Chinese)
[13] FORNEY G D. The viterbi algorithm[J]. Proceedings of the IEEE, 1973, 61(3):268-278.
[14] BOTTOU L. Stochastic gradient descent tricks[M]//MONTAVON G, ORR G B, MÜLLER K R. Neural networks:Tricks of the trade. Berlin, Heidelberg, Germany:Springer, 2012:421-436.
[15] 陈治纲, 何丕廉, 孙越恒, 等. 基于向量空间模型的文本分类系统的研究与实现[J]. 中文信息学报, 2005, 19(1):37-41.CHEN Z G, HE P L, SUN Y H, et al. Research and implementation of text classification system based on VSP[J]. Journal of Chinese Information Processing, 2005, 19(1):37-41. (in Chinese)

[1]	王庆人, 王银子, 仲红, 张以文. 面向中文的字词组合序列实体识别方法[J]. 清华大学学报（自然科学版）, 2023, 63(9): 1326-1338.
[2]	胡滨, 耿天玉, 邓赓, 段磊. 基于知识蒸馏的高效生物医学命名实体识别模型[J]. 清华大学学报（自然科学版）, 2021, 61(9): 936-942.
[3]	尹学振, 赵慧, 赵俊保, 姚婉薇, 黄泽林. 多神经网络协作的军事领域命名实体识别[J]. 清华大学学报（自然科学版）, 2020, 60(8): 648-655.

Viewed

Full text

Abstract

Cited

Shared

Discussed