Multi-neural network collaboration for Chinese military named entity recognition
YIN Xuezhen1, ZHAO Hui2, ZHAO Junbao3, YAO Wanwei1, HUANG Zelin1
1. School of Software Engineering, East China Normal University, Shanghai 200062, China; 2. Shanghai Key Laboratory of Trustworthy Computing, East China Normal University, Shanghai 200062, China; 3. Beijing Remote Sensing Information Institute, Beijing 100085, China
摘要互联网公开数据蕴含着大量高价值的军事情报,成为获取开源军事情报的重要数据源之一。军事领域命名实体识别是进行军事领域信息提取、问答系统、知识图谱等工作的基础性关键任务。相比较于其他领域的命名实体,军事领域命名实体边界模糊,界定困难;互联网媒体中军事术语表达不规范,随意性的简化表达现象较普遍;现阶段面向军事领域的公开语料鲜见。该文提出一种考虑实体模糊边界的标注策略,结合领域专家知识,构建了基于微博数据的军事语料集MilitaryCorpus;提出一种多神经网络协作的军事领域命名实体识别模型,该模型通过基于Transformer的双向编码器(bidirectional encoder representations from transformers,BERT)的字向量表达层获得字级别的特征,通过双向长短时记忆神经网络(bi-directional long short-term memory,BiLSTM)层抽取上下文特征形成特征矩阵,最后由条件随机场层(conditional random field,CRF)生成最优标签序列。实验结果表明:相较于基于CRF的实体识别模型,应用该文提出的BERT-BiLSTM-CRF模型召回率提高28.48%,F值提高18.65%;相较于基于BiLSTM-CRF的实体识别模型,该文模型召回率提高13.91%,F值提高8.69%;相较于基于CNN(convolutional neural networks)-BiLSTM-CRF的实体识别模型,该文模型召回率提高7.08%,F值提高5.15%。
Abstract:Web data contains a large amount of high-value military information which has become an important data source for open-source military intelligence. Military named entity recognition is a basic, key task for information extraction, question answering and knowledge graphs in the military domain. Military named entity recognition faces some unique challenges not seen in searches for named entities in other domains, such as military named entity boundaries being vague and difficult to define, lack of standardized military terms in Internet media, extensive use of abbreviations, and the lack of a public military-oriented corpus. This paper presents an entity labeling strategy that includes the effects of fuzzy entity boundaries and a military-oriented corpus called MilitaryCorpus based on microblog data constructed by combining domain expert knowledge. A multi-neural network collaboration approach is then developed based on a named entity recognition model. The character level features are learned in the BERT (bidirectional encoder representations from transformers)-based Chinese character embedding representation layer with the context features extracted in the BiLSTM (bi-directional long short-term memory) neural network layer to form the feature matrix. Finally, the optimal tag sequence is generated in the CRF (conditional random field) layer. Tests show that the recall rate and the F-score of the BERT-BiLSTM-CRF model are 28.48% and 18.65% higher than those of a CRF-based entity recognition model, 13.91% and 8.69% higher than those of a BiLSTM-CRF-based entity recognition model, and 7.08% and 5.15% higher than those of a CNN (convolutional neural networks)-BiLSTM-CRF-based model.
[1] WILLIAMS H J, BLUM I. Defining second generation open source intelligence (OSINT) for the defense enterprise:RR-1964-OSD[R]. Santa Monica:RAND Corporation, 2018. [2] LI J, SUN A X, HAN J L, et al. A survey on deep learning for named entity recognition[J]. arXiv preprint arXiv:1812.09449, 2018. [3] YADAV V, BETHARD S. A survey on recent advances in named entity recognition from deep learning models[C]//Proceedings of the 27th International Conference on Computational Linguistics. Santa Fe, New Mexico, USA:Association for Computational Linguistics, 2018:2145-2158. [4] HUANG Z H, XU W, YU K. Bidirectional LSTM-CRF models for sequence tagging[J]. arXiv preprint arXiv:1508.01991, 2015. [5] LAMPLE G, BALLESTEROS M, SUBRAMANIAN S, et al. Neural architectures for named entity recognition[C]//Proceedings of NAACL-HLT 2016. San Diego, USA:Association for Computational Linguistics, 2016. [6] CHIU J P C, NICHOLS E. Named entity recognition with bidirectional LSTM-CNNs[J]. Transactions of the Association for Computational Linguistics, 2016, 4:357-370. [7] KURU O, CAN O A, YURET D. CharNER:Character-level named entity recognition[C]//Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics:Technical Papers. Osaka, Japan:The COLING 2016 Organizing Committee, 2016:911-921. [8] WANG Y, XIA B, LIU Z, et al. Named entity recognition for Chinese telecommunications field based on Char2Vec and Bi-LSTMs[C]//2017 12th International Conference on Intelligent Systems and Knowledge Engineering (ISKE). Nanjing, China:IEEE, 2017:1-7. [9] 冯蕴天, 张宏军, 郝文宁. 面向军事文本的命名实体识别[J]. 计算机科学, 2015, 42(7):15-18, 47. FENG Y T, ZHANG H J, HAO W N. Named entity recognition for military text[J]. Computer Science, 2015, 42(7):15-18, 47. (in Chinese) [10] 游飞, 张激, 邱定, 等. 基于深度神经网络的武器名称识别[J]. 计算机系统应用, 2018, 27(1):239-243. YOU F, ZHANG J, QIU D, et al. Weapon named entity recognition based on deep neural network[J]. Computer Systems & Applications, 2018, 27(1):239-243. (in Chinese) [11] 王学锋, 杨若鹏, 朱巍. 基于深度学习的军事命名实体识别方法[J]. 装甲兵工程学院学报, 2018, 32(4):94-98. WANG X F, YANG R P, ZHU W. Military named entity recognition method based on deep learning[J]. Journal of Academy of Armored Force Engineering, 2018, 32(4):94-98. (in Chinese) [12] 张晓海, 操新文, 高源. 基于深度学习的作战文书命名实体识别[J]. 指挥控制与仿真, 2019, 41(4):22-26. ZHANG X H, CAO X W, GAO Y. Named entity recognition for combat documents based on deep learning[J]. Command Control & Simulation, 2019, 41(4):22-26. (in Chinese) [13] 李丽双, 郭元凯. 基于CNN-BLSTM-CRF模型的生物医学命名实体识别[J]. 中文信息学报, 2018, 32(1):116-122.LI L S, GUO Y K. Biomedical named entity recognition with CNN-BLSTM-CRF[J]. Journal of Chinese Information Processing, 2018, 32(1):116-122. (in Chinese) [14] WU Y H, JIANG M, LEI J B, et al. Named entity recognition in Chinese clinical text using deep neural network[J]. Studies in Health Technology and Informatics, 2015, 216:624-628. [15] PENG N Y, DREDZE M. Named entity recognition for Chinese social media with jointly trained embeddings[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon, Portugal:Association for Computational Linguistics, 2015:548-554. [16] MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[J]. Computer Science, arXiv preprint arXiv:1301. 3781, 2013. [17] PENNINGTON J, SOCHER R, MANNING C D. Glove:Global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha, Qatar:Association for Computational Linguistics, 2014:1532-1543. [18] DEVLIN J, CHANG M W, LEE K, et al. Bert:Pre-training of deep bidirectional transformers for language understanding[J]. arXiv preprint arXiv:1810.04805, 2018. [19] CHEN Q, ZHUO Z, WANG W. BERT for joint intent classification and slot filling[J]. arXiv preprint arXiv:1902.10909, 2019. [20] ADHIKARI A, RAM A, TANG R, et al. DocBERT:BERT for document classification[J]. arXiv preprint arXiv:1904.08398, 2019. [21] ALBERTI C, LEE K, COLLINS M. A BERT baseline for the natural questions[J]. arXiv preprint arXiv:1901.08634, 2019.