Knowledge extraction and knowledge base construction method from industrial software packages
WANG Liping1,2, ZHANG Chao2, CAI Enlei2, SHI Huijie2, WANG Dong1
1. Department of Mechanical Engineering, Tsinghua University, Beijing 100084, China; 2. School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
Abstract:Industrial software is a key force supporting the development of domestic small and medium-sized enterprises. Industrial software packages contain a large amount of knowledge related to manufacturing processes, but little of the knowledge embedded in these software packages has been extracted and put into a knowledge base. This paper presents a knowledge extraction model that combines neural networks and natural language processing. The model includes text representation, entity recognition, and relationship extraction. The extracted entities and relationships are modeled on a knowledge graph, while related concepts in the software are defined through ontology modeling. The ontology model concepts are mapped to graph data to improve data retrieval and modeling capabilities and the data can be stored in the knowledge base with long term. The results show that this method can build an industrial software knowledge base which will play an important role in integrating manufacturing knowledge.
王立平, 张超, 蔡恩磊, 史慧杰, 王冬. 面向自主工业软件的知识提取和知识库构建方法[J]. 清华大学学报(自然科学版), 2022, 62(5): 978-986.
WANG Liping, ZHANG Chao, CAI Enlei, SHI Huijie, WANG Dong. Knowledge extraction and knowledge base construction method from industrial software packages. Journal of Tsinghua University(Science and Technology), 2022, 62(5): 978-986.
[1] 李保利, 陈玉忠, 俞士汶. 信息抽取研究综述[J]. 计算机工程与应用, 2003, 39(10):1-5, 66. LI B L, CHEN Y Z, YU S W. Research on information extraction:A survey[J]. Computer Engineering and Applications, 2003, 39(10):1-5, 66. (in Chinese) [2] 王宁, 葛瑞芳, 苑春法, 等. 中文金融新闻中公司名的识别[J]. 中文信息学报, 2002, 16(2):1-6. WANG N, GE R F, YUAN C F, et al. Company name identification in Chinese financial domain[J]. Journal of Chinese Information Processing, 2002, 16(2):1-6. (in Chinese) [3] 王丹, 樊兴华. 面向短文本的命名实体识别[J]. 计算机应用, 2009, 29(1):143-145, 171. WANG D, FAN X H. Named entity recognition for short text[J]. Journal of Computer Applications, 2009, 29(1):143-145, 171. (in Chinese) [4] BLANCO E, MOLDOVAN D. Automatic discovery of manner relations and its applications[C]//Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing. Cambridge, USA:MIT, 2010:315-324. [5] NING G L, BAI Y L. Biomedical named entity recognition based on Glove-BLSTM-CRF model[J]. Journal of Comput