Please wait a minute...
 首页  期刊介绍 期刊订阅 联系我们 横山亮次奖 百年刊庆
 
最新录用  |  预出版  |  当期目录  |  过刊浏览  |  阅读排行  |  下载排行  |  引用排行  |  横山亮次奖  |  百年刊庆
清华大学学报(自然科学版)  2022, Vol. 62 Issue (5): 978-986    DOI: 10.16511/j.cnki.qhdxxb.2022.22.023
  机械工程 本期目录 | 过刊浏览 | 高级检索 |
面向自主工业软件的知识提取和知识库构建方法
王立平1,2, 张超2, 蔡恩磊2, 史慧杰2, 王冬1
1. 清华大学 机械工程系, 北京 100084;
2. 电子科技大学 机械与电气工程学院, 成都 611731
Knowledge extraction and knowledge base construction method from industrial software packages
WANG Liping1,2, ZHANG Chao2, CAI Enlei2, SHI Huijie2, WANG Dong1
1. Department of Mechanical Engineering, Tsinghua University, Beijing 100084, China;
2. School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
全文: PDF(5859 KB)   HTML
输出: BibTeX | EndNote (RIS)      
摘要 自主工业软件是支撑国内中小企业创新发展的核心力量之一。自主工业软件相关文本中蕴含着大量与制造业相关的知识,但是目前缺少相应的知识提取和知识库构建方法。该文提出一种基于神经网络和自然语言处理的知识提取模型,该模型包括文本表示、实体识别、关系抽取3个部分。基于知识图谱对提取的实体和关系进行建模,通过本体建模定义自主工业软件相关概念,利用图数据建模将本体模型中的概念映射到图数据中,提升了数据检索和建模能力,并将数据持久化存储到知识库中。应用结果表明:该方法可用于构建自主工业软件知识库,对整合制造业相关知识起到重要作用。
服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
王立平
张超
蔡恩磊
史慧杰
王冬
关键词 自主工业软件神经网络实体识别关系抽取知识图谱    
Abstract:Industrial software is a key force supporting the development of domestic small and medium-sized enterprises. Industrial software packages contain a large amount of knowledge related to manufacturing processes, but little of the knowledge embedded in these software packages has been extracted and put into a knowledge base. This paper presents a knowledge extraction model that combines neural networks and natural language processing. The model includes text representation, entity recognition, and relationship extraction. The extracted entities and relationships are modeled on a knowledge graph, while related concepts in the software are defined through ontology modeling. The ontology model concepts are mapped to graph data to improve data retrieval and modeling capabilities and the data can be stored in the knowledge base with long term. The results show that this method can build an industrial software knowledge base which will play an important role in integrating manufacturing knowledge.
Key wordsindustry software    neural network    entity recognition    relation extraction    knowledge graph
收稿日期: 2021-12-13      出版日期: 2022-04-26
基金资助:国家重点研发计划项目(2020YFB1712303)
通讯作者: 王冬,助理研究员,E-mail:d-wang@tsinghua.edu.cn      E-mail: d-wang@tsinghua.edu.cn
作者简介: 王立平(1967—),男,教授。
引用本文:   
王立平, 张超, 蔡恩磊, 史慧杰, 王冬. 面向自主工业软件的知识提取和知识库构建方法[J]. 清华大学学报(自然科学版), 2022, 62(5): 978-986.
WANG Liping, ZHANG Chao, CAI Enlei, SHI Huijie, WANG Dong. Knowledge extraction and knowledge base construction method from industrial software packages. Journal of Tsinghua University(Science and Technology), 2022, 62(5): 978-986.
链接本文:  
http://jst.tsinghuajournals.com/CN/10.16511/j.cnki.qhdxxb.2022.22.023  或          http://jst.tsinghuajournals.com/CN/Y2022/V62/I5/978
  
  
  
  
  
  
  
  
  
  
  
  
  
  
[1] 李保利, 陈玉忠, 俞士汶. 信息抽取研究综述[J]. 计算机工程与应用, 2003, 39(10):1-5, 66. LI B L, CHEN Y Z, YU S W. Research on information extraction:A survey[J]. Computer Engineering and Applications, 2003, 39(10):1-5, 66. (in Chinese)
[2] 王宁, 葛瑞芳, 苑春法, 等. 中文金融新闻中公司名的识别[J]. 中文信息学报, 2002, 16(2):1-6. WANG N, GE R F, YUAN C F, et al. Company name identification in Chinese financial domain[J]. Journal of Chinese Information Processing, 2002, 16(2):1-6. (in Chinese)
[3] 王丹, 樊兴华. 面向短文本的命名实体识别[J]. 计算机应用, 2009, 29(1):143-145, 171. WANG D, FAN X H. Named entity recognition for short text[J]. Journal of Computer Applications, 2009, 29(1):143-145, 171. (in Chinese)
[4] BLANCO E, MOLDOVAN D. Automatic discovery of manner relations and its applications[C]//Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing. Cambridge, USA:MIT, 2010:315-324.
[5] NING G L, BAI Y L. Biomedical named entity recognition based on Glove-BLSTM-CRF model[J]. Journal of Computational Methods in Sciences and Engineering, 2021, 21(1):125-133.
[6] GAO W C, ZHENG X H, ZHAO S S. Named entity recognition method of Chinese EMR based on BERT-BiLSTM-CRF[J]. Journal of Physics:Conference Series, 2021, 1848(1):012083.
[7] SU Q. Research on relation extraction of computer remote supervision based on neural network[J]. Journal of Physics:Conference Series, 2021, 1744(2):022066.
[8] HAN X Y, ZHANG Y, ZHANG W K, et al. An attention-based model using character composition of entities in Chinese relation extraction[J]. Information, 2020, 11(2):79.
[9] ZHANG T X, LIN H F, TADESSE M M, et al. Chinese medical relation extraction based on multi-hop self-attention mechanism[J]. International Journal of Machine Learning and Cybernetics, 2021, 12(2):355-363.
[10] 张斌, 魏扣, 郝琦. 国内外知识库研究现状述评与比较[J]. 图书情报知识, 2016(3):15-25. ZHANG B, WEI K, HAO Q. Review and comparison of research status of knowledge base at home and abroad[J]. Document, Information & Knowledge, 2016(3):15-25. (in Chinese)
[11] ANDRIĆ A, DEVEDŽIĆ V, ANDREJIĆ M. Translating a knowledge base into HTML[J]. Knowledge-Based Systems, 2006, 19(1):92-101.
[12] Anonymous. The Google knowledge graph:Information gatekeeper or a force to be reckoned with?[J]. Strategic Direction, 2014, 30(4):15-17.
[13] CHEN Y, LIAO Z F, CHEN B, et al. Construction method of knowledge base for power grid-aided decision based on knowledge graph[C]//International Conference on Intelligent Computing, Communication & Devices. Xi'an, China, 2021:356-361.
[14] LIU P C, HUANG Y L, WANG P, et al. Construction of typhoon disaster knowledge graph based on graph database Neo4j[C]//2020 Chinese Control and Decision Conference (CCDC). Hefei, China, 2020:3612-3616.
[15] 熊富林, 邓怡豪, 唐晓晟. Word2vec的核心架构及其应用[J]. 南京师范大学学报(工程技术版), 2015, 15(1):43-48. XIONG F L, DENG Y H, TANG X S. The architecture of Word2vec and its applications[J]. Journal of Nanjing Normal University (Engineering and Technology Edition), 2015, 15(1):43-48. (in Chinese)
[16] CHE W X, LI Z H, LIU T. LTP:A Chinese language technology platform[C]//Proceedings of the 23rd International Conference on Computational Linguistics:Demonstrations. Beijing, China, 2010:13-16.
[17] CAO X ZJ, YANG Y Q. Research on Chinese named entity recognition in the marine field[C]//Proceedings of the 2018 International Conference on Algorithms, Computing and Artificial Intelligence. Sanya, China, 2018:1-7.
[18] NGUYEN D Q, VERSPOOR K. Convolutional neural networks for chemical-disease relation extraction are improved with character-based word embeddings[Z]. arXiv preprint arXiv:1805.10586, 2018.
[1] 胡明昊, 王芳, 徐先涛, 罗威, 刘晓鹏, 罗准辰, 谭玉珊. 国防科技领域两阶段开放信息抽取方法[J]. 清华大学学报(自然科学版), 2023, 63(9): 1309-1316.
[2] 王庆人, 王银子, 仲红, 张以文. 面向中文的字词组合序列实体识别方法[J]. 清华大学学报(自然科学版), 2023, 63(9): 1326-1338.
[3] 杨波, 邱雷, 吴书. 异质图神经网络协同过滤模型[J]. 清华大学学报(自然科学版), 2023, 63(9): 1339-1349.
[4] 付雯, 温浩, 黄俊珲, 孙镔轩, 陈嘉杰, 陈武, 冯跃, 段星光. 基于非线性动力学模型补偿的水下机械臂自适应滑模控制[J]. 清华大学学报(自然科学版), 2023, 63(7): 1068-1077.
[5] 黄贲, 康飞, 唐玉. 基于目标检测的混凝土坝裂缝实时检测方法[J]. 清华大学学报(自然科学版), 2023, 63(7): 1078-1086.
[6] 陈波, 张华, 陈永灿, 李永龙, 熊劲松. 基于特征增强的水工结构裂缝语义分割方法[J]. 清华大学学报(自然科学版), 2023, 63(7): 1135-1143.
[7] 代鑫, 黄弘, 汲欣愉, 王巍. 基于机器学习的城市暴雨内涝时空快速预测模型[J]. 清华大学学报(自然科学版), 2023, 63(6): 865-873.
[8] 李聪健, 高航, 刘奕. 基于数值模拟和机器学习的风场快速重构方法[J]. 清华大学学报(自然科学版), 2023, 63(6): 882-887.
[9] 杜晓闯, 梁漫春, 黎岢, 俞彦成, 刘欣, 汪向伟, 王汝栋, 张国杰, 付起. 基于卷积神经网络的γ放射性核素识别方法[J]. 清华大学学报(自然科学版), 2023, 63(6): 980-986.
[10] 安健, 陈宇轩, 苏星宇, 周华, 任祝寅. 机器学习在湍流燃烧及发动机中的应用与展望[J]. 清华大学学报(自然科学版), 2023, 63(4): 462-472.
[11] 孙继昊, 宋颖, 石云姣, 赵宁波, 郑洪涛. 天然气同轴分级燃烧室污染物生成及预测[J]. 清华大学学报(自然科学版), 2023, 63(4): 649-659.
[12] 刘江帆, 葛冰, 李珊珊, 芦翔. 基于神经网络的燃烧室壁面冷效预测方法[J]. 清华大学学报(自然科学版), 2023, 63(4): 681-690.
[13] 邓青, 张博, 李宜豪, 周亮, 周正青, 蒋慧灵, 高扬. 基于级联CNN的疏散场景中人群数量估计模型[J]. 清华大学学报(自然科学版), 2023, 63(1): 146-152.
[14] 庄文宇, 张如九, 徐建军, 殷亮, 魏海宁, 刘耀儒. 基于IAGA-BP算法的高拱坝-坝基力学参数反演分析[J]. 清华大学学报(自然科学版), 2022, 62(8): 1302-1313.
[15] 于京池, 金爱云, 潘坚文, 王进廷, 张楚汉. 基于GA-BP神经网络的拱坝地震易损性分析[J]. 清华大学学报(自然科学版), 2022, 62(8): 1321-1329.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 《清华大学学报(自然科学版)》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn