Please wait a minute...
 首页  期刊介绍 期刊订阅 联系我们 横山亮次奖 百年刊庆
 
最新录用  |  预出版  |  当期目录  |  过刊浏览  |  阅读排行  |  下载排行  |  引用排行  |  横山亮次奖  |  百年刊庆
清华大学学报(自然科学版)  2018, Vol. 58 Issue (3): 272-278    DOI: 10.16511/j.cnki.qhdxxb.2018.25.010
  计算机科学与技术 本期目录 | 过刊浏览 | 高级检索 |
阅读理解中因果关系类选项的研究
王元龙1, 李茹1,2, 张虎1, 王智强1
1. 山西大学 计算机与信息技术学院, 太原 030006;
2. 山西大学 计算智能与中文信息处理教育部重点实验室, 太原 030006
Causal options in Chinese reading comprehension
WANG Yuanlong1, LI Ru1,2, ZHANG Hu1, WANG Zhiqiang1
1. School of Computer and Information Technology, Shanxi University, Taiyuan 030006, China;
2. Key Laboratory of Computation Intelligence and Chinese Information Processing of Ministry of Education, Shanxi University, Taiyuan 030006, China
全文: PDF(1700 KB)  
输出: BibTeX | EndNote (RIS)      
摘要 针对阅读理解选择题中因果关系类选项,该文提出了基于因果关系网的因果关系支持度分析方法。首先,通过线索短语从阅读材料中抽取因果事件对,并计算事件对之间因果关联强度,综合利用抽取到的因果事件对与其对应的因果关联强度构成因果关系网;其次,综合考虑了选项中的每个词在文档中的重要性和整个文档中的区分能力,采用词频-逆向文件频率(term frequency-inverse document frequency,TF-IDF)方法分别从原文中检索与选项中因事件和果事件相关的句子;最后,基于因果关系网和抽取到的相关句计算选项的因果关系支持度。为了验证该方法,实验采用了769篇模拟材料和13篇北京高考语文试卷材料(包括原文与选择题)作为测试数据集,实验结果表明该方法的准确率比Baseline方法提高了约11%。
服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
王元龙
李茹
张虎
王智强
关键词 自然语言处理因果关系网阅读理解语义相似度    
Abstract:A causal relation support analysis method based on a causal network is presented here to identify the causal relation types in Chinese reading comprehension. Firstly, the causal events are extracted from the literature by clue phrases, the causal relation between the events is given a value, and a causal network is constructed from the causal events and the causal relation. Then, the TF-IDF (term frequency-inverse document frequency) method is used to retrieve related sentences from the document and the importance of each word in the document to characterize the whole document. Finally, the causality network and related sentences are combined to analyze the causal support of the option. The method was evaluated using 769 articles and 13 Beijing colleges entrance examination (including the source text and the selected title) as a test set. This method then gave about 11% better result than the Baseline method.
Key wordsnatural language processing    causality network    reading comprehension    semantic similarity
收稿日期: 2017-08-26      出版日期: 2018-03-15
ZTFLH:  TP391  
基金资助:国家“八六三”高技术项目(2015AA015407);国家自然科学基金资助项目(61373082);山西省自然科学基金资助项目(201601D102030)
通讯作者: 李茹,教授,E-mail:liru@sxu.edu.cn     E-mail: liru@sxu.edu.cn
作者简介: 王元龙(1983-),男,讲师。
引用本文:   
王元龙, 李茹, 张虎, 王智强. 阅读理解中因果关系类选项的研究[J]. 清华大学学报(自然科学版), 2018, 58(3): 272-278.
WANG Yuanlong, LI Ru, ZHANG Hu, WANG Zhiqiang. Causal options in Chinese reading comprehension. Journal of Tsinghua University(Science and Technology), 2018, 58(3): 272-278.
链接本文:  
http://jst.tsinghuajournals.com/CN/10.16511/j.cnki.qhdxxb.2018.25.010  或          http://jst.tsinghuajournals.com/CN/Y2018/V58/I3/272
  表1 因果关系类选项比例统计
  图1 抽取含特殊线索短语的句子模板<sup>[19]</sup>
  图2 因果线索词分类图
  图3 因果关系网示例
  图4 原文中存在与选项对应的因果变量(情形1)
  图5 原文中缺少与选项对应的因变量或果变量(情形2)
  表2 数据统计
  表3 本文方法在测试集上召回率实验结果
  表4 本文方法与相关方法的结果比较
[1] DANQI C, JASON B, CHRISTOPHER D M. A thorough examination of the CNN/Daily mail reading comprehension task[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Brelin, Germany:ACL, 2016:2359-2367.
[2] 刘知远, 孙茂松, 林衍凯, 等. 知识表示学习研究进展[J]. 计算机研究与发展, 2016, 53(2):247-261. LIU Z Y, SUN M S, LIN Y K, et al. Knowledge representation learning:A review[J]. Journal of Computer Research and Development, 2016, 53(2):247-261.(in Chinese)
[3] MANDAR J, EUNSOL C, DANIEL S W, et al. TriviaQA:A large scale distantly supervised challenge dataset for reading comprehension[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Vancouver, Canada:ACL, 2017:1601-1611.
[4] MRINMAYA S, AVINAVA D, ERIC P X, et al. Learning answer-entailing structures for machine comprehension[C]//Proceedings of the 53th Annual Meeting of the Association for Computational Linguistics. Beijing, China:ACL, 2015:239-249.
[5] ADAM T, ZHENG Y, XINGDI Y. A parallel-hierarchical model for machine comprehension on sparse[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Brelin, Germany:ACL, 2016:432-441.
[6] YIN W P, SEBASTIAN E, HINRICH S. Attention-based convolutional neural network for machine comprehension[C]//Proceedings of the NAACL Human-Computer Question Answering Workshop. San Diego, CA, USA:NAACL, 2016:15-21.
[7] CUI Y M, LIU T, CHEN Z P, et al. Consensus attention-based neural networks for Chinese reading comprehension[J]. arXiv:1607.02250, 2016a.
[8] CUI Y M, CHEN Z P, WEI S, et al. Attention-over-attention neural networks for reading comprehension[J]. arXiv:1607.04423, 2016b.
[9] KARL M H, TOMAS K, EDWARD G, et al. Teaching machines to read and comprehend[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems. Istanbul, Turkey:MIT Press, 2015:1693-1701.
[10] ALESSANDRO S, PHILLIP B, YOSHUA B. Iterative alternating neural attention for machine reading[J]. arXiv:1606.002245, 2016.
[11] RUDOLF K, MARTIN S, ONDREJ B, et al. Text understanding with the attention sum reader network[J]. arXiv:1603.01547, 2016.
[12] 郭少茹, 张虎, 钱揖丽, 等. 面向高考阅读理解的句子语义相关度[J]. 清华大学学报(自然科学版), 2017, 57(6):575-579, 585.GUO S R, ZHANG H, QIAN Y L, et al. Semantic relevancy between sentences for Chinese reading comprehension on college entrance examinations[J]. Journal of Tsinghua University (Science and Technology), 2017, 57(6):575-579, 585. (in Chinese)
[13] MATTHEW R, CHRISTOPHER J, ERIN R. MCTest:A challenge dataset for the open-domain machine comprehension of text[C]//Proceedings of the Empirical Methods in Natural Language Processing. Seattle, Washington, USA:ACL, 2013:193-203.
[14] GARCIA D. An NLP system to locate expressions of actions connected by causality links[C]//Proceedings of the 10th European Workshop on Knowledge Acquisition, Modeling and Management. Catalonia, Spain:Springer, 1997:347-352.
[15] GIRJU R. Automatic detection of causal relations for question answerling[C]//Proceedings of the 41st ACL Workshop on Multilingual Summarization and Question Answering. Sapporo, Japan:ACL, 2003:76-83.
[16] KHOO C, KORNFILT J, ODDY R. Automatic extraction of cause-effect information from newspaper text without knowledge-based inferencing[J]. Literary and Linguistic Computing, 1998, 13(4):177-186.
[17] MARCU D, ECHIHABI A. An unsupervised approach to recognizing discourse relations[C]//Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Philadelphia, USA:ACL, 2002:368-375.
[18] 杨竣辉, 刘宗田, 刘炜, 等. 基于语义事件因果关系识别[J]. 小型微型计算机系统, 2016, 37(3):433-437.YANG J H, LIU Z T, LIU W, et al. Identify causality relationships based on semantic event[J]. Journal of Chinese Computer Systems, 2016, 37(3):433-437. (in Chinese)
[19] 张志昌, 张宇, 刘挺, 等. 基于话题和修辞识别的阅读理解why型问题回答[J]. 计算机研究与发展, 2011, 48(2):216-223. ZHANG Z C, ZHANG Y, LIU T, et al. Why-questions answering for reading comprehension based on topic and rhetorical identification[J]. Journal of Computer Research and Development, 2011, 48(2):216-223.(in Chinese)
[20] SENDONG Z, QUAN W, SEAN M. Constructing and embedding abstract event causality networks from text snippets[C]//Proceedings of the 10th International Conference on Web Search and Data Mining. Cambridge, UK:ACM, 2017:335-344.
[21] 张志昌, 张宇, 刘挺, 等. 基于浅层语义树核的阅读理解答案句抽取[J]. 中文信息学报, 2008, 22(1):80-86.ZHANG Z C, ZHANG Y, LIU T, et al. Answer sentence extraction of reading comprehension based on shallow semantic tree kernel[J]. Journal of Chinese Information Processing, 2008, 22(1):80-86.(in Chinese)
[22] 朱征宇, 孙俊华. 改进的基于《知网》的词汇语义相似度计算[J]. 计算机应用, 2013, 33(8):2276-2279.ZHU Z Y, SUN J H. Improved vocabulary semantic similarity calculation based on howNet[J]. Journal of Computer Applications, 2013, 33(8):2276-2279.(in Chinese)
[1] 王昀, 胡珉, 塔娜, 孙海涛, 郭毅峰, 周武爱, 郭昱, 张皖哲, 冯建华. 大语言模型及其在政务领域的应用[J]. 清华大学学报(自然科学版), 2024, 64(4): 649-658.
[2] 王庆人, 王银子, 仲红, 张以文. 面向中文的字词组合序列实体识别方法[J]. 清华大学学报(自然科学版), 2023, 63(9): 1326-1338.
[3] 陆思聪, 李春文. 基于场景与话题的聊天型人机会话系统[J]. 清华大学学报(自然科学版), 2022, 62(5): 952-958.
[4] 胡滨, 耿天玉, 邓赓, 段磊. 基于知识蒸馏的高效生物医学命名实体识别模型[J]. 清华大学学报(自然科学版), 2021, 61(9): 936-942.
[5] 贾旭东, 王莉. 基于多头注意力胶囊网络的文本分类模型[J]. 清华大学学报(自然科学版), 2020, 60(5): 415-421.
[6] 陈乐乐, 黄松, 孙金磊, 惠战伟, 吴开舜. 基于BM25算法的问题报告质量检测方法[J]. 清华大学学报(自然科学版), 2020, 60(10): 829-836.
[7] 卢兆麟, 李升波, Schroeder Felix, 周吉晨, 成波. 结合自然语言处理与改进层次分析法的乘用车驾驶舒适性评价[J]. 清华大学学报(自然科学版), 2016, 56(2): 137-143.
[8] 张旭, 王生进. 基于自然语言处理的特定属性物体检测[J]. 清华大学学报(自然科学版), 2016, 56(11): 1137-1142.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 《清华大学学报(自然科学版)》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn