清华大学学报(自然科学版)  2018, Vol. 58 Issue (3): 272-278    DOI: 10.16511/j.cnki.qhdxxb.2018.25.010
王元龙1, 李茹1,2, 张虎1, 王智强1
1. 山西大学 计算机与信息技术学院, 太原 030006;
2. 山西大学 计算智能与中文信息处理教育部重点实验室, 太原 030006
Causal options in Chinese reading comprehension
WANG Yuanlong1, LI Ru1,2, ZHANG Hu1, WANG Zhiqiang1
1. School of Computer and Information Technology, Shanxi University, Taiyuan 030006, China;
2. Key Laboratory of Computation Intelligence and Chinese Information Processing of Ministry of Education, Shanxi University, Taiyuan 030006, China
摘要 针对阅读理解选择题中因果关系类选项,该文提出了基于因果关系网的因果关系支持度分析方法。首先,通过线索短语从阅读材料中抽取因果事件对,并计算事件对之间因果关联强度,综合利用抽取到的因果事件对与其对应的因果关联强度构成因果关系网;其次,综合考虑了选项中的每个词在文档中的重要性和整个文档中的区分能力,采用词频-逆向文件频率(term frequency-inverse document frequency,TF-IDF)方法分别从原文中检索与选项中因事件和果事件相关的句子;最后,基于因果关系网和抽取到的相关句计算选项的因果关系支持度。为了验证该方法,实验采用了769篇模拟材料和13篇北京高考语文试卷材料(包括原文与选择题)作为测试数据集,实验结果表明该方法的准确率比Baseline方法提高了约11%。
关键词 自然语言处理因果关系网阅读理解语义相似度    
Abstract:A causal relation support analysis method based on a causal network is presented here to identify the causal relation types in Chinese reading comprehension. Firstly, the causal events are extracted from the literature by clue phrases, the causal relation between the events is given a value, and a causal network is constructed from the causal events and the causal relation. Then, the TF-IDF (term frequency-inverse document frequency) method is used to retrieve related sentences from the document and the importance of each word in the document to characterize the whole document. Finally, the causality network and related sentences are combined to analyze the causal support of the option. The method was evaluated using 769 articles and 13 Beijing colleges entrance examination (including the source text and the selected title) as a test set. This method then gave about 11% better result than the Baseline method.
Key wordsnatural language processing    causality network    reading comprehension    semantic similarity
收稿日期: 2017-08-26      出版日期: 2018-03-15
ZTFLH:  TP391  
通讯作者: 李茹,教授,     E-mail:
作者简介: 王元龙(1983-),男,讲师。
王元龙, 李茹, 张虎, 王智强. 阅读理解中因果关系类选项的研究[J]. 清华大学学报(自然科学版), 2018, 58(3): 272-278.
WANG Yuanlong, LI Ru, ZHANG Hu, WANG Zhiqiang. Causal options in Chinese reading comprehension. Journal of Tsinghua University(Science and Technology), 2018, 58(3): 272-278.
  表1 因果关系类选项比例统计
  图1 抽取含特殊线索短语的句子模板<sup>[19]</sup>
  图2 因果线索词分类图
  图3 因果关系网示例
  图4 原文中存在与选项对应的因果变量(情形1)
  图5 原文中缺少与选项对应的因变量或果变量(情形2)
  表2 数据统计
  表3 本文方法在测试集上召回率实验结果
  表4 本文方法与相关方法的结果比较
