清华大学学报(自然科学版)  2019, Vol. 59 Issue (6): 468-475    DOI: 10.16511/j.cnki.qhdxxb.2019.26.001
王晓明, 赵歆波
西北工业大学 计算机学院, 空天地海一体化大数据应用技术国家工程实验室, 西安 710072
Eye movement prediction of individuals while reading based on deep neural networks
WANG Xiaoming, ZHAO Xinbo
National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology, School of Computer, Northwestern Polytechnical University, Xi'an 710072, China
摘要 传统眼动模型基于心理学假设和经验数据构建,不能对未见文本数据进行预测,且不能解决阅读个体化差异问题。针对这一问题,该文提出了一种利用深度神经网络预测读者注视点的眼动模型。与传统基于心理学的眼动模型不同,该模型不是基于经验数据集,而是基于双向长短期记忆-条件随机场(bi-directional long short-term memory-conditional random field,bi-LSTM-CRF)神经网络。该模型使用阅读过程中读者的眼球运动数据作为训练数据,来预测该读者阅读其他文本时的注视点。计算机模拟结果表明:bi-LSTM-CRF模型能够使用较少的数据特征获得与现有机器学习模型相似的预测准确度,这使所提出的模型在实时人机交互应用领域具有吸引力。
关键词 个体阅读眼动追踪眼动模型深度神经网络    
Abstract:Traditional eye movement models are based on psychological assumptions and empirical data; thus, they cannot predict eye movement for previously unseen text and cannot predict individual differences while reading. This paper presents an eye movement model based on conventional psychology-based eye movement models using a bi-directional long short-term memory-conditional random field (bi-LSTM-CRF) neural network instead of empirical data sets. The model was trained to predict the eye movements of a user reading a previously unseen text based on the eye movements recorded for this person reading other texts as training data. Tests demonstrate that the model can achieve similar prediction accuracy than current machine learning models while requiring fewer features, which makes this model attractive for a range of real-time human-computer applications.
Key wordsindividual reading    eye tracking    eye movement model    deep neural networks
收稿日期: 2018-09-10      出版日期: 2019-06-01
通讯作者: 赵歆波,教授,     E-mail:
王晓明, 赵歆波. 基于深度神经网络的个体阅读眼动预测[J]. 清华大学学报(自然科学版), 2019, 59(6): 468-475.
WANG Xiaoming, ZHAO Xinbo. Eye movement prediction of individuals while reading based on deep neural networks. Journal of Tsinghua University(Science and Technology), 2019, 59(6): 468-475.
  图1 成人读者阅读时的眼睛运动轨迹
  图2 LSTM 网络中的重复单元
  图3 基于深度神经网络的眼动模型架构
  图4 biGLSTMGCRF模型训练算法过程
  表1 实验的超参数
  图5 每层的参数数量和所有需要训练的超参数
  表2 测试数据中注视词的基线率
  表3 测试数据的注视词预测准确率
  图6 使用测试数据的不同特征进行注视点预测准确度比较
  表4 EGZReader, NN09, HMKA12和本文的模型之间的比较
