Eye movement prediction of individuals while reading based on deep neural networks
WANG Xiaoming, ZHAO Xinbo
National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology, School of Computer, Northwestern Polytechnical University, Xi'an 710072, China
摘要传统眼动模型基于心理学假设和经验数据构建,不能对未见文本数据进行预测,且不能解决阅读个体化差异问题。针对这一问题,该文提出了一种利用深度神经网络预测读者注视点的眼动模型。与传统基于心理学的眼动模型不同,该模型不是基于经验数据集,而是基于双向长短期记忆-条件随机场(bi-directional long short-term memory-conditional random field,bi-LSTM-CRF)神经网络。该模型使用阅读过程中读者的眼球运动数据作为训练数据,来预测该读者阅读其他文本时的注视点。计算机模拟结果表明:bi-LSTM-CRF模型能够使用较少的数据特征获得与现有机器学习模型相似的预测准确度,这使所提出的模型在实时人机交互应用领域具有吸引力。
Abstract:Traditional eye movement models are based on psychological assumptions and empirical data; thus, they cannot predict eye movement for previously unseen text and cannot predict individual differences while reading. This paper presents an eye movement model based on conventional psychology-based eye movement models using a bi-directional long short-term memory-conditional random field (bi-LSTM-CRF) neural network instead of empirical data sets. The model was trained to predict the eye movements of a user reading a previously unseen text based on the eye movements recorded for this person reading other texts as training data. Tests demonstrate that the model can achieve similar prediction accuracy than current machine learning models while requiring fewer features, which makes this model attractive for a range of real-time human-computer applications.
王晓明, 赵歆波. 基于深度神经网络的个体阅读眼动预测[J]. 清华大学学报(自然科学版), 2019, 59(6): 468-475.
WANG Xiaoming, ZHAO Xinbo. Eye movement prediction of individuals while reading based on deep neural networks. Journal of Tsinghua University(Science and Technology), 2019, 59(6): 468-475.
[1] 孟红霞,白学军,闫国利,等.词边界信息对读者阅读歧义短语时眼跳策略的影响[J].心理科学,2015,38(4):770-776.MENG H X, BAI X J,YAN G L, et al. The effect of word boundary information on the saccade strategy upon reading the spatially ambiguous words[J].Journal of Psychological Science,2015,38(4):770-776. (in Chinese) [2] FISHMAN G A, BIRCH D G, HOLDER G E, et al. Electrophysiologic testing in disorders of the retina, optic nerve, and visual pathway[M]. 2nd ed. San Francisco:The Foundation of the American Academy of Ophthalmology, 2001. [3] RAYNER K. Eye movements in reading and information processing:20 years of research[J]. Psychological Bulletin, 1998, 124(3):372-422. [4] RADACH R, MCCONKIE G W. Determinants of fixation positions in words during reading[M]//UNDERWOOD G. Eye guidance in reading and scene perception. Oxford, England:Elsevier Science Ltd., 1998:77-100. [5] CLIFTON JR C, FERREIRA F, HENDERSON J M, et al. Eye movements in reading and information processing:Keith Rayner's 40 year legacy[J]. Journal of Memory and Language, 2016, 86:1-19. [6] FRISSON S, HARVEY D R, STAUB A. No prediction error cost in reading:Evidence from eye movements[J]. Journal of Memory and Language, 2017, 95:200-214. [7] KUPERBERG G R, JAEGER T F. What do we mean by prediction in language comprehension?[J]. Language, Cognition and Neuroscience, 2016, 31(1):32-59. [8] LUKE S G, CHRISTIANSON K. Limits on lexical prediction during reading[J]. Cognitive Psychology, 2016, 88:22-60. [9] REICHLE E D. Computational models of reading:A primer[J]. Language and Linguistics Compass, 2015, 9(7):271-284. [10] SLATTERY T J, YATES M. Word skipping:Effects of word length, predictability, spelling and reading skill[J]. The Quarterly Journal of Experimental Psychology, 2017. DOI:10.1080/17470218.2017.1310264. [11] 苏衡, 刘志方, 曹立人. 中文阅读预视加工中的词频和预测性效应及其对词切分的启示:基于眼动的证据[J]. 心理学报, 2016, 48(6):625-636.SU H, LIU Z F, CAO L R. The effects of word frequency and word predictability in preview and their implications for word segmentation in Chinese reading:Evidence from eye movements[J]. Acta Psychologica Sinica, 2016, 48(6):625-636. (in Chinese) [12] REICHLE E D, RAYNER K, POLLATSEK A. The E-Z reader model of eye-movement control in reading:Comparisons to other models[J]. Behavioral and Brain Sciences, 2003, 26(4):445-476. [13] ENGBERT R, NUTHMANN A, RICHTER E M, et al. SWIFT:A dynamical model of saccade generation during reading[J]. Psychological Review, 2005, 112(4):777-813. [14] NILSSON M, NIVRE J. Learning where to look:Modeling eye movements in reading[C]//Proceedings of the 13th Conference on Computational Natural Language Learning. Boulder, Colorado:Association for Computational Linguistics, 2009:93-101. [15] NILSSON M, NIVRE J. Towards a data-driven model of eye movement control in reading[C]//Proceedings of 2010 Workshop on Cognitive Modeling and Computational Linguistics. Uppsala, Sweden:Association for Computational Linguistics, 2010:63-71. [16] MATTHIES F, SØ GAARD A. With blinkers on:Robust prediction of eye movements across readers[C]//Proceedings of 2013 Conference on Empirical Methods in Natural Language Processing. Seattle, Washington, USA:Association for Computational Linguistics, 2013:803-807. [17] LANDWEHR N, ARZT S, SCHEFFER T, et al. A model of individual differences in gaze control during reading[C]//Proceedings of 2014 Conference on Empirical Methods in Natural Language Processing. Doha, Qatar:Association for Computational Linguistics,2014:1810-1815. [18] HARA T, MOCHIHASHI D, KANO Y, et al. Predicting word fixations in text with a CRF model for capturing general reading strategies among readers[C]//Proceedings of the 1st Workshop on Eye-Tracking and Natural Language Processing. Mumbai, India:The COLING 2012 Organizing Committee, 2012:55-70. [19] MOCH B N, KOMARUDIN K, SUSILO M S. Development of eye fixation points prediction model from eye tracking data using neural network[J]. International Journal of Technology, 2017, 8(6):1082-1088. [20] HOU Y, LI Z, WANG P, et al. Skeleton optical spectra-based action recognition using convolutional neural networks[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2018, 28(3):807-811. [21] COLLOBERT R, WESTON J, BOTTOU L, et al. Natural language processing (almost) from scratch[J]. Journal of Machine Learning Research, 2011, 12:2493-2537. [22] GOLDBERG Y. A primer on neural network models for natural language processing[J]. Journal of Artificial Intelligence Research, 2016, 57:345-420. [23] DAT N D, DAT N D, TRAN V T N, et al. Fuzzy C-means for english sentiment classification in a distributed system[J]. Applied Intelligence, 2017, 46(3):717-738. [24] HUANG M L, QIAN Q, ZHU X Y. Encoding syntactic knowledge in neural networks for sentiment classification[J]. ACM Transactions on Information Systems (TOIS), 2017, 35(3):26-33. [25] 张宇,张鹏远,颜永红. 基于注意力LSTM和多任务学习的远场语音识别[J]. 清华大学学报(自然科学版), 2018,58(3),249-253. ZHANG Y, ZHANG P Y, YAN Y H. Long short-term memory with attention and multitask learning for distant speech recognition[J]. Journal of Tsinghua University(Science and Technology), 2018, 58(3), 249-253. (in Chinese) [26] 张雪英,牛溥华,高帆.基于DNN-LSTM的VAD算法[J]. 清华大学学报(自然科学版), 2018,58(5):509-515.ZHANG X Y, NIU P H, GAO F. DNN-LSTM based VAD algorithm[J]. Journal of Tsinghua University (Science and Technology), 2018, 58(5):509-515. (in Chinese) [27] DYER C, BALLESTEROS M, LING W, et al. Transition-based dependency parsing with stack long short-term memory[C]//Proceedings of the 53rd Annual Meeting of the Association for Com-putational Linguistics and the 7th International Joint Conference on Natural Language Processing. Beijing, China:Association for Computational Linguistics, 2015:321-332. [28] GREFF K, SRIVASTAVA R K, KOUTNíK J, et al. LSTM:A search space odyssey[J]. IEEE Transactions on Neural Networks and Learning Systems, 2016, 28(10):2222-2232. [29] HUANG Z, XU W, YU K. Bidirectional LSTM-CRF models for sequence tagging[J/OL]. (2015-08-09)[2018-09-10]. https://arxiv.org/abs/1508.01991v1. [30] LUKE S G, CHRISTIANSON K. The Provo Corpus:A large eye-tracking corpus with predictability norms[J]. Behavior Research Methods, 2018, 50(2):826-833. [31] KENNEDY A, PYNTE J, MURRAY W S, et al. Frequency and predictability effects in the Dundee Corpus:An eye movement analysis[J]. Quarterly Journal of Experimental Psychology, 2013, 66(3):601-618. [32] YU A W, LEE H, LE Q V. Learning to skim text[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Ancouver, Canada:Association for Computational Linguistics, 2017:1880-1890. [33] PASCANU R, MIKOLOV T, BENGIO Y. On the difficulty of training recurrent neural networks[C]//Proceedings of the 30th International Conference on International Conference on Machine Learning. Atlanta, USA:JMLR.org, 2012:Ⅲ-1310-Ⅲ-1318. [34] ZEILER M D. ADADELTA:An adaptive learning rate method[J/OL]. (2012-12-22)[2018-09-10] http://cn.arxiv.org/abs/1212.5701. [35] KINGMA D P, BA J. Adam:A method for stochastic optimization[J/OL]. (2014-12-22)[2018-09-10]. https://arxiv.org/abs/1412.6980. [36] DAUPHIN Y N, DE VRIES H, BENGIO Y. Equilibrated adaptive learning rates for non-convex optimization[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal, Canada:MIT Press, 2015:1504-1512.