Please wait a minute...
 首页  期刊介绍 期刊订阅 联系我们 横山亮次奖 百年刊庆
 
最新录用  |  预出版  |  当期目录  |  过刊浏览  |  阅读排行  |  下载排行  |  引用排行  |  横山亮次奖  |  百年刊庆
清华大学学报(自然科学版)  2019, Vol. 59 Issue (3): 169-177    DOI: 10.16511/j.cnki.qhdxxb.2018.25.050
  计算机科学与技术 本期目录 | 过刊浏览 | 高级检索 |
基于主题模型加强的医疗活动表征学习方法
徐啸, 王灜, 金涛, 王建民
清华大学 软件学院, 北京 100084
Representation learning approach for medical activities enhanced by topical modeling
XU Xiao, WANG Ying, JIN Tao, WANG Jianmin
School of Software, Tsinghua University, Beijing 100084, China
全文: PDF(1101 KB)  
输出: BibTeX | EndNote (RIS)      
摘要 随着健康医疗数据的快速积累,数据驱动的医疗分析越来越受重视,合适的医疗活动表征对这些分析至关重要。然而,当前大多数表征方法缺乏对医疗数据时序性、数值敏感性的考虑,影响了分析方法的效果和可解释性。该文针对住院病例,提出了一种基于主题模型加强的医疗活动表征学习方法,该方法利用活动间时序关系和主题分配情况,构建了一个无监督学习的多层感知机模型。在大规模真实住院数据集上的测试结果表明:该方法所得表征可以有效提升疾病聚类、后续活动预测、剩余住院天数预测3项医疗分析任务的效果,同时表征具有良好的医学可解释性。
服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
徐啸
王灜
金涛
王建民
关键词 表征学习主题模型多层感知机医疗分析    
Abstract:With the explosion of the amount of medical data, data-driven medical analyses are receiving increasing attention. Proper representation of medical activities is crucial for such analyses. However, most existing representations are designed without considering the temporality and numerical sensitivity of medical data, which limits the performance and interpretability of the analysis tasks. This paper presents a representation learning approach for medical activities that is enhanced by topical modeling for inpatient data. The approach leverages the temporal relations between activities and the topic assignment to construct a multilayer perceptron model. Evaluations using large real data sets demonstrate that this approach significantly improves three typical medical analysis tasks, while providing medical interpretations.
Key wordsrepresentation learning    topic modeling    multilayer perceptron    medical analyses
收稿日期: 2018-03-12      出版日期: 2019-03-19
基金资助:国家自然科学基金资助项目(71690231)
引用本文:   
徐啸, 王灜, 金涛, 王建民. 基于主题模型加强的医疗活动表征学习方法[J]. 清华大学学报(自然科学版), 2019, 59(3): 169-177.
XU Xiao, WANG Ying, JIN Tao, WANG Jianmin. Representation learning approach for medical activities enhanced by topical modeling. Journal of Tsinghua University(Science and Technology), 2019, 59(3): 169-177.
链接本文:  
http://jst.tsinghuajournals.com/CN/10.16511/j.cnki.qhdxxb.2018.25.050  或          http://jst.tsinghuajournals.com/CN/Y2019/V59/I3/169
  图1 TMMAR 模型架构
  表1 数据集详细统计
  表2 疾病聚类实验结果
  图2 后续活动预测实验结果
  图3 剩余 LOS预测实验结果
  表3 3个由 TMMAR 生成的诊疗日向量维度
  表4 3个由 Med2Vec生成的诊疗日向量维度
[1] BENGIO Y, COURVILLE A, VINCENT P. Representation learning:A review and new perspectives[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(8):1798-1828.
[2] MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[C]//Proceedings of the 26th InternationalConference on Neural Information Processing Systems. Lake Tahoe, Nevada:Curran Associates Inc, 2013:3111-3119.
[3] CHOI Y, CHIU C Y I, SONTAG D. Learning low-dimensional representations of medical concepts[J]. AMIA Summits on Translational Science Proceedings, 2016, 2016:41-50.
[4] DE VINE L, ZUCCON G, KOOPMAN B, et al. Medical semantic similarity with a neural language model[C]//Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management. New York, NY, USA:ACM, 2014:1819-1822.
[5] LI C Y, HOU Y L, SUN M, et al. An evaluation of China's new rural cooperative medical system:Achievements and inadequacies from policy goals[J]. BMC Public Health, 2015, 15:1079.
[6] NGUYEN P, TRAN T, WICKRAMASINGHE N, et al. Deepr:A convolutional net for medical records[J]. IEEE Journal of Biomedical and Health Informatics, 2017, 21(1):22-30.
[7] ZHU Z H, YIN C C, QIAN B Y, et al. Measuring patient similarities via a deep architecture with medical concept embedding[C]//Proceedings of the 2016 IEEE 16th International Conference on Data Mining. Barcelona, Spain:IEEE, 2016:749-758.
[8] CHOI E, BAHADORI M T, SEARLES E, et al. Multi-layer representation learning for medical concepts[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY, USA:ACM, 2016:1495-1504.
[9] PHAM T, TRAN T, PHUNG D, et al. Deepcare:A deep dynamic memory model for predictive medicine[C]//Proceedings of Pacific-Asia Conference on Knowledge Discovery and Data Mining. Cham, Germany:Springer, 2016:30-41.
[10] CHOI E, BAHADORI M T, KULAS J A, et al. Retain:An interpretable predictive model for healthcare using reverse time attention mechanism[C]//Proceedings of 30th Conference on Neural Information Processing Systems. Barcelona, Spain, 2016:3504-3512.
[11] CHOI E, SCHUETZ A, STEWART W F, et al. Using recurrent neural network models for early detection of heart failure onset[J]. Journal of the American Medical Informatics Association, 2017, 24(2):361-370.
[12] MA F L, CHITTA R, ZHOU J, et al. Dipole:Diagnosis prediction in healthcare via attention-based bidirectional recurrent neural networks[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY, USA:ACM, 2017:1903-1911.
[13] CHOI E, BAHADORI M T, SCHUETZ A, et al. Doctor ai:Predicting clinical events via recurrent neural networks[C]//Proceedings of the 1st Machine Learning for Healthcare Conference. 2016:301-318.
[14] 中华人民共和国国家卫生和计划生育委员会. 卫生部关于印发《临床路径管理指导原则(试行)》的通知[R/OL]. (2009-10-16).[2018-03-13]. http://www.nhfpc.gov.cn/yzygj/s3589/200910/479af260b55a4fc3b4b978321b56b465.shtml.National Health and Family Planning Commission of the PRC. The notification on the instruction principle of clinical pathway management[R/OL]. (2009-10-16).[2018-03-13]. http://www.nhfpc.gov.cn/yzygj/s3589/200910/479af260b55a4fc3b4b978321b56b465.shtml. (in Chinese)
[15] 中华人民共和国国家卫生和计划生育委员会. 2016年我国卫生和计划生育事业发展统计公报[R/OL]. (2017-08-18).[2018-03-13]. http://www.nhfpc.gov.cn/guihuaxxs/s10748/201708/d82fa7141696407abb4ef764f3edf095.shtml?from=groupmessage&isappinstalled=1.National Health and Family Planning Commission of the PRC. The statistical communique on the development of national health and family planning in 2016[R/OL]. (2017-08-18).[2018-03-13]. http://www.nhfpc.gov.cn/guihuaxxs/s10748/201708/d82fa7141696407abb4ef764f3edf095.shtml?from=gro-upmessage&isappinstalled=1. (in Chinese)
[16] BLEI D M, NG A Y, JORDAN M I. Latent dirichlet allocation[J]. Journal of Machine Learning Research, 2003, 3:993-1022.
[17] LE Q, MIKOLOV T. Distributed representations of sentences and documents[C]//Proceedings of the 31st International Conference on International Conference on Machine Learning. Beijing, China:ACM, 2014:1188-1196.
[18] HUANG Z X, LU X D, DUAN H L, et al. Summarizing clinical pathways from event logs[J]. Journal of Biomedical Informatics, 2013, 46(1):111-127.
[19] XU X, JIN T, WEI Z J, et al. Tcpm:Topic-based clinical pathway mining[C]//Proceedings of 1st International Conference on Connected Health:Applications, Systems and Engineering Technologies, 2016 IEEE First International Conference on. Washington DC, USA:IEEE, 2016:292-301.
[20] CHAKRABORTY S, TOMSETT R, RAGHAVENDRA R, et al. Interpretability of deep learning models:A survey of results[C]//Proceedings of IEEE Smart World Congress 2017 Workshop:DAIS 2017-Workshop on Distributed Analytics InfraStructure and Algorithms for Multi-Organization Federations. San Francisco, CA, USA:IEEE, 2017.
[21] BENGIO Y, LAMBLIN P, POPOVICI D, et al. Greedy layer-wise training of deep networks[C]//Proceedings of the 19th International Conference on Neural Information Processing Systems. Cambridge, MA, USA:MIT Press, 2007:153-160.
[1] 郝予实, 范玉顺. 服务系统中冷启动服务协作关系挖掘与预测[J]. 清华大学学报(自然科学版), 2019, 59(11): 917-924.
[2] 奇格奇, 吴建平, 杜怡曼, 贾宇涵. 快速城镇化背景下的驾驶风格多样性分析[J]. 清华大学学报(自然科学版), 2016, 56(12): 1320-1326.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 《清华大学学报(自然科学版)》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn