Please wait a minute...
 首页  期刊介绍 期刊订阅 联系我们 横山亮次奖 百年刊庆
 
最新录用  |  预出版  |  当期目录  |  过刊浏览  |  阅读排行  |  下载排行  |  引用排行  |  横山亮次奖  |  百年刊庆
清华大学学报(自然科学版)  2023, Vol. 63 Issue (6): 1003-1011    DOI: 10.16511/j.cnki.qhdxxb.2023.22.006
  公共安全 本期目录 | 过刊浏览 | 高级检索 |
基于机器学习的新冠肺炎疫情趋势预测方法
任建强1,2, 崔亚鹏1,2, 倪顺江1,2
1. 清华大学 工程物理系, 公共安全研究院, 北京 100084;
2. 城市综合应急科学北京市重点实验室, 北京 100084
Prediction method of the pandemic trend of COVID-19 based on machine learning
REN Jianqiang1,2, CUI Yapeng1,2, NI Shunjiang1,2
1. Institute of Public Safety Research, Department of Engineering Physics, Tsinghua University, Beijing 100084, China;
2. Beijing Key Laboratory of City Integrated Emergency Response Science, Beijing 100084, China
全文: PDF(4476 KB)  
输出: BibTeX | EndNote (RIS)      
摘要 防控措施对传染病的传播过程有重要作用,因此在预测新型冠状病毒肺炎疫情未来发展趋势时必须要考虑防控措施的影响。该文提出了基于机器学习的新冠肺炎疫情三步预测模型,将神经网络、随机森林、长短期记忆网络和序列到序列等机器学习算法引入到新冠肺炎传染病疫情预测中。与前人的预测模型相比,所提出的模型考虑了新冠肺炎疫情发展过程中防控措施的变化情况,可以使用检测数据预测未来的确诊人数和实际感染规模。研究结果表明:预测结果与实际数据基本一致,模型具有较高的可靠性。该预测方法可以使政府管理部门更准确地了解新冠肺炎疫情的实际发展态势,帮助管理者更有效地分配医疗资源,为新冠肺炎疫情防控提供决策参考。
服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
任建强
崔亚鹏
倪顺江
关键词 机器学习防控措施疫情趋势预测突发公共卫生事件    
Abstract:[Objective] To estimate and predict the actual infection scale of COVID-19 in a population, a COVID-19 pandemic trend prediction method based on machine learning is proposed. This method uses detection data to predict the development trend of the pandemic and can implicitly consider the impact of prevention and control measures. Additionally, this method can predict the number of confirmed cases in the future and estimate the actual infection scale of COVID-19.[Methods] In this paper, a three-step prediction model based on machine learning (TSPM-ML) is proposed. Machine learning algorithms, such as neural networks, random forest, long short-term memory (LSTM), and sequence to sequence (seq2seq), are introduced into the prediction of the COVID-19 development situation, and the detection data are used to predict the number of people diagnosed and the actual scale of the infection in the future. The TSPM-ML includes three steps: (1) predicting the actual infection scale of COVID-19 based on the detection data, (2) predicting the future development trend of the actual infection scale based on the predicted results of the first step, and (3) predicting the number of people diagnosed in the future based on the actual infection scale obtained in the second step. The TSPM-ML is used to predict the actual pandemic situation in Germany, France, South Korea, the United States, Russia, and Finland.[Results] The largest prediction error is in the United States, with a forecast error of 23.71 per million people, while South Korea has the smallest prediction error of 0.63 per million people. Overall, the prediction results of the TSPM-ML are consistent with the simulation and actual data, and the reliability of the model is verified.[Conclusions] The predicted results are consistent with the actual data, and the TSPM-ML is highly reliable. The prediction results can enable government management departments to more accurately understand the actual development trend of COVID-19 and allocate medical resources more effectively, and provide decision support for COVID-19 prevention and control.
Key wordsmachine learning    prevention and control measures    epidemic trend prediction    public health emergencies
收稿日期: 2022-11-01      出版日期: 2023-05-12
基金资助:国家自然科学基金面上项目(72174104)
通讯作者: 倪顺江,高级工程师,E-mail:sjni@tsinghua.edu.cn     E-mail: sjni@tsinghua.edu.cn
作者简介: 任建强(1999—),男,硕士研究生。
引用本文:   
任建强, 崔亚鹏, 倪顺江. 基于机器学习的新冠肺炎疫情趋势预测方法[J]. 清华大学学报(自然科学版), 2023, 63(6): 1003-1011.
REN Jianqiang, CUI Yapeng, NI Shunjiang. Prediction method of the pandemic trend of COVID-19 based on machine learning. Journal of Tsinghua University(Science and Technology), 2023, 63(6): 1003-1011.
链接本文:  
http://jst.tsinghuajournals.com/CN/10.16511/j.cnki.qhdxxb.2023.22.006  或          http://jst.tsinghuajournals.com/CN/Y2023/V63/I6/1003
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
[1] THOMPSON, WEINTRAUB E, DHANKHAR P, et al. Estimates of US influenza-associated deaths made using four different methods[J]. Influenza and Other Respiratory Viruses, 2009, 3(1):37-49.
[2] LUZ P M, MENDES B V M, CODECO C T, et al. Time series analysis of Dengue incidence in Rio de Janeiro, Brazil[J]. The American Journal of Tropical Medicine and Hygiene, 2008, 79(6):933-939.
[3] 易燕飞.基于时间序列模型的传染病流行趋势及预测研究[D].长春:长春工业大学, 2016. YI Y F. Epidemic prediction of infectious diseases based on time series mode[D]. Changchun:Changchun University of Technology, 2016.(in Chinese)
[4] 田德红.中国布鲁氏杆菌病流行趋势及时间序列模型预测研究[D].兰州:兰州大学, 2016. TIAN D H. Study on epidemic trend and time series model of prediction for human Brucellosis in China[D]. Lanzhou:Lanzhou University, 2016.(in Chinese)
[5] 胡跃华,廖家强,冯国双,等.自回归移动平均模型在全国手足口病疫情预测中的应用[J].疾病监测, 2014, 29(10):827-832. HU Y H, LIAO J Q, FENG G S, et al. Application of multiple seasonal autoregressive integrated moving average model in prediction of incidence of hand foot and mouth disease in China[J]. Disease Surveillance, 2014, 29(10):827-832.(in Chinese)
[6] 金如锋,邱宏,周霞,等. ARIMA模型和GM (1,1)模型预测全国3种肠道传染病发病率[J].复旦学报(医学版), 2008, 35(5):675-680. JIN R F, QIU H, ZHOU X, et al. Forecasting incidence of intestinal infectious diseases in mainland China with ARIMA model and GM (1,1) model[J]. Fudan University Journal of Medical Sciences, 2008, 35(5):675-680.(in Chinese)
[7] 范引光,吕金伟,戴色莺,等. ARIMA模型与灰色预测模型GM (1,1)在HIV感染人数预测中的应用[J].中华疾病控制杂志, 2012, 16(12):1100-1103. FAN Y G, Lü J W, DAI S Y, et al. Prediction on the number of HIV with models of ARIMA and GM (1,1)[J]. Chinese Journal of Disease Control&Prevention, 2012, 16(12):1100-1103.(in Chinese)
[8] 时照华,苏虹,秦凤云,等. ARIMA模型在常见呼吸道传染病疫情预测中的应用[J].安徽医科大学学报, 2013, 48(7):783-786. SHI Z H, SU H, QIN F Y, et al. Application of ARIMA model in prediction of respiratory infectious diseases[J]. Acta Universitatis Medicinalis Anhui, 2013, 48(7):783-786.(in Chinese)
[9] 吴家兵,叶临湘,尤尔科. ARIMA模型在传染病发病率预测中的应用[J].数理医药学杂志, 2007, 20(1):90-92. WU J B, YE L X, YOU E K. Prediction of incidence of notifiable contagious diseases by application of time series model[J]. Journal of Mathematical Medicine, 2007, 20(1):90-92.(in Chinese)
[10] 郑慧敏,薛允莲,黄燕飞,等. ARIMA模型在深圳市法定传染病发病趋势预测的应用[J].实用预防医学, 2016, 23(2):240-243. ZHENG H M, XUE Y L, HUANG Y F, et al. Application of ARIMA model to predicting the incidence tendency of notifiable communicable diseases in Shenzhen City[J]. Practical Preventive Medicine, 2016, 23(2):240-243.(in Chinese)
[11] 董选军,贾伟娜. ARIMA时间序列和BP神经网络在传染病预测中的比较[J].现代实用医学, 2010, 22(2):142-143, 147. DONG X J, JIA W N. Predictive efficiency comparison of ARIMA-time-series and BP neural net model on infectious diseases[J]. Modern Practical Medicine, 2010, 22(2):142-143, 147.(in Chinese)
[12] CHEKOL B E, HAGRAS H. Employing machine learning techniques for the malaria epidemic prediction in Ethiopia[C]//Proceedings of the 10th Computer Science and Electronic Engineering. Colchester, UK, 2018:89-94.
[13] LEE M K, PAIK J H, NA I S. Outbreak prediction of hepatitis A in Korea based on statistical analysis and LSTM network[C]//Proceedings of 2020 International Conference on Artificial Intelligence in Information and Communication. Fukuoka, Japan, 2020:379-381.
[14] PANDEY M K, SUBBIAH K. Performance analysis of time series forecasting using machine learning algorithms for prediction of Ebola casualties[C]//Proceedings of the 1st International Conference on Applications of Computing and Communication Technologies. Delhi, India, 2018:320-334.
[15] 柴国荣,王斌,沙勇忠.基于多机器学习方法联合的公共卫生风险预测研究:以兰州市流感预测为例[J].数据分析与知识发现, 2021, 5(1):90-98. CHAI G R, WANG B, SHA Y Z. Public health risk forecasting with multiple machine learning methods combined:Case study of influenza forecasting in Lanzhou, China[J]. Data Analysis and Knowledge Discovery, 2021, 5(1):90-98.(in Chinese)
[16] KAUFMAN H W, CHEN Z, MEYER W A, et al. Insights from patterns of SARS-CoV-2 immunoglobulin G serology test results from a national clinical laboratory, United States, March-July 2020[J]. Population Health Management, 2021, 24(S1):S-35-S-42.
[17] CUI Y P, NI S J. SHEN S F. A network-based model to explore the role of testing in the epidemiological control of the COVID-19 pandemic[J]. BMC Infectious Diseases, 2021, 21(1):58.
[18] RUMELHART D E, HINTON G E, WILLIAMS R J. Learning representations by back-propagating errors[J]. Nature, 1986, 323(6088):533-536.
[19] BREIMAN L. Random forests[J]. Machine Learning, 2001, 45(1):5-32.
[20] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8):1735-1780.
[21] OLAH C. Understanding LSTM networks[EB/OL].[2021-02-10]. http://colah.github.io/posts/2015-08-Understanding-LSTMs/.
[22] CHO K, VAN MERRIENBOER B, GULCEHRE C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha, Qatar, 2014:1724-1734.
[23] SUTSKEVER I, VINYALS O, LE Q V. Sequence to sequence learning with neural networks[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal, Canada, 2014:3104-3112.
[24] MATHIEU E, RITCHIE H, RODÉS-GUIRAO L, et al. Coronavirus pandemic (COVID-19)[EB/OL].[2021-02-10]. https://ourworldindata.org/coronavirus.
[25] WHO. WHO coronavirus (COVID-19) dashboard[EB/OL].[2021-03-17]. https://covid19.who.int/.
[1] 吴浩, 牛风雷. 高温球床辐射传热中的机器学习模型[J]. 清华大学学报(自然科学版), 2023, 63(8): 1213-1218.
[2] 代鑫, 黄弘, 汲欣愉, 王巍. 基于机器学习的城市暴雨内涝时空快速预测模型[J]. 清华大学学报(自然科学版), 2023, 63(6): 865-873.
[3] 安健, 陈宇轩, 苏星宇, 周华, 任祝寅. 机器学习在湍流燃烧及发动机中的应用与展望[J]. 清华大学学报(自然科学版), 2023, 63(4): 462-472.
[4] 赵祺铭, 毕可鑫, 邱彤. 基于机器学习的乙烯裂解过程模型比较与集成[J]. 清华大学学报(自然科学版), 2022, 62(9): 1450-1457.
[5] 曹来成, 李运涛, 吴蓉, 郭显, 冯涛. 多密钥隐私保护决策树评估方案[J]. 清华大学学报(自然科学版), 2022, 62(5): 862-870.
[6] 王豪杰, 马子轩, 郑立言, 王元炜, 王飞, 翟季冬. 面向新一代神威超级计算机的高效内存分配器[J]. 清华大学学报(自然科学版), 2022, 62(5): 943-951.
[7] 陆思聪, 李春文. 基于场景与话题的聊天型人机会话系统[J]. 清华大学学报(自然科学版), 2022, 62(5): 952-958.
[8] 李维, 李城龙, 杨家海. As-Stream:一种针对波动数据流的算子智能并行化策略[J]. 清华大学学报(自然科学版), 2022, 62(12): 1851-1863.
[9] 刘强墨, 何旭, 周佰顺, 吴昊霖, 张弛, 秦羽, 沈晓梅, 高小榕. 基于机器学习和瞳孔响应的简易高性能自闭症分类模型[J]. 清华大学学报(自然科学版), 2022, 62(10): 1730-1738.
[10] 马晓悦, 孟啸. 用户参与视角下多图推文的图像位置和布局效应[J]. 清华大学学报(自然科学版), 2022, 62(1): 77-87.
[11] 汤志立, 王雪, 徐千军. 基于过采样和客观赋权法的岩爆预测[J]. 清华大学学报(自然科学版), 2021, 61(6): 543-555.
[12] 王志国, 章毓晋. 监控视频异常检测:综述[J]. 清华大学学报(自然科学版), 2020, 60(6): 518-529.
[13] 宋宇波, 祁欣妤, 黄强, 胡爱群, 杨俊杰. 基于二阶段多分类的物联网设备识别算法[J]. 清华大学学报(自然科学版), 2020, 60(5): 365-370.
[14] 芦效峰, 蒋方朔, 周箫, 崔宝江, 伊胜伟, 沙晶. 基于API序列特征和统计特征组合的恶意样本检测框架[J]. 清华大学学报(自然科学版), 2018, 58(5): 500-508.
[15] 邹权臣, 张涛, 吴润浦, 马金鑫, 李美聪, 陈晨, 侯长玉. 从自动化到智能化:软件漏洞挖掘技术进展[J]. 清华大学学报(自然科学版), 2018, 58(12): 1079-1094.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 《清华大学学报(自然科学版)》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn