PUBLIC SAFETY |
|
|
|
|
|
Prediction method of the pandemic trend of COVID-19 based on machine learning |
REN Jianqiang1,2, CUI Yapeng1,2, NI Shunjiang1,2 |
1. Institute of Public Safety Research, Department of Engineering Physics, Tsinghua University, Beijing 100084, China; 2. Beijing Key Laboratory of City Integrated Emergency Response Science, Beijing 100084, China |
|
|
Abstract [Objective] To estimate and predict the actual infection scale of COVID-19 in a population, a COVID-19 pandemic trend prediction method based on machine learning is proposed. This method uses detection data to predict the development trend of the pandemic and can implicitly consider the impact of prevention and control measures. Additionally, this method can predict the number of confirmed cases in the future and estimate the actual infection scale of COVID-19.[Methods] In this paper, a three-step prediction model based on machine learning (TSPM-ML) is proposed. Machine learning algorithms, such as neural networks, random forest, long short-term memory (LSTM), and sequence to sequence (seq2seq), are introduced into the prediction of the COVID-19 development situation, and the detection data are used to predict the number of people diagnosed and the actual scale of the infection in the future. The TSPM-ML includes three steps: (1) predicting the actual infection scale of COVID-19 based on the detection data, (2) predicting the future development trend of the actual infection scale based on the predicted results of the first step, and (3) predicting the number of people diagnosed in the future based on the actual infection scale obtained in the second step. The TSPM-ML is used to predict the actual pandemic situation in Germany, France, South Korea, the United States, Russia, and Finland.[Results] The largest prediction error is in the United States, with a forecast error of 23.71 per million people, while South Korea has the smallest prediction error of 0.63 per million people. Overall, the prediction results of the TSPM-ML are consistent with the simulation and actual data, and the reliability of the model is verified.[Conclusions] The predicted results are consistent with the actual data, and the TSPM-ML is highly reliable. The prediction results can enable government management departments to more accurately understand the actual development trend of COVID-19 and allocate medical resources more effectively, and provide decision support for COVID-19 prevention and control.
|
Keywords
machine learning
prevention and control measures
epidemic trend prediction
public health emergencies
|
Issue Date: 12 May 2023
|
|
|
[1] THOMPSON, WEINTRAUB E, DHANKHAR P, et al. Estimates of US influenza-associated deaths made using four different methods[J]. Influenza and Other Respiratory Viruses, 2009, 3(1):37-49. [2] LUZ P M, MENDES B V M, CODECO C T, et al. Time series analysis of Dengue incidence in Rio de Janeiro, Brazil[J]. The American Journal of Tropical Medicine and Hygiene, 2008, 79(6):933-939. [3] 易燕飞.基于时间序列模型的传染病流行趋势及预测研究[D].长春:长春工业大学, 2016. YI Y F. Epidemic prediction of infectious diseases based on time series mode[D]. Changchun:Changchun University of Technology, 2016.(in Chinese) [4] 田德红.中国布鲁氏杆菌病流行趋势及时间序列模型预测研究[D].兰州:兰州大学, 2016. TIAN D H. Study on epidemic trend and time series model of prediction for human Brucellosis in China[D]. Lanzhou:Lanzhou University, 2016.(in Chinese) [5] 胡跃华,廖家强,冯国双,等.自回归移动平均模型在全国手足口病疫情预测中的应用[J].疾病监测, 2014, 29(10):827-832. HU Y H, LIAO J Q, FENG G S, et al. Application of multiple seasonal autoregressive integrated moving average model in prediction of incidence of hand foot and mouth disease in China[J]. Disease Surveillance, 2014, 29(10):827-832.(in Chinese) [6] 金如锋,邱宏,周霞,等. ARIMA模型和GM (1,1)模型预测全国3种肠道传染病发病率[J].复旦学报(医学版), 2008, 35(5):675-680. JIN R F, QIU H, ZHOU X, et al. Forecasting incidence of intestinal infectious diseases in mainland China with ARIMA model and GM (1,1) model[J]. Fudan University Journal of Medical Sciences, 2008, 35(5):675-680.(in Chinese) [7] 范引光,吕金伟,戴色莺,等. ARIMA模型与灰色预测模型GM (1,1)在HIV感染人数预测中的应用[J].中华疾病控制杂志, 2012, 16(12):1100-1103. FAN Y G, Lü J W, DAI S Y, et al. Prediction on the number of HIV with models of ARIMA and GM (1,1)[J]. Chinese Journal of Disease Control&Prevention, 2012, 16(12):1100-1103.(in Chinese) [8] 时照华,苏虹,秦凤云,等. ARIMA模型在常见呼吸道传染病疫情预测中的应用[J].安徽医科大学学报, 2013, 48(7):783-786. SHI Z H, SU H, QIN F Y, et al. Application of ARIMA model in prediction of respiratory infectious diseases[J]. Acta Universitatis Medicinalis Anhui, 2013, 48(7):783-786.(in Chinese) [9] 吴家兵,叶临湘,尤尔科. ARIMA模型在传染病发病率预测中的应用[J].数理医药学杂志, 2007, 20(1):90-92. WU J B, YE L X, YOU E K. Prediction of incidence of notifiable contagious diseases by application of time series model[J]. Journal of Mathematical Medicine, 2007, 20(1):90-92.(in Chinese) [10] 郑慧敏,薛允莲,黄燕飞,等. ARIMA模型在深圳市法定传染病发病趋势预测的应用[J].实用预防医学, 2016, 23(2):240-243. ZHENG H M, XUE Y L, HUANG Y F, et al. Application of ARIMA model to predicting the incidence tendency of notifiable communicable diseases in Shenzhen City[J]. Practical Preventive Medicine, 2016, 23(2):240-243.(in Chinese) [11] 董选军,贾伟娜. ARIMA时间序列和BP神经网络在传染病预测中的比较[J].现代实用医学, 2010, 22(2):142-143, 147. DONG X J, JIA W N. Predictive efficiency comparison of ARIMA-time-series and BP neural net model on infectious diseases[J]. Modern Practical Medicine, 2010, 22(2):142-143, 147.(in Chinese) [12] CHEKOL B E, HAGRAS H. Employing machine learning techniques for the malaria epidemic prediction in Ethiopia[C]//Proceedings of the 10th Computer Science and Electronic Engineering. Colchester, UK, 2018:89-94. [13] LEE M K, PAIK J H, NA I S. Outbreak prediction of hepatitis A in Korea based on statistical analysis and LSTM network[C]//Proceedings of 2020 International Conference on Artificial Intelligence in Information and Communication. Fukuoka, Japan, 2020:379-381. [14] PANDEY M K, SUBBIAH K. Performance analysis of time series forecasting using machine learning algorithms for prediction of Ebola casualties[C]//Proceedings of the 1st International Conference on Applications of Computing and Communication Technologies. Delhi, India, 2018:320-334. [15] 柴国荣,王斌,沙勇忠.基于多机器学习方法联合的公共卫生风险预测研究:以兰州市流感预测为例[J].数据分析与知识发现, 2021, 5(1):90-98. CHAI G R, WANG B, SHA Y Z. Public health risk forecasting with multiple machine learning methods combined:Case study of influenza forecasting in Lanzhou, China[J]. Data Analysis and Knowledge Discovery, 2021, 5(1):90-98.(in Chinese) [16] KAUFMAN H W, CHEN Z, MEYER W A, et al. Insights from patterns of SARS-CoV-2 immunoglobulin G serology test results from a national clinical laboratory, United States, March-July 2020[J]. Population Health Management, 2021, 24(S1):S-35-S-42. [17] CUI Y P, NI S J. SHEN S F. A network-based model to explore the role of testing in the epidemiological control of the COVID-19 pandemic[J]. BMC Infectious Diseases, 2021, 21(1):58. [18] RUMELHART D E, HINTON G E, WILLIAMS R J. Learning representations by back-propagating errors[J]. Nature, 1986, 323(6088):533-536. [19] BREIMAN L. Random forests[J]. Machine Learning, 2001, 45(1):5-32. [20] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8):1735-1780. [21] OLAH C. Understanding LSTM networks[EB/OL].[2021-02-10]. http://colah.github.io/posts/2015-08-Understanding-LSTMs/. [22] CHO K, VAN MERRIENBOER B, GULCEHRE C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha, Qatar, 2014:1724-1734. [23] SUTSKEVER I, VINYALS O, LE Q V. Sequence to sequence learning with neural networks[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal, Canada, 2014:3104-3112. [24] MATHIEU E, RITCHIE H, RODÉS-GUIRAO L, et al. Coronavirus pandemic (COVID-19)[EB/OL].[2021-02-10]. https://ourworldindata.org/coronavirus. [25] WHO. WHO coronavirus (COVID-19) dashboard[EB/OL].[2021-03-17]. https://covid19.who.int/. |
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|