PDF(3890 KB)
Investment estimation model for utility tunnels using machine learning and data-driven methods
Yanqiong DING, Xue WANG, Zhili TANG, Qianjun XU
Journal of Tsinghua University(Science and Technology) ›› 2026, Vol. 66 ›› Issue (5) : 911-918.
PDF(3890 KB)
PDF(3890 KB)
Investment estimation model for utility tunnels using machine learning and data-driven methods
Objective: Accurately and quickly determining investment estimation for utility tunnels is crucial for cost optimization and investment decision-making. Owing to the rapid development of artificial intelligence technology and the continuous accumulation of engineering investment databases, research on engineering investment estimation based on machine learning has become a hot topic. However, existing studies on utility tunnel investment estimation suffer from problems such as small data samples, reliance on single methods, lack of performance comparisons among multiple algorithms, low accuracy, and poor generalization performance. These issues result in significant prediction errors in practical applications that fail to meet the needs of engineering practice. Therefore, there is an urgent need to develop a universal investment estimation model for utility tunnels based on machine learning and data-driven approaches. Methods: This study presents a systematic approach to constructing a utility tunnel investment estimation model, covering the data collection, preprocessing, feature engineering, multi-algorithm comparison, hyperparameter optimization, performance evaluation, and model application processes. Six key factors affecting utility tunnel investment estimation were selected as the input variables of the model, including tunnel length, number of chambers, excavation depth, cross-sectional size, construction method, and construction city, while the civil engineering cost of utility tunnels was taken as the output variable. A dataset containing 98 utility tunnel investment samples was created. Three data preprocessing methods were adopted to standardize the input variables of the dataset, including Min-Max normalization, Z-Score standardization, and RobustScaler. Based on Pearson's correlation analysis of the input variables and civil engineering cost, as well as the results of the feature importance analysis, nine groups of feature combinations that play a decisive role in predicting civil engineering cost were screened out. For multi-algorithm comparison, five classic machine learning algorithms were used to construct the utility tunnel investment estimation model: categorical boosting regression, gradient boosting decision tree, decision tree, extreme gradient boosting (XGB), and K-nearest neighbors. The Optuna hyperparameter optimization algorithm was used to optimize the model hyperparameters, and its performance was compared with that of the model without hyperparameter optimization. The performance of the estimation model was evaluated based on the coefficient of determination (R2 value) under three scenarios: three different preprocessing methods, nine different feature combinations, and with or without Optuna hyperparameter optimization. Through this evaluation, the optimal data preprocessing method and feature combination were determined, as well as the performance of Optuna hyperparameter optimization. Finally, the optimal estimation model was identified. Based on the optimal estimation model, an empirical prediction analysis of investment estimation was conducted for two utility tunnels in Beijing. Results: The results show that the RobustScaler method is the optimal data preprocessing method for the dataset and the five algorithm models in this paper. Using the F-1 feature combination yields the highest average R2 value (0.623) among the five algorithm models, making F-1 the optimal feature combination. Hyperparameter optimization using the Optuna algorithm improves the performance of the five models by up to 40.4%, compared with no optimization. The Optuna-XGB algorithm model performed best after optimization with an R2 value of 0.843. The prediction deviation rates for the two utility tunnels in Beijing are 5.63% and 6.50%, respectively, for the Optuna-XGB algorithm model (the best-performing model), which are significantly lower than the 10% deviation requirement. Conclusions: This study presents a data-driven investment estimation model for the civil engineering of utility tunnels, utilizing machine learning. The model's performance is examined in relation to the impact of data preprocessing methods, feature combinations, and the Optuna hyperparameter optimization algorithm. The optimal model proposed in this paper is highly accurate, which is significant for optimizing utility tunnel costs and making investment decisions, as well as ensuring their sustainable development.
utility tunnel / machine learning / data-driven / investment estimation model
| 1 |
|
| 2 |
|
| 3 |
|
| 4 |
曾国华, 汤志立, 徐千军. 基于综合效益量化的综合管廊投资决策与成本回收机制[J]. 清华大学学报(自然科学版), 2023, 63(2): 210- 222.
|
| 5 |
高韵蕊. 城市地下综合管廊建设投资及运营成本的研究[J]. 工程经济, 2021, 31(5): 11- 14.
|
| 6 |
张悠. 基于支持向量机的综合管廊工程造价估算模型研究[D]. 西安: 西安建筑科技大学, 2018.
ZHANG Y. Research on cost estimation model of the utility tunnels based on support vector machines[D]. Xi'an: Xi'an University of Architecture and Technology, 2018. (in Chinese)
|
| 7 |
张月, 杨艺鑫, 王长祥. 厦门市综合管廊全生命周期规划设计技术对管廊建设综合效益的影响[J]. 给水排水, 2020, 56(增刊1): 933-937, 941.
|
| 8 |
何仁香, 王涌涛, 李劼, 等. 重庆市科学大道综合管廊建设经济效益分析[J]. 地下空间与工程学报, 2024, 20(增刊2): 536- 542.
|
| 9 |
段晓晨, 余建星, 张建龙. 基于CS、WLC、BPNN理论预测铁路工程造价的方法[J]. 铁道学报, 2006, 28(6): 117- 122.
|
| 10 |
宋金华, 岳浩. 基于PCA-PSO-LSSVM的综合管廊投资估算方法[J]. 湖南科技大学学报(自然科学版), 2024, 39(1): 36- 44.
|
| 11 |
孟春成, 亐道远, 段晓晨. 城市轨道交通土建工程造价非线性预测与反演[J]. 西南交通大学学报, 2025, 60(1): 137- 146.
|
| 12 |
李婉斌, 刘帆, 黄兆祖. 基于BP神经网络的数据中心项目投资智能估算研究[J]. 建筑经济, 2022, 43(增刊2): 88- 91.
|
| 13 |
胡庆国, 田学泽, 何忠明. 基于遗传算法优化极限学习机的绿色建筑投资估算方法[J]. 建筑经济, 2020, 41(10): 125- 130.
|
| 14 |
王杰, 卢毅. 基于ANN贡献分析及GEP算法的地铁车站土建造价预测模型[J]. 铁道科学与工程学报, 2020, 17(8): 2152- 2161.
|
| 15 |
郑晓蕾, 张仕廉. 基于主要特征因素与BP-GEP网络的公路工程造价预测模型探究[J]. 公路工程, 2018, 43(1): 206- 210.
|
| 16 |
王雪, 汤志立, 徐千军. 基于机器学习算法的综合管廊土建工程费预测研究[J]. 工程造价管理, 2025, 36(1): 32- 37.
|
| 17 |
|
| 18 |
李芊, 张悠. 基于遗传支持向量机的综合管廊土建工程造价估算方法研究[J]. 隧道建设(中英文), 2018, 38(2): 171- 175.
|
| 19 |
胡庆国, 蔡孟龙, 何忠明. 基于GA-BP神经网络的综合管廊投资估算研究[J]. 长沙理工大学学报(自然科学版), 2020, 17(2): 68- 74.
|
/
| 〈 |
|
〉 |