基于机器学习和数据驱动的综合管廊投资估算模型

丁彦琼, 王雪, 汤志立, 徐千军

清华大学学报(自然科学版) ›› 2026, Vol. 66 ›› Issue (5) : 911-918.

PDF(3889 KB)
PDF(3889 KB)
清华大学学报(自然科学版) ›› 2026, Vol. 66 ›› Issue (5) : 911-918. DOI: 10.16511/j.cnki.qhdxxb.2025.21.040
建设管理

基于机器学习和数据驱动的综合管廊投资估算模型

  • 丁彦琼1, 王雪2, 汤志立3, 徐千军4
作者信息 +

Investment estimation model for utility tunnels using machine learning and data-driven methods

  • DING Yanqiong1, WANG Xue2, TANG Zhili3, XU Qianjun4
Author information +
文章历史 +

摘要

综合管廊投资估算的快速准确确定对成本优化至关重要。针对当前综合管廊投资估算方法单一、准确率低、模型泛化性能差的问题,该文提出了一种通用的基于机器学习和数据驱动的综合管廊投资估算模型。首先,选择影响综合管廊投资估算的6个因素作为模型的输入变量,建立了包含98个样本的数据集;其次,基于数据预处理结果,在特征重要性分析的基础上构建了9种特征组合;再次,建立了5种基于机器学习方法的投资估算模型,开展了基于Optuna的模型超参数优化;最后,基于模型性能评估获得了最优模型Optuna-XGB,确定系数(coefficient of determination)R2为0.843,其对北京市2个综合管廊投资估算的预测偏差率为5.63%~6.50%,表明该最优模型具有较高的预测精度。

Abstract

[Objective] Accurately and quickly determining investment estimation for utility tunnels is crucial for cost optimization and investment decision-making. Owing to the rapid development of artificial intelligence technology and the continuous accumulation of engineering investment databases, research on engineering investment estimation based on machine learning has become a hot topic. However, existing studies on utility tunnel investment estimation suffer from problems such as small data samples, reliance on single methods, lack of performance comparisons among multiple algorithms, low accuracy, and poor generalization performance. These issues result in significant prediction errors in practical applications that fail to meet the needs of engineering practice. Therefore, there is an urgent need to develop a universal investment estimation model for utility tunnels based on machine learning and data-driven approaches. [Methods] This study presents a systematic approach to constructing a utility tunnel investment estimation model, covering the data collection, preprocessing, feature engineering, multi-algorithm comparison, hyperparameter optimization, performance evaluation, and model application processes. Six key factors affecting utility tunnel investment estimation were selected as the input variables of the model, including tunnel length, number of chambers, excavation depth, cross-sectional size, construction method, and construction city, while the civil engineering cost of utility tunnels was taken as the output variable. A dataset containing 98 utility tunnel investment samples was created. Three data preprocessing methods were adopted to standardize the input variables of the dataset, including Min-Max normalization, Z-Score standardization, and RobustScaler. Based on Pearson's correlation analysis of the input variables and civil engineering cost, as well as the results of the feature importance analysis, nine groups of feature combinations that play a decisive role in predicting civil engineering cost were screened out. For multi-algorithm comparison, five classic machine learning algorithms were used to construct the utility tunnel investment estimation model: categorical boosting regression, gradient boosting decision tree, decision tree, extreme gradient boosting (XGB), and K-nearest neighbors. The Optuna hyperparameter optimization algorithm was used to optimize the model hyperparameters, and its performance was compared with that of the model without hyperparameter optimization. The performance of the estimation model was evaluated based on the coefficient of determination (R2 value) under three scenarios: three different preprocessing methods, nine different feature combinations, and with or without Optuna hyperparameter optimization. Through this evaluation, the optimal data preprocessing method and feature combination were determined, as well as the performance of Optuna hyperparameter optimization. Finally, the optimal estimation model was identified. Based on the optimal estimation model, an empirical prediction analysis of investment estimation was conducted for two utility tunnels in Beijing. [Results] The results show that the RobustScaler method is the optimal data preprocessing method for the dataset and the five algorithm models in this paper. Using the F-1 feature combination yields the highest average R2 value (0.623) among the five algorithm models, making F-1 the optimal feature combination. Hyperparameter optimization using the Optuna algorithm improves the performance of the five models by up to 40.4%, compared with no optimization. The Optuna-XGB algorithm model performed best after optimization with an R2 value of 0.843. The prediction deviation rates for the two utility tunnels in Beijing are 5.63% and 6.50%, respectively, for the Optuna-XGB algorithm model (the best-performing model), which are significantly lower than the 10% deviation requirement. [Conclusions] This study presents a data-driven investment estimation model for the civil engineering of utility tunnels, utilizing machine learning. The model's performance is examined in relation to the impact of data preprocessing methods, feature combinations, and the Optuna hyperparameter optimization algorithm. The optimal model proposed in this paper is highly accurate, which is significant for optimizing utility tunnel costs and making investment decisions, as well as ensuring their sustainable development.

关键词

综合管廊 / 机器学习 / 数据驱动 / 投资估算模型

Key words

utility tunnel / machine learning / data-driven / investment estimation model

引用本文

导出引用
丁彦琼, 王雪, 汤志立, 徐千军. 基于机器学习和数据驱动的综合管廊投资估算模型[J]. 清华大学学报(自然科学版). 2026, 66(5): 911-918 https://doi.org/10.16511/j.cnki.qhdxxb.2025.21.040
DING Yanqiong, WANG Xue, TANG Zhili, XU Qianjun. Investment estimation model for utility tunnels using machine learning and data-driven methods[J]. Journal of Tsinghua University(Science and Technology). 2026, 66(5): 911-918 https://doi.org/10.16511/j.cnki.qhdxxb.2025.21.040
中图分类号: TU990.3   

参考文献

[1] WANG T Y, TAN L X, XIE S Y, et al. Development and applications of common utility tunnels in china[J]. Tunnelling and Underground Space Technology, 2018, 76: 92-106.
[2] ALAGHBANDRAD A, HAMMAD A. Framework for multi-purpose utility tunnel lifecycle cost assessment and cost-sharing[J]. Tunnelling and Underground Space Technology, 2020, 104: 103528.
[3] APAK M Y, OZEN H, CALIS M, et al. Applications of utility tunnels for natural gas pipelines[J]. Tunnelling and Underground Space Technology, 2022, 122: 104243.
[4] 曾国华, 汤志立, 徐千军. 基于综合效益量化的综合管廊投资决策与成本回收机制[J].清华大学学报(自然科学版), 2023, 63(2): 210-222. ZENG G H, TANG Z L, XU Q J. Investment decision-making and cost recovery mechanisms of utility tunnels based on comprehensive benefit quantification[J]. Journal of Tsinghua University (Science and Technology), 2023, 63(2): 210-222.(in Chinese)
[5] 高韵蕊. 城市地下综合管廊建设投资及运营成本的研究[J].工程经济, 2021, 31(5): 11-14. GAO Y R. Study on the investment and operation cost of urban utility tunnel[J]. Engineering Economy, 2021, 31(5): 11-14. (in Chinese)
[6] 张悠. 基于支持向量机的综合管廊工程造价估算模型研究[D].西安: 西安建筑科技大学, 2018. ZHANG Y. Research on cost estimation model of the utility tunnels based on support vector machines[D]. Xi'an: Xi'an University of Architecture and Technology, 2018. (in Chinese)
[7] 张月, 杨艺鑫, 王长祥. 厦门市综合管廊全生命周期规划设计技术对管廊建设综合效益的影响[J].给水排水, 2020, 56(增刊1): 933-937, 941. ZHANG Y, YANG Y X, WANG C X. The influence of full life cycle planning and design technology on comprehensive benefits of utility tunnel in Xiamen[J]. Water & Wastewater Engineering, 2020, 56(S1): 933-937, 941. (in Chinese)
[8] 何仁香, 王涌涛, 李劼, 等. 重庆市科学大道综合管廊建设经济效益分析[J].地下空间与工程学报, 2024, 20(增刊2): 536-542. HE R X, WANG Y T, LI J, et al. Economic benefits analysis of Chongqing science avenue utility tunnel construction[J]. Chinese Journal of Underground Space and Engineering, 2024, 20(S2): 536-542. (in Chinese)
[9] 段晓晨, 余建星, 张建龙. 基于CS、 WLC、 BPNN理论预测铁路工程造价的方法[J].铁道学报, 2006, 28(6): 117-122. DUAN X C, YU J X, ZHANG J L. A method of estimating WLC of scheduled railway projects based on CS, WLC and BPNN theorems[J]. Journal of the China Railway Society, 2006, 28(6): 117-122. (in Chinese)
[10] 宋金华, 岳浩. 基于PCA-PSO-LSSVM的综合管廊投资估算方法[J].湖南科技大学学报(自然科学版), 2024, 39(1): 36-44. SONG J H, YUE H. Investment estimation method for utility tunnels based on PCA-PSO-LSSVM[J]. Journal of Hunan University of Science and Technology (Natural Science Edition), 2024, 39(1): 36-44. (in Chinese)
[11] 孟春成, 亐道远, 段晓晨. 城市轨道交通土建工程造价非线性预测与反演[J].西南交通大学学报, 2025, 60(1): 137-146. MENG C C, QU D Y, DUAN X C. Nonlinear prediction and inversion of civil engineering cost of urban rail transit[J]. Journal of Southwest Jiaotong University, 2025, 60(1): 137-146. (in Chinese)
[12] 李婉斌, 刘帆, 黄兆祖. 基于BP神经网络的数据中心项目投资智能估算研究[J].建筑经济, 2022, 43(增刊2): 88-91. LI W B, LIU F, HUANG Z Z. Research on investment intelligent estimation of data center project based on bp neural network[J]. Construction Economy, 2022, 43(S2): 88-91. (in Chinese)
[13] 胡庆国, 田学泽, 何忠明. 基于遗传算法优化极限学习机的绿色建筑投资估算方法[J].建筑经济, 2020, 41(10): 125-130. HU Q G, TIAN X Z, HE Z M. Investment estimation method of green building based on the optimization of extreme learning machine by genetic algorithm[J]. Construction Economy, 2020, 41(10): 125-130. (in Chinese)
[14] 王杰, 卢毅. 基于ANN贡献分析及GEP算法的地铁车站土建造价预测模型[J].铁道科学与工程学报, 2020, 17(8): 2152-2161. WANG J, LU Y. Prediction model of subway station civil engineering cost based on ANN contribution analysis and GEP algorithm[J]. Journal of Railway Science and Engineering, 2020, 17(8): 2152-2161. (in Chinese)
[15] 郑晓蕾, 张仕廉. 基于主要特征因素与BP-GEP网络的公路工程造价预测模型探究[J].公路工程, 2018, 43(1): 206-210. ZHENG X L, ZHANG S L. The exploration of highway project cost prediction model based on the main feature factor and the BP-GEP network[J]. Highway Engineering, 2018, 43(1): 206-210. (in Chinese)
[16] 王雪, 汤志立, 徐千军. 基于机器学习算法的综合管廊土建工程费预测研究[J].工程造价管理, 2025, 36(1): 32-37. WANG X, TANG Z L, XU Q J. Prediction of civil engineering costs of utility tunnels based on machine learning algorithm[J]. Engineering Cost Management, 2025, 36(1): 32-37. (in Chinese)
[17] MAHMOODZADEH A, MOHAMMADI M, DARAEI A, et al. Forecasting tunnel geology, construction time and costs using machine learning methods[J]. Neural Computing and Applications, 2021, 33: 321-348.
[18] 李芊, 张悠. 基于遗传支持向量机的综合管廊土建工程造价估算方法研究[J].隧道建设(中英文), 2018, 38(2): 171-175. LI Q, ZHANG Y. Study of utility tunnel cost estimation based on genetic algorithm and support vector machine[J]. Tunnel Construction, 2018, 38(2): 171-175. (in Chinese)
[19] 胡庆国, 蔡孟龙, 何忠明. 基于GA-BP神经网络的综合管廊投资估算研究[J].长沙理工大学学报(自然科学版), 2020, 17(2): 68-74. HU Q G, CAI M L, HE Z M. Research on investment estimation of utility tunnels based on GA-BP neural network[J]. Journal of Changsha University of Science & Technology (Natural Science), 2020, 17(2): 68-74. (in Chinese)

基金

国家自然科学基金重大项目(52090084)

PDF(3889 KB)

Accesses

Citation

Detail

段落导航
相关文章

/