大型水电工程数据共享智能体构建方法与应用

杨晨, 罗一鸣, 许后磊, 夏勇, 张志伟, 林鹏

清华大学学报(自然科学版) ›› 2026, Vol. 66 ›› Issue (4) : 702-711.

PDF(12228 KB)
PDF(12228 KB)
清华大学学报(自然科学版) ›› 2026, Vol. 66 ›› Issue (4) : 702-711. DOI: 10.16511/j.cnki.qhdxxb.2026.26.014
水利水电工程

大型水电工程数据共享智能体构建方法与应用

作者信息 +

Construction method and application of a data-sharing agent for large-scale hydropower projects

Author information +
文章历史 +

摘要

针对大型水电工程建造过程产生的海量数据常分散于设计、施工和监理等多个业务主体, 数据类型差异大且专业壁垒高(严重制约数据高效共享和流通)等难题, 该文提出一种大型水电工程数据共享智能体构建方法。首先, 提出以数据共享智能体为核心, 可处理水电工程全生命周期结构化、半结构化和非结构化3类数据的共享框架; 其次, 针对上述3类数据, 分别开发时序数据库、知识图谱和文本向量库等外部工具, 构建数据共享智能体; 再次, 提出数据共享智能体工具学习方法, 建立工具学习数据集, 并将标准语言建模目标作为损失函数, 对DeepSeek-R1-Distill-Qwen模型的1.5B和7.0B这2个版本进行微调; 最后, 以某水电工程为例, 围绕环境监测、工程规范和文献书稿等数据构建数据共享智能体, 实现了工程参建各方之间的数据高效共享, 促进了数据价值的挖掘。结果表明:1.5B和7.0B模型微调后, 规划准确率分别提升270%和104%, 任务成功率分别达65.83%和90.83%。该文研究结果有助于充分挖掘和利用数据价值, 为高海拔地区大型水电工程智能建造提供参考。

Abstract

Objective: Large-scale hydropower projects generate substantial amounts of heterogeneous data dispersed across design, construction, and supervision units. The interoperability among stakeholders is suboptimal due to the heterogeneity of data structures and professional contexts. Consequently, information sharing remains inefficient. Existing studies have typically focused on specific data types or lifecycle stages, lacking a unifying framework to facilitate comprehensive, full-cycle data sharing. To address this issue, this study proposes the development of a data-sharing agent tailored to the needs of hydropower engineering. The proposed agent is designed to accommodate structured, semi-structured, and unstructured data, and it integrates external tools such as time-series databases, knowledge graphs, and text vector databases. This integration enables accurate, on-demand data retrieval. By enhancing the tool-learning capabilities of large language models, the agent bridges data silos, enhances cross-domain collaboration, and lays a solid technical foundation for intelligent construction in complex hydropower projects. Methods: The research commences with a systematic analysis of data-sharing requirements across the full lifecycle of hydropower projects, encompassing time-series monitoring data, technical documentation, and parametric design files. Based on this analysis, a comprehensive agent framework is designed to support multi-modal data interoperability. To ensure its practicality, a supporting tool system is constructed that integrates intelligent modules for database retrieval, knowledge graph querying, and rule-based inference. Furthermore, an action-planning dataset comprising over 4000 samples is developed to train the agent in decision-making and tool invocation. Two versions of the DeepSeek-R1-Distill-Qwen model (1.5B and 7.0B parameters) are fine-tuned using this dataset to enhance structured parameter extraction, multi-step reasoning, and action planning capabilities. To assess performance, a benchmark testing dataset comprising hundreds of real-world business queries derived from hydropower project workflows is established and manually annotated to ensure fairness and reproducibility. Results: Experimental results demonstrated that the fine-tuned models substantially improved planning and reasoning performance. A comparative analysis revealed that the 1.5B and 7.0B models achieved 270% and 104% improvements in planning accuracy, respectively, compared with their pre-fine-tuning counterparts. On the business query test set, the overall output accuracies were 65.83% and 90.83%, respectively, thereby confirming a significant enhancement in model reliability and practical utility through fine-tuning. Notably, the 7.0B model consistently outperformed the smaller version, highlighting the larger model's capacity to handle complex, multi-step reasoning tasks. A practical deployment of the agent-based data-sharing platform was conducted for a real hydropower project in a representative watershed. Under static and structured data-sharing conditions, the agent maintained an average response time of less than 20s. Conversely, dynamic monitoring scenarios involving high-frequency data streams exhibited average latencies exceeding 30s, with peaks exceeding 60s under intensive analytical loads. Conclusions: This study proposes a comprehensive framework for constructing a data-sharing agent that effectively addresses critical challenges in current hydropower data-sharing practices, particularly in high-altitude, data-scarce environments. By aligning agent design with engineering-specific requirements and integrating a highly refined large language model with a domain-oriented tool ecosystem, the proposed method significantly enhances the efficiency, intelligence, and semantic interoperability of data sharing. The agent reduces cross-disciplinary access barriers, improves system responsiveness, and supports knowledge-driven decision-making. The results from field applications confirm its considerable potential for practical implementation in intelligent construction platforms. Furthermore, the findings of this study provide a scalable, generalizable technical foundation for the future development of data-driven management and intelligent decision-support systems in complex hydropower projects.

关键词

水电工程 / 智能体 / 数据共享 / 工具学习 / 知识图谱

Key words

hydropower projects / agent / data sharing / tool learning / knowledge graph

引用本文

导出引用
杨晨, 罗一鸣, 许后磊, . 大型水电工程数据共享智能体构建方法与应用[J]. 清华大学学报(自然科学版). 2026, 66(4): 702-711 https://doi.org/10.16511/j.cnki.qhdxxb.2026.26.014
Chen YANG, Yiming LUO, Houlei XU, et al. Construction method and application of a data-sharing agent for large-scale hydropower projects[J]. Journal of Tsinghua University(Science and Technology). 2026, 66(4): 702-711 https://doi.org/10.16511/j.cnki.qhdxxb.2026.26.014
中图分类号: TV72   

参考文献

1
樊启祥, 林鹏, 魏鹏程, 等. 高海拔地区水电工程智能建造挑战与对策[J]. 水利学报, 2021, 52 (12): 1404- 1417.
FAN Q X , LIN P , WEI P C , et al. Intelligent construction of hydraulic engineering in high altitude areas: Challenges and strategies[J]. Journal of Hydraulic Engineering, 2021, 52 (12): 1404- 1417.
2
ZHANG Z L , ZHANG S R , ZHAO Z Y , et al. HydroBIM—digital design, intelligent construction, and smart operation[J]. Journal of Intelligent Construction, 2023, 1 (2): 9180014.
3
FAN Q X , JIANG X C , WANG K X , et al. Cement grouting online monitoring and intelligent control for dam foundations[J]. Journal of Intelligent Construction, 2023, 1 (1): 9180005.
4
AN R N , LIN P , LI Z C , et al. Intelligent ventilation-on-demand control system for the construction of underground tunnel complex[J]. Journal of Intelligent Construction, 2024, 2 (2): 9180032.
5
FAN Q X , LI G , JIANG X C , et al. Intelligent control method and system for vibroflotation construction in hydropower engineering[J]. Journal of Intelligent Construction, 2024, 2 (3): 9180020.
6
XIANG Y F , LIN P , AN R N , et al. Full participation flat closed-loop safety management method for offshore wind power construction sites[J]. Journal of Intelligent Construction, 2023, 1 (1): 9180006.
7
MA H Q , XIAO H B . Practice and prospects for intelligent construction and smart operation of Lancang River Hydropower Projects[J]. Journal of Intelligent Construction, 2024, 2 (3): 9180040.
8
樊启祥, 林鹏, 魏鹏程, 等. 智能建造闭环控制理论[J]. 清华大学学报(自然科学版), 2021, 61 (7): 660- 670.
FAN Q X , LIN P , WEI P C , et al. Closed-loop control theory of intelligent construction[J]. Journal of Tsinghua University (Science and Technology), 2021, 61 (7): 660- 670.
9
LIN J Y , BRYAN B A , ZHOU X D , et al. Making China's water data accessible, usable and shareable[J]. Nature Water, 2023, 1 (4): 328- 335.
10
JI S X , PAN S R , CAMBRIA E , et al. A survey on knowledge graphs: Representation, acquisition, and applications[J]. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33 (2): 494- 514.
11
MASMOUDI M , KARRAY M H , BEN ABDALLAH BEN LAMINE S , et al. MEMOn: Modular environmental monitoring ontology to link heterogeneous earth observed data[J]. Environmental Modelling & Software, 2020, 124, 104581.
12
MANTOVANI A , PIANA F , LOMBARDO V . Ontology-driven representation of knowledge for geological maps[J]. Computers & Geosciences, 2020, 139, 104446.
13
ZHOU Y H , BAO T F , SHU X S , et al. BIM and ontology-based knowledge management for dam safety monitoring[J]. Automation in Construction, 2023, 145, 104649.
14
周逸凡, 段浩, 赵红莉, 等. 水文模型知识图谱构建与应用[J]. 水利学报, 2024, 55 (1): 80- 91.
ZHOU Y F , DUAN H , ZHAO H L , et al. Hydrological modeling knowledge graph construction and application[J]. Journal of Hydraulic Engineering, 2024, 55 (1): 80- 91.
15
向云飞, 罗一鸣, 宁泽宇, 等. 水电地下工程安全管理多模态知识图谱构建方法[J]. 清华大学学报(自然科学版), 2025, 65 (3): 433- 445.
XIANG Y F , LUO Y M , NING Z Y , et al. Construction method of multimodal knowledge graph for safety management in hydropower underground engineering[J]. Journal of Tsinghua University (Science and Technology), 2025, 65 (3): 433- 445.
16
刘雪梅, 卢汉康, 李海瑞, 等. 知识驱动的水利工程应急方案智能生成方法: 以南水北调中线工程为例[J]. 水利学报, 2023, 54 (6): 666- 676.
LIU X M , LU H K , LI H R , et al. A knowledge-driven approach for intelligent generation of hydraulic engineering contingency plans: A case study of the Middle Route of South-to-North Water Diversion Project[J]. Journal of Hydraulic Engineering, 2023, 54 (6): 666- 676.
17
张天鸿, 王晓玲, 余红玲, 等. 基于大语言模型的灌浆工程知识服务系统[J]. 水利学报, 2025, 56 (1): 130- 142.
ZHANG T H , WANG X L , YU H L , et al. Grouting works knowledge service system based on large language model[J]. Journal of Hydraulic Engineering, 2025, 56 (1): 130- 142.
18
明晨曦, 杨鹏, 张志鑫, 等. 基于水利一张图的地理空间信息问答智能体技术[J/OL]. 人民长江. (2025-03-20)[2025-05-05]. https://link.cnki.net/urlid/42.1202.TV.20250320.1506.004.
MING C X, YANG P, ZHANG Z X, et al. Research on large model agent technology of geospatial information question answering based on water conservancy map[J/OL]. Yangtze River. (2025-03-20)[2025-05-05]. https://link.cnki.net/urlid/42.1202.TV.20250320.1506.004. (in Chinese)
19
XI Z H , CHEN W X , GUO X , et al. The rise and potential of large language model based agents: A survey[J]. Science China Information Sciences, 2025, 68 (2): 121101.
20
ACAR B , BERGER M , AUGUSTO M G , et al. An agent-based data acquisition pipeline for image data[J]. IEEE Access, 2024, 12, 102440- 102448.
21
LI G L , ZHOU X H , ZHAO X Y . LLM for data management[J]. Proceedings of the VLDB Endowment, 2024, 17 (12): 4213- 4216.
22
QU C L , DAI S H , WEI X C , et al. Tool learning with large language models: A survey[J]. Frontiers of Computer Science, 2025, 19 (8): 198343.
23
UHM M , KIM J , AHN S , et al. Effectiveness of retrieval augmented generation-based large language models for generating construction safety information[J]. Automation in Construction, 2025, 170, 105926.
24
YU Y, YANG C H H, KOLEHMAINEN J, et al. Low-rank adaptation of large language model rescoring for parameter-efficient speech recognition[C]// Proceedings of 2023 IEEE Automatic Speech Recognition and Understanding Workshop. Taipei, China: IEEE, 2023: 1-8.
25
PRATAP S , ARANHA A R , KUMAR D , et al. The fine art of fine-tuning: A structured review of advanced LLM fine-tuning techniques[J]. Natural Language Processing Journal, 2025, 11, 100144.
26
ZHOU P , XIE X Y , LIN Z C , et al. Towards understanding convergence and generalization of AdamW[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46 (9): 6486- 6493.

基金

2023年云南省工程研究中心创新能力建设和提升专项项目
云南省数字水工程技术创新中心项目(202305AK340003)
流域梯级水电站物联网信息融合技术、设备与平台研发项目(DJ-HXGG-2022-03)
中国电建集团成都勘测设计研究院有限公司科研项目(WRQ202311151)

版权

版权所有,未经授权,不得转载。
PDF(12228 KB)

Accesses

Citation

Detail

段落导航
相关文章

/