基于YOLO框架的隧道衬砌表观病害智能识别

刘勇; 王亚琼; 王志丰

doi:10.16511/j.cnki.qhdxxb.2026.27.014

PDF(27106 KB)

清华大学学报（自然科学版） ›› 2026, Vol. 66 ›› Issue (6) : 1224-1237. DOI: 10.16511/j.cnki.qhdxxb.2026.27.014

车辆与交通

基于YOLO框架的隧道衬砌表观病害智能识别

刘勇¹, 王亚琼^1,2, 王志丰^1,2

作者信息 +

Intelligent recognition of apparent defects in tunnel lining based on the YOLO framework

LIU Yong¹, WANG Yaqiong^1,2, WANG Zhifeng^1,2

Author information +

文章历史 +

摘要

为提高隧道衬砌裂缝、渗漏水、起皮、掉块等表观病害智能检测效率及准确度,该文采用YOLO框架,结合主干信息共享与特征自适应加权融合机制提出一种衬砌表观病害检测网络(lining apparent defect detection network,LADDNet)。首先,采用轻量化ShuffleNet、GhostNet辅助CSPNet构建三分支特征协同提取网络,以实现不同主干间信息的传递共享;其次,设计并提出一种注意力集成多感受野特征自适应融合(attention-integrated multi-receptive field adaptive fusion,AMFAF)模块,通过并行多感受野卷积及注意力加权实现对主干输出特征的自适应融合;最后,引入基于注意力的尺度内特征交互(attention-based intra-scale feature interaction,AIFI)模块和自适应特征融合检测头(adaptively spatial feature fusion head,ASFF-Head)以提高算法对多尺度特征的表征能力,并完成整体框架的端到端检测。实验结果表明： LADDNet在验证集推理中获得的F1分数为0.831,mAP@0.5为0.848,mAP@0.5:0.95为0.595,精度指标高于对比模型检测结果;LADDNet的推理时间、参数量和浮点运算数分别为： 9.2 ms、14.07×10⁶和21.4×10⁹,相较RT-DETR能够展现出更优的检测速率。

Abstract

[Objective] Detecting apparent defects is fundamental for assessing the health of structures and provides guidance for preventive maintenance toward mitigating engineering hazards. Deep learning-driven computer vision methods have recently gained prominence in intelligent defect identification. However, the missed detection of small-scale targets and the unbalanced accuracy across defect categories are limitations of existing algorithms, owing to substantial variations in spatial scales. To overcome these limitations, this study introduces a detection approach that effectively captures deep network feature representations, mitigates the loss of semantic information for small targets, and maintains balanced recognition performance across multiple defect types. [Methods] A lining apparent defect detection network (LADDNet) that integrates an information-sharing backbone with an adaptive and hierarchically organized feature fusion mechanism is proposed. First, a three-branch collaborative feature extraction architecture is constructed by incorporating lightweight ShuffleNet and GhostNet modules into a CSPNet framework. This design facilitates complementary feature learning across branches, thereby enhancing the stability of gradient propagation. Thereafter, an attention-integrated multi-receptive field adaptive fusion (AMFAF) module is developed. This module employs parallel convolutions with diverse receptive fields to extract multi-level spatial information and combines them via attention-based weighting, allowing the network to automatically emphasize discriminative features associated with cracks, seepage regions, and spalling contours. An attention-based intra-scale feature interaction (AIFI) module is also introduced to enhance semantic consistency within individual feature scales by promoting effective cross-channel communication and suppressing redundant responses. Finally, an adaptively spatial feature fusion detection head (ASFF-Head) is incorporated to refine multi-scale feature aggregation, improve the localization precision, and reduce missed detections of small or low-contrast targets. By integrating these modules into a unified framework, the proposed network supports end-to-end training and inference. [Results] LADDNet exhibits significant advantages in the overall detection performance and inference efficiency. On the validation set, the model achieves F1, mAP@0.5, and mAP@0.5: 0.95 scores of 0.831, 0.848, and 0.595, respectively. On the test set, the model attains an F1 score of 0.794, mAP@0.5 of 0.830, and mAP@0.5: 0.95 of 0.579. Compared with a range of representative detection models, LADDNet achieves consistent improvements in both the F1 score and mAP@0.5. In terms of inference efficiency, LADDNet achieves real-time performance with a per-image latency of 9.2 ms, only 14.07×10⁶ parameters, and 21.4×10⁹ FLOPs, delivering substantially faster inference than RT-DETR. Furthermore, when detecting defects in images containing handwritten markings or interference from auxiliary tunnel facilities, LADDNet continues to demonstrate strong robustness. For the identification of mesh cracks, water seepage, and spalling, the model demonstrates high confidence, low missed-detection rates, and precise localization. [Conclusions] The proposed LADDNet model affords markedly enhanced intelligent detection of diverse tunnel-lining defects by integrating information sharing, multi-receptive-field feature extraction, adaptive fusion strategies, and intra-scale interaction mechanisms. It delivers notable gains in accuracy, robustness, and computational efficiency, effectively overcoming long-standing challenges in multi-scale and multi-type defect recognition. These advances position LADDNet as a reliable visual perception module for automated tunnel inspection, structural condition evaluation, and long-term operational monitoring. Overall, the approach shows strong promise for real-world engineering applications and broad deployment in next-generation intelligent maintenance systems. Its versatility further underscores its value for future infrastructure management initiatives.

导出引用

刘勇, 王亚琼, 王志丰. 基于YOLO框架的隧道衬砌表观病害智能识别[J]. 清华大学学报（自然科学版）. 2026, 66(6): 1224-1237 https://doi.org/10.16511/j.cnki.qhdxxb.2026.27.014

LIU Yong, WANG Yaqiong, WANG Zhifeng. Intelligent recognition of apparent defects in tunnel lining based on the YOLO framework[J]. Journal of Tsinghua University(Science and Technology). 2026, 66(6): 1224-1237 https://doi.org/10.16511/j.cnki.qhdxxb.2026.27.014

中图分类号： TP391.7

参考文献

[1] 中华人民共和国交通运输部. 2024年交通运输行业发展统计公报[N]. 中国交通报, 2025-06-12(002). The Ministry of Transport, PRC. Statistical bulletin of transportation industry development in 2024[N]. China Communications News, 2025-06-12(002). (in Chinese)
[2] 孙己龙, 刘勇, 周黎伟, 等. 基于DCNv2和Transformer Decoder的隧道衬砌裂缝高效检测模型研究[J]. 图学学报, 2024, 45(5): 1050-1061. SUN J L, LIU Y, ZHOU L W, et al. Research on efficient detection model of tunnel lining crack based on DCNv2 and Transformer Decoder [J]. Journal of Graphics, 2024, 45(5): 1050-1061. (in Chinese)
[3] 孙己龙, 刘勇, 路鑫, 等. 基于可变形卷积网络和YOLOv8的衬砌裂缝检测模型研究[J]. 中国安全生产科学技术, 2024, 20(8): 181-189. SUN J L, LIU Y, LU X, et al. Research on detection model of lining crack based on deformable convolutional network and YOLOv8[J]. Journal of Safety Science and Technology, 2024, 20(8): 181-189. (in Chinese)
[4] 周中, 闫龙宾, 张俊杰, 等. 基于深度学习的公路隧道表观病害智能识别研究现状与展望[J]. 土木工程学报, 2022, 55(S2): 38-48. ZHOU Z, YAN L B, ZHANG J J, et al. Review and prospect of intelligent identification of apparent diseases in highway tunnels based on deep learning [J]. China Civil Engineering Journal, 2022, 55(S2): 38-48. (in Chinese)
[5] 赵磊, 张文, 孙振国, 等. 基于色彩分割及信息熵加权特征匹配的刹车片图像分类算法[J]. 清华大学学报(自然科学版), 2018, 58(6): 547-552. ZHAO L, ZHANG W, SUN Z G, et al. Brake pad image classification algorithm based on color segmentation and information entropy weighted feature matching [J]. Journal of Tsinghua University (Science & Technology), 2018, 58(6): 547-552. (in Chinese)
[6] 潘少伟, 杨怡婷, 尚娅敏, 等. 基于DC-HED网络和骨架提取的岩心图像边缘检测[J]. 中国石油大学学报(自然科学版), 2025, 49(3): 97-107. PAN S W, YANG Y T, SHANG Y M, et al. Edge detection of petrographic thin section images with DC-HED network and skeleton extraction [J]. Journal of China University of Petroleum (Edition of Natural Science), 2025, 49(3): 97-107. (in Chinese)
[7] 刘帅武, 王大志, 左少燕, 等. 基于改进区域生长法的X波段雷达图像处理[J]. 光学学报, 2024, 44(24): 2428008. LIU S W, WANG D Z, ZUO S Y, et al. X-band radar image processing based on improved region growing method [J]. Acta Optica Sinica, 2024, 44(24): 2428008. (in Chinese)
[8] 李金沛, 孟晓林, 胡亮亮, 等. 基于改进YOLOv8的桥梁小目标裂缝检测[J]. 清华大学学报(自然科学版), 2025, 65(7): 1260-1271. LI J P, MENG X L, HU L L, et al. Bridge small target crack detection based on improved YOLOv8[J]. Journal of Tsinghua University (Science & Technology), 2025, 65(7): 1260-1271. (in Chinese)
[9] 王魁, 周湘堡, 周天浩, 等. 基于YOLO v5算法的小鼠头部位置和姿态监测系统设计[J]. 清华大学学报(自然科学版), 2025, 65(5): 1000-1008. WANG K, ZHOU X B, ZHOU T H, et al. Design of a mice head position and posture monitoring system based on the YOLO v5 algorithm [J]. Journal of Tsinghua University (Science & Technology), 2025, 65(5): 1000-1008. (in Chinese)
[10] 邓力, 周进, 刘全义. 基于改进YOLOv8的火焰与烟雾检测算法[J]. 清华大学学报(自然科学版), 2025, 65(4): 681-689. DENG L, ZHOU J, LIU Q Y, et al. Fire and smoke detection algorithm based on improved YOLOv8[J]. Journal of Tsinghua University (Science & Technology), 2025, 65(4): 681-689. (in Chinese)
[11] 吴浩楠, 史宏, 王瑞, 等. 基于改进YOLO v8的铁路人员入侵检测方法研究[J]. 铁道科学与工程学报, 2025, 22(4): 1828-1839. WU H N, SHI H, WANG R, et al. Research on railway intrusion detection method based on improved YOLO v8[J]. Journal of Railway Science and Engineering, 2025, 22(4): 1828-1839. (in Chinese)
[12] 周中, 闫龙宾, 张俊杰, 等. 基于自注意力机制与卷积神经网络的隧道衬砌裂缝智能检测[J]. 铁道学报, 2024, 46(9): 182-192. ZHOU Z, YAN L B, ZHANG J J, et al. Intelligent detection of tunnel lining cracks based on self-attention mechanism and convolution neural network [J]. Journal of the China Railway Society, 2024, 46(9): 182-192. (in Chinese)
[13] 陈灿森, 刘巍. Leakage-YOLO: 隧道场景下裂缝漏水的实时目标检测算法[J]. 计算机工程与应用, 2025, 61(6): 118-127. CHEN C S, LIU W. Leakage-YOLO: Real-time object detection algorithm for crack and leakage in tunnel scenarios [J]. Computer Engineering and Applications, 2025, 61(6): 118-127. (in Chinese)
[14] 张振海, 孙岩, 李哲远. 基于MDS-YOLO的轻量级隧道表观病害检测算法[J/OL]. 交通运输工程学报. (2025-07-14) [2025-08-08]. https://doi.org/10.19818/j.cnki.1671-1637.2025.06.022. ZHANG Z H, SUN Y, LI Z Y. Lightweight tunnel apparent defect detection algorithm based on MDS-YOLO [J/OL]. Journal of Traffic and Transportation Engineering. (2025-07-14) [2025-08-08]. https://doi.org/10.19818/j.cnki.1671-1637.2025.06.022. (in Chinese)
[15] 宋娟, 贺龙喜, 龙会平. 基于深度学习的隧道衬砌多病害检测算法[J]. 浙江大学学报(工学版), 2024, 58(6): 1161-1173. SONG J, HE L X, LONG H P. Deep learning-based algorithm for multi defect detection in tunnel lining [J]. Journal of Zhejiang University (Engineering Science), 2024, 58(6): 1161-1173. (in Chinese)
[16] Roboflow Universe. Crack dataset [DS/OL]. (2022-12-01) [2025-02-23]. https://universe.roboflow.com/university-bswxt/crack-bphdr.
[17] FENG Y, FENG S J, ZHANG X L, et al. A two-step deep learning-based framework for metro tunnel lining defect recognition [J]. Tunnelling and Underground Space Technology, 2024, 150: 105832.
[18] WANG C Y, YEH I H, MARK LIAO H Y. YOLOv9: Learning what you want to learn using programmable gradient information [C]//Proceedings of the 18th European Conference on Computer Vision. Milan, Italy: Springer, 2024: 1-21.
[19] 罗洋, 何自芬, 张印辉, 等. 主干信息共享与多感受野特征自适应融合的作物叶片等级和病害识别方法[J]. 农业机械学报, 2025, 56(1): 377-387. LUO Y, HE Z F, HANG Y H. Crop leaf grade and disease recognition method based on backbone information sharing and multi-receptive field feature adaptive fusion [J]. Transactions of the Chinese Society for Agricultural Machinery, 2025, 56(1): 377-387. (in Chinese)
[20] WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional block attention module [C]//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer, 2018: 3-19.
[21] LAU K W, PO L M, REHMAN Y A U. Large separable kernel attention: Rethinking the large kernel attention design in CNN [J]. Expert Systems with Applications, 2024, 236: 121352.
[22] ZHAO Y, LV W Y, XU S L, et al. DETRs beat YOLOs on real-time object detection [C]//Proceedings of 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE, 2024: 16965-16974.
[23] LIU S T, HUANG D, WANG Y H. Learning spatial fusion for single-shot object detection [EB/OL]. (2019-11-25) [2025-05-15]. https://arxiv.org/abs/1911.09516.
[24] HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018: 7132-7141.
[25] MISRA D, NALAMADA T, ARASANIPALAI A U, et al. Rotate to attend: Convolutional triplet attention module [C]//Proceedings of 2021 IEEE Winter Conference on Applications of Computer Vision. Waikoloa, USA: IEEE, 2021: 3138-3147.
[26] WANG Q L, WU B G, ZHU P F, et al. ECA-Net: Efficient channel attention for deep convolutional neural networks [C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE, 2020: 11531-11539.
[27] CHATTOPADHYAY A, SARKAR A, HOWLADER P, et al. Grad-CAM++: Improved visual explanations for deep convolutional networks [C]//Proceedings of 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). Lake Tahoe, USA: IEEE, 2018: 839-847.

基金

国家自然科学基金面上项目(52478384); 陕西省杰出青年科学基金项目(2025JC-JCQN-026); 陕西省创新能力支撑计划项目(2023-CX-TD-35); 陕西省秦创原“科学家+工程师”队伍建设项目(2023KXJ-159)

PDF(27106 KB)

Accesses

Citation

Detail

段落导航

收稿日期	出版日期
2025-08-25	2026-06-15
发布日期
2026-06-03

选择文件类型/文献管理软件名称

选择包含的内容

摘要

Abstract

关键词

Key words

引用本文

{{custom_sec.title}}

{{custom_sec.title}}

参考文献

基金

访问统计

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

摘要

Abstract

关键词

Key words

引用本文

{{custom_sec.title}}

{{custom_sec.title}}

参考文献

基金

访问统计