随着中国桥梁大规模进入运维阶段,传统的桥梁检测方法通常依赖人工巡检和目视检查,无人机的桥梁表观图像数据采集能力较强,因此被逐步应用于桥梁表观病害检测。针对交通管控下无人机拍摄桥梁图像尺寸较大和小目标裂缝较多,目前检测算法提取小目标裂缝特征信息困难且存在识别漏检、误检等问题,该文提出了一种基于改进YOLOv8的桥梁小目标裂缝检测算法。首先,用高效视觉变换器(efficient vision transformer,EfficientViT)替换YOLOv8的主干网络,以减少大量冗余参数,增强图像的局部特征提取能力;其次,引入大型选择性核网络(large selective kernel network,LSKNet),融合C2f模块注意力机制,选择性调整卷积核尺寸,降低运行参数,减少计算复杂度;最后,增加双向特征金字塔网络(bidirectional feature pyramid network,BiFPN),融合P2小目标检测头,增加不同尺度特征图之间的信息耦合和小目标裂缝检测框,实现小目标裂缝精准识别与提取。依托无人机采集的某桥梁裂缝数据集进行小目标裂缝检测验证,与YOLOv8模型相比,该算法模型的图像识别准确率提高了3.7%,召回率提高了3.5%,F1分数提高了3.5%,平均精度均值50(mAP50)提高了3.9%,平均精度均值50到95(mAP50-95)提高了7.4%。该文所提的基于改进YOLOv8的桥梁小目标裂缝检测算法提高了桥梁结构小目标裂缝的识别准确率,为广泛应用于桥梁结构的表观健康监测提供了可行的解决方案。
Objective: The structural integrity of bridges is a critical concern as infrastructure ages, necessitating the development of reliable methods for detecting potential failures. Among these, the identification of small target cracks is particularly important, as these cracks often grow undetected until they result in severe damage. Traditional inspection methods, such as manual visual inspections, are hindered by their labor-intensive nature and susceptibility to human error, often resulting in the oversight of small but significant defects. Recent advancements in computer vision and deep learning technologies offer new opportunities to improve the accuracy and efficiency of bridge inspections. This study introduces an innovative approach for detecting small target cracks in bridge structures by employing an enhanced version of the You Only Look Once (YOLOv8) object detection model, a widely recognized algorithm known for its rapid processing capabilities and high detection accuracy. The enhanced YOLOv8 model is tailored to detect small-scale cracks on bridge surfaces that may not be easily identifiable by traditional inspection methods or earlier versions of computer vision models. Methods: The proposed algorithm modifies the standard YOLOv8 model to address the specific challenges associated with detecting small cracks on bridge surfaces. A key modification is the integration of efficient vision transformer (EfficientViT) into the backbone of the YOLOv8 model. EfficientViT is an advanced transformer-based architecture that reduces redundant parameters and optimizes the extraction of local features from high-resolution images, enabling more precise detection of subtle crack features. This enhancement is crucial, as small cracks often exhibit low contrast against their background and may be easily overlooked by less sophisticated models. In addition to EfficientViT, the proposed algorithm also incorporates large selective kernel network (LSKNet) within the C2f module of YOLOv8. LSKNet employs a dynamic kernel selection mechanism that allows the model to adaptively adjust the size of the convolutional kernels based on the input features, making it highly suitable for detecting cracks of varying sizes, orientations, and morphological characteristics. This adaptability ensures that the model can detect small cracks, regardless of their form. Furthermore, the model uses bidirectional feature pyramid network (BiFPN) to merge feature maps at different scales. Traditional models struggle with detecting small targets due to the loss of critical information during downsampling operations. BiFPN mitigates this issue by preserving high-resolution feature maps across multiple layers, enhancing the model's ability to detect small cracks that would otherwise be missed. The combined effect of these modifications improves the accuracy of small target crack detection while maintaining computational efficiency. Results: The effectiveness of the proposed model was validated using a dataset of crack images from a specific bridge, captured by unmanned aerial vehicles (UAVs). UAVs provided detailed images from areas that were often difficult or dangerous to access using traditional inspection methods. The experimental results demonstrated that the enhanced YOLOv8 model significantly outperformed the original version in terms of key performance metrics. Specifically, the modified model achieved improvements of 3.7%, 3.5%, 3.5%, 3.9%, and 7.4% in terms of the detection precision, recall, F1 score, mAP50, and mAP50-95, respectively. These results indicated a substantial improvement in the model's ability to detect small cracks that often had low contrast and irregular shapes, which were typical characteristics of cracks on bridge surfaces. Furthermore, compared to conventional methods, the proposed model was able to detect cracks with higher precision and fewer false positives, making it a promising tool for improving the efficiency of bridge inspections. Conclusions: In conclusion, the improved YOLOv8 algorithm introduced in this study represents a significant advancement in the detection of small target cracks in bridge structures. The modifications made to the original YOLOv8 model, including the integration of EfficientViT, LSKNet, and BiFPN, result in a more accurate and computationally efficient model for crack detection. This approach offers a practical and scalable solution for the widespread application of bridge health monitoring, particularly in areas that are difficult to inspect using traditional methods. By leveraging advanced surface data processing techniques, this research contributes to the development of modern methods for assessing the health of bridge structures, ultimately helping to ensure the safety and longevity of infrastructure systems.