PDF(18495 KB)
An image-based smoke detection method integrating spatial perception and saliency modeling
Yanjun CHEN, Yinuo HUO, Chenglin YANG, Jingwu WANG, Zhiyong GUO, Lili ZHOU, Lei LI
Journal of Tsinghua University(Science and Technology) ›› 2025, Vol. 65 ›› Issue (11) : 2168-2179.
PDF(18495 KB)
PDF(18495 KB)
An image-based smoke detection method integrating spatial perception and saliency modeling
Objective: Image- based smoke detection is a vital component of early fire warning systems. However, existing methods face considerable challenges in reliability when applied to environments with complex backgrounds, high noise levels, and low image contrast. In particular, during the early stages of a fire, smoke often appears small in size, low in density, blurred in shape, and irregular in morphology, which further complicates detection. To address these challenges, this study proposes a smoke detection method that integrates spatial perception and saliency modeling. The aim is to improve the robustness, adaptability, and accuracy of smoke detection systems, providing highly reliable and effective solutions for real-world fire surveillance across diverse environments. Methods: The proposed method consists of three key components: the multi-kernel parallel convolution module (MKPCM), dynamic histogram axial interaction module (DHAIM), and spatial decay residual block (SDRB). The MKPCM employs a parallel architecture with convolution kernels of varying sizes, enabling the network to capture features across multiple spatial scales simultaneously. This design allows for an effective representation of the variable dispersion scales of smoke. The embedded context anchor mechanism further refines this process by assigning differentiated spatial weights, enhancing the focus on relevant visual regions while suppressing background noise and irrelevant features. The DHAIM uses dynamic histogram-based segmentation to partition feature maps into high- and low-contrast areas, and then applies hybrid attention mechanisms tailored to each partition to improve semantic differentiation and precise extraction of subtle smoke cues in low-contrast zones. The SDRB introduces a spatial attention generation process based on Manhattan distance, where attention weights decay as spatial distance increases, to effectively reduce interference from remote pixels and improve feature consistency in regions with blurred boundaries. These components are jointly optimized in an end-to-end learning framework to enhance the model's sensitivity to complex spatial patterns and ambiguous edge transitions of smoke plumes. Results: To evaluate the effectiveness of the proposed method, a multi-scene smoke detection dataset is constructed, encompassing various indoor and outdoor scenarios with diverse background complexities. Experimental results show that the proposed method achieves an average precision of 94.0%, outperforming the baseline real-time detection transformer model by 5.5%. The method consistently delivers high detection accuracy across different environmental conditions and maintains strong robustness against low contrast, occlusion, and scale variation. Ablation studies confirm the individual and combined contributions of MKPCM, DHAIM, and SDRB to enhancing performance metrics such as precision, recall, and F1 score. In addition, the method demonstrates efficient inference and computational performance, making it highly suitable for real-time deployment in intelligent surveillance, early fire warning systems, and automated safety platforms. Conclusions: This study presents a robust and efficient smoke detection method that integrates multi-scale spatial perception and contrast-adaptive saliency modeling. The experimental findings validate the method's ability to address key challenges in early fire smoke detection, especially in visually complex environments. With its strong detection performance and practical adaptability, the proposed method holds significant potential for integration into real-world fire prevention infrastructures, thereby enhancing early warning capabilities and contributing to improved public safety outcomes and emergency responsiveness.
fire detection / image-based smoke detection / deep learning / attention mechanism / saliency modeling
| 1 |
CHEN T H, YIN Y H, HUANG S F, et al. The smoke detection for early fire-alarming system based on video processing[C]//Proceedings of the 2006 International Conference on Intelligent Information Hiding and Multimedia. Pasadena, USA: IEEE, 2006: 427-430.
|
| 2 |
|
| 3 |
TÖREYIN B U, DEDEO AGˇG LU Y, ÇETIN A E. Wavelet based real-time smoke detection in video[C]//Proceedings of the 2005 13th European Signal Processing Conference. Antalya, Turkey: IEEE, 2005: 1-4.
|
| 4 |
PICCININI P, CALDERARA S, CUCCHIARA R. Reliable smoke detection in the domains of image energy and color[C]//Proceedings of the 2008 15th IEEE International Conference on Image Processing. San Diego, USA: IEEE, 2008: 1376-1379.
|
| 5 |
|
| 6 |
REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, USA: IEEE, 2016: 779-788.
|
| 7 |
REDMON J, FARHADI A. YOLO9000: Better, faster, stronger[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE, 2017: 7263-7271.
|
| 8 |
REDMON J, FARHADI A. YOLOV3: An incremental improvement[Z/OL]. arXiv preprint. arXiv: 1804.02767, 2018.
|
| 9 |
BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOV4: Optimal speed and accuracy of object detection[Z/OL]. arXiv preprint. arXiv: 2004.10934, 2020.
|
| 10 |
GHIASI G, CUI Y, SRINIVAS A, et al. Simple copy-paste is a strong data augmentation method for instance segmentation[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, USA: IEEE, 2021: 2918-2928.
|
| 11 |
|
| 12 |
CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]//Proceedings of the 16th European Conference on Computer Vision (ECCV 2020). Glasgow, UK: Springer, 2020: 213-229.
|
| 13 |
LIU S L, LI F, ZHANG H, et al. DAB-DETR: Dynamic anchor boxes are better queries for DETR[Z/OL]//arXiv preprint. arXiv: 2201.12329, 2022.
|
| 14 |
SUN P Z, ZHANG R F, JIANG Y, et al. Sparse R-CNN: End-to-end object detection with learnable proposals[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, USA: IEEE, 2021: 14454-14463.
|
| 15 |
ZHANG H, LI F, LIU S L, et al. Dino: DETR with improved denoising anchor boxes for end-to-end object detection[Z/OL]. arXiv preprint. arXiv: 2203.03605, 2022.
|
| 16 |
ZHU X Z, SU W J, LU L W, et al. Deformable DETR: Deformable transformers for end-to-end object detection[Z/OL]. arXiv preprint. arXiv: 2010.04159, 2021.
|
| 17 |
ZHAO Y A, LV W Y, XU S L, et al. DETRs beat YOLOs on real-time object detection[C]//Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recog- nition (CVPR). Seattle, USA: IEEE, 2024: 16965-16974.
|
| 18 |
|
| 19 |
HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018: 7132-7141.
|
| 20 |
HU J, SHEN L, ALBANIE S, et al. Gather-excite: Exploiting feature context in convolutional neural networks[C]//Proceedings of the 32nd International Conference on Neural Information Processing Systems. Montréal, Canada: Curran Associates., 2018: 9423-9433.
|
| 21 |
WANG X L, GIRSHICK R, GUPTA A, et al. Non-local neural networks[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018: 7794-7803.
|
| 22 |
|
| 23 |
SUN S Q, REN W Q, GAO X W, et al. Restoring images in adverse weather conditions via histogram transformer[C]//Proceedings of the 18th European Conference on Computer Vision (ECCV 2024). Milan, Italy: Springer-Verlag, 2024: 111-129.
|
| 24 |
DAI J F, QI H Z, XIONG Y W, et al. Deformable convolutional networks[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV). Venice, Italy: IEEE, 2017: 764-773.
|
| 25 |
FAN Q H, HUANG H B, CHEN M R, et al. RMT: Retentive networks meet vision transformers[C]//Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE, 2024: 5641-5651.
|
| 26 |
LI S, WANG B, DONG R R, et al. A novel smoke detection algorithm based on fast self-tuning background subtraction[C]//Proceedings of the 2016 Chinese Control and Decision Conference (CCDC). Yinchuan, China: IEEE, 2016: 3539-3543.
|
/
| 〈 |
|
〉 |