计算机科学与技术

组合全卷积神经网络和条件随机场的道路分割

  • 宋青松 ,
  • 张超 ,
  • 陈禹 ,
  • 王兴莉 ,
  • 杨小军
展开
  • 长安大学 信息工程学院, 西安 710064

收稿日期: 2018-01-24

  网络出版日期: 2018-08-15

基金资助

国家自然科学基金资助项目(61201406,61473047);中央高校基本科研业务费专项资金资助项目(310824162022,300102248201,300102248401)

Road segmentation using full convolutional neural networks with conditional random fields

  • SONG Qingsong ,
  • ZHANG Chao ,
  • CHEN Yu ,
  • WANG Xingli ,
  • YANG Xiaojun
Expand
  • School of Information Engineering, Chang'an University, Xi'an 710064, China

Received date: 2018-01-24

  Online published: 2018-08-15

摘要

常见的道路分割方法往往环境噪声鲁棒性不足并且分割边缘不够平滑。针对该问题,提出了一种组合全卷积神经网络和全连接条件随机场的道路分割方法。首先,利用深度神经网络良好的特征表征能力,将道路分割视为一个二分类问题,构建一个基于VGG_16深度卷积网络的全卷积网络,实现道路图像端到端的路面和背景分类;然后,利用全连接条件随机场能够实现图像精细分割的特点,采用全连接条件随机场对二分类得到的粗糙边缘再进行平滑优化。针对真实环境下采集的道路分割基准数据库的测试结果表明:该方法获得了98.13%的分割准确率以及每0.84 s处理1幅图像的分割速度,具有一定的先进性。

本文引用格式

宋青松 , 张超 , 陈禹 , 王兴莉 , 杨小军 . 组合全卷积神经网络和条件随机场的道路分割[J]. 清华大学学报(自然科学版), 2018 , 58(8) : 725 -731 . DOI: 10.16511/j.cnki.qhdxxb.2018.21.013

Abstract

Common road segmentation methods are often limited by environmental noise and the roughness of the segmenting edges. A road segmentation method was developed to address these shortcomings by combining a fully convolutional neural network and a conditional random field. The feature representation in the neural networks models the road segmentation as a binary classification problem. A VGG-16 deep convolutional neural network based fully convolutional network was constructed to classify each road image end to end into the road and the background. Then, the fully-connected conditional random field (CRF) was used for fine segmentation to refine the coarse edges obtained from the binary classification. Tests of road segmentation benchmark datasets acquired in real environments show that this method can achieve 98.13% segmentation accuracy and real-time processing with 0.84 s perimage.

参考文献

[1] ALVAREZ J M, GEVERS T, LOPEZ A M. Road detection by one-class color classification:Dataset and experiments[R/OL]. (2014-12-18)[2017-10-20]. https://arxiv.org/abs/1412.3506.
[2] GRAOVAC S, GOMA A. Detection of road image borders based on texture classification[J]. International Journal of Advanced Robotic Systems, 2012, 9(242):1-12.
[3] WANG Y, SHEN D G, TEOH E K. Lane detection using spline model[J]. Pattern Recognition Letters, 2000, 21(8):677-689.
[4] WANG Y, TEOH E K, SHEN D G. Lane detection and tracking using B-Snake[J]. Image and Vision Computing, 2004, 22(4):269-280.
[5] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11):2278-2324.
[6] NING F, DELHOMME D, LECUN Y, et al. Toward automatic phenotyping of developing embryos from videos[J]. IEEE Transactions on Image Processing, 2005, 14(9):1360-1371.
[7] GANIN Y, LEMPITSKY V. N4-fields:Neural network nearest neighbor fields for image transforms[C]//Proceedings of the 12th Asian Conference on Computer Vision. Singapore:Springer, 2014:536-551.
[8] LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA:IEEE, 2015:3431-3440.
[9] LAFFERTY J D, MCCALLUM A, PEREIRA F C N. Conditional random fields:Probabilistic models for segmenting and labeling sequence data[C]//Proceedings of the 18th International Conference on Machine Learning. San Francisco, USA:Morgan Kaufmann Publishers Inc., 2001:282-289.
[10] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]//2015 International Conference on Learning Representations. Lille, France:University of Oxford, 2015:1-14.
[11] SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA:IEEE, 2015:1-9.
[12] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA:IEEE, 2016:770-778.
[13] ZEILER M D, KRISHNAN D, TAYLOR G W, et al. Deconvolutional networks[C]//2010 IEEE Conference on Computer Vision and Pattern Recognition. San Francisco, CA, USA:IEEE, 2010, 238(6):2528-2535.
[14] ZEILER M D, FERGUS R. Visualizing and understanding convolutional networks[C]//The 13th European Conference on Computer Vision. Zurich, Switzerland:Springer, 2014:818-833.
[15] SHI W Z, CABALLERO J, HUSZÁR F, et al. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Nevada, USA:IEEE, 2016:1874-1883.
[16] Avisynth wiki. Resampling[R/OL]. (2016-09-22)[2017-11-12]. http://avisynth.nl/index.php/Resampling.
[17] KRAHENBUHL P, KOLTUN V. Efficient inference in fully connected crfs with Gaussian edge potentials[C]//The 24th International Conference on Neural Information Processing Systems. Granada, Spain:Neural Information Processing Systems Foundation, Inc., 2011:109-117.
[18] ZHENG S, JAYASUMANA S, ROMERA-PAREDES B, et al. Conditional random fields as recurrent neural networks[C]//2015 IEEE International Conference on Computer Vision, Santiago, Chile:IEEE, 2015:1529-1537.
[19] ÁLVAREZ J M, LÓPEZ A M, GEVERS T, et al. Combining priors, appearance, and context for road detection[J]. IEEE Transactions on Intelligent Transportation Systems, 2014, 15(3):1168-1178.
[20] The PASCAL visual object classes challenge 2012(VOC2012)[R/OL]. (2012-02-01)[2017-09-01]. http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html.
[21] KINGMA D P, Ba J. Adam:A method for stochastic optimization[R/OL]. (2017-01-30)[2017-08-21]. https://arxiv.org/abs/1412.6980.
文章导航

/