基于自然语言处理的特定属性物体检测

张旭; 王生进

doi:10.16511/j.cnki.qhdxxb.2016.26.001

清华大学学报（自然科学版） >

2016 , Vol. 56 >Issue 11: 1137 - 1142

DOI: https://doi.org/10.16511/j.cnki.qhdxxb.2016.26.001

电子工程

基于自然语言处理的特定属性物体检测

张旭 ,
王生进

展开

清华大学电子工程系, 智能技术与系统国家重点实验室, 信息技术国家实验室, 北京 100084

收稿日期: 2016-06-02

网络出版日期: 2016-11-15

收起

Attributed object detection based on natural language processing

ZHANG Xu ,
WANG Shengjin

Expand

State Key Laboratory of Intelligent Technology and System, National Laboratory for Information Science and Technology, Department of Electrical Engineering, Tsinghua University, Beijing 100084, China

Received date: 2016-06-02

Online published: 2016-11-15

Fold

摘要

该文研究如何在图片中定位特定属性物体（如“废弃的车”等）。由于一个物体可能包含几十甚至上百个非互斥的属性，训练特定属性物体检测器的难点是为大量的特定属性物体收集训练图片并标定边界框。该文提出使用特定属性物体分类器扩展物体检测器获取特定属性物体检测器的方法。其中的特定属性物体分类器通过使用从互联网上挖掘的图片以及从物体检测器和自然语言处理工具获取的标注信息训练得到。构建了特定属性物体检测数据库并对特定属性物体检测器的性能进行分析，结果表明：特定属性检测器的平均精度均值比物体检测器相对提高30%。

关键词： 特定属性物体检测; 物体检测; 自然语言处理

本文引用格式

张旭 , 王生进 . 基于自然语言处理的特定属性物体检测[J]. 清华大学学报（自然科学版）, 2016 , 56(11) : 1137 -1142 . DOI: 10.16511/j.cnki.qhdxxb.2016.26.001

Abstract

This paper addresses the problem of localizing an attributed object, such as "abandoned car", in images. Since one object may have tens or even hundreds of non-exclusive attributes, the main difficulties of attributed object detection are manually collecting training images and labeling the bounding boxes for a large number of attributed objects. This attributed object detector extends the object detector with an attributed object classifier. The attributed object classifier is trained by images from the Internet and labeling information gathered by the object detector and a natural language processing tool. An attributed object detection dataset was developed to evaluate the attributed object detectors. Tests show that this attributed object detector has good performance gains of 30% for the mean average precision compared to generic object detectors.

Key words： attributed object detection; object detection; natural language processing

参考文献

[1] 杨德亮, 谢旭东, 李春文, 等. 基于分布式视频网络的交叉口车辆精确定位方法[J]. 清华大学学报(自然科学版), 2016, 56(3):281-286,293. YANG Deliang, XIE Xudong, Li Chunwen, et al. Accurate vehicle location method at an intersection based on distributed video networks[J]. J Tsinghua Univ (Sci and Tech), 2016, 56(3):281-286,293. (in Chinese) [2] Zhang X, He F, Tian L, et al. Cognitive pedestrian detector:Adapting detector to specific scene by transferring attributes[J]. Neurocomputing, 2015, 149:800-810. [3] Borth D, Ji R, Chen T, et al. Large-scale visual sentiment ontology and detectors using adjective noun pairs[C]//Proceedings of ACM MM. Barcelona, Spain:ACM, 2013:223-232. [4] Chen T, Yu F X, Chen J, et al. Object-based visual sentiment concept analysis and application[C]//Proceedings of ACM MM. Orlando, USA:ACM, 2014:367-376. [5] Jou B, Chen T, Pappas N, et al. Visual affect around the world:A large-scale multilingual visual sentiment ontology[C]//Proceedings of ACM MM. Brisbane, Australia:ACM, 2015:159-168. [6] Wang X, Jia J, Tang J, et al. Modeling emotion influence in image social networks[J]. Affective Computing, IEEE Transactions on, 2015, 6(3):286-297. [7] Duan K, Parikh D, Crandall D, et al. Discovering localized attributes for fine-grained recognition[C]//Proceedings of CVPR. Providence, USA:IEEE, 2012:3474-3481. [8] Branson S, Van Horn G, Wah C, et al. The ignorant led by the blind:A hybrid human-machine vision system for fine-grained categorization[J]. International Journal of Computer Vision, 2014, 108(1-2):3-29. [9] Hoffman J, Guadarrama S, Tzeng E S, et al. LSDA:Large scale detection through adaptation[C]//Advances in Neural Information Processing Systems. Montréal, Canada:MIT Press, 2014:3536-3544. [10] Tommasi T, Patricia N, Caputo B, et al. A deeper look at dataset bias[J]. Pattern Recognition, 2015:504-516. [11] Schmidhuber J. Deep learning in neural networks:An overview[J]. Neural Networks, 2015, 61:85-117. [12] Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of CVPR. Columbus, USA:IEEE, 2014:580-587. [13] He K, Zhang X, Ren S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 2015, 37(9):1904-1916. [14] Felzenszwalb P F, Girshick R B, McAllester D, et al. Object detection with discriminatively trained part-based models[J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 2010, 32(9):1627-1645. [15] Li Y, Wang S, Tian Q, et al. Feature representation for statistical-learning-based object detection:A review[J]. Pattern Recognition, 2015, 48(11):3542-3559. [16] Cortes C, Vapnik V. Support vector machine[J]. Machine learning, 199

Options

文章导航

模态框（Modal）标题

摘要

本文引用格式

Abstract

参考文献

访问统计