Abstract:3-D internet photo visualization reconstructs objects in 3-D using structure information gained from the object's motion to give users motion experience. However, due to the large illumination difference between photographs on the Internet, traditional reconstruction methods cannot generate a single point cloud, but will generate multiple independent point clouds. This paper describes a 3-D model registration framework based on 3-D geometries that generates unified 3-D models from various illuminations to complete a structure from multiple models. The 3-D point cloud geometry is used instead of the 2-D features to overcome the influence of large illumination changes. Secondly, a scaled-PCA-ICP algorithm was then used to do the registration that can overcome the large scale variance between the two point clouds. Tests on two datasets show the effectiveness of this method.
[1] Snavely N, Seitz S M, Szeliski R. Photo tourism: Exploring photo collections in 3-D [J]. ACM transactions on graphics, 2006, 25(3): 835-846.
[2] Snavely N, Seitz S M, Szeliski R. Modeling the world from internet photo collections [J]. International Journal of Computer Vision, 2008, 80(2): 189-210.
[3] Lowe D G. Object recognition from local scale-invariant features [C]//Proceedings of the 7th IEEE International Conference on Computer Vision (ICCV). Kerkyra, Greece: IEEE, 1999: 1150-1157.
[4] Yin L, Snavely N, Gehrke J. MatchMiner: Efficient spanning structure mining in large image collections [C]//European Conference on Computer Vision (ECCV). Firenze, Italy: Springer Berlin Heidelberg, 2012:45-58.
[5] Li Y, Snavely N, Dan H, et al. Worldwide pose estimation using 3D point clouds [C]//European Conference on Computer Vision (ECCV). Firenze, Italy: Springer Berlin Heidelberg, 2012:15-29.
[6] Zhong Y. Intrinsic shape signatures: A shape descriptor for 3-D object recognition [C]//2009 IEEE 12th International Conference on Computer Vision Workshops (ICCV Workshops). Kyoto, Japan: IEEE, 2009: 689-696.
[7] Henry P, Krainin M, Herbst E, et al. RGB-D mapping: Using Kinect-style depth cameras for dense 3-D modeling of indoor environments [J]. The International Journal of Robotics Research, 2012, 31(5): 647-663.
[8] Newcombe R A, Izadi S, Hilliges O, et al. Kinect fusion: Real-time dense surface mapping and tracking [C]//10th IEEE International Symposium on Mixed and Augmented Reality (ISMAR). Basel, Switzerland: IEEE, 2011: 127-136.
[9] Pomerleau F, Colas F, Siegwart R, et al. Comparing ICP variants on real-world data sets [J]. Autonomous Robots, 2013, 34(3): 133-148.
[10] Besl P J, Mckay N D. Method for registration of 3-D shapes [J]. Proceedings of SPIE-The International Society for Optical Engineering, 1992, 14(3):239-256.
[11] Umeyama S. Least-squares estimation of transformation parameters between two point patterns [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1991 (4): 376-380.
[12] Wu C. Towards linear-time incremental structure from motion [C]//International Conference on 3-D Vision (3-DV). Seattle, WA, USA: IEEE, 2013: 127-134.
[13] Furukawa Y, Curless B, Seitz S M, et al. Towards internet-scale multi-view stereo [C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR). San Francisco, CA, USA: IEEE, 2010: 1434-1441.
[14] Furukawa Y, Ponce J. Accurate, dense, and robust multiview stereopsis [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(8): 1362-1376.