|
[1] P. A. Saurabh Gupta, Ross Girshick and J. Malik, “Learning rich features from rgb-d images for object detection and segmentation,” in ECCV, 2014. [2] R. G. J. S. Shaoqing Ren, Kaiming He, “Faster r-cnn: Towards real-time object detection with region proposal networks,” in NIPS, 2015. [3] J. Redmon and A. Farhadi, “Yolo9000: Better, faster, stronger,” in CVPR, 2017. [4] P. K. Nathan Silberman, Derek Hoiem and R. Fergus, “Indoor segmentation and support inference from rgbd images,” in ECCV, 2012. [5] A.Dai,A.X.Chang, M.Savva, M.Halber, T.Funkhouser, and M.Nießner,“Scan- net: Richly-annotated 3d reconstructions of indoor scenes,” in Proc. Computer Vision and Pattern Recognition (CVPR), IEEE, 2017. [6] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in CVPR, 2016. [7] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, “Ssd: Single shot multibox detector,” in ECCV, 2016. [8] S. L. S. Song and J. Xiao, “Sun rgb-d: A rgb-d scene understanding benchmark suite,” in CVPR, 2015. [9] J. W. B. L. T. X. Xiaozhi Chen, Huimin Ma, “Multi-view 3d object detection network for autonomous driving,” in CVPR, 2017. [10] A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous driving? the kitti vision benchmark suite,” in Conference on Computer Vision and Pattern Recognition (CVPR), 2012. [11] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in CVPR, 2014. [12] R. Girshick, “Fast r-cnn,” in ICCV, 2015. [13] P. D. Kaiming He, Georgia Gkioxari and R. Girshick, “Mask r-cnn,” in ICCV, 2017. [14] K. E. A. van de Sande, J. Uijlings, T. Gevers, and A. Smeulders, “Segmentation as selective search for object recognition,” in ICCV, 2011. 29 [15] S.R.J.S.KaimingHe,XiangyuZhang,“Spatialpyramidpoolingindeepconvolu- tional networks for visual recognition,” in IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2015. [16] S.C.T.A.A.D.Erhan,D.,“Scalableobjectdetectionusingdeepneuralnetworks,” in CVPR, 2014. [17] R. G. K. H. B. H. Tsung-Yi Lin, Piotr Dollár and S. Belongie, “Feature pyramid networks for object detection,” in CVPR, 2017. [18] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman, “The pascal visual object classes (voc) challenge,” in International Journal of Computer Vision, 2010. [19] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft coco: Common objects in context,” in ECCV, 2014. [20] J. M. Saurabh Gupta, Judy Hoffman, “Cross modal distillation for supervision transfer,” in CVPR, 2016. [21] S. Song and J. Xiao., “Sliding shapes for 3d object detection in depth images,” in ECCV, 2014. [22] S.SongandJ.Xiao.,“Deepslidingshapesforamodal3dobjectdetectioninrgb-d images,” in CVPR, 2016. [23] N. Dalal and B. Triggs., “Histograms of oriented gradients for human detection,” in CVPR, 2005. [24] R.ZhileandE.B.Sudderth.,“Three-dimensionalobjectdetec-tionandlayoutpredic- tion using clouds of oriented gradients,” in CVPR, 2016. [25] P.-T. J. B. J. M. F. M. J. Arbela ́ez, P., “Multiscale combinatorial grouping.,” in CVPR, 2014. [26] Z. D. L. J. Latecki, “Amodal detection of 3d objects: Inferring 3d bounding boxes from 2d ones in rgb-depth images,” in CVPR, 2017. [27] Z.Wang,W.Zhan,and M.Tomizuka,“Fusingbirdviewlidarpointcloudandfront view camera image for deep object detection,” in arXiv, 2017. [28] D. Lowe, “Distinctive image features from scale-invariant key-points,” in Int. J. Comput. Vis., vol. 60, no. 2, pp. 91 110, 2004. [29] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in NIPS, 2012. [30] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. S. Bernstein, A. C. Berg, and F. Li, “Imagenet: A large-scale hierarchical image database,” in CVPR, 2009. [31] J. . O. M. A. G. Basura Fernando, Efstratios Gavves and T. Tuytelaars, “Rank pooling for action recognition,” in PAMI, 2016. 30 [32] H. Bilen, B. Fernando, E. Gavves, and A. Vedaldi, “Dynamic image networks for action recognition,” in CVPR, 2016. [33] R. F. L. T. D. Tran, L. Bourdev and M. Paluri, “Learning spatiotemporal features with 3d convolutional networks„” in ICCV, 2015. [34] J. D. S. K. J. L. R. G. S. G. Y. Jia, E. Shelhamer, and T. Darrell, “Caffe: Convolutional architecture for fast feature embedding,” in arXiv, 2014. [35] S. S. T. L. R. S. A. Karpathy, G. Toderici and L. Fei-Fei, “Large-scale video classification with convolutional neural networks,” in CVPR, 2014. [36] R. B. Rusu and S. Cousins, “3d is here: Point cloud library (pcl),” in IEEE Inter- national Conference on Robotics and Automation (ICRA), 2011. [37] Y. J. J. T. B. M. F. K. S. A. Janoch, S. Karayev and T. Darrell., “A category-level 3-d objectdataset: Putting the kinect to work.,” in ICCV Workshop onConsumer Depth Cameras for Computer Vision, 2011. |