|
[1] T. Wang, F. Wang, J. Lin, Y. Tsai, W. Chiu, and M. Sun, “Plug-and-play: Improve depth prediction via sparse data propagation,” in 2019 International Conference on Robotics and Automation (ICRA), pp. 5880–5886, May 2019. v [2] T.-H. Wang, H.-N. Hu, C. H. Lin, Y.-H. Tsai, W.-C. Chiu, and M. Sun, “3d li- dar and stereo fusion using stereo matching network with conditional cost volume normalization,” arXiv preprint arXiv:1904.02917, 2019. v [3] F.MaandS.Karaman,“Sparse-to-dense:Depthpredictionfromsparsedepthsam- ples and a single image,” in International Conference on Robotics and Automation (ICRA), 2018. xv, 1, 9, 19, 21, 23, 30, 31 [4] J. Uhrig, N. Schneider, L. Schneider, U. Franke, T. Brox, and A. Geiger, “Sparsity invariant cnns,” in International Conference on 3D Vision (3DV), 2017. 1, 4, 19, 20, 21, 23, 30 [5] M. Jaritz, R. De Charette, E. Wirbel, X. Perrotton, and F. Nashashibi, “Sparse and dense data with cnns: Depth completion and semantic segmentation,” arXiv preprint arXiv:1808.00769, 2018. 2, 9, 10, 21, 22, 23, 30 [6] Z. Chen, V. Badrinarayanan, G. Drozdov, and A. Rabinovich, “Estimating depth from rgb and sparse sensing,” arXiv preprint arXiv:1804.02771, 2018. 2 [7] N.Chodosh,C.Wang,andS.Lucey,“Deepconvolutionalcompressedsensingfor lidar depth completion,” arXiv preprint arXiv:1803.08949, 2018. 2, 9, 21, 23, 27, 30 [8] R.Y.XinjingCheng,PengWang,“Depthestimationviaaffinitylearnedwithcon- volutional spatial propagation network,” in European Conference on Computer Vision (ECCV), 2018. 2, 7, 9, 14 [9] J. Yosinski, J. Clune, A. Nguyen, T. Fuchs, and H. Lipson, “Understanding neural networks through deep visualization,” in Deep Learning Workshop, International Conference on Machine Learning (ICML), 2015. 2 [10] A. Kurakin, I. Goodfellow, and S. Bengio, “Adversarial examples in the physical world,” arXiv preprint arXiv:1607.02533, 2016. 2, 11, 12 [11] P. K. Nathan Silberman, Derek Hoiem and R. Fergus, “Indoor segmentation and support inference from rgbd images,” in European Conference on Computer Vision (ECCV), 2012. 3, 19, 20 [12] A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous driving? the kitti vision benchmark suite,” in Conference on Computer Vision and Pattern Recognition (CVPR), 2012. 3 [13] J. Zbontar and Y. LeCun, “Computing the stereo matching cost with a convo- lutional neural network,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. 4, 6, 24 [14] W.Luo,A.G.Schwing,andR.Urtasun,“Efficientdeeplearningforstereomatch- ing,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. 4, 6 [15] A.Kendall,H.Martirosyan,S.Dasgupta,P.Henry,R.Kennedy,A.Bachrach,and A. Bry, “End-to-end learning of geometry and context for deep stereo regression,” in IEEE International Conference on Computer Vision (ICCV), 2017. 4, 6, 14, 22, 24, 25, 27, 29 [16] J.Pang,W.Sun,J.S.Ren,C.Yang,andQ.Yan,“Cascaderesiduallearning:Atwo- stage convolutional neural network for stereo matching,” in IEEE International Conference on Computer Vision Workshops (ICCV Workshops), 2017. 4, 6 [17] J.-R. Chang and Y.-S. Chen, “Pyramid stereo matching network,” in IEEE Con- ference on Computer Vision and Pattern Recognition (CVPR), 2018. 4, 6 [18] P.-H. Huang, K. Matzen, J. Kopf, N. Ahuja, and J.-B. Huang, “Deepmvs: Learn- ing multi-view stereopsis,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018. 4 [19] M. Menze and A. Geiger, “Object scene flow for autonomous vehicles,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. 4, 19, 20 [20] D. Eigen, C. Puhrsch, and R. Fergus, “Depth map prediction from a single image using a multi-scale deep network,” in Advances in neural information processing systems, pp. 2366–2374, 2014. 5, 19, 21, 23 [21] D. Eigen and R. Fergus, “Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 2650–2658, 2015. 5 [22] F.Liu,C.Shen,andG.Lin,“Deepconvolutionalneuralfieldsfordepthestimation from a single image,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5162–5170, 2015. 5 [23] I. Laina, C. Rupprecht, V. Belagiannis, F. Tombari, and N. Navab, “Deeper depth prediction with fully convolutional residual networks,” in 3D Vision (3DV), 2016 Fourth International Conference on, pp. 239–248, IEEE, 2016. 5, 10, 19, 21, 23 [24] H.Fu,M.Gong,C.Wang,K.Batmanghelich,andD.Tao,“Deepordinalregression network for monocular depth estimation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2002–2011, 2018. 5, 10 [25] D. Scharstein and R. Szeliski, “A taxonomy and evaluation of dense two-frame stereo correspondence algorithms,” International Journal of Computer Vision (IJCV), 2002. 6, 14 [26] H.Hirschmuller,“Stereoprocessingbysemiglobalmatchingandmutualinforma- tion,” IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2008. 6, 7, 24, 25 [27] N. Mayer, E. Ilg, P. Hausser, P. Fischer, D. Cremers, A. Dosovitskiy, and T. Brox, “A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. 6 [28] F. Mal and S. Karaman, “Sparse-to-dense: Depth prediction from sparse depth samples and a single image,” in IEEE International Conference on Robotics and Automation (ICRA), 2018. 6, 15, 28, 29 [29] F. Ma, G. V. Cavalheiro, and S. Karaman, “Self-supervised sparse-to- dense: Self-supervised depth completion from lidar and monocular camera,” ArXiv:1807.00275, 2018. 6, 24, 25, 28, 29 [30] J. Uhrig, N. Schneider, L. Schneider, U. Franke, T. Brox, and A. Geiger, “Sparsity invariant cnns,” in International Conference on 3D Vision (3DV), 2017. 6 [31] Z. Huang, J. Fan, S. Yi, X. Wang, and H. Li, “Hms-net: Hierarchical multi-scale sparsity-invariant network for sparse depth completion,” ArXiv:1808.08685, 2018. 6 [32] A.Eldesokey,M.Felsberg,andF.S.Khan,“Confidencepropagationthroughcnns for guided sparse depth regression,” ArXiv:1811.01791, 2018. 6, 24, 25 [33] W. Van Gansbeke, D. Neven, B. De Brabandere, and L. Van Gool, “Sparse and noisy lidar completion with rgb guidance and uncertainty,” ArXiv:1902.05356, 2019. 6, 24, 25 [34] X. Cheng, P. Wang, and R. Yang, “Depth estimation via affinity learned with con- volutional spatial propagation network,” in European Conference on Computer Vision (ECCV), 2018. 6 [35] T.-H.Wang,F.-E.Wang,J.-T.Lin,Y.-H.Tsai,W.-C.Chiu,andM.Sun,“Plug-and- play: Improve depth estimation via sparse data propagation,” AarXiv:1812.08350, 2018. 6 [36] K.Nickels,A.Castano,andC.Cianci,“Fusionoflidarandstereorangeformobile robots,” in International Conference on Advanced Robotics (ICAR), 2003. 7 [37] D. Huber, T. Kanade, et al., “Integrating lidar into stereo for fast and improved disparity computation,” in International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission (3DIMPVT), 2011. 7 [38] V. Gandhi, J. Čech, and R. Horaud, “High-resolution depth maps based on tof- stereo fusion,” in IEEE International Conference on Robotics and Automation (ICRA), 2012. 7 [39] W. Maddern and P. Newman, “Real-time probabilistic fusion of sparse 3d lidar and dense stereo,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2016. 7, 20, 24, 25 [40] K. Park, S. Kim, and K. Sohn, “High-precision depth estimation with the 3d lidar and stereo fusion,” in IEEE International Conference on Robotics and Automation (ICRA), 2018. 7, 20, 24, 25 [41] E. Perez, H. De Vries, F. Strub, V. Dumoulin, and A. Courville, “Learning visual reasoning without strong priors,” ArXiv:1707.03017, 2017. 8, 15, 16 [42] H. De Vries, F. Strub, J. Mary, H. Larochelle, O. Pietquin, and A. C. Courville, “Modulating early visual processing by language,” in Advances in Neural Infor- mation Processing Systems (NIPS), 2017. 8, 15, 16 [43] E. Perez, F. Strub, H. De Vries, V. Dumoulin, and A. Courville, “Film: Visual reasoning with a general conditioning layer,” in AAAI Conference on Artificial Intelligence (AAAI), 2018. 8 [44] A. Radford, L. Metz, and S. Chintala, “Unsupervised representation learning with deep convolutional generative adversarial networks,” ArXiv:1511.06434, 2015. 8 [45] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computa- tion, 1997. 8 [46] D. Ha, A. Dai, and Q. V. Le, “Hypernetworks,” ArXiv:1609.09106, 2016. 8 [47] C.H.Lin,C.-C.Chang,Y.-S.Chen,D.-C.Juan,W.Wei,andH.-T.Chen,“COCO- GAN: Conditional coordinate generative adversarial network,” 2019. 8 [48] Y.Liao,L.Huang,Y.Wang,S.Kodagoda,Y.Yu,andY.Liu,“Parsegeometryfrom a line: Monocular depth estimation with partial laser observation,” in Robotics and Automation (ICRA), 2017 IEEE International Conference on, pp. 5059–5066, IEEE, 2017. 9, 10 [49] Y. Zhang and T. Funkhouser, “Deep depth completion of a single rgb-d image,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018. 9 [50] Y. Cao, Z. Wu, and C. Shen, “Estimating depth from monocular images as classi- fication using deep fully convolutional residual networks,” IEEE Transactions on Circuits and Systems for Video Technology, 2017. 10 [51] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning internal represen- tations by error propagation,” in Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Volume 1: Foundations, 1986. 12 [52] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in Inter- national Conference for Learning Representations, 2014. 12 [53] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recogni- tion,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. 17, 28 [54] T.Zhou,M.Brown,N.Snavely,andD.G.Lowe,“Unsupervisedlearningofdepth and ego-motion from video,” in CVPR, 2017. 21, 23 [55] G.Hinton,N.Srivastava,andK.Swersky,“Neuralnetworksformachinelearning lecture 6a overview of mini-batch gradient descent.” 22 [56] G. Ros, L. Sellart, J. Materzynska, D. Vazquez, and A. M. Lopez, “The synthia dataset: A large collection of synthetic images for semantic segmentation of urban scenes,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016. 31 |