|
[1] K. Konda and R. Memisevic, “Learning visual odometry with a convolutional network,” in VISAPP International Conference on Computer Vision Theory and Applications, 2015. [2] P. Agrawal, J. Carreira, and J. Malik, “Learning to see by moving,” in Proc. IEEE Int. Conf. on Computer Vision (ICCV), 2015. [3] D. Jayaraman and K. Grauman, “Learning image representations tied to ego-motion,” in Proc. IEEE Int. Conf. on Computer Vision (ICCV), 2015. [4] S. Wang, R. Clark, H. Wen, and N. Trigoni, “Deepvo: Towards end-to-end visual odom- etry with deep recurrent convolutional neural networks,” in Proc. IEEE Int. Conf. on Robotics and Automation (ICRA), 2017. [5] F. Walch, C. Hazirbas, L. Leal-Taixé, T. Sattler, S. Hilsenbeck, and D. Cremers, “Image- based localization using lstms for structured feature correlation,” in Proc. IEEE Int. Conf. on Computer Vision (ICCV), 2017. [6] Y. Xiang, T. Schmidt, V. Narayanan, and D. Fox, “Posecnn: A convolutional neural net- work for 6d object pose estimation in cluttered scenes,” in Proc. Robotics: Science and Systems (RSS), 2018. [7] V. Balntas, S. Li, and V. Prisacariu, “Relocnet: Continuous metric learning relocalisation using neural nets,” in Proc. European Conf. on Computer Vision (ECCV), 2018. [8] Z. Laskar, I. Melekhov, S. Kalia, and J. Kannala, “Camera relocalization by computing pairwise relative poses using convolutional neural network,” in Proc. IEEE Int. Conf. on Computer Vision Workshop (ICCVW), 2017. [9] I. Melekhov, J. Ylioinas, J. Kannala, and E. Rahtu, “Relative camera pose estimation using convolutional neural networks,” in International Conference on Advanced Concepts for Intelligent Vision Systems, 2017. [10] A. Kendall, M. Grimes, and R. Cipolla, “Posenet: A convolutional network for real- time 6-dof camera relocalization,” in Proc. IEEE Int. Conf. on Computer Vision (ICCV), pp. 2938–2946, 10 2015. [11] A. Kendall and R. Cipolla, “Modelling uncertainty in deep learning for camera relocaliza- tion,” in Proc. IEEE Int. Conf. on Robotics and Automation (ICRA), 2016. [12] M. Cai, C. Shen, and I. Reid, “A hybrid probabilistic model for camera relocalization,” in Proc. British Machine Vision Conf. (BMVC), 2018. 37 [13] A. Kendall and R. Cipolla, “Geometric loss functions for camera pose regression with deep learning,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 6555–6564, 2017. [14] T. Sattler, Q. Zhou, M. Pollefeys, and L. Leal-Taixé, “Understanding the limitations of cnn-based absolute camera pose regression,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2019. [15] S. Brahmbhatt, J. Gu, K. Kim, J. Hays, and J. Kautz, “Geometry-aware learning of maps for camera localization,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2018. [16] I. Melekhov, J. Ylioinas, J. Kannala, and E. Rahtu, “Image-based localization using hour- glass networks,” in Proc. IEEE Int. Conf. on Computer Vision Workshop (ICCVW), 2017. [17] T. Naseer and W. Burgard, “Deep regression for monocular camera-based 6-dof global localization in outdoor environments,” in Proc. IEEE Int. Conf. on Intelligent Robots and Systems (IROS), 2017. [18] N. Radwan, A. Valada, and W. Burgard, “Vlocnet++: Deep multitask learning for semantic visual localization and odometry,” in IEEE Robotics Autom. Lett., 2018. [19] A. Valada, N. Radwan, and W. Burgard, “Deep auxiliary learning for visual localization and odometry,” in Proc. IEEE Int. Conf. on Robotics and Automation (ICRA), 2017. [20] J. Wu, L. Ma, and X. Hu, “Delving deeper into convolutional neural networks for camera relocalization,” in Proc. IEEE Int. Conf. on Robotics and Automation (ICRA), 2017. [21] F. Xue, Q. Wang, X. Wang, W. Dong, J. Wang, and H. Zha, “Guided feature selection for deep visual odometry,” in Proc. Asian Conf. on Computer Vision (ACCV), 2018. [22] N. Yang, L. von Stumberg, R. Wang, and D. Cremers, “D3vo: Deep depth, deep pose and deep uncertainty for monocular visual odometry,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2020. [23] G. Costante, M. Mancini, P. Valigi, and T. A. Ciarfuglia, “Exploring representation learn- ing with cnns for frame-to-frame ego-motion estimation,” in IEEE Robotics Autom. Lett., 2016. [24] G. Costante and T. A. Ciarfuglia, “Ls-vo: Learning dense optical subspace for robust visual odometry estimation,” in IEEE Robotics Autom. Lett., 2018. [25] P. Muller and A. Savakis, “Flowdometry: An optical flow and deep learning based ap- proach to visual odometry,” in Proc. IEEE Winter Conf. on Applications of Computer Vision (WACV), 2017. [26] B. Wang, C. Chen, C. X. Lu, P. Zhao, N. Trigoni, and A. Markham, “Atloc: Attention guided camera localization,” arXiv preprint arXiv:1909.03557, 2019. [27] H. Damirchi, R. Khorrambakht, and H. D. Taghirad, “Exploring self-attention for visual odometry,” arXiv, vol. abs/2011.08634, 2020. 38 [28] E. Parisotto, D. S. Chaplot, J. Zhang, and R. Salakhutdinov, “Global pose estimation with an attention-based recurrent network,” in Proc. IEEE Conf. on Computer Vision and Pat- tern Recognition Workshop (CVPRW), pp. 237–246, 2018. [29] C. Chen, S. Rosa, Y. Miao, C. X. Lu, W. Wu, A. Markham, and N. Trigoni, “Selective sen- sor fusion for neural visual-inertial odometry,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 10542–10551, 2019. [30] T. A. Ciarfuglia, G. Costante, P. Valigi, and E. Ricci, “Evaluation of non-geometric meth- ods for visual odometry,” Robotics Auton. Syst., vol. 62, no. 12, pp. 1717–1730, 2014. [31] T. Zhang, X. Liu, K. Kühnlenz, and M. Buss, “Visual odometry for the autonomous city explorer,” in Proc. IEEE Int. Conf. on Intelligent Robots and Systems (IROS), pp. 3513– 3518, 2009. [32] X. Kuo, C. Liu, K. Lin, E. Luo, Y. Chen, and C. Lee, “Dynamic attention-based visual odometry,” in Proc. IEEE Int. Conf. on Intelligent Robots and Systems (IROS), pp. 5753– 5760, 2020. [33] M. Kaneko, K. Iwami, T. Ogawa, T. Yamasaki, and K. Aizawa, “Mask-slam: Robust feature-based monocular SLAM by masking using semantic segmentation,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition Workshop (CVPRW), pp. 258–266, 2018. [34] B. Bescós, J. M. Fácil, J. Civera, and J. Neira, “Dynaslam: Tracking, mapping, and in- painting in dynamic scenes,” IEEE Robotics Autom. Lett., vol. 3, no. 4, pp. 4076–4083, 2018. [35] T. Sun, Y. Sun, M. Liu, and D. Yeung, “Movable-object-aware visual SLAM via weakly supervised semantic segmentation,” CoRR, vol. abs/1906.03629, 2019. [36] C. Chen, S. Rosa, Y. Miao, C. X. Lu, W. Wu, A. Markham, and N. Trigoni, “Selective sen- sor fusion for neural visual-inertial odometry,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2019. [37] F. Gao, J. Yu, H. Shen, Y. Wang, and H. Yang, “Attentional separation-and- aggregation network for self-supervised depth-pose learning in dynamic scenes,” CoRR, vol. abs/2011.09369, 2020. [38] B. Li, S. Wang, H. Ye, X. Gong, and Z. Xiang, “Cross-modal knowledge distillation for depth privileged monocular visual odometry,” IEEE Robotics and Automation Letters, vol. 7, no. 3, pp. 6171–6178, 2022. [39] S. Lee, F. Rameau, F. Pan, and I. S. Kweon, “Attentive and contrastive learning for joint depth and motion field estimation,” CoRR, vol. abs/2110.06853, 2021. [40] A. Kendall and Y. Gal, “What uncertainties do we need in bayesian deep learning for computer vision?,” in Proc. Conf. on Neural Information Processing Systems (NeurIPS), 2017. [41] M. Klodt and A. Vedaldi, “Supervising the new with the old: Learning sfm from sfm,” in Proc. European Conf. on Computer Vision (ECCV), 2018. 39 [42] H. Strasdat, J. M. M. Montiel, and A. J. Davison, “Real-time monocular slam: Why fil- ter?,” in 2010 IEEE International Conference on Robotics and Automation, pp. 2657– 2664, 2010. [43] J. Engel, J. Sturm, and D. Cremers, “Semi-dense visual odometry for a monocular cam- era,” in 2013 IEEE International Conference on Computer Vision, pp. 1449–1456, 2013. [44] X.-Y. Dai, Q.-H. Meng, and S. Jin, “Uncertainty-driven active view planning in feature- based monocular vslam,” Applied Soft Computing, vol. 108, p. 107459, 2021. [45] G. Costante and M. Mancini, “Uncertainty estimation for data-driven visual odometry,” IEEE Trans. Robotics, vol. 36, no. 6, pp. 1738–1757, 2020. [46] R. Mur-Artal and J. D. Tardós, “Orb-slam2: an open-source slam system for monocular, stereo and rgb-d cameras,” in IEEE Trans. Robotics, 2017. [47] G. Klein and D. W. Murray, “Parallel tracking and mapping for small ar workspaces,” in IEEE/ACM International Symposium on Mixed and Augmented Reality (ISMAR), 2007. [48] A. Geiger, J. Ziegler, and C. Stiller, “Stereoscan: Dense 3d reconstruction in real-time,” in IEEE Intelligent Vehicles Symposium (IV), 2011. [49] J. Engel, V. Koltun, and D. Cremers, “Direct sparse odometry,” IEEE Trans. Pattern Anal- ysis and Machine Intelligence (TPAMI), 2017. [50] J. Engel, T. Schöps, and D. Cremers, “Lsd-slam: Large-scale direct monocular,” in Proc. European Conf. on Computer Vision (ECCV), 2014. [51] R. A. Newcombe, S. Lovegrove, and A. J. Davison, “Dtam: Dense tracking and mapping in real-time,” in Proc. IEEE Int. Conf. on Computer Vision (ICCV), 2011. [52] R. Liu, J. Lehman, P. Molino, F. P. Such, E. Frank, A. Sergeev, and J. Yosinski, “An intriguing failing of convolutional neural networks and the coordconv solution,” in Proc. Conf. on Neural Information Processing Systems (NeurIPS), pp. 9628–9639, 2018. [53] M. A. Fischler and R. C. Bolles, “Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography,” Commun. ACM, vol. 24, p. 381–395, jun 1981. [54] D. J. Butler, J. Wulff, G. B. Stanley, and M. J. Black, “A naturalistic open source movie for optical flow evaluation,” in Proc. European Conf. on Computer Vision (ECCV), pp. 611– 625, 2012. [55] W. Wang, Y. Hu, and S. A. Scherer, “Tartanvo: A generalizable learning-based vo,” in Proc. Conf. on Robot Learning (CoRL), 2020. [56] K. Simonyan, A. Vedaldi, and A. Zisserman, “Deep inside convolutional networks: Visu- alising image classification models and saliency maps,” in Proc. Int. Conf. on Learning Representations Workshop (ICLRW), 2014. [57] A. Dosovitskiy, P. Fischer, E. Ilg, P. Hausser, C. Hazirbas, V. Golkov, P. Van Der Smagt, D. Cremers, and T. Brox, “Flownet: Learning optical flow with convolutional networks,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 2758–2766, 2015. 40 [58] E. Ilg, N. Mayer, T. Saikia, M. Keuper, A. Dosovitskiy, and T. Brox, “Flownet 2.0: Evo- lution of optical flow estimation with deep networks,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 1647–1655, 2017. [59] D. Sun, X. Yang, M.-Y. Liu, and J. Kautz, “Models matter, so does training: An empirical study of cnns for optical flow estimation,” in IEEE Trans. Pattern Analysis and Machine Intelligence (TPAMI), vol. 42, pp. 1408–1423, 2020. [60] S. Zhao, Y. Sheng, Y. Dong, E. Chang, and Y. Xu, “Maskflownet: Asymmetric feature matching with learnable occlusion mask,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 6277–6286, 2020. [61] Z. Yin, T. Darrell, and F. Yu, “Hierarchical discrete distribution decomposition for match density estimation,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 6037–6046, 2019. [62] J. Wulff, L. Sevilla-Lara, and M. J. Black, “Optical flow in mostly rigid scenes,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 6911–6920, 2017. [63] A. Ranjan and M. J. Black, “Optical flow estimation using a spatial pyramid network,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 2720–2729, 2017. [64] J. Hur and S. Roth, “Iterative residual refinement for joint optical flow and occlusion estimation,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 5747–5756, 2019. [65] Z. Teed and J. Deng, “Raft: Recurrent all-pairs field transforms for optical flow,” in Proc. European Conf. on Computer Vision (ECCV), 2020. [66] G. Yang and D. Ramanan, “Volumetric correspondence networks for optical flow,” in Proc. Conf. on Neural Information Processing Systems (NeurIPS), 2019. [67] A. Jaegle, S. Borgeaud, J.-B. Alayrac, C. Doersch, C. Ionescu, D. Ding, S. Koppula, D. Zo- ran, A. Brock, E. Shelhamer, O. Hénaff, M. M. Botvinick, A. Zisserman, O. Vinyals, and J. Carreira, “Perciever io: A general architecture for structured inputs and outputs,” in Proc. Int. Conf. on Learning Representations (ICLR), 2022. [68] Z. Huang, X. Shi, C. Zhang, Q. Wang, K. C. Cheung, H. Qin, J. Dai, and H. Li, “Flow- former: A transformer architecture for optical flow,” ArXiv, vol. abs/2203.16194, 2022. [69] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” in Proc. Conf. on Neural Information Processing Systems (NeurIPS), 2017. [70] A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics: The kitti dataset,” in The International Journal of Robotics Research, pp. 1231–1237, 2013. [71] N. Mayer, E. Ilg, P. Häusser, P. Fischer, D. Cremers, A. Dosovitskiy, and T. Brox, “A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2016. |