|
[1] H.-K. Yang, T.-C. Hsiao, T.-H. Liao, H.-S. Liu, L.-Y. Tsao, T.-W. Wang, S.-Y. Yang, Y.-W. Chen, H.-R. Liao, and C.-Y. Lee, “Investigation of factorized optical flows as mid-level representations,” arXiv preprint arXiv:2203.04927, 2022. [2] A. Sax, J. O. Zhang, B. Emi, A. Zamir, S. Savarese, L. Guibas, and J. Malik, “Learning to navigate using mid-level visual priors,” arXiv preprint arXiv:1912.11121, 2019. [3] B. Chen, A. Sax, G. Lewis, I. Armeni, S. Savarese, A. Zamir, J. Malik, and L. Pinto, “Robust policies via mid-level visual representations: An experimental study in manipulation and navigation,” arXiv preprint arXiv:2011.06698, 2020. [4] L. Yen-Chen, A. Zeng, S. Song, P. Isola, and T.-Y. Lin, “Learning to see before learning to act: Visual pre-training for manipulation,” in 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 7286–7293, IEEE, 2020. [5] M. M¨uller, A. Dosovitskiy, B. Ghanem, and V. Koltun, “Driving policy transfer via modularity and abstraction,” arXiv preprint arXiv:1804.09364, 2018. [6] K. Arulkumaran, M. P. Deisenroth, M. Brundage, and A. A. Bharath, “A brief survey of deep reinforcement learning,” arXiv preprint arXiv:1708.05866, 2017. 37 [7] J. Garcıa and F. Fern´andez, “A comprehensive survey on safe reinforcement learning,” Journal of Machine Learning Research, vol. 16, no. 1, pp. 1437–1480, 2015. [8] W. Zhao, J. P. Queralta, and T. Westerlund, “Sim-to-real transfer in deep reinforcement learning for robotics: a survey,” in 2020 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 737–744, IEEE, 2020. [9] I. Higgins, A. Pal, A. Rusu, L. Matthey, C. Burgess, A. Pritzel, M. Botvinick, C. Blundell, and A. Lerchner, “Darla: Improving zero-shot transfer in reinforcement learning,” in International Conference on Machine Learning, pp. 1480–1490, PMLR, 2017. [10] J. Oh, S. Singh, H. Lee, and P. Kohli, “Zero-shot task generalization with multitask deep reinforcement learning,” in International Conference on Machine Learning, pp. 2661–2670, PMLR, 2017. [11] N. Koenig and A. Howard, “Design and use paradigms for gazebo, an opensource multi-robot simulator,” in 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)(IEEE Cat. No. 04CH37566), vol. 3, pp. 2149–2154, IEEE, 2004. 12] A. Juliani, V.-P. Berges, E. Teng, A. Cohen, J. Harper, C. Elion, C. Goy, Y. Gao, H. Henry, M. Mattar, et al., “Unity: A general platform for intelligent agents,” arXiv preprint arXiv:1809.02627, 2018. [13] E. Coumans and Y. Bai, “Pybullet, a python module for physics simulation for games, robotics and machine learning.” http://pybullet.org, 2016–2021. [14] E. Todorov, T. Erez, and Y. Tassa, “Mujoco: A physics engine for model-based control,” in 2012 IEEE/RSJ international conference on intelligent robots and systems, pp. 5026–5033, IEEE, 2012. 38 [15] M. Quigley, K. Conley, B. P. Gerkey, J. Faust, T. Foote, J. Leibs, R. Wheeler, and A. Y. Ng, “Ros: an open-source robot operating system,” in ICRA Workshop on Open Source Software, 2009. [16] J. P. Tobin, Real-World Robotic Perception and Control Using Synthetic Data. University of California, Berkeley, 2019. [17] L. Capito, U. Ozguner, and K. Redmill, “Optical flow based visual potential field for autonomous driving,” in 2020 IEEE Intelligent Vehicles Symposium (IV), pp. 885–891, IEEE, 2020. [18] B. Zhou, P. Kr¨ahenb¨uhl, and V. Koltun, “Does computer vision matter for action?,” Science Robotics, vol. 4, no. 30, p. eaaw6661, 2019. [19] R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction. MIT press, 2018. [20] M. v. Otterlo and M. Wiering, “Reinforcement learning and markov decision processes,” in Reinforcement learning, pp. 3–42, Springer, 2012. [21] A. G. Barto, P. S. Thomas, and R. S. Sutton, “Some recent applications of reinforcement learning,” in Proceedings of the Eighteenth Yale Workshop on Adaptive and Learning Systems, 2017. [22] E. Ipek, O. Mutlu, J. F. Mart´ınez, and R. Caruana, “Self-optimizing memory controllers: A reinforcement learning approach,” ACM SIGARCH Computer Architecture News, vol. 36, no. 3, pp. 39–50, 2008. [23] G. Tesauro, D. Gondek, J. Lenchner, J. Fan, and J. M. Prager, “Simulation, learning, and optimization techniques in watson’s game strategies,” IBM Journal of Research and Development, vol. 56, no. 3.4, pp. 16–1, 2012. 39 [24] Z. Zainuddin and O. Pauline, “Function approximation using artificial neural networks,” WSEAS Transactions on Mathematics, vol. 7, no. 6, pp. 333–338, 2008. [25] S. Levine, C. Finn, T. Darrell, and P. Abbeel, “End-to-end training of deep visuomotor policies,” The Journal of Machine Learning Research, vol. 17, no. 1, pp. 1334–1373, 2016. [26] I. Osband, C. Blundell, A. Pritzel, and B. Van Roy, “Deep exploration via bootstrapped dqn,” Advances in neural information processing systems, vol. 29, 2016. [27] H. van Hasselt, A. Guez, and D. Silver, “Deep reinforcement learning with double q-learning,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 30, Mar. 2016. [28] J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, 2017. [29] T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, “Continuous control with deep reinforcement learning,” arXiv preprint arXiv:1509.02971, 2015. [30] T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine, “Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor,” in International conference on machine learning, pp. 1861–1870, PMLR, 2018. [31] A. G. Barto, R. S. Sutton, and C. W. Anderson, “Neuronlike adaptive elements that can solve difficult learning control problems,” IEEE transactions on systems, man, and cybernetics, no. 5, pp. 834–846, 1983. 40 [32] T. Haarnoja, A. Zhou, K. Hartikainen, G. Tucker, S. Ha, J. Tan, V. Kumar, H. Zhu, A. Gupta, P. Abbeel, et al., “Soft actor-critic algorithms and applications,” arXiv preprint arXiv:1812.05905, 2018. [33] G. Brockman, V. Cheung, L. Pettersson, J. Schneider, J. Schulman, J. Tang, and W. Zaremba, “Openai gym,” arXiv preprint arXiv:1606.01540, 2016. [34] D. Silver, G. Lever, N. Heess, T. Degris, D. Wierstra, and M. Riedmiller, “Deterministic policy gradient algorithms,” in International conference on machine learning, pp. 387–395, PMLR, 2014. [35] M. Vecerik, T. Hester, J. Scholz, F. Wang, O. Pietquin, B. Piot, N. Heess, T. Roth¨orl, T. Lampe, and M. Riedmiller, “Leveraging demonstrations for deep reinforcement learning on robotics problems with sparse rewards,” arXiv preprint arXiv:1707.08817, 2017. [36] Z. Zhu, K. Lin, and J. Zhou, “Transfer learning in deep reinforcement learning: A survey,” arXiv preprint arXiv:2009.07888, 2020. [37] F. Fern´andez and M. Veloso, “Probabilistic policy reuse in a reinforcement learning agent,” in Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems, pp. 720–727, 2006. [38] T. Haarnoja, H. Tang, P. Abbeel, and S. Levine, “Reinforcement learning with deep energy-based policies,” in International conference on machine learning, pp. 1352–1361, PMLR, 2017. [39] A. Garcia-Garcia, S. Orts-Escolano, S. Oprea, V. Villena-Martinez, and J. Garcia-Rodriguez, “A review on deep learning techniques applied to semantic segmentation,” arXiv preprint arXiv:1704.06857, 2017. 41 [40] C. Yu, J. Wang, C. Peng, C. Gao, G. Yu, and N. Sang, “Bisenet: Bilateral segmentation network for real-time semantic segmentation,” in Proceedings of the European conference on computer vision (ECCV), pp. 325–341, 2018. [41] P. Wang, P. Chen, Y. Yuan, D. Liu, Z. Huang, X. Hou, and G. Cottrell, “Understanding convolution for semantic segmentation,” in 2018 IEEE winter conference on applications of computer vision (WACV), pp. 1451–1460, Ieee, 2018. [42] H. Zhang, K. Dana, J. Shi, Z. Zhang, X. Wang, A. Tyagi, and A. Agrawal, “Context encoding for semantic segmentation,” in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 7151–7160, 2018. [43] D. Eigen, C. Puhrsch, and R. Fergus, “Depth map prediction from a single image using a multi-scale deep network,” Advances in neural information processing systems, vol. 27, 2014. [44] D. Scharstein and R. Szeliski, “A taxonomy and evaluation of dense two-frame stereo correspondence algorithms,” International journal of computer vision, vol. 47, no. 1, pp. 7–42, 2002. [45] G. Zhou, L. Fang, K. Tang, H. Zhang, K. Wang, and K. Yang, “Guidance: A visual sensing platform for robotic applications,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 9–14, 2015. [46] H.-Y. Kuo, H.-R. Su, S.-H. Lai, and C.-C. Wu, “3d object detection and pose estimation from depth image for robotic bin picking,” in 2014 IEEE international conference on automation science and engineering (CASE), pp. 1264–1269, IEEE, 2014. 42 [47] R. N. Elek, A. I. K´aroly, T. Haidegger, and P. Galambos, “Towards optical flow ego-motion compensation for moving object segmentation.,” in ROBOVIS, pp. 114–120, 2020. [48] K. M. Dawson-Howe and D. Vernon, “Simple pinhole camera calibration,” International Journal of Imaging Systems and Technology, vol. 5, no. 1, pp. 1– 6, 1994. [49] S. Zhang and R. S. Sutton, “A deeper look at experience replay,” arXiv preprint arXiv:1712.01275, 2017. [50] A. Stooke and P. Abbeel, “Accelerated methods for deep reinforcement learning,” arXiv preprint arXiv:1803.02811, 2018. [51] V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, “Playing atari with deep reinforcement learning,” arXiv preprint arXiv:1312.5602, 2013. [52] Y. A. LeCun, L. Bottou, G. B. Orr, and K.-R. M¨uller, “Efficient backprop,” in Neural networks: Tricks of the trade, pp. 9–48, Springer, 2012. [53] N. S. Keskar, D. Mudigere, J. Nocedal, M. Smelyanskiy, and P. T. P. Tang, “On large-batch training for deep learning: Generalization gap and sharp minima,” arXiv preprint arXiv:1609.04836, 2016. [54] L.-J. Lin, “Self-improving reactive agents based on reinforcement learning, planning and teaching,” Machine learning, vol. 8, no. 3, pp. 293–321, 1992. [55] X. Qian and D. Klabjan, “The impact of the mini-batch size on the variance of gradients in stochastic gradient descent,” arXiv preprint arXiv:2004.13146, 2020. 43 [56] L. Bottou et al., “Online learning and stochastic approximations,” On-line learning in neural networks, vol. 17, no. 9, p. 142, 1998. |