|
[1] Ashvin Nair, Bob McGrew, Marcin Andrychowicz, Wojciech Zaremba, and Pieter Abbeel, “Overcoming exploration in reinforcement learning with demonstrations”, in IEEE International Conference on Robotics and Automation (ICRA), 2018 [2] Vinicius G. Goecks, Gregory M. Gremillion, Vernon J. Lawhern , John Valasek, and Nicholas R. Waytowich, “Integrating Behavior Cloning and Reinforcement Learning for Improved Performance in Dense and Sparse Reward Environments”, in International Conference on Autonomous Agents and MultiAgent Systems (AAMAS), 2020 [3] Richard Li, Allan Jabri, Trevor Darrell, and Pulkit Agrawal, “Towards Practical Multi-Object Manipulation using Relational Reinforcement Learning”, in IEEE International Conference on Robotics and Automation (ICRA), 2020 [4] Ivaylo Popov, Nicolas Heess, Timothy Lillicrap, Roland Hafner, Gabriel Barth-Maron, Matej Vecerik, Thomas Lampe, Yuval Tassa, Tom Erez, and Martin Riedmiller, “Data-efficient Deep Reinforcement Learning for Dexterous Manipulation”, in International Conference on Learning Representations (ICLR), 2018 [5] Rodrigo Toro Icarte, Toryn Q. Klassen, Richard Valenzano, and Sheila A. McIlraith, “Using Reward Machines for High-Level Task Specification and Decomposition in Reinforcement Learning”, in International Conference on Machine Learning (ICML), 2018 [6] Rodrigo Toro Icarte, Toryn Q. Klassen, Richard Valenzano, and Sheila A. McIlraith, “Reward Machines Exploiting Reward Function Structure in Reinforcement Learning”, in arXiv:2010.03950 [7] Karpathy, A., REINFORCEjs: WaterWorld demo. http://cs.stanford.edu/people/karpathy/reinforcejs/waterworld.html,2015 [8] Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., & Zaremba, W, OpenAI gym, 2016 [9] Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra, “Continuous control with deep reinforcement learning” in arXiv:1509.02971, 2015 [10] Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel , and Sergey Levine, “Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor”, in International Conference on Machine Learning (ICML), 2018 [11] Olivier Michel, Cyberbotics Ltd., Swiss Federal Institute of Technology in Lausanne, and BIRG & SWIS research groups, “Cyberbotics Ltd. Webots^TM: Professional Mobile Robot Simulation”, International Journal of Advanced Robotic Systems, 2004 [12] Jette Randlov, Preben Alstrøm, “Learning to Drive a Bicycle Using Reinforcement Learning and Shaping.”, in International Conference on Machine Learning (ICML), 1998
|