|
[1] A. McGovern, R. S. Sutton, and A. H. Fagg. Roles of macro-actions in accel-erating reinforcement learning. In Proc. Grace Hopper celebration of women in computing, volume 1317, 1997. [2] A. McGovern and R. S. Sutton. Macro-actions in reinforcement learning: An empirical analysis. Computer Science Department Faculty Publication Series, page 15, 1998. [3] S. Xu, H. Y. Kuang, Z. Q. Zhuang, R. J. Hu, Y. Liu, and H. Y. Sun. Macro action selection with deep reinforcement learning in StarCraft, 12 2018. [4] H. Onda and S. Ozawa. A reinforcement learning model using macro-actions in multi-task grid-world problems. In Proc. Int. Conf. Systems, Man and Cybernetics (SMC), pages 3088–3093, Oct. 2009. [5] A. Braylan, M. Hollenbeck, E. Meyerson, and R. Miikkulainen. Frame skip is a powerful parameter for learning to play Atari. In Proc. Association for the Advancement of Artificial Intelligence (AAAI) Conf. Workshop, Jan. 2015. [6] T. Yoshikawa and M. Kurihara. An acquiring method of macro-actions in reinforcement learning. In Proc. IEEE Int. Conf. Systems, Man, and Cyber-netics (SMC), pages 4813–4817, Nov. 2006. [7] M. A. H. Newton, J. Levine, M. Fox, and D. Long. Learning macro-actions for arbitrary planners and domains. In Proc. Int. Conf. Automated Planning and Scheduling (ICAPS), pages 256–263, Sep. 2007. [8] I. P. Durugkar, C. Rosenbaum, S. Dernbach, and S. Mahadevan. Deep rein-forcement learning with macro-actions. arXiv:1606.04615, Jun. 2016. [9] Zhiheng Zhao, Yi Liang, and Xiaoming Jin. Handling large-scale action space in deep q network. In 2018 International Conference on Artificial Intelligence and Big Data (ICAIBD), pages 93–96. IEEE, 2018. [10] B. Baker, O. Gupta, N. Naik, and R. Raskar. Designing neural network architectures using reinforcement learning. In Proc. Int. Conf. Learning Rep-resentations (ICLR), Apr. 2017. [11] B. Zoph and Q. V. Le. Neural architecture search with reinforcement learning. In Proc. Int. Conf. Learning Representations (ICLR), Apr. 2017. [12] Hanxiao Liu, Karen Simonyan, and Yiming Yang. Darts: Differentiable ar-chitecture search. ArXiv, abs/1806.09055, 2018. [13] M. G. Bellemare, Y. Naddaf, J. Veness, and M. H. Bowling. The arcade learning environment: An evaluation platform for general agents. J. Artificial Intelligence Research (JAIR), 2013. [14] R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA, USA, 1st edition, 1998. [15] R. S. Sutton, D. Precup, and S. Singh. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112(1-2):181–211, Aug. 1999. [16] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, et al. Human-level control through deep reinforcement learning. Nature, 518:529–533, Feb. 2015. [17] J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. Proximal policy optimization algorithms, 2017. [18] A. Hill, A. Raffin, M. Ernestus, A. Gleave, R. Traore, et al. Stable baselines. https://github.com/hill-a/stable-baselines, 2018. [19] Richard E Korf. Macro-operators: A weak method for learning. Artificial intelligence, 26(1):35–77, 1985. [20] A. Botea, M. Enzenberger, M. M¨uller, and J. Schaeffer. Macro-FF: Improv-ing AI planning with automatically learned macro-operators. J. Artificial Intelligence Research (JAIR), 24:581–621, Oct. 2005. [21] Masataro Asai and Alex Fukunaga. Solving large-scale planning problems by decomposition and macro generation. In ICAPS, pages 16–24, 2015. [22] Luk´aˇs Chrpa and Mauro Vallati. Improving domain-independent planning via critical section macro-operators. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 7546–7553, 2019. [23] Adi Botea, Martin M¨uller, and Jonathan Schaeffer. Learning partial-order macros from solutions. In ICAPS, pages 231–240, 2005. [24] Adi Botea, Martin M¨uller, Jonathan Schaeffer, et al. Fast planning with iterative macros. In IJCAI, pages 1828–1833, 2007. [25] J. Randlov. Learning macro-actions in reinforcement learning. In Proc. Ad-vances in Neural Information Processing Systems (NeurIPS), pages 1045–1051, Dec. 1999. [26] A. Vezhnevets, V. Mnih, S. Osindero, A. Graves, O. Vinyals, J. Agapiou, et al. Strategic attentive writer for learning macro-actions. In Proc. Advances in Neural Information Processing Systems (NeurIPS), pages 3486–3494, Dec. 2016. [27] A. S. Lakshminarayanan, S. Sharma, and B. Ravindran. Dynamic action repetition for deep reinforcement learning. In Proc. the Thirty-First AAAI Conf. Artificial Intelligence (AAAI-17), Feb. 2017. [28] S. Sharma, A. S. Lakshminarayanan, and B. Ravindran. Learning to repeat: Fine grained action repetition for deep reinforcement learning. In Proc. Int. Conf. Learning Represenations (ICLR), Apr.–May 2017. [29] K. Heecheol, M. Yamada, K. Miyoshi, and H. Yamakawa. Macro action rein-forcement learning with sequence disentanglement using variational autoen-coder. arXiv:1903.09366, May 2019. [30] A. I. Coles and A. J. Smith. Marvin: A heuristic search planner with online macro-action learning. J. Artificial Intelligence Research (JAIR), 28:119–156, Jan. 2007. [31] M. Newton, J. Levine, and M. Fox. Genetically evolved macro-actions in AI planning problems. In Proc. the UK Planning and Scheduling Special Interest Group (PlanSIG) Workshop, pages 163–172. Citeseer, Jan. 2005. [32] A. Dulac, D. Pellier, H. Fiorino, and D. Janiszek. Learning useful macro-actions for planning with n-grams. In Proc. Int. Conf. Tools with Artificial Intelligence (ICTAI), pages 803–810, Nov. 2013. [33] M. Bellemare, S. Srinivasan, G. Ostrovski, T. Schaul, D. Saxton, and R. Munos. Unifying count-based exploration and intrinsic motivation. In Advances in neural information processing systems, pages 1471–1479, 2016. |