|
[1] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. A. Riedmiller, A. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis. Human-level control through deep reinforcement learning. Nature, vol. 518, no. 7540, pp. 529-533, Feb. 2015. [2] M. Zhang, Z. McCarthy, C. Finn, S. Levine, and P. Abbeel. Learning deep neural network policies with continuous memory states. In Proc. Int. Conf. Robotics and Automation (ICRA), pages 520–527, May 2016. [3] R. Houthooft, X. Chen, Y. Duan, J. Schulman, F. D. Turck, and P. Abbeel. VIME: Variational information maximizing exploration. In Proc. Advances in Neural Information Processing Systems (NeurIPS), pages 1109–1117, Dec. 2016. [4] M. G. Bellemare, S. Srinivasan, G. Ostrovski, T. Schaul, D. Saxton, and R. Munos. Unifying count-based exploration and intrinsic motivation. In Advances in Neural Information Processing Systems (NeurIPS), pages 1471–1479, Dec. 2016. [5] G. Ostrovski, M. G. Bellemare, A. van den Oord, and R. Munos. Count-based exploration with neural density models. In Proc. Int. Conf. Machine Learning (ICML), pages 2721–2730, Jun. 2017. [6] B. C. Stadie, S. Levine, and P. Abbeel. Incentivizing exploration in reinforcement learning with deep predictive models. arXiv:1507.00814, Nov. 2015. [7] D. Pathak, P. Agrawal, A. A. Efros, and T. Darrell. Curiosity-driven exploration by self-supervised prediction. In Proc. Int. Conf. Machine Learning (ICML), pages 2778–2787, May 2017. [8] Y. Burda, H. Edwards, D. Pathak, A. J. Storkey, T. Darrell, and A. A. Efros. Large-scale study of curiosity-driven learning. In Proc. Int. Conf. Learning Representation (ICLR), May 2019. [9] Y. Burda, H. Edwards, A. Storkey, and O. Klimov. Exploration by random network distillation. In roc. Int. Conf. Learning Representations (ICLR), May 2019. [10] D. P. Kingma and M. Welling. Auto-encoding variational bayes. arXiv:1312.6114, ay 2014. [11] T. Xue, J. Wu, K. L. Bouman, and B. Freeman. Visual dynamics: Probabilistic futureframe synthesis via cross convolutional networks. In Proc. Advances in Neural Information Processing Systems (NeurIPS), pages 91–99, Dec. 2016. [12] W. Lotter, G. Kreiman, and D. D. Cox. Deep predictive coding networks for video prediction and unsupervised learning. arXiv:1605.08104, Mar. 2017. [13] S. Greydanus, A. Koul, J. Dodge, and A. Fern. Visualizing and understanding atari agents. In Int. Conf. Machine Learning (ICML), pages 1787–1796, Jun. 2018. [14] M. G. Bellemare, Y. Naddaf, J. Veness, and M. Bowling. The arcade learning environment: An evaluation platform for general agents. J. Artificial Intelligence Research (JAIR), 47:253–279, May 2013. [15] M. Wydmuch, M. Kempkaand, and W. Jaskowski. ViZDoom competitions: Playing Doom from pixels. IEEE Trans. Games, Oct. 2018. [16] P. Fischer, A. Dosovitskiy, and E. IlgA. et al. FlowNet: Learning optical flow with convolutional networks. In Proc. IEEE Int. Conf. Computer Vision (ICCV), pages 2758–2766, May 2015. [17] E. Ilg, N. Mayer, T. Saikia, M. Keuper, A. Dosovitskiy, and T. Brox. FlowNet 2.0: Evolution of optical flow estimation with deep networks. In Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pages 1647–1655, Dec. 2017. [18] S. Meister, J. Hur, and S. Roth. Unflow: Unsupervised learning of optical flow with a bidirectional census loss. In Proc. of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), pages 7251–7259, 2018. [19] T. Beier and S. Neely. Feature-based image metamorphosis. In Special Interest Group on Computer Graphics (SIGGRAPH), pages 35–42, Jul. 1992. [20] J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. Proximal policy optimization algorithms. arXiv:1707.06347, Jul. 2017. [21] V. Mnih, A. P. Badia, M. Mirza, A. Graves, T. P. Lillicrap, T. Harley, D. Silver, and K. Kavukcuoglu. Asynchronous methods for deep reinforcement learning. In Proc. Int. Conf. Machine Learning (ICML), pages 1928–1937, Jun. 2016.
|