|
[1] B. M. Lake, T. D. Ullman, J. B. Tenenbaum, and S. J. Gershman, “Building machines that learn and think like people,” Behavioral and brain sciences, vol. 40, 2017. [2] P. Tsividis, T. Pouncy, J. L. Xu, J. B. Tenenbaum, and S. J. Gershman, “Human learning in atari,” in Proceedings of the AAAI Spring Symposia, 2017. [3] L. Kaiser, M. Babaeizadeh, P. Milos, B. Osinski, R. H. Campbell, K. Czechowski, D. Erhan, C. Finn, P. Kozakowski, S. Levine, et al., “Model-based reinforcemenst learning for atari,” arXiv preprint arXiv:1903.00374, 2019. [4] D. Yarats, A. Zhang, I. Kostrikov, B. Amos, J. Pineau, and R. Fergus, “Improving sample efficiency in model-free reinforcement learning from images,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 10674–10681, 2021. [5] D. Hafner, T. Lillicrap, I. Fischer, R. Villegas, D. Ha, H. Lee, and J. Davidson, “Learning latent dynamics for planning from pixels,” in Proceedings of the International conference on machine learning, pp. 2555–2565, PMLR, 2019. [6] D. Hafner, T. Lillicrap, J. Ba, and M. Norouzi, “Dream to control: Learning behaviors by latent imagination,” in Proceedings of the International Conference on Learning Representations, 2020. [7] A. X. Lee, A. Nagabandi, P. Abbeel, and S. Levine, “Stochastic latent actor-critic: Deep reinforcement learning with a latent variable model,” Advances in Neural Information Processing Systems, vol. 33, pp. 741–752, 2020. [8] M. Laskin, A. Srinivas, and P. Abbeel, “Curl: Contrastive unsupervised representations for reinforcement learning,” Proceedings of the 37th International Conference on Machine Learning, Vienna, Austria, PMLR 119, 2020. arXiv:2004.04136. [9] M. Laskin, K. Lee, A. Stooke, L. Pinto, P. Abbeel, and A. Srinivas, “Reinforcement learning with augmented data,” Advances in neural information processing systems, vol. 33, pp. 19884–19895, 2020. [10] I. Kostrikov, D. Yarats, and R. Fergus, “Image augmentation is all you need: Regularizing deep reinforcement learning from pixels,” arXiv preprint arXiv:2004.13649, 2020. [11] Y. Tassa, Y. Doron, A. Muldal, T. Erez, Y. Li, D. d. L. Casas, D. Budden, A. Abdolmaleki, J. Merel, A. Lefrancq, et al., “Deepmind control suite,” arXiv preprint arXiv:1801.00690, 2018. [12] E. Shelhamer, P. Mahmoudieh, M. Argus, and T. Darrell, “Loss is its own reward: Selfsupervision for reinforcement learning,” arXiv preprint arXiv:1612.07307, 2016 [13] T. Jakab, A. Gupta, H. Bilen, and A. Vedaldi, “Unsupervised learning of object landmarks through conditional image generation,” Advances in neural information processing systems, vol. 31, 2018. [14] Y. Zhang, Y. Guo, Y. Jin, Y. Luo, Z. He, and H. Lee, “Unsupervised discovery of object landmarks as structural representations,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2694–2703, 2018. [15] D. Lorenz, L. Bereska, T. Milbich, and B. Ommer, “Unsupervised part-based disentangling of object shape and appearance,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10955–10964, 2019. [16] T. Jakab, A. Gupta, H. Bilen, and A. Vedaldi, “Self-supervised learning of interpretable keypoints from unlabelled videos,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8787–8797, 2020. [17] J. J. Sun, S. Ryou, R. H. Goldshmid, B. Weissbourd, J. O. Dabiri, D. J. Anderson, A. Kennedy, Y. Yue, and P. Perona, “Self-supervised keypoint discovery in behavioral videos,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2171–2180, 2022. [18] M. H. Graham, “Confronting multicollinearity in ecological multiple regression,” Ecology, vol. 84, no. 11, pp. 2809–2815, 2003. [19] R. Jonschkowski, R. Hafner, J. Scholz, and M. Riedmiller, “Pves: Position-velocity encoders for unsupervised learning of structured state representations,” arXiv preprint arXiv:1705.09805, 2017. [20] W. Shang, X. Wang, A. Srinivas, A. Rajeswaran, Y. Gao, P. Abbeel, and M. Laskin, “Reinforcement learning with latent flow,” Advances in Neural Information Processing Systems, vol. 34, pp. 22171–22183, 2021. [21] R. Boney, A. Ilin, and J. Kannala, “End-to-end learning of keypoint representations for continuous control from images,” arXiv preprint arXiv:2106.07995, 2021. [22] T. Haarnoja, A. Zhou, K. Hartikainen, G. Tucker, S. Ha, J. Tan, V. Kumar, H. Zhu, A. Gupta, P. Abbeel, et al., “Soft actor-critic algorithms and applications,” arXiv preprint arXiv:1812.05905, 2018. [23] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, et al., “Human-level control through deep reinforcement learning,” nature, vol. 518, no. 7540, pp. 529–533, 2015. [24] J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, 2017. [25] M. Hessel, J. Modayil, H. Van Hasselt, T. Schaul, G. Ostrovski, W. Dabney, D. Horgan, B. Piot, M. Azar, and D. Silver, “Rainbow: Combining improvements in deep reinforcement learning,” in Proceedings of the Thirty-second AAAI conference on artificial intelligence, 2018. [26] A. v. d. Oord, Y. Li, and O. Vinyals, “Representation learning with contrastive predictive coding,” arXiv preprint arXiv:1807.03748, 2018. [27] K. He, H. Fan, Y. Wu, S. Xie, and R. Girshick, “Momentum contrast for unsupervised visual representation learning,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9729–9738, 2020. [28] T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive learning of visual representations,” in Proceedings of the International conference on machine learning, pp. 1597–1607, PMLR, 2020. [29] T. D. Kulkarni, A. Gupta, C. Ionescu, S. Borgeaud, M. Reynolds, A. Zisserman, and V. Mnih, “Unsupervised learning of object keypoints for perception and control,” Advances in neural information processing systems, vol. 32, 2019. [30] B. Chen, P. Abbeel, and D. Pathak, “Unsupervised learning of visual 3d keypoints for control,” in Proceedings of the International Conference on Machine Learning, pp. 1539– 1549, PMLR, 2021. [31] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE transactions on image processing, vol. 13, no. 4, pp. 600–612, 2004.
|