|
[1] M. Everingham et al., “The PASCAL visual object classes challenge: A retrospective,” Int. J. Computer Vision, vol. 111, no. 1, pp. 98-136, Jan. 2015. [2] M. Cordts et al., “The Cityscapes dataset for semantic urban scene understanding,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 3213-3223, Jun. 2016. [3] B. Zhou et al., “Scene parsing through ADE20K dataset,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 5122-5130, Jul. 2017. [4] L. Tsung-Yi and other, “Microsoft COCO: Common objects in context,” in Proc. European Conf. Computer Vision (ECCV), pp. 740-755, Sep. 2014. [5] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Semantic image segmentation with deep convolutional nets and fully connected CRFs,” in Proc. Int. Conf. Learning Representations (ICLR), May 2015. [6] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs,” IEEE Trans. Pattern Analysis and Machine Intelligence (TPAMI), Apr. 2017. [7] L.-C. Chen, G. Papandreou, S. F, and H. Adam, “Rethinking atrous convolution for semantic image segmentation,” arXiv:1706.0558, Aug. 2017. [8] H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia, “Pyramid scene parsing network,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 6230-6239, Jul. 2017. [9] Z. Wu, C. Shen, and A. van den Hengel, “Wider or deeper: Revisiting the resnet model for visual recognition,” arXiv:11611.10080., Nov. 2016. [10] G. Lin, A. Milan, C. Shen, and I. Reid, “Refinenet: Multi-path refinement networks for high-resolution semantic segmentation,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 5168-5177, Jul. 2017. [11] P. Wang et al., “Understanding convolution for semantic segmentation,” 1702.08502, Feb. 2017. [12] G. GhiasiEmail and C. C. Fowlkes, “Laplacian pyramid reconstruction and refinement for semantic segmentation,” in Proc. European Conf. Computer Vision (ECCV), pp. 519-534, Oct. 2016. [13] D. Alexey, R. German, C. Felipe, L. Antonio, and K. Vladlen, “CARLA: An open urban driving simulator,” in Proc. Conf.on Robot Learning (CoRL), pp. 445-461, Nov. 2017. [14] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” in Proc. Neural Information Processing Systems (NIPS), pp. 1097-1105, Dec. 2012. [15] K. Simonyan and A. Zisserman, “Very deep convolutional networks for largescale image recognition,” in Proc. Int. Conf. Learning Representations (ICLR), May 2015. [16] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 770-778, Jun. 2016. [17] C. Szegedy et al., “Going deeper with convolutions,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 1-9, Jun. 2015. [18] C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi, “Inception-v4, inceptionresnet and the impact of residual connections on learning,” in Proc. Association for the Advancement of Artificial Intelligence (AAAI), pp. 4278-4284, Feb. 2017. [19] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 3431-3440, Jun. 2015. [20] K. He, X. Zhang, S. Ren, and J. Sun, “Spatial pyramid pooling in deep convolutional networks for visual recognition,” IEEE Trans. Pattern Analysis and Machine Intelligence (TPAMI), vol. 37, no. 9, pp. 1904-1916, Sep. 2015. [21] F. Yu and V. Koltun, “Multi-scale context aggregation by dilated convolutions,” in Proc. Int. Conf. Learning Representations (ICLR), May 2016. [22] J. Long, E. Shelhamer, and T. Darrell, “Efficient piecewise training of deep structured models for semantic segmentation,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 3194-3203, Jun. 2016. [23] H. Zhao, X. Qi, X. Shen, J. Shi, and J. Jia, “ICNet for real-time semantic segmentation on high-resolution images,” arXiv:1704.08545., Apr. 2017. [24] L.-C. Chen, Y. Yang, J. Wang, W. Xu, and A. L. Yuille, “Attention to scale: Scale-aware semantic image segmentation,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 3640-3649, Jun. 2016. [25] T. Pohlen, A. Hermans, M. Mathias, and B. Leibe, “Full-resolution residual networks for semantic segmentation in street scenes,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 3309-3318, Jul. 2017. [26] S. Zagoruyko et al., “A multipath network for object detection,” arXiv:1604.02135, Aug. 2016. [27] S. Zheng et al., “Conditional random fields as recurrent neural networks,” in Proc. IEEE Int. Conf. Computer Vision (ICCV), pp. 1529-1537, Dec. 2015. [28] P. Kr¨ahenb¨uhl and V. Koltun, “Efficient inference in fully connected crfs with gaussian edge potentials,” in Proc. Neural Information Processing Systems (NIPS), pp. 109-117, Dec. 2011. [29] E. Shelhamer, K. Rakelly, J. Hoffman, and T. Darrell, “Clockwork convnets for video semantic segmentation,” in Proc. European Conf. Computer Vision (ECCV) Wksp, pp. 852-868, Oct. 2016. [30] X. Zhu, Y. Xiong, J. Dai, L. Yuan, and Y. Wei, “Deep feature flow for video recognition,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 4141-4150, Jul. 2017. [31] L. Wiskott and T. J. Sejnowski, “Slow feature analysis: Unsupervised learning of invariances,” Neural computation, vol. 14, no. 4, pp. 715-770, Apr. 2002. [32] M. D. Zeiler and R. Fergus, “Slow and steady feature analysis: Higher order temporal coherence in video,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 3852-3861, Jun. 2016. [33] B. K. P. Horn and B. G. Schunck, “Determining optical flow,” J. Artificial intelligence, vol. 17, no. 1-3, pp. 185-203, Aug. 1981. [34] A. Dosovitskiy et al., “FlowNet: Learning optical flow with convolutional networks,” in Proc. IEEE Int. Conf. Computer Vision (ICCV), pp. 2758-2766, Dec. 2015. [35] E. Ilg et al., “FlowNet 2.0: Evolution of optical flow estimation with deep networks,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 1647-1655, Jul. 2017. [36] S. Ioffe and C. Szegedy, “Batch Normalization: accelerating deep network training by reducing internal covariate shift,” in Proc. Machine Learning Research (PMLR), vol. 37, pp. 448-456, Jul. 2015. [37] M. D. Zeiler, D. Krishnan, G. W. Taylor, and R. Fergus, “Deconvolutional networks,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 2528-2535, Jun. 2010. [38] J. Weickert, A. Bruhn, T. Brox, and N. Papenberg, “A survey on variational optic flow methods for small displacements,” Mathematical Models for Registration and Applications to Medical Imaging, pp. 103-136, Oct. 2006. [39] T. Brox and J. Malik, “Large displacement optical flow: Descriptor matching in variational motion estimation,” IEEE Trans. Pattern Analysis and Machine Intelligence (TPAMI), vol. 33, no. 3, pp. 500-513, May 2011. [40] P. Weinzaepfel, J. Revaud, Z. Harchaoui, and C. Schmid., “DeepFlow: Large displacement optical flow with deep matching,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 1385-1392, Dec. 2013. [41] C. Bailer, B. Taetz, and D. Stricker, “Flow fields: Dense correspondence fields for highly accurate large displacement optical flow estimation,” in Proc. IEEE Int. Conf. Computer Vision (ICCV), pp. 4015-4023, Dec. 2015. [42] J. Wulff and M. J. Black, “Efficient sparse-to-dense optical flow estimation using a learned basis and layers,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 120-130, Jun. 2015. [43] A. Ranjan and M. J. Black, “Optical flow estimation using a spatial pyramid network,” arXiv:1611.00850., Nov. 2016. [44] E. L. Denton, S. Chintala, R. Fergus, et al., “Deep generative image models using a laplacian pyramid of adversarial networks,” in Proc. Neural Information Processing Systems (NIPS), pp. 1486-1494, Dec. 2015. [45] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in Proc. Int. Conf. Learning Representations (ICLR), May 2015. [46] D.Jayaraman and K. Grauman, “Visualizing and understanding convolutional networks,” in Proc. European Conf. Computer Vision (ECCV), pp. 818-833, Sep. 2014.
|