|
[1] J. M. Alvarez, F. Lumbreras, A. M. Lopez, and T. Gevers. Understanding road scenes using visual cues and GPS information. In ECCV Workshops (3), volume 7585 of Lecture Notes in Computer Science, pages 635-638. Springer, 2012. [2] V. Badrinarayanan, A. Kendall, and R. Cipolla. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. CoRR, 2015. [3] G. J. Brostow, J. Fauqueur, and R. Cipolla. Semantic object classes in video: A high-definition ground truth database. Pattern Recognition Letters, (2):88-97, 2009. [4] L. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille. Semantic image segmentation with deep convolutional nets and fully connected crfs. CoRR, abs/1412.7062, 2014. [5] D. Eigen and R. Fergus. Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In ICCV, pages 2650- 2658, 2015. [6] A. Ess, T. Mueller, H. Grabner, and L. J. V. Gool. Segmentation-based urban traffic scene understanding. In BMVC, pages 1-11. British Machine Vision Association, 2009. [7] K. He, X. Zhang, S. Ren, and J. Sun. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In ICCV, pages 1026-1034, 2015. [8] W. Huang and X. Gong. Fusion based holistic road scene understanding. CoRR, abs/1406.7525, 2014. 33 [9] X. Huang, C. Shen, X. Boix, and Q. Zhao. SALICON: reducing the semantic gap in saliency prediction by adapting deep neural networks. In ICCV, pages 262-270, 2015. [10] S. Ioffe and C. Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In ICML, JMLR Proceedings, pages 448- 456, 2015. [11] K. Jarrett, K. Kavukcuoglu, M. Ranzato, and Y. LeCun. What is the best multistage architecture for object recognition? In ICCV, pages 2146-2153, 2009. [12] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093, 2014. [13] S. S. Kruthiventi, K. Ayush, and R. V. Babu. Deepfix: A fully convolutional neural network for predicting human eye fixations. CoRR, abs/1510.02927, 2015. [14] M. Kummerer, L. Theis, and M. Bethge. Deep gaze I: boosting saliency prediction with feature maps trained on imagenet. CoRR, abs/1411.1045, 2014. [15] N. Liu, J. Han, D. Zhang, S. Wen, and T. Liu. Predicting eye fixations using convolutional neural networks. In CVPR, pages 362-370. IEEE Computer Society, 2015. [16] J. Long, E. Shelhamer, and T. Darrell. Fully convolutional networks for semantic segmentation. In CVPR, pages 3431-3440, 2015. [17] H. Noh, S. Hong, and B. Han. Learning deconvolution network for semantic segmentation. In ICCV, pages 1520-1528, 2015. [18] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. CoRR, 2014. [19] P. Sturgess, K. Alahari, L. Ladicky, and P. H. S. Torr. Combining appearance and structure from motion features for road scene understanding. In BMVC, pages 1-11. British Machine Vision Association, 2009. 34 [20] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. E. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In CVPR, pages 1-9, 2015. [21] C. L. Thomas. Opensalicon: An open source implementation of the salicon saliency model. Technical Report TR-2016-02, University of Pittsburgh, 2016. [22] H. Yang, B. Lin, K. Chang, and C. Chen. Automatic age estimation from face images via deep ranking. In BMVC, pages 55.1-55.11, 2015. [23] S. Zheng, S. Jayasumana, B. Romera-Paredes, V. Vineet, Z. Su, D. Du, C. Huang, and P. H. S. Torr. Conditional random fields as recurrent neural networks. In ICCV, pages 1529-1537. IEEE Computer Society, 2015.
|