|
[1] Kim, Y. 2014. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1746– 1751. [2] Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, et al. 2016. Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144. [3] Karl Moritz Hermann, Tomas Kocisky, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom. 2015. Teaching machines to read and comprehend. In Advances in Neural Information Processing Systems, pages 1693– 1701. [4] Ting-Hao K. Huang, Francis Ferraro, Nasrin Mostafazadeh, Ishan Misra, Jacob Devlin, Aishwarya Agrawal, Ross Girshick, Xiaodong He, Pushmeet Kohli, Dhruv Batra, et al. 2016. Visual storytelling. In the 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2016). [5] Lin, T.-Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollar, P.; and Zitnick, C. L. 2014. Microsoft ´ coco: Common objects in context. In European conference on computer vision, 740–755. Springer. [6] Yu, L.; Bansal, M.; and Berg, T. 2017. Hierarchically attentive RNN for album summarization and storytelling. In EMNLP. Yu, L.; Bansal, M.; and Berg, T. 2017. Hierarchically attentive RNN for album summarization and storytelling. In EMNLP. [7] Huang, Q.; Gan, Z.; Celikyilmaz, A.; Wu, D.; Wang, J.; and He, X. 2019. Hierarchically structured reinforcement learning for topically coherent visual story generation. In AAAI. [8] Wang, X.; Chen, W.; Wang, Y.-F.; and Wang, W. Y. 2018b. No metrics are perfect: Adversarial reward learning for visual storytelling. In ACL. [9] Park and Gunhee Kim. 2015. Expressing an image stream with a sequence of natural sentences. In Advances in Neural Information Processing Systems, pages 73–81. [10] Yunjae Jung, Dahun Kim; Sanghyun Woo; Kyungsu Kim; and Sungjin Kim In So Kweon. 2020. Hide-and-Tell: Learning to Bridge Image streams for Visual Storytelling. In AAAI. [11] Junjie Hu; Yu Cheng; Zhe Gan; Jingjing Liu; Jianfeng Gao; and Graham Neubig. 2020. What Makes A Good Story? Designing Composite Rewards for Visual Storytelling. In the Thirty-Fourth Conference on AAAI. [12] Papineni, K.; Roukos, S.; Ward, T.; and Zhu, W.-J. 2002. BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting on association for computational linguistics, 311–318. Association for Computational Linguistics. [13] Xin Wang, Wenhu Chen, Jiawei Wu, Yuan-Fang Wang, and William Yang Wang. 2018b. Video captioning via hierarchical reinforcement learning. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). [14] Marc’Aurelio Ranzato, Sumit Chopra, Michael Auli, and Wojciech Zaremba. 2015. Sequence level training with recurrent neural networks. [15] Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; and Bengio, Y. 2014. Generative adversarial nets. In Advances in neural information processing systems. [16] Yu, L.; Zhang, W.; Wang, J.; and Yu, Y. 2017. Seqgan: Sequence generative adversarial nets with policy gradient. In AAAI. [17] Chen Chen; Shuai Mu; Wanpeng Xiao; Zexiong Ye; Liesi Wu; and Qi Ju. 2019. Improving Image Captioning with Conditional Generative Adversarial Nets. In AAAI. [18] He, K.; Zhang, X.; Ren, S.; and Sun, J. 2016. Deep residual learning for image recognition. In CVPR. [19] Vinyals, O.; Toshev, A.; Bengio, S.; and Erhan, D. 2015. Show and tell: A neural image caption generator. In Proceedings of the IEEE conference on computer vision and pattern recognition, 3156–3164. [20] S. Yan, F. Wu, J. S. Smith, W. Lu and B. Zhang, "Image Captioning using Adversarial Networks and Reinforcement Learning," 2018 24th International Conference on Pattern Recognition (ICPR), 2018, pp. 248-253, doi: 10.1109/ICPR.2018.8545049. [21] Devlin, J.; Chang, M.-W.; Lee, K.; and Toutanova, K. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In NAACL.
|