帳號:guest(18.117.229.133)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):張晉豪
作者(外文):Chang, Chin-Hao
論文名稱(中文):用於領域泛化之具有輔助語義學習的特徵空間中之風格替換
論文名稱(外文):Style Replacement in Feature Space with Auxiliary Semantic Learning for Domain Generalization
指導教授(中文):許秋婷
指導教授(外文):Hsu, Chiou-Ting
口試委員(中文):王聖智
邵皓強
口試委員(外文):Wang, Sheng-Jyh
Shao, Hao-Chiang
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:108062614
出版年(民國):110
畢業學年度:109
語文別:英文
論文頁數:28
中文關鍵詞:領域泛化風格替換風格轉換自監督學習
外文關鍵詞:Domain generalizationStyle replacementStyle transferSelf-supervised learning
相關次數:
  • 推薦推薦:0
  • 點閱點閱:53
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
在處理新數據時,神經網絡模型之泛化能力是現實世界應用中一個重要的議
題。域泛化旨在利用來自多個不同源域的數據來訓練一個模型,該模型可以直
接泛化到任何在訓練階段不可使用的未知目標域。在本論文中,我們專注於圖
像分類任務的域泛化,並提出了一個端到端且獨特的風格替換及語義學習框架
(稱為SRNet),主要有兩個構想。第一個是新穎的風格替換方法,它促進我
們的網絡模型提取風格不變的特徵。另外,為了更進一步促進語義特徵學習,
我們提出的方法包含了一個輔助的自監督任務,該任務預測轉換圖像的轉換類
型。通過將風格替換與輔助圖像轉換預測任務結合,我們訓練一個模型,通過
根據圖像的高級語義特徵或全局對象形狀對圖像進行分類,將跨域知識轉移到
未知目標域。在PACS 和VLCS 兩個領域泛化基准上的實驗結果顯示我們提出的
方法有效,且比過去的方法擁有更好的效果。
The generalization ability of deep neural network model is a crucial issue in real-world applications when dealing with new data. Domain generalization relies on data of multiple source domains to learn a model which is capable of generalizing well to any unknown target domain that is unavailable during training. In this thesis, we focus on domain generalization for image recognition task and propose an end-to-end and multi-task learning framework (called SRNet) with two main ideas. First, we propose a novel style replacement method to encourage our model to extract style-invariant features. Second, to further boost the semantic feature learning, we include an auxiliary task to predict the transformation type of a transformed image in a self-supervised way. By combining the style replacement method with the auxiliary image transformation prediction task, we train a model to transfer the cross-domain knowledge to unknown target domains by classifying images according to their high-level semantic features or global object shapes. Experimental results on two domain generalization benchmarks, PACS and VLCS, demonstrate that our proposed method is effective and attains superior performance over previous methods.
Acknowledgements I
中文摘要 II
Abstract III
1 Introduction 2
2 Related Work 6
2.1 Domain Generalization . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Style Transfer and Data Augmentation . . . . . . . . . . . . . . . . 7
2.3 Self-Supervised Learning . . . . . . . . . . . . . . . . . . . . . . . . 8
3 Proposed Method 9
3.1 Problem Statement and Motivations . . . . . . . . . . . . . . . . . 9
3.2 The Architecture of Proposed Network . . . . . . . . . . . . . . . . 10
3.3 Style Replacement during Feature Extraction . . . . . . . . . . . . 11
3.4 Semantic Feature Learning with Auxiliary Task . . . . . . . . . . . 14
4 Experiments 16
4.1 Datasets and Experimental Settings . . . . . . . . . . . . . . . . . . 16
4.2 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.3 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.4 Results on PACS Dataset . . . . . . . . . . . . . . . . . . . . . . . 20
4.5 Results on VLCS Dataset . . . . . . . . . . . . . . . . . . . . . . . 21
4.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5 Conclusion 24
References 25
[1] Imad Eddine Ibrahim Bekkouch, Dragoş Constantin Nicolae, Adil Khan, SM Ahsan Kazmi, Asad Masood Khattak, and Bulat Ibragimov. Adversarial reconstruction loss for domain generalization. IEEE Access, 9:42424–42437, 2021.
[2] Fabio M Carlucci, Antonio D’Innocente, Silvia Bucci, Barbara Caputo, and Tatiana Tommasi. Domain generalization by solving jigsaw puzzles. In
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2229–2238, 2019.
[3] Ting Chen, Xiaohua Zhai, Marvin Ritter, Mario Lucic, and Neil Houlsby. Selfsupervised gans via auxiliary rotation loss. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12154–12163, 2019.
[4] Myung Jin Choi, Joseph J Lim, Antonio Torralba, and Alan S Willsky. Exploiting hierarchical context on a large database of object categories. In 2010 IEEE computer society conference on computer vision and pattern recognition, pages 129–136. IEEE, 2010.
[5] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
[6] Zhengming Ding and Yun Fu. Deep domain generalization with structured low-rank constraint. IEEE Transactions on Image Processing, 27(1):304–313, 2017.
[7] Qi Dou, Daniel Coelho de Castro, Konstantinos Kamnitsas, and Ben Glocker. Domain generalization via model-agnostic learning of semantic features. Advances in Neural Information Processing Systems, 32:6450–6461, 2019.
[8] Vincent Dumoulin, Jonathon Shlens, and Manjunath Kudlur. A learned representation for artistic style. ICLR, 2017.
[9] Mark Everingham and John Winn. The pascal visual object classes challenge 2007 (voc2007) development kit. 2009.
[10] Li Fei-Fei, Rob Fergus, and Pietro Perona. Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. In 2004 conference on computer vision and pattern recognition workshop, pages 178–178. IEEE, 2004.
[11] Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo Larochelle, François Laviolette, Mario Marchand, and Victor Lempitsky. Domain-adversarial training of neural networks. The journal of machine learning research, 17(1):2096–2030, 2016.
[12] Leon A Gatys, Alexander S Ecker, and Matthias Bethge. Image style transfer using convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2414–2423, 2016.
[13] Golnaz Ghiasi, Honglak Lee, Manjunath Kudlur, Vincent Dumoulin, and Jonathon Shlens. Exploring the structure of a real-time, arbitrary neural
artistic stylization network. arXiv preprint arXiv:1705.06830, 2017.
[14] Muhammad Ghifary, W Bastiaan Kleijn, Mengjie Zhang, and David Balduzzi. Domain generalization for object recognition with multi-task autoencoders. In Proceedings of the IEEE international conference on computer vision, pages 2551–2559, 2015.
[15] Spyros Gidaris, Andrei Bursuc, Nikos Komodakis, Patrick Pérez, and Matthieu Cord. Boosting few-shot visual learning with self-supervision. In
Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 8059–8068, 2019.
[16] Spyros Gidaris, Praveer Singh, and Nikos Komodakis. Unsupervised representation learning by predicting image rotations. arXiv preprint arXiv:1803.07728, 2018.
[17] Judy Hoffman, Eric Tzeng, Taesung Park, Jun-Yan Zhu, Phillip Isola, Kate Saenko, Alexei Efros, and Trevor Darrell. Cycada: Cycle-consistent adversarial domain adaptation. In International conference on machine learning, pages 1989–1998. PMLR, 2018.
[18] Xun Huang and Serge Belongie. Arbitrary style transfer in real-time with adaptive instance normalization. In ICCV, 2017.
[19] Philip TG Jackson, Amir Atapour Abarghouei, Stephen Bonner, Toby P Breckon, and Boguslaw Obara. Style augmentation: data augmentation via
style randomization. In CVPR Workshops, pages 83–92, 2019.
[20] Alex Krizhevsky. One weird trick for parallelizing convolutional neural networks. arXiv preprint arXiv:1404.5997, 2014.
[21] Gustav Larsson, Michael Maire, and Gregory Shakhnarovich. Colorization as a proxy task for visual understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 6874–6883, 2017.
[22] Da Li, Yongxin Yang, Yi-Zhe Song, and Timothy M Hospedales. Deeper, broader and artier domain generalization. In Proceedings of the IEEE international conference on computer vision, pages 5542–5550, 2017.
[23] Da Li, Yongxin Yang, Yi-Zhe Song, and Timothy M Hospedales. Learning to generalize: Meta-learning for domain generalization. In Thirty-Second AAAI Conference on Artificial Intelligence, 2018.
[24] Haoliang Li, Sinno Jialin Pan, Shiqi Wang, and Alex C Kot. Domain generalization with adversarial feature learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5400–5409, 2018.
[25] Ya Li, Xinmei Tian, Mingming Gong, Yajing Liu, Tongliang Liu, Kun Zhang, and Dacheng Tao. Deep domain generalization via conditional invariant adversarial networks. In Proceedings of the European Conference on Computer Vision (ECCV), pages 624–639, 2018.
[26] Mingsheng Long, Yue Cao, Jianmin Wang, and Michael Jordan. Learning transferable features with deep adaptation networks. In International conference on machine learning, pages 97–105. PMLR, 2015.
[27] Toshihiko Matsuura and Tatsuya Harada. Domain generalization using a mixture of multiple latent domains. In Proceedings of the AAAI Conference
on Artificial Intelligence, volume 34, pages 11749–11756, 2020.
[28] Mehdi Noroozi and Paolo Favaro. Unsupervised learning of visual representations by solving jigsaw puzzles. In European conference on computer vision, pages 69–84. Springer, 2016.
[29] Deepak Pathak, Philipp Krahenbuhl, Jeff Donahue, Trevor Darrell, and Alexei A Efros. Context encoders: Feature learning by inpainting. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2536–2544, 2016.
[30] Bryan C Russell, Antonio Torralba, Kevin P Murphy, and William T Freeman. Labelme: a database and web-based tool for image annotation. International journal of computer vision, 77(1-3):157–173, 2008.
[31] Nathan Somavarapu, Chih-Yao Ma, and Zsolt Kira. Frustratingly simple domain generalization via image stylization. arXiv preprint arXiv:2006.11207, 2020.
[32] Antonio Torralba and Alexei A Efros. Unbiased look at dataset bias. In CVPR 2011, pages 1521–1528. IEEE, 2011.
[33] Yi-Hsuan Tsai, Wei-Chih Hung, Samuel Schulter, Kihyuk Sohn, Ming-Hsuan Yang, and Manmohan Chandraker. Learning to adapt structured output space for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7472–7481, 2018.
[34] Mei Wang and Weihong Deng. Deep visual domain adaptation: A survey. Neurocomputing, 312:135–153, 2018.
[35] Shujun Wang, Lequan Yu, Caizi Li, Chi-Wing Fu, and Pheng-Ann Heng. Learning from extrinsic and intrinsic supervisions for domain generalization. In European Conference on Computer Vision, pages 159–176. Springer, 2020.
[36] Yexun Zhang, Ya Zhang, Qinwei Xu, and Ruipeng Zhang. Learning robust shape-based features for domain generalization. IEEE Access, 8:63748–63756, 2020.
[37] Kaiyang Zhou, Yongxin Yang, Timothy Hospedales, and Tao Xiang. Deep domain-adversarial image generation for domain generalisation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 13025–13032, 2020.
[38] Kaiyang Zhou, Yongxin Yang, Timothy Hospedales, and Tao Xiang. Learning to generate novel domains for domain generalization. In Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm, editors, Computer Vision – ECCV 2020, pages 561–578, Cham, 2020. Springer International Publishing.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *