提升穩健領域內與領域外泛化能力: 利用原型對齊與協作注意力模組之對比式學習_

帳號：guest(216.73.216.157) 離開系統

字體大小：

詳目顯示

第 1 筆 / 共 1 筆

/1頁

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士論文系統

、以作者查詢全國書目

論文基本資料
摘要
外文摘要
論文目次
參考文獻
電子全文

作者(中文):	郭源哲
作者(外文):	Kuo, Yuan-Jhe
論文名稱(中文):	提升穩健領域內與領域外泛化能力: 利用原型對齊與協作注意力模組之對比式學習
論文名稱(外文):	Towards Robust In-Domain and Out-of-Domain Generalization: Contrastive Learning with Prototype Alignment and Collaborative Attention
指導教授(中文):	許秋婷
指導教授(外文):	Hsu, Chiou-Ting
口試委員(中文):	王聖智陳煥宗
口試委員(外文):	Wang, Sheng-Jyh Chen, Hwann-Tzong
學位類別:	碩士
校院名稱:	國立清華大學
系所名稱:	資訊工程學系
學號:	109062507
出版年(民國):	111
畢業學年度:	111
語文別:	英文
論文頁數:	26
中文關鍵詞:	領域泛化、對比式學習、嘈雜標籤、度量學習
外文關鍵詞:	Domain generalization、Contrastive learning、Noisy labels、Metric learning
相關次數:	推薦:0 點閱:422 評分: 下載:0 收藏:0

領域泛化的目標是藉由在多個源域中學習出能夠泛化於看不見的目標域模型。
在假設目標域的分布和源域不相同的前提下，先前的方法大多解決了領域外泛化
的問題，卻很少關注在源域的領域內泛化表現。而我們認為由於目標域是無法預
見地，且其分布可能會與源域的分布相當接近，因此領域內及領域外的泛化能力
是同樣重要的。另外，當源域出現不一致或是嘈雜的真實標籤時，模型的穩健性
也是重要的議題。因此，在本文中，我們提出了利用原型對齊與協作注意力模組
的對比式學習架構來解決圖片分類在穩健領域內與領域外之領域泛化的問題。首
先，我們設計一個基於距離的對比式學習來分開模糊的類別到一定距離以提升領
域外的表現。再者，我們提出利用原型對齊來將每個類別的特徵表徵對齊到對應
的原型上以提升領域內的表現。最後，我們提出一個全新的協作注意力模組來運
用正向與反向學習的好處以提升模型的穩健程度。實驗結果在兩個基準下顯示我
們的方法不僅在領域內的表現相當有競爭力，在領域外的表現以及具嘈雜標籤的
情況下更是優於之前的方法。

Domain generalization focuses on generalizing a model learned from multiple source domains to unseen target domains. Assuming the target domains distribute differently from the source domains, most previous methods address the out-of-domain generalization issue but slightly concern the in-domain performance on the source domains. Because the target domains are unseen and may distribute similarly with the source domains, we believe both the in-domain and out-of-domain performances are equally important. The model robustness also raises concerns when there exist inconsistent or noisy ground truth labels in the source domains. Therefore, in this thesis, we propose a contrastive learning framework with prototype alignment and collaborative attention to address the robust in-domain and out-of-domain generalization issue for image classification. We first design a margin-based contrastive learning to boost the out-of-domain performance by pushing the ambiguous classes apart by at least a margin. Next, we propose using prototype alignment to support the in-domain performance by aligning the latent feature representation of each class to the corresponding class prototype. Finally, we propose a novel collaborative attention method by leveraging the strength from both positive and negative learnings to enhance the model robustness. Experimental results on two benchmarks show that our method achieves competitive in-domain performance and outperforms previous methods in the out-of-domain and noisy label scenarios.

Contents
摘要i
Abstract ii
Acknowledgements
1 Introduction 1
2 Related Work 4
2.1 Domain Generalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Contrastive Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Noise-Label Representation Learning . . . . . . . . . . . . . . . . . . . . . . 6
3 Method 7
3.1 Margin-Based Contrastive Learning . . . . . . . . . . . . . . . . . . . . . . . 7
3.2 Prototype Alignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3 Collaborative Attention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.4 Total Loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4 Experiments 15
4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.2 Datasets and Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.2.1 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.3 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.3.1 Effectiveness of Collaborative Attention . . . . . . . . . . . . . . . . . 17
4.3.2 Effectiveness of Margin-Based Contrastive Learning . . . . . . . . . . 18
4.3.3 Effectiveness of Prototype Alignment . . . . . . . . . . . . . . . . . . 18
4.3.4 Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.4 Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.4.1 In-Domain Performance . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.4.2 Out-of-Domain Performance . . . . . . . . . . . . . . . . . . . . . . . 20
4.4.3 Model Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5 Conclusion 22
References 23

References
[1] V. Vapnik, “Statistical learning theory new york,” NY: Wiley, vol. 1, no. 2, p. 3, 1998.
[2] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Advances in neural information processing systems, vol. 25,2012.
[3] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate
object detection and semantic segmentation,” in Proceedings of the IEEE conference on
computer vision and pattern recognition, pp. 580–587, 2014.
[4] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Deeplab: Semantic
image segmentation with deep convolutional nets, atrous convolution, and fully connected
crfs,” IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 4,
pp. 834–848, 2017.
[5] H. Li, S. J. Pan, S. Wang, and A. C. Kot, “Domain generalization with adversarial feature
learning,” in Proceedings of the IEEE conference on computer vision and pattern
recognition, pp. 5400–5409, 2018.
[6] B. Sun and K. Saenko, “Deep coral: Correlation alignment for deep domain adaptation,”
in European conference on computer vision, pp. 443–450, Springer, 2016.
[7] Y. Ganin, E. Ustinova, H. Ajakan, P. Germain, H. Larochelle, F. Laviolette, M. Marchand,
and V. Lempitsky, “Domain-adversarial training of neural networks,” The journal
of machine learning research, vol. 17, no. 1, pp. 2096–2030, 2016.
[8] M. Xu, J. Zhang, B. Ni, T. Li, C. Wang, Q. Tian, and W. Zhang, “Adversarial domain
adaptation with domain mixup,” in Proceedings of the AAAI Conference on Artificial Intelligence,
vol. 34, pp. 6502–6509, 2020.
[9] K. Zhou, Y. Yang, Y. Qiao, and T. Xiang, “Domain generalization with mixstyle,” arXiv
preprint arXiv:2104.02008, 2021.
[10] F. M. Carlucci, A. D’Innocente, S. Bucci, B. Caputo, and T. Tommasi, “Domain generalization
by solving jigsaw puzzles,” in CVPR, 2019.
[11] Z. Huang, H. Wang, E. P. Xing, and D. Huang, “Self-challenging improves cross-domain
generalization,” in European Conference on Computer Vision, pp. 124–140, Springer,
2020.
[12] S. Yun, D. Han, S. J. Oh, S. Chun, J. Choe, and Y. Yoo, “Cutmix: Regularization strategy
to train strong classifiers with localizable features,” in International Conference on
Computer Vision (ICCV), 2019.
23
[13] Y. Li, X. Tian, M. Gong, Y. Liu, T. Liu, K. Zhang, and D. Tao, “Deep domain generalization
via conditional invariant adversarial networks,” in Proceedings of the European
Conference on Computer Vision (ECCV), pp. 624–639, 2018.
[14] D. Li, Y. Yang, Y.-Z. Song, and T. Hospedales, “Learning to generalize: Meta-learning for
domain generalization,” in Proceedings of the AAAI conference on artificial intelligence,
vol. 32, 2018.
[15] M. Zhang, H. Marklund, N. Dhawan, A. Gupta, S. Levine, and C. Finn, “Adaptive risk
minimization: Learning to adapt to domain shift,” Advances in Neural Information Processing
Systems, vol. 34, pp. 23664–23678, 2021.
[16] P. Foret, A. Kleiner, H. Mobahi, and B. Neyshabur, “Sharpness-aware minimization for
efficiently improving generalization,” arXiv preprint arXiv:2010.01412, 2020.
[17] G. Blanchard, A. A. Deshmukh, Ü. Dogan, G. Lee, and C. Scott, “Domain generalization
by marginal transfer learning,” The Journal of Machine Learning Research, vol. 22, no. 1,
pp. 46–100, 2021.
[18] Y. Balaji, S. Sankaranarayanan, and R. Chellappa, “Metareg: Towards domain generalization
using meta-regularization,” Advances in neural information processing systems,
vol. 31, 2018.
[19] M. Arjovsky, L. Bottou, I. Gulrajani, and D. Lopez-Paz, “Invariant risk minimization,”
arXiv preprint arXiv:1907.02893, 2019.
[20] S. Sagawa, P. W. Koh, T. B. Hashimoto, and P. Liang, “Distributionally robust neural
networks for group shifts: On the importance of regularization for worst-case generalization,”
arXiv preprint arXiv:1911.08731, 2019.
[21] D. Krueger, E. Caballero, J.-H. Jacobsen, A. Zhang, J. Binas, D. Zhang, R. Le Priol, and
A. Courville, “Out-of-distribution generalization via risk extrapolation (rex),” in International
Conference on Machine Learning, pp. 5815–5826, PMLR, 2021.
[22] L. Li, K. Gao, J. Cao, Z. Huang, Y. Weng, X. Mi, Z. Yu, X. Li, and B. Xia, “Progressive
domain expansion network for single domain generalization,” in Proceedings of the
IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 224–233, 2021.
[23] O. Nuriel, S. Benaim, and L. Wolf, “Permuted adain: Reducing the bias towards global
statistics in image classification,” in Proceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition, pp. 9482–9491, 2021.
[24] Y. Du, X. Zhen, L. Shao, and C. G. Snoek, “Metanorm: Learning to normalize few-shot
batches across domains,” in International Conference on Learning Representations, 2020.
[25] J. Cha, S. Chun, K. Lee, H.-C. Cho, S. Park, Y. Lee, and S. Park, “Swad: Domain generalization
by seeking flat minima,” Advances in Neural Information Processing Systems,
vol. 34, pp. 22405–22418, 2021.
[26] K. He, H. Fan, Y. Wu, S. Xie, and R. Girshick, “Momentum contrast for unsupervised
visual representation learning,” in Proceedings of the IEEE/CVF conference on computer
vision and pattern recognition, pp. 9729–9738, 2020.
24
[27] J.-B. Grill, F. Strub, F. Altché, C. Tallec, P. Richemond, E. Buchatskaya, C. Doersch,
B. Avila Pires, Z. Guo, M. Gheshlaghi Azar, et al., “Bootstrap your own latent-a new
approach to self-supervised learning,” Advances in neural information processing systems,
vol. 33, pp. 21271–21284, 2020.
[28] X. Chen and K. He, “Exploring simple siamese representation learning,” in Proceedings
of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15750–
15758, 2021.
[29] D. Kim, Y. Yoo, S. Park, J. Kim, and J. Lee, “Selfreg: Self-supervised contrastive regularization
for domain generalization,” in Proceedings of the IEEE/CVF International Conference
on Computer Vision, pp. 9619–9628, 2021.
[30] X. Yao, Y. Bai, X. Zhang, Y. Zhang, Q. Sun, R. Chen, R. Li, and B. Yu, “Pcl: Proxybased
contrastive learning for domain generalization,” in Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern Recognition, pp. 7097–7107, 2022.
[31] H. Song, M. Kim, D. Park, Y. Shin, and J.-G. Lee, “Learning from noisy labels with
deep neural networks: A survey,” IEEE Transactions on Neural Networks and Learning
Systems, 2022.
[32] Y. Kim, J. Yim, J. Yun, and J. Kim, “Nlnl: Negative learning for noisy labels,” in Proceedings
of the IEEE/CVF International Conference on Computer Vision, pp. 101–110,
2019.
[33] S. Motiian, M. Piccirilli, D. A. Adjeroh, and G. Doretto, “Unified deep supervised domain
adaptation and generalization,” in IEEE International Conference on Computer Vision
(ICCV), 2017.
[34] S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, “Cbam: Convolutional block attention module,”
in Proceedings of the European conference on computer vision (ECCV), pp. 3–19,
2018.
[35] D. Li, Y. Yang, Y.-Z. Song, and T. M. Hospedales, “Deeper, broader and artier domain
generalization,” in Proceedings of the IEEE international conference on computer vision,
pp. 5542–5550, 2017.
[36] C. Fang, Y. Xu, and D. N. Rockmore, “Unbiased metric learning: On the utilization of
multiple datasets and web images for softening bias,” in Proceedings of the IEEE International
Conference on Computer Vision, pp. 1657–1664, 2013.
[37] I. Gulrajani and D. Lopez-Paz, “In search of lost domain generalization,” in International
Conference on Learning Representations, 2020.
[38] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in
Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–
778, 2016.
[39] P. Izmailov, D. Podoprikhin, T. Garipov, D. Vetrov, and A. G. Wilson, “Averaging weights
leads to wider optima and better generalization,” arXiv preprint arXiv:1803.05407, 2018.
25
[40] D. Arpit, H. Wang, Y. Zhou, and C. Xiong, “Ensemble of averages: Improving
model selection and boosting performance in domain generalization,” arXiv preprint
arXiv:2110.10832, 2021.
[41] H. Nam, H. Lee, J. Park, W. Yoon, and D. Yoo, “Reducing domain gap by reducing
style bias,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern
Recognition, pp. 8690–8699, 2021.
[42] J. Cha, K. Lee, S. Park, and S. Chun, “Domain generalization by mutual-information regularization
with pre-trained models,” arXiv preprint arXiv:2203.10789, 2022.
26

電子全文
摘要

推文
推薦
評分
引用網址
轉寄

top

詳目顯示

相關論文