帳號:guest(18.219.212.91)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):張 睿
作者(外文):Chang, Jui
論文名稱(中文):趨向目標:自我正規化漸進式學習於語義分割之無監督領域適應
論文名稱(外文):Towards the Target: Self-Regularized Progressive Learning for Unsupervised Domain Adaptation on Semantic Segmentation
指導教授(中文):許秋婷
指導教授(外文):Hsu, Chiou-Ting
口試委員(中文):林嘉文
林彥宇
口試委員(外文):Lin, Chia-Wen
Lin, Yen-Yu
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:108062584
出版年(民國):110
畢業學年度:109
語文別:英文
論文頁數:25
中文關鍵詞:無監督領域適應語義分割漸進式學習標註放寬類別不平衡
外文關鍵詞:Unsupervised Domain AdaptationSemantic SegmentationProgressive LearningLabel RelaxationClass Imbalance
相關次數:
  • 推薦推薦:0
  • 點閱點閱:409
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
語義分割的無監督域適應其目標為將從有標記的合成源域中學到的知識轉移到未標記的現實世界目標域。原域的標記資料隱含了豐富的語意資訊因而對於無監督與義學習來說是不可或缺的。然而,由於兩個域之間的差異,即所謂的“域差距”,過度擬合源域資料會影響在目標遇上的表現。近期有些方法使用目標數據的偽標籤來進行域內自適應,藉以探索域特定的語義。因此,其中使用的偽標籤之品質對於自適應來說是非常重要的。在本文中,我們提出了一個統一的框架來逐步促進對目標域的適應。首先,我們透過鬆弛源標籤來進行跨域適應,其中我們鬆弛在源域上的學習以增進將已學習到的知識一般化至目標域上。接著,我們針對域內自適應階段提出一種雙級別自正則化方法來規範偽標籤學習,並解決域內適應階段的類不平衡問題。實驗結果在兩個基準下,即GTA5→Cityscapes和SYNTHIA→Cityscapes,顯示出相當大的改進,並證明了我們的框架相對於其他方法的優越性。
Unsupervised domain adaptation for semantic segmentation aims to transfer the knowledge learned from a labeled synthetic source domain to an unlabeled real-world target domain. The labeled data from the source domain capture rich semantics and are indispensable to unsupervised domain adaptation. However, because of the difference between the two domains, i.e., the so-called domain gap”, fitting strictly to the source data usually hinders the performance on the target domain. Some recent efforts have been taken to explore the domain-specific semantics by conducting a within-domain adaptation using the predicted pseudo labels of the target data. The quality of the pseudo labels is therefore essential to the within-domain adaptation. In this thesis, we propose a unified framework to progressively facilitate the adaptation towards the target domain. First, we conduct the cross-domain adaptation through a novel source label relaxation, in which we relax the source domain learning to facilitate generalizing the learned knowledge towards the target domain. Next, in the within-domain adaptation stage, we propose a dual-level self-regularization to regularize the pseudo-label learning and also to tackle the class-imbalanced issue. The experiment results on two benchmarks, i.e., GTA5→Cityscapes and SYNTHIA→Cityscapes, show considerable improvement over the strong baseline and demonstrate the superiority of our framework over other methods.
摘要 ii
Abstract iii
Acknowledgements iv
1 Introduction 1
2 Related Work 4
2.1 Adversarial Cross-Domain Adaptation 4
2.2 Self-Supervised Cross-Domain Adaptation 5
2.3 Within-Domain Adaptation 6
3 Proposed Method 7
3.1 Baseline for Cross-Domain Adaptation 8
3.2 Source Label Relaxation for Cross-Domain Adaptation 9
3.2.1 Pixel-Level Relaxation 9
3.2.2 Patch-Level Relaxation 9
3.3 Dual-Level Self-Regularization for Within-Domain Adaptation 10
3.3.1 Pixel-Level Self-Regularization 11
3.3.2 Image-Level Self-Regularization 11
3.4 Objectives and Optimization 12
4 Experiments 13
4.1 Datasets and Evaluation Metrics 13
4.2 Implementation Details 14
4.2.1 Data Pre-Processing 14
4.2.2 Network Architecture and Hyper-Parameter Settings 14
4.3 Comparisons 15
4.3.1 GTA5 → Cityscapes 15
4.3.2 SYNTHIA → Cityscapes 16
4.4 Ablation Study 16
4.4.1 Effectiveness of Source Label Relaxation 17
4.4.2 Effectiveness of Dual-Level Self-Regularization 18
4.4.3 Effectiveness of the Proposed Progressive Framework 18
4.5 Visualization 19
5 Conclusions 21
References 22
[1] H. Wang, T. Shen, W. Zhang, L.Y. Duan, and T. Mei, “Classes matter: A fine-grained adversarial approach to cross-domain semantic segmentation,” in European Conference on Computer Vision, pp. 642–659, Springer, 2020.
[2] Y. Luo, L. Zheng, T. Guan, J. Yu, and Y. Yang, “Taking a closer look at domain shift: Category-level adversaries for semantics consistent domain adaptation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.
[3] Y.H. Chen, W.Y. Chen, Y.T. Chen, B.C. Tsai, Y.C. Frank Wang, and M. Sun, “No more discrimination: Cross city adaptation of road scene segmenters,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 1992–2001, 2017.
[4] Y.H. Tsai, W.C. Hung, S. Schulter, K. Sohn, M.H. Yang, and M. Chandraker, “Learning to adapt structured output space for semantic segmentation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7472–7481, 2018.
[5] M. Kim and H. Byun, “Learning texture invariant representation for domain adaptation of semantic segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12975–12984, 2020.
[6] Z. Huang, X. Wang, L. Huang, C. Huang, Y. Wei, and W. Liu, “Ccnet: Crisscross attention for semantic segmentation,” in Proceedings of the IEEE/CVF International
Conference on Computer Vision, pp. 603–612, 2019.
[7] Y. Yuan, X. Chen, and J. Wang, “Object-contextual representations for semantic segmentation,” arXiv preprint arXiv:1909.11065, 2019.
[8] X. Li, Z. Zhong, J. Wu, Y. Yang, Z. Lin, and H. Liu, “Expectation-maximization attention networks for semantic segmentation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9167–9176, 2019.
[9] T. Takikawa, D. Acuna, V. Jampani, and S. Fidler, “Gatedscnn: Gated shape cnns for semantic segmentation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5229–5238, 2019.
[10] Q. Hou, L. Zhang, M.M. Cheng, and J. Feng, “Strip pooling: Rethinking spatial pooling for scene parsing,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4003–4012, 2020. 22
[11] Y.H. Tsai, K. Sohn, S. Schulter, and M. Chandraker, “Domain adaptation for structured output via discriminative patch representations,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 1456–1465, 2019.
[12] L. Du, J. Tan, H. Yang, J. Feng, X. Xue, Q. Zheng, X. Ye, and X. Zhang, “Ssfdan: Separated semantic feature based domain adaptation network for semantic segmentation,” in Proceedings of the IEEE International Conference on Computer
Vision, pp. 982–991, 2019.
[13] S. Paul, Y.H. Tsai, S. Schulter, A. K. RoyChowdhury, and M. Chandraker, “Domain adaptive semantic segmentation using weak labels,” in European Conference on Computer Vision (ECCV), 2020.
[14] Y. Li, L. Yuan, and N. Vasconcelos, “Bidirectional learning for domain adaptation of semantic segmentation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6936–6945, 2019.
[15] Y. Zou, Z. Yu, B. Vijaya Kumar, and J. Wang, “Unsupervised domain adaptation for semantic segmentation via class-balanced self-training,”
in Proceedings of the European conference on computer vision (ECCV), pp. 289–305, 2018.
[16] Y. Zou, Z. Yu, X. Liu, B. Kumar, and J. Wang, “Confidence regularized self-training,” in Proceedings of the IEEE International Conference on Computer Vision,
pp. 5982–5991, 2019.
[17] K. Mei, C. Zhu, J. Zou, and S. Zhang, “Instance adaptive self-training for unsupervised domain adaptation,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXVI 16, pp. 415–430, Springer, 2020.
[18] I. Shin, S. Woo, F. Pan, and I. S. Kweon, “Two-phase pseudo label densification for self-training based domain adaptation,” in Computer Vision – ECCV 2020 (A. Vedaldi, H. Bischof, T. Brox, and J.M. Frahm, eds.), (Cham), pp. 532–548, Springer International Publishing, 2020.
[19] W. Tranheden, V. Olsson, J. Pinto, and L. Svensson, “Dacs: Domain adaptation via cross-domain mixed sampling,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 1379–1389, January 2021.
[20] S. Tang, P. Tang, Y. Gong, Z. Ma, and M. Xie, “Unsupervised domain adaptation via coarse-to-fine feature alignment method using contrastive learning,” 2021.
[21] P. Zhang, B. Zhang, T. Zhang, D. Chen, Y. Wang, and F. Wen, “Prototypical pseudo label denoising and target structure learning for domain adaptive semantic segmentation,” arXiv preprint arXiv:2101.10979, vol. 2, p. 1, 2021.
[22] T.H. Vu, H. Jain, M. Bucher, M. Cord, and P. Pérez, “Advent: Adversarial entropy minimization for domain adaptation in semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2517–2526, 2019. 23
[23] Y. Zhang, P. David, H. Foroosh, and B. Gong, “A curriculum domain adaptation approach to the semantic segmentation of urban scenes,” IEEE transactions on pattern analysis and machine intelligence, 2019.
[24] H. Ma, X. Lin, Z. Wu, and Y. Yu, “Coarse-to-fine domain adaptive semantic segmentation with photometric alignment and category-center regularization,” 2021.
[25] Q. Lian, F. Lv, L. Duan, and B. Gong, “Constructing self-motivated pyramid curriculums for cross-domain semantic segmentation: A non-adversarial approach,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 6758–6767, 2019.
[26] J. Hoffman, E. Tzeng, T. Park, J.Y. Zhu, P. Isola, K. Saenko, A. Efros, and T. Darrell, “Cycada: Cycle-consistent adversarial domain adaptation,” in International
conference on machine learning, pp. 1989–1998, PMLR, 2018.
[27] F. Pan, I. Shin, F. Rameau, S. Lee, and I. S. Kweon, “Unsupervised intra-domain adaptation for semantic segmentation through self-supervision,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3764–3773, 2020.
[28] X. Guo, C. Yang, B. Li, and Y. Yuan, “Meta-correction: Domain-aware meta loss correction for unsupervised domain adaptation in semantic segmentation,” 2021.
[29] C. Qin, L. Wang, Q. Ma, Y. Yin, H. Wang, and Y. Fu, “Contradictory structure learning for semisupervised domain adaptation,” in Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 576–584, SIAM, 2021.
[30] Z. Zheng and Y. Yang, “Rectifying pseudo label learning via uncertainty estimation for domain adaptive semantic segmentation,” 2020.
[31] Z. Zheng and Y. Yang, “Unsupervised scene adaptation with memory regularization in vivo,” 2020.
[32] S. R. Richter, V. Vineet, S. Roth, and V. Koltun, “Playing for data: Ground truth from computer games,” in European conference on computer vision, pp. 102–118, Springer, 2016.
[33] G. Ros, L. Sellart, J. Materzynska, D. Vazquez, and A. M. Lopez, “The synthia dataset: A large collection of synthetic images for semantic segmentation of urban scenes,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3234–3243, 2016.
[34] M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, “The cityscapes dataset for semantic urban scene understanding,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3213–3223, 2016.
[35] J. Hoffman, D. Wang, F. Yu, and T. Darrell, “Fcns in the wild: Pixel-level adversarial and constraint-based adaptation,” arXiv preprint arXiv:1612.02649, 2016. 24
[36] L.C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 4, pp. 834–848, 2018.
[37] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778, 2016.
[38] Y. Zhang, Z. Qiu, T. Yao, C.W. Ngo, D. Liu, and T. Mei, “Transferring and regularizing prediction for semantic segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9621–9630, 2020.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *