帳號:guest(3.22.74.30)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):柯子逸
作者(外文):Ke, Zi-Yi
論文名稱(中文):基於生成自我引導的密集標籤之弱監督語義分割
論文名稱(外文):Generating Self-Guided Dense Annotations for Weakly Supervised Semantic Segmentation
指導教授(中文):許秋婷
指導教授(外文):Hsu, Chiou-Ting
口試委員(中文):簡仁宗
陳煥宗
口試委員(外文):Chien, Jen-Tzung
Chien, Jen-Tzung
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊系統與應用研究所
學號:105065514
出版年(民國):107
畢業學年度:106
語文別:英文
論文頁數:30
中文關鍵詞:弱監督語義分割自我引導
外文關鍵詞:Weakly SupervisedSemantic SegmentationSelf-Guided
相關次數:
  • 推薦推薦:0
  • 點閱點閱:506
  • 評分評分:*****
  • 下載下載:6
  • 收藏收藏:0
相較於全監督設定,使用影像級標籤學習語義分割模型是非常具有挑戰性的。由於不知道確切的像素─標籤對應關係,大多數以弱監督設定學習的方法都依賴額外的模型去推斷偽像素級標籤,並用來訓練學習語義分割模型。在本篇論文中,我們的目標是在不使用額外的模型的前提下,發展出一個單一的類神經網路,去訓練語義分割模型。我們提出一個創新的自我引導策略,充分地去利用學習到的所有層級的特徵,以逐步的生成密集偽標籤。首先,我們利用高層級的特徵作為各類別的定位圖,以約略的定位每個類別。接著,我們提出相似性引導的方法,促進每個定位圖與對應到的中間級的特徵有一致的表現。第三,我們採用訓練影像本身作為引導,並使用自我引導的細化修補加以轉移影像固有的結構至定位圖中。最終,我們可以由這些定位圖中獲得偽像素級標籤,並用這些偽標籤作為基本事實來訓練語義分割模型。我們提出的自我引導策略是一個統一的框架,建立在一個單一的網路上,並且,整個訓練過程交替於更新特徵表示與修飾定位圖之間。在PASCAL VOC 2012 segmentation benchmark上的實驗結果顯示了我們的方法優於其他使用同樣設定的弱監督方法。
Learning semantic segmentation models under image-level supervision is far more challenging than under fully supervised setting. Without knowing the exact pixel-label correspondence, most weakly-supervised methods rely on external models to infer pseudo pixel-level labels for training semantic segmentation models. In this thesis, we aim to develop a single neural network without resorting to any external models. We propose a novel self-guided strategy to fully utilize features learned across multiple levels to progressively generate the dense pseudo labels. First, we use high-level features as class-specific localization maps to roughly locate the classes. Next, we propose an affinity-guided method to encourage each localization map to be consistent with their intermediate level features. Third, we adopt the training image itself as guidance and propose a self-guided refinement to further transfer the image's inherent structure into the maps. Finally, we derive pseudo pixel-level labels from these localization maps and use the pseudo labels as ground truth to train the semantic segmentation model. Our proposed self-guided strategy is a unified framework, which is built on a single network and alternatively updates the feature representation and refines localization maps during the training procedure. Experimental results on PASCAL VOC 2012 segmentation benchmark demonstrate that our method outperforms other weakly-supervised methods under the same setting.
中文摘要 I
Abstract II
1. Introduction 1
2. Related Work 6
2.1 Methods with External Models 6
2.2 Methods without External Models 7
3. Proposed Method 10
3.1 Class-Specific Localization Maps 10
3.2 Affinity-Guided Refinement 11
3.3 Self-Guided Refinement 13
3.4 Pseudo Pixel-Level Label 15
3.5 Training Strategy 15
4. Experimental Results 17
4.1 Experimental Settings 17
4.1.1 Dataset and Evaluation Metrics 17
4.1.2 Implementation 17
4.1.3 Parameter Setting 18
4.2 Evaluation of the Proposed Method 18
4.2.1 Effectiveness of Each Component 18
4.2.2 Evaluation of Affinity-Guided Refinement 19
4.2.3 Evaluation of Different Affinity Matrices 20
4.3 Comparisons with Existing Methods 22
5. Conclusion 26
6. References 27

[1] A. Chaudhry, P.K. Dokania and P.H.S. Torr, “Discovering Class-Specific Pixels for Weakly-Supervised Semantic Segmentation,’’ In BMVC, 2017.
[2] K. He, J. Sun and X. Tang, “Guided Image Filtering,’’ In TPAMI, 2013.
[3] J. Dai, K. He and J. Sun, “Boxsup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation,’’ In ICCV, 2015.
[4] A. Khoreva, R. Benenson, J. Hosang, M. Hein and B. Schiele, “Simple Does It: Weakly Supervised Instance and Semantic Segmentation,” In CVPR, 2017.
[5] D. Lin, J. Dai, J. Jia, K. He and J. Sun, “ScribbleSup: Scribble-Supervised Convolutional Networks for Semantic Segmentation,’’ In CVPR, 2016.
[6] A. Bearman, O. Russakovsky, V. Ferrari and F.-F. Li, “What’s the Point: Semantic Segmentation with Point Supervision,” In ECCV, 2016.
[7] A. Roy and S. Todorovic, “Combining Bottom-up, Top-Down, and Smoothness Cues for Weakly Supervised Image Segmentation,” In CVPR, 2017.
[8] X. Qi, Z. Liu, J. Shi, H. Zhao and J. Jia, “Augmented Feedback in Semantic Segmentation under Image Level Supervision,” In ECCV, 2016.
[9] Y. Wei, X. Liang, Y. Chen, Z. Jie, Y. Xiao, Y. Zhao and S. Yan, “Learning to Segment with Image-level Annotations,” In PR, 2016.
[10] Y. Wei, X. Liang, Y. Chen, X. Shen, M.-M. Cheng, J. Feng, Y. Zhao and S. Yan, “STC: A Simple to Complex Framework for Weakly-supervised Semantic Segmentation,” In TPAMI, 2016.
[11] W. Shimoda and K. Yanai, “Distinct Class-specific Saliency Maps for Weakly Supervised Semantic Segmentation,” In ECCV, 2016.
[12] F. Saleh, M.S. Akbarian, M. Salzmann, L. Petersson, S. Gould and J.M. Alvarez, “Built-in Foreground/background Prior for Weakly-supervised Semantic Segmentation,” In ECCV, 2016.
[13] F. Saleh, M.S. Akbarian, M. Salzmann, L. Petersson, J.M. Alvarez and S. Gould, “Incorporating Network Built-in Priors in Weakly-supervised Semantic Segmentation,” In TPAMI, 2017.
[14] A. Kolesnikov and C.H. Lampert, “Seed, Expand and Constrain: Three Principles for Weakly-supervised Image Segmentation,” In ECCV, 2016.
[15] Y. Wei, J. Feng, X. Liang, M.-M. Cheng, Y. Zhao and S. Yan, “Object Region Mining with Adversarial Erasing: A Simple Classification to Semantic Segmentation Approach,” In CVPR, 2017.
[16] G. Papandreou, L.C. Chen, K. Murphy and A.L. Yuille, “Weakly- and Semi- Supervised Learning of a Deep Convolutional Network for Semantic Image Segmentation,” In ICCV, 2015.
[17] D. Pathak, E. Shelhamer, J. Long and T. Darrell, “Fully Convolutional Multi-Class Multiple Instance Learning,” In ICLR Workshop, 2015.
[18] P.O. Pinheiro and R. Collobert, “From image-level to pixel-level labeling with convolutional networks,” In CVPR, 2015.
[19] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy and A.L. Yuille, “Deeplab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs,” In TPAMI, 2016.
[20] K. He, X. Zhang, S. Ren and J. Sun, “Deep Residual Learning for Image Recognition,” In CVPR, 2016.
[21] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li and F.-F. Li, “Imagenet: A large-scale hierarchical image database,” In CVPR, 2009.
[22] M. Everingham, S.M.A. Eslami, L.V. Gool, C.K.I. Williams, J. Winn and A. Zisserman, “The pascal visual object classes challenge: A retrospective,” In IJCV, 2015.
[23] P. Krähenbühl and V. Koltun, “Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials,” In NIPS, 2011.
[24] J. Long, E. Shelhamer and T. Darrell, “Fully Convolutional Networks for Semantic Segmentation,” In CVPR, 2015.
[25] H. Noh, S. Hong and B. Han, “Learning Deconvolution Network for Semantic Segmentation,” In ICCV, 2015.
[26] X. Li, Z. Liu, P. Luo, C.C. Loy and X. Tang, “Not All Pixels Are Equal: Difficulty-aware Semantic Segmentation via Deep Layer Cascade,” In CVPR, 2018.
[27] A. Khoreva, R. Benenson, J. Hosang, M. Hein and B. Schiele, “Learning Transferrable Knowledge for Semantic Segmentation with Deep Convolutional Neural Network,” In CVPR, 2016.
[28] S.J. Oh, R. Benenson, A. Khoreva, Z. Akata, M. Fritz and B. Schiele, “Exploiting saliency for object segmentation from image level labels,” In CVPR, 2017
[29] P.O. Pinheiro, R. Collobert and P. Dollar, “Learning to Segment Object Candidates,” In NIPS, 2015.
[30] P.O. Pinheiro, T.-Y. Lin, R. Collobert and P. Dollar, “Learning to Refine Object Segments,” In ECCV, 2016.
[31] A. Arnab and P.H.S. Torr, “Pixelwise Instance Segmentation with a Dynamically Instantiated Network,” In CVPR, 2017.
[32] K. He, G. Gkioxari, P. Dollar and R. Girshick, “Mask R-CNN,” In ICCV, 2017.
[33] L. Lovsz, “Random walks on graphs: A survey,” 1993.
[34] J. Ahn and S. Kwak, “Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation,” In CVPR, 2018.
[35] H. Zhang, K. Dana, J. Shi, Z. Zhang, X. Wang, A. Tyagi and A. Agrawal, “Context Encoding for Semantic Segmentation,” In CVPR, 2018.
[36] P. Bilinski and V. Prisacariu, “Dense Decoder Shortcut Connections for Single-Pass Semantic Segmentation,” In CVPR, 2018.
[37] M. Yang, K. Yu, C. Zhang, Z. Li and K. Yang, “DenseASPP for Semantic Segmentation in Street Scenes,” In CVPR, 2018.
[38] C. Yu, J. Wang, C. Peng, C. Gao, G. Yu and N. Sang, “Learning a Discriminative Feature Network for Semantic Segmentation,” In CVPR, 2018.
[39] X. Wang, S. You, X. Li and H. Ma, “Weakly-Supervised Semantic Segmentation by Iteratively Mining Common Object Features,” In CVPR, 2018.
[40] Y. Wei, H. Xiao, H. Shi, Z. Jie, J. Feng and T.S. Huang, “Revisiting Dilated Convolution: A Simple Approach for Weakly- and Semi- Supervised Semantic Segmentation,” In CVPR, 2018.
[41] Z. Huang, X. Wang, J. Wang, W. Liu and J. Wang, “Weakly-Supervised Semantic Segmentation Network with Deep Seeded Region Growing,” In CVPR, 2018.
[42] B. Hariharan, P. Arbelaez, L. Bourdev, S. Maji and J. Malik, “Semantic contours from inverse detectors,” In ICCV, 2011.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *