帳號:guest(18.224.59.181)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):陳浩雲
作者(外文):Chen, Hao-Yun
論文名稱(中文):基於分層式互補目標的神經網絡訓練
論文名稱(外文):Learning with Hierarchical Complement Objective
指導教授(中文):張世杰
指導教授(外文):Chang, Shih-Chieh
口試委員(中文):王鈺強
吳凱強
口試委員(外文):Wang, Yu-Chiang
Wu, Kai-Chiang
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:107062506
出版年(民國):109
畢業學年度:108
語文別:英文
論文頁數:32
中文關鍵詞:類別標籤層次最佳化深度學習分類語義分割
外文關鍵詞:category hierarchyoptimizationentropydeep learningimage recognitionsemantic segmentation
相關次數:
  • 推薦推薦:0
  • 點閱點閱:70
  • 評分評分:*****
  • 下載下載:14
  • 收藏收藏:0
分層式標籤已經被廣泛地運用在標註計算機視覺相關的任務上,例如有明確分層的圖像分類任務與隱式分層的語意分割任務。然而,在目前計算機視覺相關的任務上表現最好的模型與方法都是使用交叉熵,相當於隱式的假設了類別標籤之間並不具有分層的架構。 基於類別標籤與標籤之間必定具有某些相似性存在,我們提出了一種全新的模型訓練準則: Hierarchical Complement Objective Training (HCOT),這個全新的訓練準則能夠有效的利用分層式標籤的資訊來訓練深度神經網絡。HCOT 嘗試要最大化正確類別的機率,且同時依照其餘類別與正確類別之間的分層關聯來平攤剩餘的機率分布,這麼做可以使模型有效地利用分層式標籤的優點。我們使用我們提出的全新訓練準則 HCOT 在圖像分類任務與語意分割任務上。實驗結果證實,使用了 HCOT 能夠超越現今存在最優的模型與訓練方法在 CIFAR-100, ImageNet-2012, and PASCAL-Context 訓練集上。我們額外的研究更表明,HCOT 能夠使用在所有擁有隱式分層的計算機視覺任務上。
Label hierarchies widely exist in many vision-related problems, ranging from explicit label hierarchies existed in image classification to latent label hierarchies existed in semantic segmentation. Nevertheless, state-of-the-art methods often deploy cross-entropy loss that implicitly assumes class labels to be exclusive and thus independence from each other. Motivated by the fact that classes from the same parental category usually share certain similarity, we design a new training diagram called Hierarchical Complement Objective Training (HCOT) that leverages the information from label hierarchy. HCOT maximizes the probability of the ground truth class, and at the same time, neutralizes the probabilities of rest of the classes in a hierarchical fashion, making the model take advantage of the label hierarchy explicitly. The proposed HCOT is evaluated on both image classification and semantic segmentation tasks. Experimental results confirm that HCOT outperforms state-of-the-art models in CIFAR-100, ImageNet-2012, and PASCAL-Context. The study further demonstrates that HCOT can be applied on tasks with latent label hierarchies, which is a common characteristic in many machine learning tasks.
1 Introduction 1
2 Background 5
2.1 Learning Label Hierarchy 5
2.2 Explicit Label Hierarchy 6
2.3 Latent Label Hierarchy 7
3 Hierarchical Complement Objective Training 8
3.1 Hierarchical Complement Entropy 10
3.2 Optimization 11
4 Image Classification 13
4.1 CIFAR-100 14
4.1.1 Experimental Setup 14
4.1.2 Results 15
4.1.3 Results with Mixup and Cutout 16
4.1.4 Analysis on Coarse-level Labels 16
4.1.5 Embedding Space Visualization 17
4.1.6 Comparison with CNN-RNN 17
4.2 ImageNet-2012 19
4.2.1 Experimental Setup 19
4.2.2 Results 20
4.2.3 Ablation study 20
5 Semantic Segmentation 22
5.0.1 Dataset 22
5.0.2 Experimental Setup 23
5.0.3 Evaluation 24
5.0.4 Results 24
5.0.5 Visualizations 25
6 Conclusion 27
References 28
[1] B. Alsallakh, A. Jourabloo, M. Ye, X. Liu, and L. Ren. Do convolutional neural networks learn class hierarchy? IEEE Transactions on Visualization and Computer Graphics, Volume: 24, 2017.
[2] H.-Y.Chen, J.-H.Liang, S.-C.Chang, J.-Y.Pan, Y.T.Chen, W.Wei,andD.-C.Juan. Improving adversarial robustness via guided complement entropy. In ICCV’19, 2019.
[3] H.-Y.Chen, P.-H.Wang, C.-H.Liu, S.-C.Chang, J.-Y.Pan, Y.-T.Chen, W.Wei, and D.-C. Juan. Complement objective training. In ICLR’19, 2019.
[4] L. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. arXiv preprint arXiv:1606.00915, 2016.
[5] L. Chen, G. Papandreou, F. Schroff, and H. Adam. Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587, 2017.
[6] T. Devries and G. W. Taylor. Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552, 2017.
[7] W. Goo, J. Kim, G. Kim, and S. J. Hwang. Taxonomy-regularized semantic deep convolutional neural networks. In ECCV’16, 2016.
[8] P.Goyal, P.Doll´ar, R.B.Girshick, P.Noordhuis, L.Wesolowski, A.Kyrola, A.Tulloch, Y. Jia, and K. He. Accurate, large minibatch SGD: Training ImageNet in 1 hour. arXiv preprint arXiv:1706.02677, 2017.
[9] Y. Guo, Y. Liu, E. M. Bakker, Y. Guo, and M. S. Lew. Cnn-rnn: a large-scale hierarchical image classification framework. Multimedia Tools and Applications, 2018.
[10] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In CVPR’16, 2016.
[11] K. He, X. Zhang, S. Ren, and J. Sun. Identity mappings in deep residual networks. In ECCV’16, 2016.
[12] J. Hu, L. Shen, and G. Sun. Squeeze-and-excitation networks. In CVPR’18, June 2018.
[13] S. Ioffe and C. Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In ICML’15, 2015.
[14] A. Krizhevsky. Learning multiple layers of features from tiny images. Technical report, 2009.
[15] A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS’12, 2012.
[16] G. Lin, A. Milan, C. Shen, and I. D. Reid. Refinenet: Multi-path refinement networks for high-resolution semantic segmentation. arXivpreprintarXiv:1611.06612, 2016.
[17] G. A. Miller. Wordnet: A lexical database for english. COMMUNICATIONS OF THE ACM, 1995.
[18] R. Mottaghi, X. Chen, X. Liu, N. Cho, S. Lee, S. Fidler, R. Urtasun, and A. Yuille. The role of context for object detection and semantic segmentation in the wild. In CVPR’14, 2014.
[19] C.Murdock, Z.Li, H.Zhou, and T.Duerig. Blockout: Dynamic model selection for hierarchical deep networks. In CVPR’16, 2016.
[20] F. Redmon. Yolo9000: Better, faster, stronger. In CVPR’17, 2017.
[21] M. Ristin, J. Gall, M. Guillaumin, and L. Van Gool. From categories to subcategories: Large-scale image classification with partial class label refinement. In CVPR’15, 2015.
[22] C. E. Shannon. A mathematical theory of communication. Bell System Technical Journal, 1948.
[23] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 2014.
[24] A.-M. Tousch, S. Herbin, and J.-Y. Audibert. Semantic hierarchies for image annotation: A survey. Pattern Recognition, 2012.
[25] H. Wu, J. Zhang, K. Huang, K. Liang, and Y. Yu. Fastfcn: Rethinking dilated
convolution in the backbone for semantic segmentation. arXiv preprint arXiv: 1903.11816, 2019.
[26] S.Xie, R.B.Girshick, P.Doll´ar, Z.Tu, and K.He. Aggregated residual transformations for deep neural networks. In CVPR’17, 2017.
[27] S. Xie, T. Yang, Xiaoyu Wang, and Yuanqing Lin. Hyper-class augmented and regularized deep learning for fine-grained image classification. In CVPR’15, 2015.
[28] Z. Yan, H. Zhang, R. Piramuthu, V. Jagadeesh, D. DeCoste, W. Di, and Y. Yu. Hdcnn: Hierarchical deep convolutional neural networks for large scale visual recognition. In ICCV’15, 2015.
[29] F.Yu, V.Koltun, and T.A.Funkhouser. Dilated residual networks. CVPR’17,2017.
[30] S. Zagoruyko and N. Komodakis. Wide residual networks. In BMVC’16, 2016.
[31] H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz. Mixup: Beyond empirical risk minimization. In ICLR’18, 2018.
[32] H. Zhang, K. Dana, J. Shi, Z. Zhang, X. Wang, A. Tyagi, and A. Agrawal. Context encoding for semantic segmentation. In CVPR’18, 2018.
[33] H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia. Pyramid scene parsing network. In CVPR’17, July 2017.
[34] B. Zhou, H. Zhao, X. Puig, S. Fidler, A. Barriuso, and A. Torralba. Scene parsing through ade20k dataset. In CVPR’17, 2017
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *