帳號:guest(3.147.74.197)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):成家聲
作者(外文):Cheng, Chia-Sheng
論文名稱(中文):基於自監督度量學習及任務導向轉換之少樣本影像分類
論文名稱(外文):Few-Shot Visual Classification with Improved Self-Supervised Metric Learning and Task-Aware Transformation
指導教授(中文):林嘉文
指導教授(外文):Lin, Chia-Wen
口試委員(中文):許秋婷
林彥宇
彭彥璁
口試委員(外文):Hsu, Chiu-Ting
Lin, Yen-Yu
Peng, Yan-Tsung
學位類別:碩士
校院名稱:國立清華大學
系所名稱:電機工程學系
學號:108061527
出版年(民國):110
畢業學年度:110
語文別:中文
論文頁數:29
中文關鍵詞:少樣本學習自監督學習度量學習
外文關鍵詞:Few-Shot LearningSelf-Supervised LearningMetric Learning
相關次數:
  • 推薦推薦:0
  • 點閱點閱:451
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
隨著深度學習的蓬勃發展以及各式資料集的日益完備,卷積神經網路 (Convolutional Neural Network) 基於大量數據的資料驅動 (Data-Driven) 可以在圖像分類的任務上相對傳統電腦視覺演算法有相當大幅的領先,但是由於蒐集資料曠日費時,又需要耗費大量成本標注資料,但導致面對新的應用時產生諸多不便。若是使用傳統的監督式學習 (Supervised Learning),但卻以少量樣本訓練一個參數量巨大的神經網路,幾乎都會有嚴重的過擬合 (Over-fitting) 的問題。
於是近幾年學者開始發展一個新的領域: 少樣本學習 (Few-Shot Learning) ,試圖在資料有限的情況下依然能夠訓練出可靠的模型。目前少樣本學習的主要路線為元學習 (Meta Learning) 架構,又能分為基於梯度 (Gradient-Based) 的元學習與基於度量空間 (Metric-Based) 的元學習。最近則有研究著重於利用自監督學習 (Self-Supervised Learning) 的機制來學習更廣泛的特徵。
而在最近的研究中,開始有些工作嘗試不再使用元學習架構,轉而使用傳統的監督式學習,並專注在如何獲取更高品質的特徵,並且發現即使不使用原學習也能獲得不俗的表現,甚至簡單的基線方式便能打敗最近一些使用架構相對複雜的元學習研究,所以令人好奇的問題是,若兩種學習方式能學到不同知識,在學習過程中同時對兩種任務優化,是否能在少樣本圖像學習上表現更好?
在此研究中,我們結合了特徵嵌入元學習、一般監督式學習與自監督學習三種學習目標,並在元學習導入一個與任務相關 (Task-Aware) 的投射機制,將樣本的特徵嵌入轉換到一個更可靠的空間進行分類。實驗證明在少樣本的分類任務的過擬合窘境上有顯著的改善,不但能大幅超越傳統方式,並且在主流資料集與不同情境設定上具有競爭性的表現。

With the rapid development of deep learning as well as the emerging of various data sets, convolutional neural networks based on large amounts of data have a considerable lead in visual classification compared to traditional computer vision algorithms. But data collection is time-consuming and it costs a lot to label the data, which brings inconvenience when implementing new applications. If following common supervised learning but using a small number of samples to train a complicated neural network, overfitting problem always comes to us.
In recent years, scholars have begun to develop a new field: few-shot learning, trying to train a reliable model even with limited data. At present, the main route of few-shot learning is the meta-learning architecture, which can be divided into gradient-based meta-learning and metric-based meta-learning. Recently, some researchers have focused on training better-quality feature extractors, using self-supervised learning mechanisms to learn more general features.
For recent few-shot classification works, researchers start to rethink the necessity of meta-learning and try to use standard supervised learning instead of meta-learning, and they find that even the simplest baseline can beat recent complex meta-learning methods.
In this work, we combine meta-learning, traditional supervised learning, and self-supervised learning, furthermore, introducing a task-aware projection mechanism to transform the original feature embedding into a more reliable space for classification. Experiments show that there is a significant improvement in the over-fitting dilemma of the few-shot visual classification task.
鳴 謝 ii
摘 要 iii
Abstract iv
Content v
Chapter 1 Introduction 7
Chapter 2 Related Work 10
2.1 Metric-Based Meta-Learning 10
2.2 Whole-Classification Methods 11
2.3 Self-supervised Feature Learning 12
Chapter 3 Proposed method 13
3.1 Problem Formulation 13
3.2 Improved Self-Supervised Metric Learning 14
3.3 Task-Aware Projection Learning Process 15
3.4 Training and Implementation Details 17
Chapter 4 Experiments and Discussion 18
4.1 Datasets and Evaluation Metrics 18
4.2 Comparisons with State-of-the-Arts 19
4.3 Ablation Study 20
4.4 Self-Supervised Auxiliary Tasks 22
Chapter 5 Conclusion 23
References 24

[1] Wei-Yu Chen, Yen-Cheng Liu, Zsolt Kira, Yu-Chiang Wang, and Jia-Bin Huang. A closer look at few-shot classification. In International Conference on Learning Representations, 2019.
[2] Guneet Singh Dhillon, Pratik Chaudhari, Avinash Ravichandran, and Stefano Soatto. A baseline for few-shot image classification. In International Conference on Learning Representations, 2020.
[3] Spyros Gidaris, Andrei Bursuc, Nikos Komodakis, Patrick Perez, and Matthieu Cord. Boosting few-shot visual learning with self-supervision. In Proceedings of the IEEE International Conference on Computer Vision, 2019.
[4] Boris Oreshkin, Pau Rodr´ıguez Lopez, and Alexandre Lacoste. Tadam: Task dependent adaptive metric for improved few-shot learning. In Advances in Neural Information Processing System, 2018.
[5] Limeng Qiao, Yemin Shi, Jia Li, Yaowei Wang, Tiejun Huang, and Yonghong Tian. Transductive episodic-wise adaptive metric for few-shot learning. In Proceedings of the IEEE International Conference on Computer Vision, 2019.
[6] Christian Simon, Piotr Koniusz, Richard Nock, and Mehrtash Harandi. Adaptive subspaces for few-shot learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020.
[7] Avinash Ravichandran, Rahul Bhotika, and Stefano Soatto. Few-shot learning with embedded class models and shot-free meta training. In Proceedings of the IEEE International Conference on Computer Vision, 2019.
[8] Kai Li, Yulun Zhang, Kunpeng Li, and Yun Fu. Adversarial feature hallucination networks for few-shot learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020.
[9] Chelsea Finn, Pieter Abbeel, and Sergey Levine. Model-agnostic meta-learning for fast adaptation of deep networks. In International conference on machine learning, 2017.
[10] Hang Gao, Zheng Shou, Alireza Zareian, Hanwang Zhang, and Shih-Fu Chang. Low-shot learning via covariance-preserving adversarial augmentation networks. In Advances in Neural Information Processing Systems, 2018.
[11] Wei-Yu Chen, Yen-Cheng Liu, Zsolt Kira, Yu-Chiang Frank Wang, and Jia-Bin Huang. A closer look at few-shot classification. In International Conference on Learning Representations, 2019.
[12] Jake Snell, Kevin Swersky, and Richard S. Zemel. Prototypical networks for few-shot learning. In Advances in Neural Information Processing System, 2017.
[13] Andrei A. Rusu, Dushyant Rao, Jakub Sygnowski, Oriol Vinyals, Razvan Pascanu, Simon Osindero, and Raia Hadsell. Meta-learning with latent embedding optimization. In International Conference on Learning Representations, 2019.
[14] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In Advances in Neural Information Processing Systems, 2014..
[15] Oriol Vinyals, Charles Blundell, Tim Lillicrap, Koray Kavukcuoglu, and Daan Wierstra. Matching networks for one shot learning. In Advances in Neural Information Processing Systems, 2016
[16] Yongqin Xian, Tobias Lorenz, Bernt Schiele, and Zeynep Akata. Feature generating networks for zero-shot learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018.
[17] Chi Zhang, Yujun Cai, Guosheng Lin, and Chunhua Shen. Deepemd: Few-shot image classification with differentiable earth mover’s distance and structured classifiers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020.
[18] Bharath Hariharan and Ross Girshick. Low-shot visual recognition by shrinking and hallucinating features. In Proceedings of the IEEE International Conference on Computer Vision, 2017.
[19] Yaoyao Liu, Bernt Schiele, and Qianru Sun. An ensemble of epoch-wise empirical bayes for fewshot learning. In European conference on computer vision, 2020.
[20] Nikhil Mishra, Mostafa Rohaninejad, Xi Chen, and Pieter Abbeel. A simple neural attentive meta-learner. In International Conference on Learning Representations, 2018.
[21] Mehdi Noroozi and Paolo Favaro. Unsupervised learning of visual representations by solving jigsaw puzzles. In European Conference on Computer Vision, 2016.
[22] Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A simple framework for contrastive learning of visual representations. In International Conference on Machine Learning, 2020.
[23] Ruixiang Zhang, Tong Che, Zoubin Ghahramani, Yoshua Bengio, and Yangqiu Song. Metagan: An adversarial approach to few-shot learning. In Advances in Neural Information Processing Systems, 2018.
[24] Yonglong Tian, Yue Wang, Dilip Krishnan, Joshua B Tenenbaum, and Phillip Isola. Rethinking few-shot image classification: a good embedding is all you need? In European Conference on Computer Vision, 2020.
[25] Richard Zhang, Phillip Isola, and Alexei A Efros. Colorful image colorization. In European conference on computer vision, 2016.
[26] Ishan Misra and Laurens van der Maaten. Self-supervised learning of pretext-invariant representations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020.
[27] Shuo Yang, Lu Liu, Min Xu. Free lunch for few-shot learning: distribution calibration In International Conference on Learning Representations, 2021.
[28] Aniruddh Raghu, Maithra Raghu, Samy Bengio, and Oriol Vinyals. Rapid learning or feature reuse? towards understanding the effectiveness of maml. In International Conference on Learning Representations, 2020.
[29] Yinbo Chen, Zhuang Liu, Huijuan Xu, Trevor Darrell and Xiaolong Wang.
Meta-Baseline: Exploring Simple Meta-Learning for Few-Shot Learning. In Proceedings of the IEEE International Conference on Computer Vision, 2021.
[30] G. Koch, R. Zemel, and R. Salakhutdinov. Siamese neural networks for one-shot image recognition. In International Conference on Machine Learning deep learning workshop, 2015.
[31] Flood Sung, Yongxin Yang, Li Zhang, Tao Xiang, Philip HS Torr, and Timothy M Hospedales. Learning to compare: Relation network for few-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018.
[32] Mamshad Nayeem Rizve, Salman Khan, Fahad Shahbaz Khan, Mubarak Shah. Exploring Complementary Strengths of Invariant and Equivariant Representations for Few-Shot Learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021.
[33] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2016.
[34] Kwonjoon Lee, Subhransu Maji, Avinash Ravichandran, and Stefano Soatto. Meta-learning with differentiable convex optimization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019.
[35] Golnaz Ghiasi, Tsung-Yi Lin, and Quoc V Le. Dropblock: A regularization method for convolutional networks. In Advances in Neural Information Processing Systems, 2018.
[36] Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European conference on computer vision, 2018.
[37] J. Deng, W. Dong, R. Socher, L. Li, Kai Li and Li Fei-Fei, "ImageNet: A large-scale hierarchical image database. In IEEE Conference on Computer Vision and Pattern Recognition, 2009
[38] A Santoro, S Bartunov, M Botvinick, D Wierstra, T Lillicrap. Meta-learning with memory-augmented neural networks. In International Conference on Machine Learning, 2016.
[39] T Munkhdalai, H Yu. Meta networks. In International Conference on Machine Learning, 2017.
[40] Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. Momentum contrast for unsupervised visual representation learning. In IEEE Conference on Computer Vision and Pattern Recognition, 2020.
[41] Carl Doersch, Abhinav Gupta, and Alexei A Efros. Unsupervised visual representation learning by context prediction. In Proceedings of the IEEE International Conference on Computer Vision, 2015.
[42] Gustav Larsson, Michael Maire, and Gregory Shakhnarovich. Learning representations for automatic colorization. In European conference on computer vision, 2016.
[43] Spyros Gidaris, Praveer Singh, and Nikos Komodakis. Unsupervised representation learning by predicting image rotations. In International Conference on Learning Representations, 2018.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *