帳號:guest(18.118.162.201)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):唐永承
作者(外文):Tang, Yung-Chen
論文名稱(中文):深度學習鉗夾術:神經網絡校準的聯合輸入擾動和溫度縮放之方法
論文名稱(外文):Neural Clamping: Joint Input Perturbation and Temperature Scaling for Neural Network Calibration
指導教授(中文):何宗易
指導教授(外文):Ho, Tsung-Yi
口試委員(中文):陳宏明
林淑敏
彭冠舉
口試委員(外文):Chen, Hung-Ming
Lin, Shu-Min
Peng, Guan-Ju
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:109062592
出版年(民國):111
畢業學年度:110
語文別:英文
論文頁數:28
中文關鍵詞:深度學習神經網路校準
外文關鍵詞:Deep learningNeural network calibration
相關次數:
  • 推薦推薦:0
  • 點閱點閱:91
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
神經網絡校準是深度學習中確保預測一致性的一項重要任務,其目的為確保模型預測的置信度與真實正確性似然之間的關係。在本篇論文中,我們提出了一種新的後處理校準方法,稱為深度學習鉗夾術,在預先訓練的分類器上使用簡單的聯合輸入輸出轉換,通過一個可學習的通用輸入擾動和一個輸出溫度標度參數,來完成神經網路校準的目標。此外,我們也提供了深度學習鉗夾術的理論解釋,可以證明此方法比溫度縮放具有更好的校準效果。在 CIFAR-100 和 ImageNet 識別數據集進行實驗以及測試在不同種的深度神經網絡模型,我們的實驗結果顯示深度學習鉗夾術顯著優於最先進的後處理校準方法。
Neural network calibration is an essential task in deep learning to ensure consistency between the confidence of model prediction and the true correctness likelihood.
In this thesis, we propose a new post-processing calibration method called Neural Clamping, which employs a simple joint input-output transformation on a pre-trained classifier via a learnable universal input perturbation and an output temperature scaling parameter.
Moreover, we provide theoretical explanations on why Neural Clamping is provably better than temperature scaling.
Evaluated on CIFAR-100 and ImageNet image recognition datasets and a variety of deep neural network models, our empirical results show that Neural Clamping significantly outperforms state-of-the-art post-processing calibration methods.
Abstract (Mandarin) I
Abstract II
Acknowledgements III
Contents IV
List of Figures VI
List of Tables VII

1 Introduction ... 1

2 Background and Related Work ... 4
2.1 Probabilistic Characterization of Neural Network Calibration ... 4
2.2 CalibrationMetrics ... 5
2.3 Post-ProcessingCalibrationMethods ... 6

3 Methodology ... 8
3.1 JointInput-OutputCalibration ... 8
3.2 TrainingObjectiveFunctioninNeuralClamping ... 10
3.3 How to Choose a Proper γ Value in Focal Loss for Neural Clamping? ... 10
3.4 Theoretical Justification on the Advantage of Neural Clamping ... 11

4 Performance Evaluation 13
4.1 EvaluationandImplementationDetails ... 13
4.2 CIFAR-100andImageNetResults ... 15
4.3 AdditionalAnalysisofNeuralClamping ... 17

5 Conclusions ... 19

6 Appendix ... 20
6.1 Algorithmic Descriptions of Neural Clamping ... 20
6.2 Proof for Lemma 1 ... 21
6.3 Proof for Theorem 1 ... 22

Bibliography ... 25
[1] C. Guo, G. Pleiss, Y. Sun, and K. Q. Weinberger, “On calibration of modern neural networks,” in International Conference on Machine Learning, pp. 1321–1330, PMLR, 2017.
[2] A. Esteva, B. Kuprel, R. A. Novoa, J. Ko, S. M. Swetter, H. M. Blau, and S. Thrun, “Dermatologist-level classification of skin cancer with deep neural networks,”
nature, vol. 542, no. 7639, pp. 115–118, 2017.
[3] X. Jiang, M. Osl, J. Kim, and L. Ohno-Machado, “Calibrating predictive model estimates to support personalized medicine,” Journal of the American Medical Informatics Association, vol. 19, no. 2, pp. 263–274, 2012.
[4] S. Shafaei, S. Kugele, M. H. Osman, and A. Knoll, “Uncertainty in machine learn- ing: A safety perspective on autonomous driving,” in International Conference on Computer Safety, Reliability, and Security, pp. 458–464, Springer, 2018.
[5] A. Kumar, P. S. Liang, and T. Ma, “Verified uncertainty calibration,” Advances in Neural Information Processing Systems, vol. 32, 2019.
[6] M. Minderer, J. Djolonga, R. Romijnders, F. Hubis, X. Zhai, N. Houlsby, D. Tran, and M. Lucic, “Revisiting the calibration of modern neural networks,” Advances in Neural Information Processing Systems, vol. 34, 2021.
[7] G. Liang, Y. Zhang, X. Wang, and N. Jacobs, “Improved trainable calibration method for neural networks on medical imaging classification,” arXiv preprint arXiv:2009.04057, 2020.
[8] R. Müller, S. Kornblith, and G. E. Hinton, “When does label smoothing help?,” Advances in neural information processing systems, vol. 32, 2019.
[9] J. Tian, D. Yung, Y.-C. Hsu, and Z. Kira, “A geometric perspective towards neu- ral calibration via sensitivity decomposition,” Advances in Neural Information Processing Systems, vol. 34, 2021.
[10] K. Gupta, A. Rahimi, T. Ajanthan, T. Mensink, C. Sminchisescu, and R. Hartley, “Calibration of neural networks using splines,” arXiv preprint arXiv:2006.12800,
2020.
[11] M. Kull, M. Perello Nieto, M. Kängsepp, T. Silva Filho, H. Song, and P. Flach, “Beyond temperature scaling: Obtaining well-calibrated multi-class probabilities with dirichlet calibration,” Advances in neural information processing systems,
vol. 32, 2019.
[12] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” in Proceedings of the IEEE international conference on computer vision, pp. 2980–2988, 2017.
[13] M. P. Naeini, G. Cooper, and M. Hauskrecht, “Obtaining well calibrated proba- bilities using bayesian binning,” in Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015.
[14] J.Nixon,M.W.Dusenberry,L.Zhang,G.Jerfel,andD.Tran,“Measuringcalibration in deep learning.,” in CVPR Workshops, vol. 2, 2019.
[15] J. Platt et al., “Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods,” Advances in large margin classifiers, vol. 10, no. 3, pp. 61–74, 1999.
[16] J.Mukhoti,V.Kulharia,A.Sanyal,S.Golodetz,P.Torr,andP.Dokania,“Calibrating deep neural networks using focal loss,” Advances in Neural Information Processing Systems, vol. 33, pp. 15288–15299, 2020.
[17] N. Charoenphakdee, J. Vongkulbhisal, N. Chairatanakul, and M. Sugiyama, “On focal loss for class-posterior probability estimation: A theoretical perspective,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5202–5211, 2021.
[18] S. Zagoruyko and N. Komodakis, “Wide residual networks,” arXiv preprint arXiv:1605.07146, 2016.
[19] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4700–4708, 2017.
[20] A. Krizhevsky, G. Hinton, et al., “Learning multiple layers of features from tiny images,” 2009.
[21] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778, 2016.
[22] M. Tan and Q. Le, “Efficientnet: Rethinking model scaling for convolutional neural networks,” in International conference on machine learning, pp. 6105–6114, PMLR, 2019.
[23] I.Radosavovic,R.P.Kosaraju,R.Girshick,K.He,andP.Dollár,“Designingnetwork design spaces,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10428–10436, 2020.
[24] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE conference on computer vision and pattern recognition, pp. 248–255, Ieee, 2009.
(此全文未開放授權)
電子全文
摘要
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *