作者(外文):Tang, Yung-Chen
論文名稱(外文):Neural Clamping: Joint Input Perturbation and Temperature Scaling for Neural Network Calibration
指導教授(外文):Ho, Tsung-Yi
口試委員(外文):Chen, Hung-Ming
Lin, Shu-Min
Peng, Guan-Ju
外文關鍵詞:Deep learningNeural network calibration
神經網絡校準是深度學習中確保預測一致性的一項重要任務,其目的為確保模型預測的置信度與真實正確性似然之間的關係。在本篇論文中,我們提出了一種新的後處理校準方法,稱為深度學習鉗夾術,在預先訓練的分類器上使用簡單的聯合輸入輸出轉換,通過一個可學習的通用輸入擾動和一個輸出溫度標度參數,來完成神經網路校準的目標。此外,我們也提供了深度學習鉗夾術的理論解釋,可以證明此方法比溫度縮放具有更好的校準效果。在 CIFAR-100 和 ImageNet 識別數據集進行實驗以及測試在不同種的深度神經網絡模型,我們的實驗結果顯示深度學習鉗夾術顯著優於最先進的後處理校準方法。
Neural network calibration is an essential task in deep learning to ensure consistency between the confidence of model prediction and the true correctness likelihood.
In this thesis, we propose a new post-processing calibration method called Neural Clamping, which employs a simple joint input-output transformation on a pre-trained classifier via a learnable universal input perturbation and an output temperature scaling parameter.
Moreover, we provide theoretical explanations on why Neural Clamping is provably better than temperature scaling.
Evaluated on CIFAR-100 and ImageNet image recognition datasets and a variety of deep neural network models, our empirical results show that Neural Clamping significantly outperforms state-of-the-art post-processing calibration methods.
Abstract (Mandarin) I
Abstract II
Acknowledgements III
Contents IV
List of Figures VI
List of Tables VII

1 Introduction ... 1

2 Background and Related Work ... 4
2.1 Probabilistic Characterization of Neural Network Calibration ... 4
2.2 CalibrationMetrics ... 5
2.3 Post-ProcessingCalibrationMethods ... 6

3 Methodology ... 8
3.1 JointInput-OutputCalibration ... 8
3.2 TrainingObjectiveFunctioninNeuralClamping ... 10
3.3 How to Choose a Proper γ Value in Focal Loss for Neural Clamping? ... 10
3.4 Theoretical Justification on the Advantage of Neural Clamping ... 11

4 Performance Evaluation 13
4.1 EvaluationandImplementationDetails ... 13
4.2 CIFAR-100andImageNetResults ... 15
4.3 AdditionalAnalysisofNeuralClamping ... 17

5 Conclusions ... 19

6 Appendix ... 20
6.1 Algorithmic Descriptions of Neural Clamping ... 20
6.2 Proof for Lemma 1 ... 21
6.3 Proof for Theorem 1 ... 22

Bibliography ... 25
