帳號:guest(3.16.207.192)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):梁肇宏
作者(外文):Liang, Jhao-Hong
論文名稱(中文):引導式互補熵之增強神經網路架構於對抗性樣本防護性
論文名稱(外文):Improving Model Robustness on Adversarial Examples via Guided Complement Entropy
指導教授(中文):張世杰
指導教授(外文):Chang, Shih-Chieh
口試委員(中文):陳煥宗
陳永昇
口試委員(外文):Chen, Hwann-Tzong
Chen, Yong-Sheng
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:106062511
出版年(民國):108
畢業學年度:107
語文別:英文
論文頁數:32
中文關鍵詞:神經網路機器學習對抗式樣本
外文關鍵詞:Neural networkMachine learningAdversarial example
相關次數:
  • 推薦推薦:0
  • 點閱點閱:343
  • 評分評分:*****
  • 下載下載:11
  • 收藏收藏:0
目前神經網路對於對抗式樣本的辨別能力是十分不足的,透過對原圖片特徵加上些微的擾動而產生的對抗式樣本可造成模型辨別準確率嚴重的降低,因此現今對於神經網路架構在對抗式樣本的防護是非常重要的議題。在這篇碩論中我們提出一種新的損失函數”引導式互補熵”,透過”引導式互補熵”學習的架構主要有兩個特性(a)使所有非標記類別的機率分布平攤化(b)最大化正確類別標記的機率分布。透過上面兩個特性,經過”引導式互補熵”學習的架構可在不同的類群間學習到更好的表徵,以此提升神經網路架構對於對抗式樣本的防護能力。此外,在架構受到白箱攻擊的情況下,不管神經網路是透過一般訓練或對抗式訓練,經過”引導式互補熵”學習的模型相對於透過“交叉熵”學習的模型在辨識對抗式樣本有更好的防禦性,並且在一般圖片辨識的任務下,透過”引導式互補熵”學習的模型準確率也優於一般透過“交叉熵”學習的模型。
Improving model robustness for adversarial attacks has been shown as an essential issue. The model predictions can be drastically misled by adding small adversarial perturbations to images. In this thesis, we propose a new training objective “Guided Complement Entropy” (GCE) that has desirable dual effects: (a) neutralizing the predicted probabilities of incorrect classes (non-ground-truth label classes), and (b) maximizing the predicted probability of the ground-truth class, particularly when (a) is achieved. Training with GCE encourages models to learn latent representations where samples of different classes form distinct clusters, which we argue, improves the model robustness against adversarial perturbations. Furthermore, compared with the models trained with cross-entropy, same models trained with GCE achieve significant improvements on the robustness against white-box adversarial attacks, both with and without adversarial training. When no attack is present, the model training with GCE also outperforms cross-entropy in terms of model accuracy.
1 Introduction 1
2 Related Work 5
2.1 Adversarial Attacks . . . . . . . . . . . . . . . . . 5
2.2 Adversarial Defenses . . . . . . . . . . . . . . . . .6
2.3 Complement Objective Training . . . . . . . . . . . . 7
3 Guided Complement Entropy 10
3.1 Guided Complement Entropy . . . . . . . . . . . . . .10
3.2 Synthetic Data Analysis . . . . . . . . . . . . . . .12
4 Experiments 16
4.1 Balancing scaling objective . . . . . . . . . . . . .16
4.2 Adversarial setting . . . . . . . . . . . . . . . . .17
4.3 Performance on natural examples . . . . . . . . . . .20
4.4 Robustness to White-box attacks . . . . . . . . . . .21
4.5 Robustness to adversarial training . . . . . . . . . 25
5 Conclusion 28
References 29
[1] N. Carlini and D. A.Wagner. Towards evaluating the robustness of neural networks. In IEEESSP’17, 2017.
[2] H.-Y. Chen, P.-H.Wang, C.-H. Liu, S.-C. Chang, J.-Y. Pan, Y.-T. Chen, W.Wei, and D.-C. Juan. Complement objective training. In ICLR’19, 2019.
[3] Y. Dong, F. Liao, T. Pang, H. Su, J. Zhu, X. Hu, and J. Li. Boosting adversarial attacks with momentum. In CVPR’18, 2018.
[4] I. Goodfellow, J. Shlens, and C. Szegedy. Explaining and harnessing adversarial examples. In ICLR’15, 2015.
[5] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative Adversarial Networks. In ICLR’14, 2014.
[6] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In CVPR’16, 2016.
[7] A. Krizhevsky. Learning multiple layers of features from tiny images. Technical report, 2009.
[8] A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS’12, 2012.
[9] A. Kurakin, I. J. Goodfellow, and S. Bengio. Adversarial examples in the physical world. In ICLR’17 Workshop, 2017.
[10] A. Kurakin, I. J. Goodfellow, and S. Bengio. Adversarial machine learning at scale. In ICLR’17, 2017.
[11] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998.
[12] T. Lin, M. Maire, S. J. Belongie, L. D. Bourdev, R. B. Girshick, J. Hays, P. Perona, D. Ramanan, P. Doll´ar, and C. L. Zitnick. Microsoft COCO: common objects in context. In ECCV’14, 2014.
[13] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu. Towards deep learning models resistant to adversarial attacks. In ICLR’18, 2018.
[14] P. Nakkiran. Adversarial robustness may be at odds with simplicity. arXiv preprint arXiv:1901.00532, 2019.
[15] N. Papernot, P. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, and A. Swami. The limitations of deep learning in adversarial settings. In IEEE European Symposium on Security and Privacy, 2016.
[16] N. Papernot and P. D. McDaniel. Extending defensive distillation. arXiv preprint arXiv:1705.05264, 2017.
[17] N. Papernot, P. D. McDaniel, X.Wu, S. Jha, and A. Swami. Distillation as a defense to adversarial perturbations against deep neural networks. In IEEE Symposium on Security and Privacy, 2015.
[18] A. Raghunathan, J. Steinhardt, and P. Liang. Certified defenses against adversarial examples. In ICLR’18, 2018.
[19] P. J. Rousseeuw. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 1987.
[20] F. Tramr, A. Kurakin, N. Papernot, I. Goodfellow, D. Boneh, and P. McDaniel. Ensemble adversarial training: Attacks and defenses. In ICLR’18, 2018.
[21] D. Tsipras, S. Santurkar, L. Engstrom, A. Turner, and A. Madry. Robustness may be at odds with accuracy. In ICLR’19, 2019.
[22] V. N. Vapnik. An overview of statistical learning theory. IEEE transactions on neural networks, 1999.
[23] T.-W. Weng, H. Zhang, P.-Y. Chen, J. Yi, D. Su, Y. Gao, C.-J. Hsieh, and L. Daniel. Evaluating the robustness of neural networks: An extreme value theory approach. In ICLR’18, 2018.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *