帳號:guest(18.191.45.169)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):陳逸瑋
作者(外文):Chen, Yi-Wei
論文名稱(中文):透過對抗訓練增強深度學習模型中的黑盒浮水印抗性
論文名稱(外文):Enhancing Black-box Watermark Robustness in Deep Learning Models through Adversarial Training
指導教授(中文):孫宏民
指導教授(外文):Sun, Hung-Min
口試委員(中文):許富皓
黃育綸
口試委員(外文):Hsu, Fu-Hau
Huang, Yu-Lun
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊安全研究所
學號:110164520
出版年(民國):113
畢業學年度:112
語文別:英文
論文頁數:48
中文關鍵詞:深度學習黑盒浮水印白盒浮水印模型修改模型萃取輸入預處理對抗式訓練
外文關鍵詞:Deep LearningBlack-Box WatermarkingWhite-Box WatermarkingModel ModificationModel ExtractionInput PreprocessingAdversarial Training
相關次數:
  • 推薦推薦:0
  • 點閱點閱:81
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
自浮水印技術問世以來,它已逐漸發展為一個重要的工具,用於保護創作者的智慧財產權。隨著機器學習模型的快速發展,如何保護這些模型相關的智慧財產權已成為一個迫切的問題。因此,針對深度神經網絡(DNNs)的浮水印技術應運而生,並且它們的起源可以追溯到傳統的數字浮水印。然而,隨著這些技術的出現,已經設計出了許多針對這些模型浮水印的攻擊手段。這些攻擊包括輸入預處理、模型修改和模型提取等多種方法。為了對抗這些隨時推陳出新的威脅而制定新的浮水印策略是一項複雜且充滿挑戰的任務,若是可以找出別種方法提升浮水印的強韌性是更理想的方法。此外透過對抗式訓練可以提升模型的強韌性,考量到上述特性,因此本文提出了一種創新方法,該方法採用對抗式訓練來增強黑盒浮水印的穩健性,目的是加固浮水印技術,並且可以在介於原始浮水印準確率20%至95%之中擁有顯著地提升浮水印強韌性,以抵禦潛在的浮水印抹除攻擊。

關鍵字:深度學習、黑盒浮水印、白盒浮水印、模型修改、模型萃取、輸入預處理、對抗式訓練
Since the advent of watermarking technology, it has gradually evolved into a vital tool for protecting the intellectual property rights of creators. With the rapid development of machine learning models, safeguarding the intellectual property associated with these models has become an urgent issue. Consequently, watermarking techniques specifically for Deep Neural Networks (DNNs) have emerged, tracing their origins to traditional digital watermarking. However, numerous attacks targeting these model watermarks have been devised with the advent of these techniques. These attacks include various methods such as input preprocessing, model modification, and model extraction. Developing new watermarking strategies to combat these ever-evolving threats is complex and challenging. Ideally, finding alternative methods to enhance watermark robustness is more desirable.
Moreover, adversarial training can be employed to improve the robustness of models. Considering these characteristics, this paper proposes an innovative approach that utilizes adversarial training to enhance the robustness of black-box watermarks. The goal is to fortify watermarking techniques, significantly improving watermark robustness within the range of 20% to 95% original watermark accuracy, thereby resisting potential watermark removal attacks.

Keywords: Deep Learning, Black-Box Watermarking, White-Box Watermarking, Model Modification, Model Extraction, Input Preprocessing, Adversarial Training
Abstract (Chinese) I
Acknowledgements (Chinese) II
Abstract IV
Contents V
List of Figures VIII
List of Tables X
1 Introduction 1
1.1 DNN Watermark Classifications . . . . . . . . . . . . . . . . . . . . 2
1.2 Advanced Attack Strategies in Model Security . . . . . . . . . . . . 3
1.3 Exploring the Efficacy of Adversarial Training in Enhancing Watermark
Robustness in Deep Neural Networks . . . . . . . . . . . . . . 4
2 Background And Related Works 7
2.1 Evolution of Watermarking Techniques . . . . . . . . . . . . . . . . 7
2.2 Overview of Deep Neural Networks (DNN) . . . . . . . . . . . . . . 11
2.2.1 Deep Neural Network Architecture . . . . . . . . . . . . . . 11
2.2.2 Characteristics and Applications of DNNs . . . . . . . . . . 12
2.3 Deep Neural Network Watermarking . . . . . . . . . . . . . . . . . 13

2.3.1 Introduction to DNN Watermarking Techniques . . . . . . . 13
2.3.2 Categories of Watermarking . . . . . . . . . . . . . . . . . . 14
2.3.3 Watermarking Typologies . . . . . . . . . . . . . . . . . . . 14
2.3.4 Setting Watermarking Standards . . . . . . . . . . . . . . . 15
2.3.5 Key Aspects of DNN Watermarking . . . . . . . . . . . . . . 15
2.4 Adversarial Objectives and Capabilities . . . . . . . . . . . . . . . . 17
2.4.1 Attacker’s Objectives . . . . . . . . . . . . . . . . . . . . . . 17
2.4.2 Attacker’s Capabilities . . . . . . . . . . . . . . . . . . . . . 18
2.4.3 Advances in Watermarking Techniques . . . . . . . . . . . . 19
2.5 Watermarking Schemes and Techniques . . . . . . . . . . . . . . . . 19
2.5.1 Training Phase Watermarking: BlackMarks . . . . . . . . 19
2.5.2 Inference Phase Watermarking: DAWN . . . . . . . . . 20
2.6 Adversarial Training . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.7 Attack Strategies to Remove Watermarks . . . . . . . . . . . . . . . 23
2.8 Desirable Qualities in an Optimal Watermark . . . . . . . . . . . . 27
3 Methodology 29
3.1 Dataset and Model Architecture . . . . . . . . . . . . . . . . . . . . 29
3.2 Watermarking Techniques . . . . . . . . . . . . . . . . . . . . . . . 30
3.3 Adversarial Training . . . . . . . . . . . . . . . . . . . . . . . . 32
3.4 Attack Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4 Result 34
4.1 Training Phase Watermarking Strategy . . . . . . . . . . . . . . . . 35
4.2 Inference Phase Watermarking Strategy . . . . . . . . . . . . . . . . 37
4.3 Parameter Encoding Watermarking Strategy . . . . . . . . . . . . . 38
4.4 Conclusion on Results . . . . . . . . . . . . . . . . . . . . . . . . . 39

5 Conclusion 40
6 Experiment Setting 42
Bibliography 45
Bibliography
[1] Neha Bansal, Vinay Kumar Deolia, Atul Bansal, and Pooja Pathak. Digital
image watermarking using least significant bit technique in different bit positions.
In 2014 International Conference on Computational Intelligence and
Communication Networks, pages 813–818, 2014.
[2] Huili Chen, Bita Darvish Rouhani, Cheng Fu, Jishen Zhao, and Farinaz
Koushanfar. Deepmarks: A secure fingerprinting framework for digital rights
management of deep learning models. In Proceedings of the 2019 on International
Conference on Multimedia Retrieval, ICMR ’19, page 105–113, New
York, NY, USA, 2019. Association for Computing Machinery.
[3] Huili Chen, Bita Darvish Rouhani, and Farinaz Koushanfar. Blackmarks:
Blackbox multibit watermarking for deep neural networks. CoRR,
abs/1904.00344, 2019.
[4] Gintare Karolina Dziugaite, Zoubin Ghahramani, and Daniel M Roy. A
study of the effect of jpg compression on adversarial images. arXiv preprint
arXiv:1608.00853, 2016.
[5] Himanshu, Sanjay Rawat, Balasubramanian Raman, and Gaurav Bhatnagar.
Dct and svd based new watermarking scheme. In 2010 3rd International
Conference on Emerging Trends in Engineering and Technology, pages 146–
151, 2010.
45
[6] Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. Distilling the knowledge in a
neural network. arXiv preprint arXiv:1503.02531, 2015.
[7] Tianjin Huang, Vlado Menkovski, Yulong Pei, and Mykola Pechenizkiy. Bridging
the performance gap between fgsm and pgd adversarial training, 2020.
[8] Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and
Yoshua Bengio. Quantized neural networks: Training neural networks with
low precision weights and activations. The Journal of Machine Learning Research,
18(1):6869–6898, 2017.
[9] Kalpesh Krishna, Gaurav Singh Tomar, Ankur P Parikh, Nicolas Papernot,
and Mohit Iyyer. Thieves on sesame street! model extraction of bert-based
apis. arXiv preprint arXiv:1910.12366, 2019.
[10] Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton. Learning multiple layers
of features from tiny images. Citeseer, 2009.
[11] Kang Liu, Brendan Dolan-Gavitt, and Siddharth Garg. Fine-pruning: Defending
against backdooring attacks on deep neural networks. In International
Symposium on Research in Attacks, Intrusions, and Defenses, pages 273–294.
Springer, 2018.
[12] Weiyang Liu, Yandong Wen, Zhiding Yu, Ming Li, Bhiksha Raj, and Le Song.
Sphereface: Deep hypersphere embedding for face recognition. In Proceedings
of the IEEE conference on computer vision and pattern recognition, pages
212–220, 2017.
[13] Nils Lukas, Edward Jiang, Xinda Li, and Florian Kerschbaum. Sok: How
robust is image classification deep neural network watermarking? (extended
version). arXiv preprint arXiv:2108.04974, 2021.

[14] Huan Luo, Yi Yang, Bo Tong, Fan Wu, and Bin Fan. Traffic sign recognition
using a multi-task convolutional neural network. IEEE Transactions on
Intelligent Transportation Systems, 19(4):1100–1111, 2017.
[15] Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras,
and Adrian Vladu. Towards deep learning models resistant to adversarial
attacks. In International Conference on Learning Representations, 2018.
[16] Omkar M Parkhi, Andrea Vedaldi, and Andrew Zisserman. Deep face recognition.
In British Machine Vision Conference, 2015.
[17] Sebastian Ruder. An overview of gradient descent optimization algorithms.,
2016.
[18] Sebastian Szyller, Buse Gul Atli, Samuel Marchal, and N. Asokan. Dawn:
Dynamic adversarial watermarking of neural networks. In Proceedings of the
29th ACM International Conference on Multimedia, MM ’21, page 4417–4425,
New York, NY, USA, 2021. Association for Computing Machinery.
[19] Andrew Tirkel and Charles Osborne. Electronic water mark. Citeseer, 1992.
[20] Fengbin Tu, Shouyi Yin, Peng Ouyang, Shibin Tang, Leibo Liu, and Shaojun
Wei. Deep convolutional neural network architecture with reconfigurable
computation patterns. IEEE Transactions on Very Large Scale Integration
(VLSI) Systems, 25(8):2220–2233, 2017.
[21] Yusuke Uchida, Yuki Nagai, Shigeyuki Sakazawa, and Shin’ichi Satoh. Embedding
watermarks into deep neural networks. In Proceedings of the 2017
ACM on International Conference on Multimedia Retrieval, pages 269–277,
2017.

[22] HaoWang, YitongWang, Zheng Zhou, Xing Ji, Dihong Gong, Jingchao Zhou,
Zhifeng Li, and Wei Liu. Cosface: Large margin cosine loss for deep face
recognition. In Proceedings of the IEEE Conference on Computer Vision and
Pattern Recognition, pages 5265–5274, 2018.
[23] Weilin Xu, David Evans, and Yanjun Qi. Feature squeezing: Detecting adversarial
examples in deep neural networks. arXiv preprint arXiv:1704.01155,
2017.
[24] Sergey Zagoruyko and Nikos Komodakis. Wide residual networks, 2016.
[25] Jianpeng Zhang, Yutong Xie, Qi Wu, and Yong Xia. Medical image classification
using synergic deep learning. Medical image analysis, 54:10–19, 2019.
[26] Michael Zhu and Suyog Gupta. To prune, or not to prune: exploring the
efficacy of pruning for model compression, 2017.
(此全文20280103後開放外部瀏覽)
電子全文
摘要
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *