帳號:guest(18.227.81.186)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):陳志傑
作者(外文):Chen, Chih-Chieh
論文名稱(中文):通過調整零點進行後訓練量化
論文名稱(外文):Post-Training Quantization by Adjusting Zero Points
指導教授(中文):張世杰
指導教授(外文):Chang, Shih-Chieh
口試委員(中文):何宗易
謝明得
口試委員(外文):Ho, Tsung-Yi
Shieh, Ming-Der
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:110062646
出版年(民國):112
畢業學年度:111
語文別:中文
論文頁數:30
中文關鍵詞:後訓練量化混和精度零點調整
外文關鍵詞:post-training quantizationmixed precisionzero point
相關次數:
  • 推薦推薦:0
  • 點閱點閱:83
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
量化是一種常見的模型壓縮技術,後訓練量化指的是在不進行進一步訓練的
情況下對預訓練模型進行量化。在本論文中,我們提出了兩種新穎的後訓練量
化方法。首先,我們利用比較激活值的相似性進行混合精度量化。其次,我們
引入了一種有效的零點調整方法,進一步提高了量化模型的準確性。實驗結果
顯示,與之前的方法相比,我們的方法表現更優。在將ResNet-18模型壓縮到相
同大小的情況下,我們的方法提高了1.7%的準確率。同樣地,在ResNet-50模型
上,我們的方法提高了3%的準確率。這些結果突出了我們的方法在提高量化模
型準確性方面的有效性。
Quantization is a common technique for model compression, where post-training quantization refers to quantizing a pre-trained model without further training. In this paper, we propose two novel methods for post-training quantization. Firstly, we leverage to compare the similarity of output feature map (OFM) to perform mixed-precision quantization on the model. Lastly, we introduce an effective zero-point adjustment method to enhance quantized models’ accuracy further. The
experimental results demonstrate the superiority of our approach compared to previous work. In the case of compressing the ResNet-18 model to the same size, our method achieves a 1.7% higher accuracy. Similarly, for the ResNet-50 model, our approach achieves a 3% accuracy improvement. These results highlight the effectiveness of our methods in improving the accuracy of quantized models.
Contents
Acknowledgements (Chinese) I
Abstract (Chinese) III
Abstract IV
Contents V
List of Figures VII
List of Tables VIII
List of Algorithms IX
1 Introduction 1
2 Previous Works 4
2.1 Mixed Precsion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Clipping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3 Methodology 7
3.1 Mixed Precision-Output Feature Map Comparison . . . . . . . . . . 7
3.2 Zero Point Adjustment . . . . . . . . . . . . . . . . . . . . . . . . . 9
4 Experiments 14
V
4.1 Calibration data for post-training quantization . . . . . . . . . . . . 14
4.2 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.3 Experimental setting . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.4 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . 15
5 Conclusions 17
References 18
[1] Ron Banner, Yury Nahshan, Elad Hoffer, and Daniel Soudry. Aciq: analytical
clipping for integer quantization of neural networks. 2018.
[2] Ron Banner, Yury Nahshan, and Daniel Soudry. Post training 4-bit quanti-
zation of convolutional networks for rapid-deployment. Advances in Neural
Information Processing Systems, 32, 2019.
[3] Yaohui Cai, Zhewei Yao, Zhen Dong, Amir Gholami, Michael W Mahoney,
and Kurt Keutzer. Zeroq: A novel zero shot quantization framework. In
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern
Recognition, pages 13169–13178, 2020.
[4] Jungwook Choi, Zhuo Wang, Swagath Venkataramani, Pierce I-Jen Chuang,
Vijayalakshmi Srinivasan, and Kailash Gopalakrishnan. Pact: Parame-
terized clipping activation for quantized neural networks. arXiv preprint
arXiv:1805.06085, 2018.
[5] Yoni Choukroun, Eli Kravchik, Fan Yang, and Pavel Kisilev. Low-bit quan-
tization of neural networks for efficient inference. In 2019 IEEE/CVF Inter-
national Conference on Computer Vision Workshop (ICCVW), pages 3009–
3018. IEEE, 2019.
18
[6] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual
learning for image recognition. In Proceedings of the IEEE conference on
computer vision and pattern recognition, pages 770–778, 2016.
[7] Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang,
Andrew Howard, Hartwig Adam, and Dmitry Kalenichenko. Quantization
and training of neural networks for efficient integer-arithmetic-only inference.
In Proceedings of the IEEE conference on computer vision and pattern recog-
nition, pages 2704–2713, 2018.
[8] Rundong Li, Yan Wang, Feng Liang, Hongwei Qin, Junjie Yan, and Rui Fan.
Fully quantized network for object detection. In Proceedings of the IEEE/CVF
conference on computer vision and pattern recognition, pages 2810–2819, 2019.
[9] Gil Shomron, Freddy Gabbay, Samer Kurzum, and Uri Weiser. Post-training
sparsity-aware quantization. Advances in Neural Information Processing Sys-
tems, 34:17737–17748, 2021.
[10] Kuan Wang, Zhijian Liu, Yujun Lin, Ji Lin, and Song Han. Haq: Hardware-
aware automated quantization with mixed precision. In Proceedings of the
IEEE/CVF conference on computer vision and pattern recognition, pages
8612–8620, 2019.
[11] Bichen Wu, Yanghan Wang, Peizhao Zhang, Yuandong Tian, Peter Vajda,
and Kurt Keutzer. Mixed precision quantization of convnets via differentiable
neural architecture search. arXiv preprint arXiv:1812.00090, 2018.
[12] Haibao Yu, Tuopu Wen, Guangliang Cheng, Jiankai Sun, Qi Han, and Jian-
ping Shi. Low-bit quantization needs good distribution. In Proceedings of the
IEEE/CVF Conference on Computer Vision and Pattern Recognition Work-
shops, pages 680–681, 2020.
19
[13] Bohan Zhuang, Lingqiao Liu, Mingkui Tan, Chunhua Shen, and Ian Reid.
Training quantized neural networks with a full-precision auxiliary module.
In Proceedings of the IEEE/CVF conference on computer vision and pattern
recognition, pages 1488–1497, 2020.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *