帳號:guest(3.131.37.193)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):蔣秉叡
作者(外文):Chiang, Ping-Rui
論文名稱(中文):深度量化編碼網路
論文名稱(外文):Deep Quantizable Encoding Network
指導教授(中文):黃之浩
指導教授(外文):Huang, Chih-Hao
口試委員(中文):鍾偉和
孫敏德
口試委員(外文):Chung, Wei-ho
Sun, Min-Te
學位類別:碩士
校院名稱:國立清華大學
系所名稱:通訊工程研究所
學號:106064501
出版年(民國):111
畢業學年度:110
語文別:中文
論文頁數:117
中文關鍵詞:紋理影像辨識深度學習編碼加速量化
外文關鍵詞:Texture Image RecognitionDeep LearningEncodingAccelerationQuantization
相關次數:
  • 推薦推薦:0
  • 點閱點閱:553
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
物件辨識技術能有效節省辨識所需要的人力與薪資成本, 故吸引不少科學家對其投入研究, 而紋理是物件的重要特徵, 故不少研究物件辨識的科學家會在專門對其投入研究。早期的科學家是研究如何使用機器學習演算法來辨識紋理特徵, 但是當辨識的紋理特徵物件不是在同幾種設定好的光源種類, 同幾種設定好的光源角度, 同幾種設定好的拍攝距離, 同幾種設定好的拍攝角度下產生時, 其機器學習演算法就無法獲得良好的辨識結果, 因此科學家也開始研究如何使用深度學習演算法來辨識這種類型的紋理特徵物件, 而因為深度學習演算法需要儲存大量的卷積層權重與輸入以及執行大量的前向傳導計算, 後向傳導計算與權重更迭計算, 故往往需要儲存空間極大的記憶體以及運算力極強儲存空間極大的處理器才能執行其演算法, 而因為常見的邊緣裝置往往沒有配置這樣的處理器與記憶體, 故科學家也開始研究如何能減少深度學習演算法所需要儲存的權重與輸入以減少需要儲存的變量與需要執行的計算量, 而本篇論文改善了深度學習在紋理辨識應用的缺點, 即其運算量大與容量需求高的缺點, 並在改善此缺點的情況下還能保有與其誤差0.5%以下的辨識正確率。
Since the technology of object recognition can effectively save the recognition cost of human and salary, it attracts a lot of scientists conduct research on it. And since texture is important object characteristic, those scientists researching on object recognition will also research on it. Early scientists recognize object with texture characteristic based on machine learning algorithm. However, this won’t work well when the objects with texture characteristic are generating from inconsistent light source, inconsistent light angle, inconsistent shooting distance and inconsistent shooting angle. Hence, scientists start researching on deep learning algorithm in order to recognize this type of objects. And since deep learning algorithm requires a lot of storage for convolution weights and inputs and a lot of calculations for forward pass, backward pass, and parameter updating, it will need memory with large storage capacity and processor with strong computing capability and large storage capacity to fulfill those requirements. Nevertheless, edge devices usually don’t have this type of processor and memory. Hence, Scientists also start researching on how to decrease the storage of convolution weights and inputs in deep learning algorithm and how to reduce the amount of calculations in deep learning algorithm. In this thesis, we propose the model which not only can improve those disadvantages of deep learning application in texture recognition mentioned above but also is only 0.5% less recognition accuracy than the original deep learning model when we finish improvement.
1. 緒論 1
1.1 研究背景與動機 1
1.2 論文架構 2
2. 相關研究探討 3
2.1 機器學習 3
2.2 深度學習 7
2.3 深度學習模型加速 8
3. 演算法介紹 11
3.1 深度紋理編碼網路介紹 11
3.2 不訓練模型量化的演算法 30
3.3 深度量化編碼網路 53
4. 實驗設定與結果 68
4.1 資料集介紹與設定 68
4.2 深度量化編碼網路(不用訓練的版本)之參數設定與實驗結果 70
4.3 深度量化編碼網路(需要訓練的版本)之參數設定, 實驗結果與實驗訓練過程 81
5. 結論 105
參考文獻 107
Appendix A 111
Appendix B 114
[1] F.-A. Spanhol, L.-S. Oliveira, C. Petitjean, and L. Heutte, "Breast cancer histopathological image classification using convolutional neural networks," 2016 International Joint Conference on Neural Networks (IJCNN), Vancouver, BC, Canada, July. 24-29, 2016, pp. 2560-2567.

[2] J. Pakkanen and J. Iivarinen, "A novel self-organizing neural network for defect image classification," 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541), Budapest, Hungary, July. 25-29, 2004, vol. 4, pp. 2553-2556.

[3] T. Ojala, M. Pietikäinen, and D. Harwood, "A comparative study of texture measures with classification based on featured distributions," Pattern Recognition, 1996, Vol 29, pp. 51-59.

[4] T. Ojala, M. Pietikainen, and T. Maenpaa, "Multiresolution gray-scale and rotation invariant texture classification with local binary patterns," in IEEE Transactions on Pattern Analysis and Machine Intelligence, July, 2002, vol. 24, pp. 971-987.

[5] T. Leung and J. Malik, "Representing and recognizing the visual appearance of materials using three-dimensional textons," International Journal of Computer Vision, June, 2001, vol. 43, pp 29-44.

[6] C. Schmid, "Constructing models for content-based image retrieval," In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2001, vol. 2, pp. 39–45.

[7] H. Jégou, M. Douze, C. Schmid, and P. Pérez, "Aggregating local descriptors into a compact image representation," 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010, pp. 3304-3311.

[8] F. Perronnin, J. Sanchez, and T. Mensink, "Improving the fisher kernel for large scale image classification," Proceedings of IEEE European Conference on Computer Vision, 2010, vol 6314, pp. 143–156.



[9] Y. Song, F. Zhang, Q. Li, H. Huang, L. J. O’Donnell, and W. Cai, "Locally-transferred fisher vectors for texture classification," 2017 IEEE International Conference on Computer Vision (ICCV), 2017, pp. 4922-4930.

[10] Y. Gao, O. Beijbom, N. Zhang, and T. Darrell, "Compact Bilinear Pooling," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 317-326.

[11] G. Hinton, O. Vinyals, and J. Dean, "Distilling the knowledge in a neural network," arXiv preprint arXiv:1503.02531, 2015.

[12] Z. Liu, J. Li, Z. Shen, G. Huang, S. Yan, and C. Zhang, "Learning efficient convolutional networks through network slimming," 2017 IEEE International Conference on Computer Vision (ICCV), 2017, pp. 2755-2763.

[13] M. Courbariaux, Y. Bengio, and J. P. David, "BinaryConnect: training deep neural networks with binary weights during propagations," arXiv preprint arXiv: 1511.00363, 2016.

[14] B. Jacob, S. Kligys, B. Chen, M. Zhu, M. Tang, A. Howard, H. Adam, and D. Kalenichenko, "Quantization and training of neural networks for efficient integer-arithmetic-only inference," 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.

[15] H. Zhang, J. Xue, and K. Dana, "Deep ten: texture encoding network," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 2896-2905.

[16] S. Liao and Albert C. S. Chung, “Face recognition by using elongated local binary patterns with average maximum distance gradient magnitude,” ACCV (2007).

[17] L. Nanni, A. Lumini, and S. Brahnam, “Local binary pattern variants as texture descriptors for medical image analysis,” Artificial intelligence in medicine, 2010.

[18] G. Zhao, G. Wu, Y. Liu, and J. Chen, "Texture classification based on completed modeling of local binary pattern," 2011 International Conference on Computational and Information Sciences, 2011, pp. 268-271.

[19] Y. Hu, Z. Long, and G. AlRegib, "Scale selective extended local binary pattern for texture classification," 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017, pp. 1413-1417.

[20] M. Varma and A. Zisserman, "A statistical approach to texture classification from single images," International Journal of Computer Vision, April, 2005, vol. 62, pp. 61–81.

[21] Curet database https://www.cs.columbia.edu/CAVE/

[22] Kthtips database https://www.csc.kth.se/cvap/databases/kth-tips/documentation.html

[23] Dtd database https://www.robots.ox.ac.uk/~vgg/data/dtd/

[24] Fmd database https://people.csail.mit.edu/celiu/CVPR2010/FMD/

[25] L. Liu, J. Chen, P. Fieguth, G. Zhao, R. Chellappa, and M. Pietikäinen, "From BOW to CNN: two decades of texture representation for texture classification," International Journal of Computer Vision, 2019.

[26] M. Cimpoi, S. Maji, I. Kokkinos, and A. Vedaldi, "Deep filter banks for texture recognition, description, and segmentation," International Journal of Computer Vision, 2016.

[27] A. Romero, N. Ballas, S. E. Kahou, A. Chassang, C. Gatta, and Y. Bengio, "Fitnets: hints for thin deep nets," arXiv preprint arXiv:1412.6550, 2015.

[28] Y. Zhang, T. Xiang, T. M. Hospedales, and H. Lu, "Deep Mutual Learning," In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 4320-4328.

[29] H. Li, A. Kadav, I. Durdanovic, H. Samet, and H. P. Graf, "Pruning filters for efficient convnets," arXiv preprint arXiv:1608.08710, 2017.

[30] Y. He, P. Liu, Z. Wang, Z. Hu, and Y. Yang, "Filter pruning via geometric median for deep convolutional neural networks acceleration," In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 4335-4344.

[31] I. Hubara, M. Courbariaux, D. Soudry, E. Y. Ran, and Y. Bengio, "Binarized Neural Networks," Advances in Neural Information Processing Systems 29 (NIPS), 2016.

[32] M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi, "XNOR-Net: imageNet classification using binary convolutional neural networks," arXiv preprint arXiv: 1603.05279, 2016.

[33] S. Zhou, Y. Wu, Z. Ni, X. Zhou, H. Wen, and Y. Zou, "DoReFa-Net: training low bitwidth convolutional neural networks with low bitwidth gradients," arXiv preprint arXiv:1606.06160, 2016.

[34] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770-778.

[35] S. Bell, P. Upchurch, N. Snavely, and K. Bala, "Material recognition in the wild with the materials in context database," In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pages 3479–3487.
(此全文20250126後開放外部瀏覽)
電子全文
中英文摘要
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *