帳號:guest(3.142.133.169)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):呂賢鑫
作者(外文):Lu, Shian-Shin
論文名稱(中文):用於提升人臉識別率的陰影移除生成對抗網路
論文名稱(外文):Identity-Preserving Face Deshading GAN for Boosting Face Recognition
指導教授(中文):林嘉文
指導教授(外文):Lin, Chia-Wen
口試委員(中文):許秋婷
邱維辰
林彥宇
口試委員(外文):Hsu, Chiou-Ting
Chiu, Wei-Chen
Lin, Yen-Yu
學位類別:碩士
校院名稱:國立清華大學
系所名稱:電機工程學系
學號:105061585
出版年(民國):107
畢業學年度:107
語文別:英文
論文頁數:33
中文關鍵詞:生成對抗網路圖像遷移孿生神經網路
外文關鍵詞:GANsimage-to-image translationSiamese network
相關次數:
  • 推薦推薦:0
  • 點閱點閱:259
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
即便是基於深度學習的人臉辨識方法,在面對拍攝情況的變異時,比如臉部表情、臉部視角及照片解析度等因素,往往還是會影響到辨識度。其中光照變化是人臉辨識的問題之一,因為產生的陰影遮蔽住臉部特徵時,可能會造成辨識器的判斷失準。
過往研究分別以不同角度切入來解決光影遮蔽問題,像是採用遷移學習的方式,從一個已經在大型資料集的訓練好的模型開始,再微調在其他資料庫上、採用一些正規化技巧來訓練網路、透過收集更完整的資料來訓練模型、使用一些技巧或生成模型來擴增資料,增加辨識器的通用性…等。而另一種方法是將人臉做預處理,移除遮蔽臉部特徵的陰影,再進行後續的辨識。這也是本篇所要探討的方法。因為移除人臉陰影本身是個非良置問題,所以過往方法多是基於Retinex等物理理論之前提來解決。也有些方法透過人臉三維資訊重建來實現陰影移除。本篇則將此視為一種圖像遷移的問題,在保留五官結構不變的前提下對光影進行操作,產生陰影移除的結果。如此能減少對光照機制的建模與假設。本篇架構上採用對抗生成網路,具有產生較清晰結果的能力。另外為了確保移除陰影過程中身分資訊的保留,所以在訓練過程中導入一個預先訓練好的人臉辨識器,該辨識器能夠提取具有身分資訊的特徵。利用照片在去除陰影前後的特徵之間的差異,使得陰影移除模型能夠學著去保留照片的身分特徵。同時再引入孿生神經網路的概念,讓相同身分的人特徵更接近;不同人彼此的特徵更遠離,藉此輔助來產生更具有身分鑑別度的照片。實驗結果顯示本篇方法能夠產生品質相對穩定的去陰影圖像,而且這些圖像也有助於辨識效果的提升。
Even deep learning based face recognition may suffer from the face images in unconstrained environment such as expressions, pose and resolution. And the lighting changes is one of the crucial issues because the shadow occlude some distinguishable facial key features, which may harm the recognition accuracy.
There are many researches to handle this problem. Such as fined-tuning recognition model, using regularization techniques, trained on larger dataset, data augmentation, and so on. In this thesis we adopt another approach named face deshading that removes the shadow occlusion on the face images before they fed to the recognition model. Because it is an ill-posed problem, the previous methods usually based on some physical priors such as Retinex theory. And some methods use 3D reconstruction to deshade. In this thesis, we treat face deshading as the image-to-image translation problem that keeps the facial structure unchanged while changes the lighting. This would not need too many modeling or assumptions about the lighting mechanisms. Our model is based on the generative adversarial networks (GANs) able to generate appealing results. To preserve the identity information that may be eliminated in the deshading process, a pre-trained face recognition model is introduced during the training. It is able to extract identity related features. With the constraint that features of the deshaded photo should not alter, the shadow removal model can learn to preserve the identity. Furthermore, the Siamese network is introduced to make the features of same identity closer; different identity are separated, to synthesize more discriminative photos. Experiment results show that our model is able to generated deshaded images with stable visual quality, and these images also improve the accuracy of the face model.
Abstract ii
Content iii
Chapter 1 Introduction 1
Chapter 2 Related Work 4
2.1 Single image deshading 4
2.2 Generative adversarial networks 5
2.3 Image-to-image translation 5
2.4 Siamese network 6
Chapter 3 Proposed method 7
3.1 Overview 7
3.2 Objective 8
3.2.1 Adversarial Loss 8
3.2.2 Identity Preserving Loss 9
3.2.3 Contrastive Loss 10
3.2.4 Illumination Classification Loss 11
3.2.5 Reconstruction Loss 11
3.2.6 Full objective 12
3.3 Network architecture 12
Chapter 4 Experiments and Discussion 13
摘 要 i
Abstract ii
Content iii
Chapter 1 Introduction 1
Chapter 2 Related Work 4
2.1 Single image deshading 4
2.2 Generative adversarial networks 5
2.3 Image-to-image translation 5
2.4 Siamese network 6
Chapter 3 Proposed method 7
3.1 Overview 7
3.2 Objective 8
3.2.1 Adversarial Loss 8
3.2.2 Identity Preserving Loss 9
3.2.3 Contrastive Loss 10
3.2.4 Illumination Classification Loss 11
3.2.5 Reconstruction Loss 11
3.2.6 Full objective 12
3.3 Network architecture 12
Chapter 4 Experiments and Discussion 13
4.1 Experimental Setup 13
4.1.1 Dataset 13
4.1.2 Implementation details 14
4.1.3 Evaluation scenarios and metrics 15
4.2 Qualitative comparisons 18
4.3 Quantitative comparisons 19
4.3.1 Comparison with other methods 19
4.3.2 Comparison of individual illumination condition 22
4.3.3 Ablation study 22
4.3.4 Results of other face recognition model 24
4.4 Limitations 25
Chapter 5 Conclusion 26
References 27
Appendix 32
Model architecture 32
[1] T. Kim, B. Kim, M. Cha, and J. Kim. Unsupervised visual attribute transfer with reconfigurable generative adversarial networks. arXiv preprint arXiv:1707.09798, 2017.
[2] J. Lv, X. Shao, J. Huang, X. Zhou, and X. Zhou. Data Augmentation for Face Recognition. In Neurocomputing., vol. 230, 2017, pp. 184–196.
[3] D. Jiang, Y. Hu, S. Yan, L. Zhang, H. Zhang, and W. Gao. Efficient 3D reconstruction for face reconstruction. In Pattern Recognition., vol. 38, no. 6, pp. 787–798, 2005.
[4] G. D. Finlayson, S. D. Hordley, and M. S. Drew. Removing shadows from images using retinex. In Color Imaging Conference: Color Science and Engineering Systems, Technologies, and Applications, pages 73–79, 2002.
[5] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. ImageNet: A Large-Scale Hierarchical Image Database. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2009.
[6] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, pages 1929–1958, 2014.
[7] Z. Shu, E. Yumer, S. Hadap, K. Sunkavalli, E. Shechtman, and D. Samaras. Neural face editing with intrinsic image disentangling. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
[8] S. Sengupta, A. Kanazawa, C. D. Castillo, and D. Jacobs. SfSNet: Learning shape, reflectance and illuminance of faces in the wild. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
[9] H. G. Barrow and J. M. Tenenbaum. Recovering intrinsic scene characteristics from images. Technical Report 157, AI Center, SRI International, 333 Ravenswood Ave., Menlo Park, CA 94025, Apr 1978.
[10] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros. Unpaired imageto-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017.
[11] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
[12] Y. Choi, M. Choi, M. Kim, J.-W. Ha, S. Kim, and J. Choo. StarGAN: Unified Generative Adversarial Networks for MultiDomain Image-to-Image Translation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
[13] J. Bromley, J. W. Bentz, L. Bottou, I. Guyon, Y. LeCun, C. Moore, E. S¨ackinger, and R. Shah. Signature verification using a siamese time delay neural network. In International Journal of Pattern Recognition and Artificial Intelligence, vol. 7, no. 4, pp. 669–688, 1993.
[14] E. Land. Lightness and Retinex theory. In Journal of the Optical Society of America, A, 61:1-11, 1971.
[15] R. Ramamoorthi and P. Hanrahan. On the relationship between radiance and irradiance: determining the illumination from images of a convex lambertian object. JOSA A, 18(10):2448–2459, 2001.
[16] R. Basri and D. W. Jacobs. Lambertian reflectance and linear subspaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(2):218–233, 2003.
[17] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In Advances in Neural Information Processing Systems (NIPS), pages 2672–2680, 2014.
[18] T. Kim, M. Cha, H. Kim, J. K. Lee, and J. Kim. Learning to discover cross-domain relations with generative adversarial networks. In Proceedings of the 34th International Conference on Machine Learning (ICML), pages 1857–1865, 2017.
[19] S. Chopra, R. Hadsell, and Y. LeCun. Learning a similarity metric discriminatively, with application to face verification. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2005.
[20] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. C. Courville. Improved training of wasserstein gans. In NIPS, 2017.
[21] M. Arjovsky, S. Chintala, and L. Bottou. Wasserstein generative adversarial networks. In Proceedings of the 34th International Conference on Machine Learning (ICML), pages 214–223, 2017.
[22] R. Huang, S. Zhang, T. Li, and R. He. Beyond face rotation: Global and local perception gan for photorealistic and identity preserving frontal view synthesis. In IEEE International Conference on Computer Vision, 2017.
[23] X. Wu, R. He, Z. Sun, and T. Tan. A light CNN for deep face representation with noisy labels. CoRR abs/1511.02683, 2015.
[24] W.-S. Lai, J.-B. Huang, N. Ahuja, and M.-H. Yang. Deep laplacian pyramid networks for fast and accurate super-resolution. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
[25] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2016.
[26] C. Li and M. Wand. Precomputed real-time texture synthesis with markovian generative adversarial networks. In Proceedings of the 14th European Conference on Computer Vision (ECCV), pages 702–716, 2016.
[27] R. Gross, I. Matthews, J. Cohn, T. Kanade, and S. Baker. Multi-pie. In Image Vision Computing, 2010.
[28] Z. Liu, P. Luo, X. Wang, and X. Tang. Deep learning face attributes in the wild. In IEEE International Conference on Computer Vision, 2015.
[29] B. Y. Li, A. S. Mian, W. Liu, A. Krishna. Using Kinect for face recognition under varying poses expressions illumination and disguise. In Proc. IEEE Workshop Appl. Comput. Vis., pp. 186-192, 2013.
[30] K. Zhang, Z. Zhang, Z. Li, and Y. Qiao. Joint face detection and alignment using multitask cascaded convolutional networks. In IEEE Signal Processing Letters, 23(10):1499–1503, 2016.
[31] D. Yi, Z. Lei, S. Liao, and S. Z. Li. Learning face representation from scratch. CoRR, vol. abs/1411.7923, 2014.
[32] Y. Guo, L. Zhang, Y. Hu, X. He, and J. Gao. Ms-celeb-1m: A dataset and benchmark for large-scale face recognition. In European Conference on Computer Vision, 2016.
[33] D. P. Kingma and J. Ba. Adam: A Method for Stochastic Optimization. In ICLR, 2015.
[34] L. Best-Rowden, H. Han, C. Otto, B. Klare, and A. K. Jain. Unconstrained face recognition: Identifying a person of interest from a media collection. In IEEE Transactions on Information Forensics and Security, vol. 9, no. 12, pp. 2144–2157, 2014.
[35] D. J. Jobson, Z. Rahman, and G. A. Woodell. A multiscale retinex for bridging the gap between color images and the human observation of scenes. In IEEE Transactions on Image processing, 6(7):965–976, 1997.
[36] O. Ronneberger, P. Fischer, and T. Brox. U-net: Convolutional networks for biomedical image segmentation. In Int. Conf. on Medical Image Computing and Computer-Assisted Intervention, pages 234–241. Springer, 2015.
[37] Y. Wang, L. Zhang, Z. Liu, G. Hua, Z. Wen, Z. Zhang, and D. Samaras. Face relighting from a single image under arbitrary unknown lighting conditions. Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 31, no. 11, pp. 1968–1984, 2009.
[38] S. Chen, Y. Liu, X. Gao, and Z. Han. Mobilefacenets: Efficient cnns for accurate real-time face verification on mobile devices. arXiv preprint arXiv:1804.07573, 2018.
[39] J. Johnson, A. Alahi, and L. Fei-Fei. Perceptual losses for real-time style transfer and super-resolution. In ECCV, 2016.
[40] G. Huang, M. Mattar, H. Lee, and E. Learned-Miller. Learning to Align from Scratch. Advances in Neural Information Processing Systems (NIPS), 2012.




 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *