帳號:guest(3.133.147.142)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):李昱慧
作者(外文):Lee, Yu-Hui
論文名稱(中文):保持身分特徵之眼鏡移除以增進人臉辨識準確率
論文名稱(外文):ByeGlassesGAN: Identity Preserving Eyeglasses Removal for Improving Face Recognition Accuracy
指導教授(中文):賴尚宏
指導教授(外文):Lai, Shang-Hong
口試委員(中文):邱維辰
許秋婷
李哲榮
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:106062508
出版年(民國):108
畢業學年度:107
語文別:英文
論文頁數:38
中文關鍵詞:生成對抗網路人臉辨識深度學習
外文關鍵詞:Generative Adversarial NetworkFace RecognitionDeep Learning
相關次數:
  • 推薦推薦:0
  • 點閱點閱:388
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
近年人臉辨識技術已被廣為應用至我們的日常生活,頂尖的人臉辨識模組能夠辨識生活中大多數的人臉,但當人臉被遮擋時(如戴眼鏡),辨識準確率往往會大幅度的降低。
在本論文中,我們提出一個專門將人像照片中的眼鏡去除的生成對抗網路,叫做ByeGlassesGAN。我們提出的ByeGlassesGAN能夠預測出人像中眼鏡所在的位置,並同時將眼鏡從影像中去除。ByeGlassesGAN的生成網路主要由一個編碼器、一個人臉解碼器、以及一個語意分割解碼器組成。編碼器主要致力於提取輸入影像中的資訊,並由人臉解碼器利用編碼器提取出來的資訊產生出去除眼鏡後的影像。此外,因也可以將去除眼鏡視為一種針對人臉的填補任務,所以我們還配置了一個語意分割解碼器,用來預測眼鏡所在位置以及眼鏡去除後人臉形狀的二元遮罩。語意分割解碼器生成的資訊將與人臉解碼器共享,以引導人臉解碼器產生更好的影像。我們的ByeGlassesGAN除了去除一般眼鏡外,針對半透明的有色鏡片或是鏡片上的反光都能夠有效的去除。
而實驗結果表明,ByeGlassesGAN除了能夠產生吸引人的影像外,若作為人臉辨識的前處理,也能夠有效的提升眼鏡照片的人臉辨識準確率。
Face recognition techniques have been widely used in our daily lives. However, although state-of-the-art face recognition systems are capable of recognizing faces for practical applications, their accuracies are degraded when recognizing faces with occlusions, such as eyeglasses.
In this paper, we propose a novel image-to-image GAN framework for eyeglasses removal, called ByeGlassesGAN, which is used to automatically detect the position of eyeglasses and then remove them from a portrait. Our ByeGlassesGAN consists of an encoder, a face decoder, and a segmentation decoder. The encoder is responsible for extracting information from the source face image. And the face decoder utilizes this information to generate glasses-removed images. Since glasses removal can be regarded as a kind of face completion task, here we equip a segmentation decoder which aims at predicting the segmentation mask of the eyeglasses and the completed face region. The feature vectors generated by the segmentation decoder are shared with the face decoder, which facilitates better reconstruction results. Our ByeGlassesGAN can provide visually appealing results in the eyeglasses-removed images even for semi-transparent color eyeglasses or glasses with glare. In the experiment, we demonstrate improvement of face recognition accuracy by applying our method as a pre-processing step for faces with eyeglasses.
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Related Work 5
2.1 Image-to-Image transformation . . . . . . . . . . . . . . . . . . . . 5
2.2 Face Attributes Manipulation . . . . . . . . . . . . . . . . . . . . . 5
2.3 Image Completion . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3 ByeGlassesGAN 8
3.1 Overview of ByeGlassesGAN . . . . . . . . . . . . . . . . . . . . 8
3.2 Objective Function . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2.1 Adversarial Loss . . . . . . . . . . . . . . . . . . . . . . . 11
3.2.2 Per-pixel Loss . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2.3 Segmentation Loss . . . . . . . . . . . . . . . . . . . . . . 13
3.2.4 Identity Preserving . . . . . . . . . . . . . . . . . . . . . . 13
3.2.5 Overall Loss Function for Generator . . . . . . . . . . . . . 14
3.3 Network Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 14
4 Data Synthesis 17
5 Experimental Results 20
5.1 Implementation details . . . . . . . . . . . . . . . . . . . . . . . . 20
5.2 Qualitative Results . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.3 Quantitative Results . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.4 Face Recognition Evaluation . . . . . . . . . . . . . . . . . . . . . 26
5.5 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.6 Demo System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
6 Conclusions 35
References 36
[1] Bansal, A., Nanduri, A., Castillo, C. D., Ranjan, R., and Chellappa, R. Umdfaces: An annotated face dataset for training deep networks. arXiv preprint arXiv:1611.01484v2 (2016).
[2] Bao, J., Chen, D., Wen, F., Li, H., and Hua, G. Towards open-set identity preserving face synthesis. In Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition (2018), pp. 6713–6722.
[3] Chang, H., Lu, J., Yu, F., and Finkelstein, A. Pairedcyclegan: Asymmetric style transfer for applying and removing makeup. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018), pp. 40–48.
[4] Chen, S., Liu, Y., Gao, X., and Han, Z. Mobilefacenets: Efficient cnns for accurate real-time face verification on mobile devices. In Chinese Conference on Biometric Recognition (2018), Springer, pp. 428–438.
[5] Choi, Y., Choi, M., Kim, M., Ha, J.-W., Kim, S., and Choo, J. Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018), pp. 8789–8797.
[6] Deng, J., Guo, J., Xue, N., and Zafeiriou, S. Arcface: Additive angular margin loss for deep face recognition. arXiv preprint arXiv:1801.07698 (2018).
[7] Gao, R., and Grauman, K. On-demand learning for deep image restoration. In ICCV (2017).
[8] Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. Generative adversarial nets. In Advances in neural information processing systems (2014), pp. 2672–2680.
[9] Guo, J., Zhu, X., Lei, Z., and Li, S. Z. Face synthesis for eyeglass-robust face recognition. In Chinese Conference on Biometric Recognition (2018), Springer, pp. 275–284.
[10] He, Z., Zuo, W., Kan, M., Shan, S., and Chen, X. Arbitrary facial attribute editing: Only change what you want. arXiv preprint arXiv:1711.10678 1, 3 (2017).
[11] Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., and Hochreiter, S. Gans trained by a two time-scale update rule converge to a local nash equilibrium. In Advances in Neural Information Processing Systems (2017), pp. 6626–6637.
[12] Huang, G. B., Ramesh, M., Berg, T., and Learned-Miller, E. Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Tech. Rep. 07-49, University of Massachusetts, Amherst, October 2007.
[13] Iizuka, S., Simo-Serra, E., and Ishikawa, H. Globally and locally consistent image completion. ACM Transactions on Graphics (ToG) 36, 4 (2017), 107.
[14] Isola, P., Zhu, J.-Y., Zhou, T., and Efros, A. A. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (2017), pp. 1125–1134.
[15] Jo, Y., and Park, J. Sc-fegan: Face editing generative adversarial network with user’s sketch and color. arXiv preprint arXiv:1902.06838 (2019).
[16] Kemelmacher-Shlizerman, I., Seitz, S. M., Miller, D., and Brossard, E. The megaface benchmark: 1 million faces for recognition at scale. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016), pp. 4873–4882.
[17] Kingma, D. P., and Welling, M. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013).
[18] Lee, C.-H., Liu, Z., Wu, L., and Luo, P. Maskgan: Towards diverse and interactive facial image manipulation. Technical Report (2019).
[19] Li, M., Zuo, W., and Zhang, D. Deep identity-aware transfer of facial attributes. arXiv preprint arXiv:1610.05586 (2016).
[20] Liu, G., Reda, F. A., Shih, K. J., Wang, T.-C., Tao, A., and Catanzaro, B. Image inpainting for irregular holes using partial convolutions. In Proceedings of the
European Conference on Computer Vision (ECCV) (2018), pp. 85–100.
[21] Liu, Z., Luo, P., Wang, X., and Tang, X. Deep learning face attributes in the wild. In Proceedings of the IEEE international conference on computer vision (2015), pp. 3730–3738.
[22] Mao, X., Li, Q., Xie, H., Lau, R. Y., Wang, Z., and Paul Smolley, S. Least squares generative adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision (2017), pp. 2794–2802.
[23] Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., and Efros, A. A. Context encoders: Feature learning by inpainting. In Proceedings of the IEEE conference on computer vision and pattern recognition (2016), pp. 2536–2544.
[24] Ronneberger, O., Fischer, P., and Brox, T. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention (2015), Springer, pp. 234–241.
[25] Sangkloy, P., Lu, J., Fang, C., Yu, F., and Hays, J. Scribbler: Controlling deep image synthesis with sketch and color. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017), pp. 5400–5409.
[26] Shen, W., and Liu, R. Learning residual images for face attribute manipulation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017), pp. 4030–4038.
[27] Upchurch, P., Gardner, J., Pleiss, G., Pless, R., Snavely, N., Bala, K., and Weinberger, K. Deep feature interpolation for image content changes. In Proceedings of the IEEE conference on computer vision and pattern recognition (2017), pp. 7064–7073.
[28] Wu, C., Liu, C., Shum, H.-Y., Xy, Y.-Q., and Zhang, Z. Automatic eyeglasses removal from face images. IEEE transactions on pattern analysis and machine intelligence 26, 3 (2004), 322–336.
[29] Xiao, T., Hong, J., and Ma, J. Elegant: Exchanging latent encodings with gan for transferring multiple face attributes. In Proceedings of the European Conference on Computer Vision (ECCV) (2018), pp. 168–184.
[30] Yang, C., Lu, X., Lin, Z., Shechtman, E., Wang, O., and Li, H. High-resolution image inpainting using multi-scale neural patch synthesis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017), pp. 6721–6729.
[31] Yu, C., Wang, J., Peng, C., Gao, C., Yu, G., and Sang, N. Bisenet: Bilateral segmentation network for real-time semantic segmentation. In Proceedings of the European Conference on Computer Vision (ECCV) (2018), pp. 325–341.
[32] Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., and Huang, T. S. Free-form image inpainting with gated convolution. arXiv preprint arXiv:1806.03589 (2018).
[33] Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., and Huang, T. S. Generative image inpainting with contextual attention. arXiv preprint arXiv:1801.07892 (2018).
[34] Zhang, G., Kan, M., Shan, S., and Chen, X. Generative adversarial network with spatial attention for face attribute editing. In Proceedings of the European Conference on Computer Vision (ECCV) (2018), pp. 417–432.
[35] Zhang, L., Ji, Y., Lin, X., and Liu, C. Style transfer for anime sketches with enhanced residual u-net and auxiliary classifier gan. In 2017 4th IAPR Asian Conference on Pattern Recognition (ACPR) (2017), IEEE, pp. 506–511.
[36] Zhang, R., Zhu, J.-Y., Isola, P., Geng, X., Lin, A. S., Yu, T., and Efros, A. A. Real-time user-guided image colorization with learned deep priors. arXiv preprint arXiv:1705.02999 (2017).
[37] Zhu, J.-Y., Park, T., Isola, P., and Efros, A. A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision (2017), pp. 2223–2232.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *