帳號:guest(18.118.193.240)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):劉康郁
作者(外文):Liu, Kang-Yu
論文名稱(中文):用於視角和光影不變性臉部識別的遷移式人臉資料擴增網絡
論文名稱(外文):DotFAN: A Domain-transferred Face Augmentation Network for Pose and Illumination Invariant Face Recognition
指導教授(中文):林嘉文
指導教授(外文):Lin, Chia-Wen
口試委員(中文):林彥宇
胡敏君
黃敬群
口試委員(外文):Lin, Yen-Yu
Hu, Min-Chun
Huang, Ching-Chun
學位類別:碩士
校院名稱:國立清華大學
系所名稱:電機工程學系
學號:106061522
出版年(民國):109
畢業學年度:108
語文別:英文
論文頁數:40
中文關鍵詞:生成對抗網路人臉轉正人臉資料增強
外文關鍵詞:GANsFace frontalizationFace augmentation
相關次數:
  • 推薦推薦:0
  • 點閱點閱:492
  • 評分評分:*****
  • 下載下載:8
  • 收藏收藏:0
基於深度學習的發展,人臉識別準確率獲得明顯的提升,但基於深度學習的
人臉識別十分依賴大量標註的訓練資料,並且面對拍攝環境的變異性,比方臉部
表情、臉部視角、光影的變化、照片解析度等因素,人臉識別辨識度也會下降,
因此蒐集不同拍攝環境下相同身分的訓練資料是十分重要的議題。
本篇提出一種由 3D 模型輔助的遷移式人臉資料增強網絡(DotFAN),可以從
已訓練的模型提取有效的特徵並在保持身分的前提下生成指定光影、表情與視角
的人臉影像。方法的結構相似於 StarGAN,採用一個生成對抗網路去生成多個領
域(Domain)的人臉影像,但額外使用兩個子網路,人臉專家網絡(FEM)和人臉
形狀回歸器(FSR),FSR 主要學習 3D 人臉形狀與視角資訊,協助本篇方法控
制生成人臉的表情與視角;FEM 負責提取人臉身份相關資訊,確保生成影像身
分一致性,在兩個模型的協助下,本篇方法有效生成各種人臉變化並可以用於人
臉資料增強。在實驗結果顯示本篇方法的多項優勢,首先本篇方法可以生成具有
多種人臉變化(姿勢、照明、形狀、表情)且品質穩定影像,並保留原始影像的
重要身分特徵。而在人臉轉正任務中,本篇方法在多個數據集的人臉識別準確性
優於其他先進方法。最後本篇方法在人臉資料增強應用中可以增加同類別(身分)
的多樣性,以利於增強後的數據集中可以學習更好的人臉識別模型。
The performance of the convolution neural network (CNN) based face recognition model mainly relies on the richness of labelled training data. In addition, face recognition still suffers from uncontrolled environments, such as illuminations, expressions, and poses. Collecting a training set with large variations of a face identity under different poses and illuminations changes, however, is very expensive, making the diversity of within-class face images a critical issue in practice. In this paper, we propose a 3D model-assisted domain-transferred face augmentation network (DotFAN) that can generate a series of variants of an input face based on the knowledge distilled from existing rich datasets collected from other domains. DotFAN is structurally a conditional StarGAN but has two additional subnetworks, namely face expert model (FEM) and face shape regressor (FSR). While FSR aims to extract 3D face attributes, FEM is designed to capture a face identity. With their aid, DotFAN can learn a face representation and effectively generate face images of various facial attributes while preserving the identity of augmented images. Experiments show some advantages of DotFAN. First, DotFAN can generate photorealistic images with face variants (pose, illumination, shape, and expression) while preserving the face identity. Second, in face frontalization task, DotFAN outperforms the advanced methods in terms of face recognition accuracy on multiple datasets. Finally, DotFAN is beneficial for augmenting anemic face datasets to improve their within-class diversity so that a better face recognition model can be learned from the augmented dataset.
摘 要 i
Abstract ii
Content iii
Chapter 1 Introduction page:1
Chapter 2 Related Work page:5
2.1 Generative Adversarial Network page:5
2.2 Conditional Generative Adversarial Network page:5
2.3 Face normalization page:7
2.4 Face rotation page:8
Chapter 3 Proposed method page:10
3.1 Overview page:10
3.2 Disentangled facial representation page:11
3.2.1 Face-Expert Model page:12
3.2.2 Face Shape Regressor page:12
3.2.3 General facial encoder and illumination code page:14
3.3 Generator page:14
3.3.1 Cycle-consistency loss page:15
3.3.2 Pose-symmetric loss page:15
3.3.3 Identity Preserving Loss page:16
3.3.4 Pose-consistency loss: page:16
3.4 Discriminator page:16
3.5 Full objective function page:17
Chapter 4 Experiments page:18
4.1 Dataset page:18
4.2 Implementation details page:19
4.3 Face synthesis page:20
4.4 Face Augmentation page:23
4.5 Image interpolation page:27
4.6 Ablation Study page:29
Chapter 5 Conclusion page:31
References page:32
Appendix page:37
Model architecture page:37

[1] V. Blanz, T. Vetter, et al. A morphable model for the synthesis of 3D faces. In Proc. ACM SIGGRAPH, 1999.
[2] Z. Shu, E. Yumer, S. Hadap, K. Sunkavalli, E. Shechtman, and D. Samaras. Neural face editing with intrinsic image disentangling. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
[3] J.-C. Chen, V. M. Patel, and R. Chellappa. Unconstrained face verification using deep cnn features. In Proc. IEEE Winter Conf. Appl. Comput. Vis., pages 1–9, 2016.
[4] S. Chen, Y. Liu, X. Gao, and Z. Han. Mobilefacenets: Efficient CNNs for accurate real-time face verification on mobile devices. In Proc. Chinese Conf. Biometric Recognit., pages 428–438, 2018.
[5] Y. Choi, M. Choi, M. Kim, J.-W. Ha, S. Kim, and J. Choo. Stargan: Unified generative adversarial networks for multidomain image-to-image translation. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 8789–8797, 2018.
[6] F. Cole, D. Belanger, D. Krishnan, A. Sarna, I. Mosseri, and W. T. Freeman. Synthesizing normalized faces from facial identity features. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 3703–3712, 2017.
[7] Y. Cui, M. Jia, T.-Y. Lin, Y. Song, and S. Belongie. Classbalanced loss based on effective number of samples. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 9268– 9277, 2019.
[8] J. Dai, K. He, and J. Sun. Instance-aware semantic segmentation via multi-task network cascades. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 3150–3158, 2016.
[9] J. Deng, J. Guo, N. Xue, and S. Zafeiriou. Arcface: Additive angular margin loss for deep face recognition. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 4690– 4699, 2019.
[10] G. D. Finlayson, S. D. Hordley, and M. S. Drew. Removing shadows from images. In Proc. European Conf. Comput. Vis., pages 823–836, 2002.
[11] I. Gross, R.and Matthews, J. Cohn, T. Kanade, and S. Baker. Multi-pie. Image Vis. Comput., 28(5):807–813, 2010.
[12] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. C. Courville. Improved training of Wasserstein GANs. In Proc. Adv. Neural Inf. Proc. Syst., pages 5767–5777, 2017.
[13] Y. Guo, L. Zhang, Y. Hu, X. He, and J. Gao. MS-celeb-1m: A dataset and benchmark for large-scale face recognition. In Proc. European Conf. Comput. Vis., pages 87–102, 2016.
[14] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 770–778, 2016.
[15] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861, 2017.
[16] Y. Hu, X. Wu, B. Yu, R. He, and Z. Sun. Pose-guided photorealistic face rotation. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 8398–8406, 2018.
[17] G. B. Huang, M. Mattar, T. Berg, and E. Learned-Miller. Labeled faces in the wild: A database forstudying face recognition in unconstrained environments. 2008.
[18] R. Huang, S. Zhang, T. Li, and R. He. Beyond face rotation: Global and local perception GAN for photorealistic and identity preserving frontal view synthesis. In Proc. IEEE Int. Conf. Comput. Vis., pages 2439–2448, 2017.
[19] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros. Image-to-image translation with conditional adversarial networks. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 1125–1134, 2017.
[20] D. Kinga and J. B. Adam. A method for stochastic optimization. In Proc. Int. Conf. Learn. Represent., volume 5, 2015.
[21] B. F. Klare, B. Klein, E. Taborsky, A. Blanton, J. Cheney, K. Allen, P. Grother, A. Mah, and A. K. Jain. Pushing the frontiers of unconstrained face detection and recognition: Iarpa janus benchmark a. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 1931–1939, 2015.
[22] E. H. Land and J. J. McCann. Lightness and retinex theory. Josa, 61(1):1–11, 1971.
[23] T. Li, R. Qian, C. Dong, S. Liu, Q. Yan, W. Zhu, and L. Lin. Beautygan: Instance-level facial makeup transfer with deep generative adversarial network. In Proc. ACM Multimedia, pages 645–653, 2018.
[24] Y. Lu, Y.-W. Tai, and C.-K. Tang. Attribute-guided face generation using conditional CycleGAN. In Proc. European Conf. Comput. Vis., pages 282–297, 2018.
[25] I. Masi, S. Rawls, G. Medioni, and P. Natarajan. Pose-aware face recognition in the wild. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 4838–4846, 2016.
[26] Y. Qian, W. Deng, and J. Hu. Unsupervised face normalization with extreme pose and expression in the wild. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 9851–9858, 2019.
[27] A. Radford, L. Metz, and S. Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434, 2015.
[28] S. Sengupta, J.-C. Chen, C. Castillo, V. M. Patel, R. Chellappa, and D. W. Jacobs. Frontal to profile face verification in the wild. In Proc. IEEE Winter Conf. Appl. Comput. Vis., pages 1–9, 2016.
[29] W. Shen and R. Liu. Learning residual images for face attribute manipulation. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 4030–4038, 2017.
[30] Y. Shen, P. Luo, J. Yan, X. Wang, and X. Tang. FaceIDGAN: Learning a symmetry three-player GAN for identitypreserving face synthesis. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 821–830, 2018.
[31] L. Tran, X. Yin, and X. Liu. Disentangled representation learning GAN for pose-invariant face recognition. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 1415–1424, 2017.
[32] H. Wang, Y. Wang, Z. Zhou, X. Ji, D. Gong, J. Zhou, Z. Li, and W. Liu. CosFace: Large margin cosine loss for deep face recognition. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 5265–5274, 2018.
[33] Y. Wang, L. Zhang, Z. Liu, G. Hua, Z. Wen, Z. Zhang, and D. Samaras. Face relighting from a single image under arbitrary unknown lighting conditions. IEEE Trans. Pattern Anal. Mach. Intell., 31(11):1968–1984, 2008
[34] D. Yi, Z. Lei, S. Liao, and S. Z. Li. Learning face representation from scratch. arXiv preprint arXiv:1411.7923, 2014.
[35] X. Yin, X. Yu, K. Sohn, X. Liu, and M. Chandraker. Towards large-pose face frontalization in the wild. In Proc. IEEE Int. Conf. Comput. Vis., pages 3990–3999, 2017.
[36] Z. Zhang, X. Chen, B. Wang, G. Hu, W. Zuo, and E. R. Hancock. Face frontalization using an appearance-flow-based convolutional neural network. IEEE Trans. Image Process., 28(5):2187–2199, 2018.
[37] J. Zhao, Y. Cheng, Y. Xu, L. Xiong, J. Li, F. Zhao, K. Jayashree, S. Pranata, S. Shen, J. Xing, et al. Towards pose invariant face recognition in the wild. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 2207–2216, 2018.
[38] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros. Unpaired imageto-image translation using cycle-consistent adversarial networks. In Pro. IEEE Int. Conf. Comput. Vis., pages 2223–2232, 2017.
[39] X. Zhu, Z. Lei, X. Liu, H. Shi, and S. Z. Li. Face alignment across large poses: A 3D solution. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 146–155, 2016.
[40] X. Zhu, Z. Lei, J. Yan, D. Yi, and S. .Z Li. High-fidelity pose and expression normalization for face recognition in the wild. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 787–796, 2015.
[41] Y. A. Mejjati, C. Richardt, J. Tompkin, D. Cosker, and K. I.Kim. Unsupervised attention-guided image-to-image trans-lation. InProc. Adv. Neural Inf. Proc. Syst., pages 3693–3703, 2018.
[42] M. Mirza and S. Osindero. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784, 2014.
[43] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D.Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. InProc. Adv. Neural Inf. Proc. Syst.,pages 2672–2680, 2014.
[44] Hao Zhou, Sunil Hadap, Kalyan Sunkavalli, and David W Ja-cobs. Deep single-image portrait relighting. InProceedingsof the IEEE International Conference on Computer Vision,pages 7194–7202, 2019.
[45] D. J. Jobson, Z. Rahman, and G. A. Woodell. A multiscaleretinex for bridging the gap between color images and thehuman observation of scenes.IEEE Trans. Image Proc.,6(7):965–976, 1997.
[46] R. Basri and D. W. Jacobs. Lambertian reflectance and linear subspaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(2):218–233, 2003.
[47] . Hassner, S. Harel, E. Paz, and R. Enbar. Effective face frontalization in unconstrained images. InProc. IEEE Conf. Comput. Vis. Pattern Recognit., pages 4295–4304, 2015.
[48] S. Sengupta, A. Kanazawa, C. D. Castillo, and D. Jacobs. SfSNet: Learning shape, reflectance and illuminance of faces in the wild. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *