帳號:guest(18.191.71.190)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):蔡恩榮
作者(外文):Tsai, En-Jung
論文名稱(中文):基於深度學習注意力模組強化特定類內變化之不變特徵轉換以多姿態人臉辨識為例
論文名稱(外文):PAM: Pose Attention Module for Pose-Invariant Face Recognition
指導教授(中文):葉維彰
指導教授(外文):Yeh, Wei-Chang
口試委員(中文):梁韵嘉
賴智明
口試委員(外文): Liang, Yun-Chia
Lai, Chyh-Ming
學位類別:碩士
校院名稱:國立清華大學
系所名稱:工業工程與工程管理學系
學號:107034574
出版年(民國):110
畢業學年度:109
語文別:英文
論文頁數:66
中文關鍵詞:深度學習人臉識別類內不均衡姿態不變特徵轉換注意力機制
外文關鍵詞:Deep LearningFace RecognitionIntra­-class ImbalancePose­-invariant Fea­ture TransformationAttention Mechanism
相關次數:
  • 推薦推薦:0
  • 點閱點閱:239
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
自從卷積神經網絡在電腦視覺與模式識別任務中取得非凡的成功以來,基於深度學習的人臉識別一直是當前最活躍的研究領域之一。近年,人臉識別研究主要集中在增強特徵表示判別性的損失函數設計上,該進展有效地大幅提高了當前大規模人臉識別的性能。然而,由於當前大規模的人臉資料集仍存在類間數量不均衡與類內特徵不均衡的問題,現有方法在諸如人臉姿態、年齡、表情等具有較大類內特徵變化的情境時,識別能力仍有著顯著退化的困境。這說明,學習特定類內變化的不變特徵轉換在無約束人臉識別中仍是當前最具有挑戰性的問題之一。其中,人臉姿態因拍攝角度的不同,不僅正臉與側臉的特徵於2D圖像有著巨大的差異,因角度造成的面部遮擋、光源方向皆使跨姿態問題成為影響人臉識別性能最大的課題。因此,本研究專注於人臉識別中具高度類內特徵差異的跨姿態問題(Cross-Pose Face Recognition),提出一種姿態注意力模組(Pose Attention Module, PAM)以提高人臉識別中姿態不變特徵學習的穩健性。PAM是一種輕量且易於實現的注意力模組,可以與任何前饋卷積神經網絡整合,並以端到端的方式進行訓練。具體而言,我們設計PAM使之通過一軟門機制(Soft gate),藉由讓不同比例的信息通過,在分層特徵空間中學習正臉與側臉間的姿態殘差,以進行正臉側臉間的特徵變換。據我們所知,此方法是第一個嘗試在下採樣前的瓶頸處使用注意力機制增強特定語義特徵變換的研究。通過廣泛的消融實驗,我們驗證了PAM 設計的有效性,並以多個人臉識別最具代表性的基準測試驗證了此注意力模組的性能,這些基準包括LFW、CFP-FP、AgeDB-30、CPLFW和CALFW。實驗結果表明,我們的方法優於最先進的方法,不僅有效提升了跨姿態不變特徵轉換的穩健性,輕量化的模組設計更大幅縮減了75倍的參數量。值得注意的是,我們的方法並不局限於解決跨姿態問題的人臉識別,通過將PAM的軟門機制調整為特定類內語義特徵的系數,這種語義注意力模組可以很容易地擴展到其他人臉識別的類內不均衡問題,包括年齡、光照、表情等較大變量的非約束問題。
Face recognition has been one of the most active research areas since Convolutional Neural Networks (CNNs) achieved phenomenal success in computer vision and pattern recognition tasks. Previous research has focused on the design of loss functions for learning discriminative representation, which has greatly improved the performance of large-scale face recognition. However, the existing methods suffer degradation in performance when processing intra-class features with large variations, such as pose, age, facial expression, etc. This indicates that learning invariant features for intra-class variations remains a great challenge in unconstrained face recognition. In this thesis, we propose a Pose Attention Module (PAM) to enhance the robustness of pose-invariant feature learning for deep face recognition. PAM is a lightweight and easy-to-implement attention block that can be integrated and trained with any feed-forward convolutional neural networks in an end-to-end manner. Specifically, PAM performs frontal-profile feature transformation in hierarchical feature space by learning residuals between pose variations with a soft gate mechanism. To the best of our knowledge, our method is the first attempt to employ hierarchical attention of specific semantic features at bottlenecks. We validate the effectiveness of PAM block design through extensive ablation studies and verified the performance of the proposed attention module on several popular benchmarks, including LFW, CFP-FP, AgeDB-30, CPLFW, and CALFW. Experimental results show that our method not only consistently outperforms state-of-the-art methods but also effectively reduces memory requirements by more than 75 times. It is noteworthy that our method is not limited to face recognition with large pose variations. By adjusting the soft gate mechanism of PAM to a specific coefficient, such semantic attention block can easily extend to address other intra-class imbalance problems in face recognition, including large variations in age, illumination, expression, etc.
中文摘要 ----------------------------------------------------------- i
Abstract ---------------------------------------------------------- ii
Acknowledgements -------------------------------------------------- iii
1 Introduction ---------------------------------------------------- 1
1.1 Background ---------------------------------------------------- 1
1.2 Motivation ---------------------------------------------------- 4
1.3 Research Aims ------------------------------------------------- 6
1.4 Overview of the Thesis ---------------------------------------- 7
2 Literature Review ----------------------------------------------- 9
2.1 Deep Learning-based Face Recognition -------------------------- 9
2.1.1 Face Detection ---------------------------------------------- 10
2.1.2 Face Processing --------------------------------------------- 10
2.1.3 Deep Face Recognition --------------------------------------- 12
2.2 The Development of Discriminative Loss Functions -------------- 15
2.2.1 Softmax Loss ------------------------------------------------ 16
2.2.2 Facenet: Triplet Loss --------------------------------------- 17
2.2.3 Center Loss ------------------------------------------------- 18
2.2.4 L-Softmax Loss ---------------------------------------------- 19
2.2.5 SphereFace: A-Softmax Loss ---------------------------------- 19
2.2.6 CosFace: Large Margin Cosine Loss --------------------------- 20
2.2.7 ArcFace: Additive Angular Margin Loss ----------------------- 21
2.3 Previous Works for Pose-Invariant Face Recognition ------------ 23
2.3.1 Face Frontalization ----------------------------------------- 23
2.3.2 Face Augmentation ------------------------------------------- 24
2.3.3 Pose-Invariant Representation Learning ---------------------- 25
2.3.4 Summary and Discussion -------------------------------------- 26
3 Methodology ----------------------------------------------------- 28
3.1 Hypothesis ---------------------------------------------------- 28
3.2 Proposed Method ----------------------------------------------- 30
3.2.1 Depthwise Residual Module (DRM) ----------------------------- 32
3.2.2 Soft Gate Mechanism ----------------------------------------- 35
3.2.3 Channel Attention Module (CAM) ------------------------------ 37
3.3 Novelties and Contributions ----------------------------------- 39
4 Experiments ----------------------------------------------------- 41
4.1 Datasets ------------------------------------------------------ 41
4.1.1 Training Set ------------------------------------------------ 41
4.1.2 Testing Set ------------------------------------------------- 42
4.2 Implementation Details ---------------------------------------- 44
4.2.1 Data Prepossessing ------------------------------------------ 44
4.2.2 Embedding Network ------------------------------------------- 44
4.2.3 Training Settings ------------------------------------------- 45
4.2.4 Testing Settings -------------------------------------------- 46
4.2.5 Environment Configuration ----------------------------------- 46
4.3 Ablation Study ------------------------------------------------ 47
4.3.1 Examining the Location for PAM ------------------------------ 47
4.3.2 Effectiveness of the Depthwise Convolution ------------------ 49
4.3.3 Effectiveness of the CAM block design ----------------------- 50
4.3.4 Effectiveness of the Soft Gate ------------------------------ 51
4.3.5 Effectiveness of the DRM and CAM ---------------------------- 52
4.4 Comparison with SOTA methods ---------------------------------- 54
4.4.1 Comparison with Open-sourced Face Recognition Models -------- 54
4.4.2 Comparison with DREAM Block --------------------------------- 55
5 Conclusion ------------------------------------------------------ 58
References -------------------------------------------------------- 61
[1] G. Little, S. Krishna, J. Black, and S. Panchanathan, “A methodology for evalu­ating robustness of face recognition algorithms with respect to variations in pose angle and illumination angle,” in Proceedings.(ICASSP’05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005., vol. 2, pp. ii–89, IEEE, 2005.
[2] F. Schroff, D. Kalenichenko, and J. Philbin, “Facenet: A unified embedding for face recognition and clustering,” in Proceedings of the IEEE conference on com­ puter vision and pattern recognition, pp. 815–823, 2015.
[3] Y. Wen, K. Zhang, Z. Li, and Y. Qiao, “A discriminative feature learning approach for deep face recognition,” in European conference on computer vision, pp. 499– 515, Springer, 2016.
[4] W. Liu, Y. Wen, Z. Yu, and M. Yang, “Large­margin softmax loss for convolutional neural networks.,” in ICML, vol. 2, p. 7, 2016.
[5] W. Liu, Y. Wen, Z. Yu, M. Li, B. Raj, and L. Song, “Sphereface: Deep hyper­ sphere embedding for face recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 212–220, 2017.
[6] H. Wang, Y. Wang, Z. Zhou, X. Ji, D. Gong, J. Zhou, Z. Li, and W. Liu, “Cosface: Large margin cosine loss for deep face recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5265–5274, 2018.
[7] J. Deng, J. Guo, N. Xue, and S. Zafeiriou, “Arcface: Additive angular margin loss for deep face recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4690–4699, 2019.
[8] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.­C. Chen, “Mobilenetv2: Inverted residuals and linear bottlenecks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4510–4520, 2018.
[9] M. Mehdipour Ghazi and H. Kemal Ekenel, “A comprehensive analysis of deep learning based representation for face recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp. 34–41, 2016.
[10] W. W. Bledsoe, “The model method in facial recognition,” Panoramic Research Inc., Palo Alto, CA, Rep. PR1, vol. 15, no. 47, p. 2, 1966.
[11] Y. Taigman, M. Yang, M. Ranzato, and L. Wolf, “Deepface: Closing the gap to human­level performance in face verification,” in Proceedings of the IEEE con­ference on computer vision and pattern recognition, pp. 1701–1708, 2014.
[12] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in neural information processing sys­ tems, pp. 1097–1105, 2012.
[13] M. Z. Alom, T. M. Taha, C. Yakopcic, S. Westberg, P. Sidike, M. S. Nasrin,
B. C. Van Esesn, A. A. S. Awwal, and V. K. Asari, “The history began from alexnet: A comprehensive survey on deep learning approaches,” arXiv preprint arXiv:1803.01164, 2018.
[14] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large­scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
[15] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recog­ nition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778, 2016.
[16] K. He, X. Zhang, S. Ren, and J. Sun, “Identity mappings in deep residual net­ works,” in European conference on computer vision, pp. 630–645, Springer, 2016.
[17] S. Xie, R. Girshick, P. Dollár, Z. Tu, and K. He, “Aggregated residual transforma­ tions for deep neural networks,” in Proceedings of the IEEE conference on com­ puter vision and pattern recognition, pp. 1492–1500, 2017.
[18] H. Zhang, C. Wu, Z. Zhang, Y. Zhu, H. Lin, Z. Zhang, Y. Sun, T. He,
J. Mueller, R. Manmatha, et al., “Resnest: Split­attention networks,” arXiv preprint arXiv:2004.08955, 2020.
[19] A. K. Jain, P. Flynn, and A. A. Ross, Handbook of biometrics. Springer Science & Business Media, 2007.
[20] P. J. Grother, M. L. Ngan, and G. W. Quinn, “Face in video evaluation (five) face recognition of non­cooperative subjects,” tech. rep., 2017.
[21] T. Berg and P. Belhumeur, “Poof: Part­based one­vs.­one features for fine­grained categorization, face verification, and attribute estimation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 955–962, 2013.
[22] J. Krause, T. Gebru, J. Deng, L.­J. Li, and L. Fei­Fei, “Learning features and parts for fine­grained recognition,” in 2014 22nd International Conference on Pattern Recognition, pp. 26–33, IEEE, 2014.
[23] Y. Sun, Y. Chen, X. Wang, and X. Tang, “Deep learning face representation by joint identification­verification,” in Advances in neural information processing systems, pp. 1988–1996, 2014.
[24] Y. Sun, X. Wang, and X. Tang, “Deep learning face representation from predicting 10,000 classes,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1891–1898, 2014.
[25] Y. Sun, X. Wang, and X. Tang, “Deeply learned face representations are sparse, selective, and robust,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2892–2900, 2015.
[26] Y. Sun, D. Liang, X. Wang, and X. Tang, “Deepid3: Face recognition with very deep neural networks,” arXiv preprint arXiv:1502.00873, 2015.
[27] O. M. Parkhi, A. Vedaldi, and A. Zisserman, “Deep face recognition,” 2015.
[28] W. Zhao, R. Chellappa, P. J. Phillips, and A. Rosenfeld, “Face recognition: A literature survey,” ACM computing surveys (CSUR), vol. 35, no. 4, pp. 399–458, 2003.
[29] S. Sengupta, J.­C. Chen, C. Castillo, V. M. Patel, R. Chellappa, and D. W. Jacobs, “Frontal to profile face verification in the wild,” in 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1–9, IEEE, 2016.
[30] H. Li, Z. Lin, X. Shen, J. Brandt, and G. Hua, “A convolutional neural network cascade for face detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5325–5334, 2015.
[31] R. Ranjan, V. M. Patel, and R. Chellappa, “Hyperface: A deep multi­task learning framework for face detection, landmark localization, pose estimation, and gender recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 1, pp. 121–135, 2017.
[32] H. Sadeghi, A.­A. Raie, and M.­R. Mohammadi, “Facial expression recognition using geometric normalization and appearance representation,” in 2013 8th Ira­ nian Conference On Machine Vision and Image Processing (MVIP), pp. 159–163, IEEE, 2013.
[33] X. Zhu, X. Liu, Z. Lei, and S. Z. Li, “Face alignment in full pose range: A 3d total solution,” IEEE transactions on pattern analysis and machine intelligence, vol. 41, no. 1, pp. 78–92, 2017.
[34] X. Jin and X. Tan, “Face alignment in­the­wild: A survey,” Computer Vision and Image Understanding, vol. 162, pp. 1–22, 2017.
[35] Y. Wu and Q. Ji, “Facial landmark detection: A literature survey,” International Journal of Computer Vision, vol. 127, no. 2, pp. 115–142, 2019.
[36] T.­Y. Yang, Y.­T. Chen, Y.­Y. Lin, and Y.­Y. Chuang, “Fsa­net: Learning fine­ grained structure aggregation for head pose estimation from a single image,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1087–1096, 2019.
[37] M. Wang and W. Deng, “Deep face recognition: A survey,” arXiv preprint arXiv:1804.06655, 2018.
[38] M. Gunther, S. Cruz, E. M. Rudd, and T. E. Boult, “Toward open­set face recog­ nition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 71–80, 2017.
[39] T. Ahonen, A. Hadid, and M. Pietikäinen, “Face recognition with local binary pat­ terns,” in European conference on computer vision, pp. 469–481, Springer, 2004.
[40] C. Geng and X. Jiang, “Face recognition using sift features,” in 2009 16th IEEE international conference on image processing (ICIP), pp. 3313–3316, IEEE, 2009.
[41] O. Déniz, G. Bueno, J. Salido, and F. De la Torre, “Face recognition using histograms of oriented gradients,” Pattern Recognition Letters, vol. 32, no. 12, pp. 1598–1603, 2011.
[42] D. S. Trigueros, L. Meng, and M. Hartnett, “Face recognition: From traditional to deep learning methods,” arXiv preprint arXiv:1811.00116, 2018.
[43] J. Hu, L. Shen, and G. Sun, “Squeeze­and­excitation networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7132–7141, 2018.
[44] Y. Guo, L. Zhang, Y. Hu, X. He, and J. Gao, “Ms­celeb­1m: A dataset and bench­ mark for large­scale face recognition,” in European Conference on Computer Vi­ sion, pp. 87–102, Springer, 2016.
[45] V. Blanz and T. Vetter, “A morphable model for the synthesis of 3d faces,” in Proceedings of the 26th annual conference on Computer graphics and interactive techniques, pp. 187–194, 1999.
[46] T. Hassner, S. Harel, E. Paz, and R. Enbar, “Effective face frontalization in uncon­ strained images,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4295–4304, 2015.
[47] X. Zhu, Z. Lei, J. Yan, D. Yi, and S. Z. Li, “High­fidelity pose and expression nor­ malization for face recognition in the wild,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 787–796, 2015.
[48] I. Goodfellow, J. Pouget­Abadie, M. Mirza, B. Xu, D. Warde­Farley, S. Ozair,
A. Courville, and Y. Bengio, “Generative adversarial nets,” in Advances in neural information processing systems, pp. 2672–2680, 2014.
[49] R. Huang, S. Zhang, T. Li, and R. He, “Beyond face rotation: Global and local perception gan for photorealistic and identity preserving frontal view synthesis,” in Proceedings of the IEEE international conference on computer vision, pp. 2439– 2448, 2017.
[50] X. Yin, X. Yu, K. Sohn, X. Liu, and M. Chandraker, “Towards large­pose face frontalization in the wild,” in Proceedings of the IEEE international conference on computer vision, pp. 3990–3999, 2017.
[51] L. Perez and J. Wang, “The effectiveness of data augmentation in image classifi­ cation using deep learning,” arXiv preprint arXiv:1712.04621, 2017.
[52] C. Huang, Y. Li, C. C. Loy, and X. Tang, “Learning deep representation for imbal­ anced classification,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5375–5384, 2016.
[53] H. Liu, X. Zhu, Z. Lei, and S. Z. Li, “Adaptiveface: Adaptive margin and sampling for face recognition,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11947–11956, 2019.
[54] J. Zhao, L. Xiong, J. Karlekar, J. Li, F. Zhao, Z. Wang, S. Pranata, S. Shen, S. Yan, and J. Feng, “Dual­agent gans for photorealistic and identity preserving profile face synthesis.,” in NIPS, vol. 2, p. 3, 2017.
[55] Y. Shen, P. Luo, J. Yan, X. Wang, and X. Tang, “Faceid­gan: Learning a symme­ try three­player gan for identity­preserving face synthesis,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 821–830, 2018.
[56] J. Deng, S. Cheng, N. Xue, Y. Zhou, and S. Zafeiriou, “Uv­gan: Adversarial fa­ cial uv map completion for pose­invariant face recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7093–7102, 2018.
[57] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Van­ houcke, and A. Rabinovich, “Going deeper with convolutions,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1–9, 2015.
[58] L. Tran, X. Yin, and X. Liu, “Disentangled representation learning gan for pose­ invariant face recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1415–1424, 2017.
[59] J. Zhao, Y. Cheng, Y. Xu, L. Xiong, J. Li, F. Zhao, K. Jayashree, S. Pranata,
S. Shen, J. Xing, et al., “Towards pose invariant face recognition in the wild,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2207–2216, 2018.
[60] K. Cao, Y. Rong, C. Li, X. Tang, and C. C. Loy, “Pose­robust face recognition via deep residual equivariant mapping,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5187–5196, 2018.
[61] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” nature, vol. 521, no. 7553, pp. 436–444, 2015.
[62] J. Park, S. Woo, J. Lee, and I. S. Kweon, “BAM: bottleneck attention module,” in British Machine Vision Conference 2018, BMVC 2018, Newcastle, UK, September 3­6, 2018, p. 147, BMVA Press, 2018.
[63] K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers: Surpassing human­level performance on imagenet classification,” in Proceedings of the IEEE international conference on computer vision, pp. 1026–1034, 2015.
[64] D. Han, J. Kim, and J. Kim, “Deep pyramidal residual networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5927–5935, 2017.
[65] F. Chollet, “Xception: Deep learning with depthwise separable convolutions,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1251–1258, 2017.
[66] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. An­ dreetto, and H. Adam, “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” arXiv preprint arXiv:1704.04861, 2017.
[67] X. Zhang, X. Zhou, M. Lin, and J. Sun, “Shufflenet: An extremely efficient con­ volutional neural network for mobile devices,” in Proceedings of the IEEE confer­ ence on computer vision and pattern recognition, pp. 6848–6856, 2018.
[68] S. Woo, J. Park, J.­Y. Lee, and I. S. Kweon, “Cbam: Convolutional block attention module,” in Proceedings of the European conference on computer vision (ECCV), pp. 3–19, 2018.
[69] G. B. Huang, M. Mattar, T. Berg, and E. Learned­Miller, “Labeled faces in the wild: A database forstudying face recognition in unconstrained environments,” in Workshop on faces in’Real­Life’Images: detection, alignment, and recognition, 2008.
[70] S. Moschoglou, A. Papaioannou, C. Sagonas, J. Deng, I. Kotsia, and S. Zafeiriou, “Agedb: the first manually collected, in­the­wild age database,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 51–59, 2017.
[71] T. Zheng and W. Deng, “Cross­pose lfw: A database for studying crosspose face recognition in unconstrained environments,” Beijing University of Posts and Telecommunications, Tech. Rep, pp. 18–01, 2018.
[72] T. Zheng, W. Deng, and J. Hu, “Cross­age lfw: A database for study­ ing cross­age face recognition in unconstrained environments,” arXiv preprint arXiv:1708.08197, 2017.
[73] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang,
A. Karpathy, A. Khosla, M. Bernstein, et al., “Imagenet large scale visual recogni­ tion challenge,” International journal of computer vision, vol. 115, no. 3, pp. 211– 252, 2015.
[74] A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Des­ maison, L. Antiga, and A. Lerer, “Automatic differentiation in pytorch,” in NIPS Workshop, 2017.
[75] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin,
N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Rai­ son, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, “Pytorch: An imperative style, high­performance deep learning library,” in Ad­ vances in Neural Information Processing Systems 32 (H. Wallach, H. Larochelle,
A. Beygelzimer, F. d'Alché­Buc, E. Fox, and R. Garnett, eds.), pp. 8024–8035, Curran Associates, Inc., 2019.
[76] Q. Cao, L. Shen, W. Xie, O. M. Parkhi, and A. Zisserman, “Vggface2: A dataset for recognising faces across pose and age,” in 2018 13th IEEE International Con­ ference on Automatic Face & Gesture Recognition (FG 2018), pp. 67–74, IEEE, 2018.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *