帳號:guest(18.224.38.15)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):李翊安
作者(外文):Lee, Yi-An
論文名稱(中文):空間注意力機制同步化影片生成和遠程光體積變化描記圖法
論文名稱(外文):Simultaneous Video Generation and Remote Photoplethysmography Estimation with Spatial Attention Mechanism
指導教授(中文):許秋婷
指導教授(外文):Hsu, Chiou-Ting
口試委員(中文):王聖智
邵皓強
口試委員(外文):Wang, Sheng-Jyh
Shao, Hao-Chiang
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:107062576
出版年(民國):109
畢業學年度:109
語文別:英文
論文頁數:37
中文關鍵詞:影片生成遠程光體積變化描記圖法計算機視覺深度學習人工智慧心率偵測生醫影像資料強化注意力機制
外文關鍵詞:Video generationRemote photoplethysmographyComputer visionDeep learningArtificial intelligenceHeart rate estimationBiomedical imagingData augmentationAttention mechanism
相關次數:
  • 推薦推薦:0
  • 點閱點閱:516
  • 評分評分:*****
  • 下載下載:35
  • 收藏收藏:0
遠程光體積變化描記圖法(rPPG)是一種測量生物醫學訊息的技術。藉由分析人體皮膚的光學影像,rPPG可以擷取人體脈搏資訊,且相較傳統心律偵測工具,rPPG具有非侵入、非接觸的優勢。隨著深度學習(Deep learning)與卷積神經網絡(CNN)近年快速的發展,這些技術已被運用在rPPG偵測。儘管如此,精準測量rPPG仍然非常困難。我們認為數據增強(data augmentation)具有改善測量rPPG精確度的潛力。在本文中,我們提出了兩個深度學習的網路:rPPG偵測網絡(rPPG esimation network)和rPPG合成網絡(rPPG synthesizing network),分別從面部影像偵測出rPPG,以及生成訓練用的合成影像。此外,我們在rPPG偵測網絡內設計的注意力模組(Attention module),除了增進rPPG偵測網絡的訓練過程,還能降低合成影片的失真程度。根據實驗結果,我們的方法成功生成了與真實影像幾乎相同、無失真的合成影像,從而改進了rPPG運算網路。此論文的實驗數據也超越了所有現存方法。
Remote photoplethysmography (rPPG) is a non-invasive method for estimating biomedical signals from optically-obtained videos of human skin. With the vast progress of deep learning and convolutional neural networks, the CNN-based learning frameworks have been proven effective for estimating such information. Because accurate estimation of rPPG requires large amount of training data, we believe data augmentation is very potential for improving the performance. In this thesis, we propose two serial, joint-learning frameworks: rPPG estimation network and rPPG synthesizing network, to estimate rPPG signals from face videos and generate synthetic data for augmentation, respectively. Additionally, we implement novel attention modules within the rPPG estimation network to boost the training process and alleviate the artifact of synthetic videos. According to the experimental results on benchmark datasets, our method is capable of generating realistic videos to improve the proposed model and reaches state-of-the-art performance.
誌謝
摘要 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .i
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1
2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5
2.1 Remote Photoplethysmography Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5
2.2 Data Augmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6
2.3 Attention Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6
3 Proposed Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8
3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8
3.2 rPPG Estimation Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9
3.3 rPPG Attention Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10
3.4 rPPG Synthesizing Network . . . . . . . . . . . . . . . . . . . . . . . . . . .12
3.5 Overall Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16
4 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17
4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17
4.2 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17
4.3 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18
4.4 Implementation Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19
4.5 Network Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19
4.6 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .20
4.7 Result and Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24
4.8 Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .29
5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .33
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .34
[1] G. de Haan and V. Jeanne, “Robust pulse rate from chrominance-based rppg,” IEEE Trans-actions on Biomedical Engineering, vol. 60, pp. 2878–2886, Oct 2013.
[2] W. Chen and D. McDuff, “Deepphys: Video-based physiological measurement using con-volutional attention networks,” in The European Conference on Computer Vision (ECCV)(V. Ferrari, M. Hebert, C. Sminchisescu, and Y. Weiss, eds.), (Cham), pp. 356–373,Springer International Publishing, 2018.
[3] W. Chen and D. J. McDuff, “Deepmag: Source specific motion magnification using gra-dient ascent,” CoRR, vol. abs/1808.03338, 2018.
[4] Z.-K. Wang, Y. Kao, and C.-T. Hsu, “Vision-based Heart Rate Estimation via a Two-streamCNN,” in 2019 IEEE International Conference on Image Processing (ICIP), pp. 3327–3331, Sep 2019.
[5] C. Zhao, P. Mei, S. Xu, Y. Li, and Y. Feng, “Performance evaluation of visual objectdetection and tracking algorithms used in remote photoplethysmography,” in Proceedingsof the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, Oct2019.
[6] J. Hernandez-Ortega, J. Fierrez, A. Morales, and P. Tome, “Time Analysis of Pulse-BasedFace Anti-Spoofing in Visible and NIR,” in Proceedings of IEEE Conference on ComputerVision and Pattern Recognition, 2018.
[7] Y. Liu, A. Jourabloo, and X. Liu, “Learning Deep Models for Face Anti-Spoofing: Binaryor Auxiliary Supervision,” in Proceedings of IEEE Conference on Computer Vision andPattern Recognition, pp. 389–398, 2018.
[8] S. Liu, X. Lan, and P. C. Yuen, “Remote photoplethysmography correspondence featurefor 3d mask face presentation attack detection,” 09 2018.
[9] S. Liu, P. C. Yuen, S. Zhang, and G. Zhao, “3d mask face anti-spoofing with remote pho-toplethysmography,” vol. 9911, pp. 85–100, 10 2016.
[10] X. Niu, H. Han, S. Shan, and X. Chen, “Synrhythm: Learning a deep heart rate estimatorfrom general to specific,” in 2018 24th International Conference on Pattern Recognition(ICPR), pp. 3580–3585, Aug 2018.
[11] Y. Qiu, Y. Liu, J. Arteaga-Falconi, H. Dong, and A. E. Saddik, “Evm-cnn: Real-timecontactless heart rate estimation from facial video,” IEEE Transactions on Multimedia,vol. 21, pp. 1778–1787, July 2019
[12] R. S̆petlík, V. Franc, J. C̆ech, and J. Matas, “Visual Heart Rate Estimation with Convolu-tional Neural Network,” in Proceedings of British Machine Vision Conference, 2018.
[13] S. Tulyakov, X. Alameda-Pineda, E. Ricci, L. Yin, J. F. Cohn, and N. Sebe, “Self-adaptivematrix completion for heart rate estimation from face videos under realistic conditions,” in2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2396–2404, June 2016.
[14] Y.-Y. Tsou, Y.-A. Lee, C.-T. Hsu, and S.-H. Chang, “Siamese-rppg network: Remote pho-toplethysmography signal estimation from face video,” in The 35th ACM/SIGAPP Sym-posium on Applied Computing (SAC’20), 2020.
[15] Z. Yu, X. Li, and G. Zhao, “Recovering remote photoplethysmograph signal from facialvideos using spatio-temporal convolutional networks,” CoRR, vol. abs/1905.02419, 2019.
[16] S. Bobbia, R. Macwan, Y. Benezeth, A. Mansouri, and J. Dubois, “Unsupervised skintissue segmentation for remote photoplethysmography,” Pattern Recognition Letters, Oct2017.
[17] R. Stricker, S. Müller, and H.-M. Gross, “Non-contact video-based pulse rate measurementon a mobile service robot,” vol. 2014, pp. 1056–1062, Aug 2014.
[18] M. Soleymani, J. Lichtenauer, T. Pun, and M. Pantic, “A multimodal database for affectrecognition and implicit tagging,” IEEE Transactions on Affective Computing, vol. 3,no. 1, pp. 42–55, 2012.
[19] O. Perepelkina, M. Artemyev, M. Churikova, and M. Grinenko, “Hearttrack: Convolu-tional neural network for remote video-based heart rate monitoring,” in Proceedings of theIEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops,June 2020.
[20] Y. Benezeth, S. Bobbia, K. Nakamura, R. Gomez, and J. Dubois, “Probabilistic sig-nal quality metric for reduced complexity unsupervised remote photoplethysmography,”pp. 1–5, May 2019.
[21] P. Li, K. N. Yannick Benezeth, R. Gomez, and F. Yang, “Model-based region of inter-est segmentation for remote photoplethysmography,” in 14th International Conference onComputer Vision Theory and Applications, pp. 383–388, Feb 2019.
[22] X. Li, J. Chen, G. Zhao, and M. Pietikäinen, “Remote heart rate measurement from facevideos under realistic situations,” in 2014 IEEE Conference on Computer Vision and Pat-tern Recognition, pp. 4264–4271, June 2014.
[23] R. Macwan, Y. Benezeth, and A. Mansouri, “Heart rate estimation using remote photo-plethysmography with multi-objective optimization,” Biomedical Signal Processing andControl, vol. 49, pp. 24–33, March 2019.
[24] R. Macwan, S. Bobbia, Y. Benezeth, J. Dubois, and A. Mansouri, “Periodic variance max-imization using generalized eigenvalue decomposition applied to remote photoplethys-mography estimation,” in 2018 IEEE/CVF Conference on Computer Vision and PatternRecognition Workshops (CVPRW), pp. 1413–14138, June 2018.
[25] W. Wang, S. Stuijk, and G. de Haan, “A novel algorithm for remote photoplethysmogra-phy: Spatial subspace rotation,” IEEE Transactions on Biomedical Engineering, vol. 63,pp. 1974–1984, Sep 2016.
[26] W. Verkruysse, L. O. Svaasand, and J. S. Nelson, “Remote Plethysmographic ImagingUsing Ambient Light,” Optics Express, vol. 16, no. 26, pp. 21434–21445, 2008.
[27] A. B. Hertzmann, “Observations on the Finger Volume Pulse Recorded Photo-electrically,” American Journal of Physiology, vol. 119, pp. 334–335, 1937.
[28] M.-Z. Poh, D. J. McDuff, and R. W. Picard, “Non-contact, Automated Cardiac Pulse Mea-surements Using Video Imaging and Blind Source Separation,” Optics Express, vol. 18,no. 10, pp. 10762–10774, 2010.
[29] G. Balakrishnan, F. Durand, and J. Guttag, “Detecting pulse from head motions in video,”in 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3430–3437,2013.
[30] W. Wang, A. C. den Brinker, S. Stuijk, and G. de Haan, “Algorithmic principles of remoteppg,” IEEE Transactions on Biomedical Engineering, vol. 64, no. 7, pp. 1479–1491, 2017.
[31] Z. Yu, W. Peng, X. Li, X. Hong, and G. Zhao, “Remote heart rate measurement from highlycompressed facial videos: an end-to-end deep learning solution with video enhancement,”in International Conference on Computer Vision (ICCV), 2019.
[32] X. Niu, S. Shan, H. Han, and X. Chen, “Rhythmnet: End-to-end heart rate estimationfrom face via spatial-temporal representation,” IEEE Transactions on Image Processing,vol. 29, pp. 2409–2423, 2020.
[33] A. Krizhevsky, I. Sutskever, and G. Hinton, “Imagenet classification with deep convolu-tional neural networks,” Neural Information Processing Systems, vol. 25, 01 2012.
[34] M. Frid-Adar, E. Klang, M. Amitai, J. Goldberger, and H. Greenspan, “Synthetic dataaugmentation using gan for improved liver lesion classification,” pp. 289–293, April 2018.
[35] S. K. Lim, Y. Loo, N. Tran, N. Cheung, G. Roig, and Y. Elovici, “Doping: Generative dataaugmentation for unsupervised anomaly detection with gan,” in 2018 IEEE InternationalConference on Data Mining (ICDM), pp. 1122–1127, 2018.
[36] K. Simonyan, A. Vedaldi, and A. Zisserman, “Deep inside convolutional networks: Visu-alising image classification models and saliency maps,” 2013.
[37] P. H. Seo, Z. L. Lin, S. Cohen, X. Shen, and B. Han, “Hierarchical attention networks,”ArXiv, vol. abs/1606.02393, 2016.
[38] D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointly learning toalign and translate,” 2014.
[39] Z. Zhang, Y. Xie, F. Xing, M. McGough, and L. Yang, “Mdnet: A semantically andvisually interpretable medical image diagnosis network,” in 2017 IEEE Conference onComputer Vision and Pattern Recognition (CVPR), pp. 3549–3557, 2017.
[40] K. Li, Z. Wu, K. Peng, J. Ernst, and Y. Fu, “Tell me where to look: Guided attention infer-ence network,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recogni-tion, pp. 9215–9223, 2018.
[41] J. Allen, “Photoplethysmography and Its Application in Clinical Physiological Measure-ment,” Physiological Measurement, vol. 28, pp. R1–R39, Feb 2007.
[42] S. L. Phung, A. Bouzerdoum, and D. Chai, “Skin segmentation using color pixel classi-fication: analysis and comparison,” IEEE Transactions on Pattern Analysis and MachineIntelligence, vol. 27, no. 1, pp. 148–154, 2005.
[43] S. L. Phung, D. Chai, and A. Bouzerdoum, “Adaptive skin segmentation in color im-ages,” in 2003 IEEE International Conference on Acoustics, Speech, and Signal Process-ing, 2003. Proceedings. (ICASSP ’03)., vol. 3, pp. III–353, 2003.
[44] Y. Lei, W. Yuan, H. Wang, Y. Wenhu, and W. Bo, “A skin segmentation algorithm basedon stacked autoencoders,” IEEE Transactions on Multimedia, vol. 19, no. 4, pp. 740–749,2017.
[45] S. Chaichulee, M. Villarroel, J. Jorge, C. Arteta, G. Green, K. McCormick, A. Zisserman,and L. Tarassenko, “Multi-task convolutional neural network for patient detection andskin segmentation in continuous non-contact vital sign monitoring,” in 2017 12th IEEEInternational Conference on Automatic Face Gesture Recognition (FG 2017), pp. 266–272, 2017.
[46] S. Liu, P. C. Yuen, S. Zhang, and G. Zhao, “3d mask face anti-spoofing with remote pho-toplethysmography,” vol. 9911, pp. 85–100, 10 2016.
[47] G. Heusch, A. Anjos, and S. Marcel, “A reproducible study on remote heart rate measure-ment,” CoRR, vol. abs/1709.00962, 2017.
[48] A. Radford, L. Metz, and S. Chintala, “Unsupervised representation learning with deepconvolutional generative adversarial networks,” 2015
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *