帳號:guest(3.14.252.21)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):劉聖鴻
作者(外文):Liu, Sheng-Hung
論文名稱(中文):自監督式特徵分離學習與擴增於單領域泛化的遠程心率估計
論文名稱(外文):Self-Supervised Disentangled Feature Learning and Augmentation for Single Domain Generalization in rPPG Estimation
指導教授(中文):許秋婷
指導教授(外文):Hsu, Chiou-Ting
口試委員(中文):黃敬群
邵皓強
口試委員(外文):Huang, Ching-Chun
Shao, Hao-Chiang
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:110062635
出版年(民國):112
畢業學年度:112
語文別:英文
論文頁數:37
中文關鍵詞:遠距光體積變化描記圖單領域泛化自監督特徵分離資料擴增
外文關鍵詞:Remote photoplethysmographySingle-domain generalizationSelf-supervised feature disentanglementData augmentation
相關次數:
  • 推薦推薦:0
  • 點閱點閱:16
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
遠距光體積變化描記圖提供了一種從面部影像中監測生理信號的非接觸方法。然而,由於生理信號的微弱性質,這種技術在未知的測試領域中可能難以有效泛化。為了改善泛化能力,先前的方法針對領域泛化問題提出了基於遠程心率估計的方法。然而,這些方法通常需要在多個領域進行訓練。在本文中,我們提出了一種新的自監督特徵分離增強心率估計網絡於單領域的泛化,更實用於實際情境。我們的目標是在不依賴於領域標註的情況下進行特徵分離學習,並進一步擴增特徵的多樣性,以實現更有效的泛化。的自監督特徵分離增強心率估計網絡利用自監督的分離特徵學習來提取心率和非心率特徵。接著,我們提出了兩種對分離特徵的擴增技術,包括可學習式的特徵變換以及動態AdaIN,以增強模型的泛化能力。在多個基準數據集上的實驗結果呈現,所提出的方法在顯著優於先前方法。
Remote photoplethysmography (rPPG) offers a non-contact method for monitoring physiological signals from facial videos. However, this technique might struggle to generalize effectively to unseen test domains due to the subtlety of physiological signals. To improve generalization, prior approaches have proposed rPPG estimation methods based on domain generalization problem. However, these methods often require training with multiple domains. In this thesis, we introduce a novel Self-Supervised Disentangled feature Augmentation rPPG network (SSDA-rPPGNet) for single-domain generalization, which is more practical in real-world scenarios. Our goal is to disentangle features without relying on domain annotations and further enhance feature diversity to facilitate more effective generalization. SSDA-rPPGNet utilizes self-supervised disentangled feature learning to extract both rPPG and non-rPPG features. Next, we propose two augmentation techniques to the disentangled features, including Learnable Feature Transformation (LFT) and Dynamic AdaIN (DAdaIN), to enhance the model's generalization ability. Experimental results on multiple benchmark datasets show that the proposed method significantly outperforms previous approaches.
Abstract i
Acknowledgments
1 Introduction 1
2 Related Work 5
2.1 Remote Photoplethysmography Estimation 5
2.2 Single-Domain Generalization 6
2.3 rPPG Data Augmentation 6
3 Method 8
3.1 Disentangled Feature Learning 9
3.2 Augmentation of rPPG Feature 13
3.3 Augmentation of Non-rPPG Feature 15
3.4 Model Training of SSDA-rPPGNet 18
3.5 Inference Stage 19
4 Experiment 21
4.1 Overview 21
4.2 Datasets 21
4.3 Evaluation Metrics 23
4.4 Implementation Details 24
4.5 Ablation Study 25
4.5.1 t-SNE Visualization 26
4.6 Results and Comparison 29
4.6.1 Cross-Dataset Testing 29
4.6.2 Intra-Dataset Testing 31
5 Conclusion 33
References 34
[1] S. Bobbia, R. Macwan, Y. Benezeth, A. Mansouri, and J. Dubois, “Unsupervised skin
tissue segmentation for remote photoplethysmography,” Pattern Recognition Letters,
vol. 124, pp. 82–90, 2019.
[2] R. Stricker, S. Müller, and H.-M. Gross, “Non-contact video-based pulse rate measurement
on a mobile service robot,” in The 23rd IEEE International Symposium on Robot and
Human Interactive Communication, pp. 1056–1062, IEEE, 2014.
[3] G. Heusch, A. Anjos, and S. Marcel, “A reproducible study on remote heart rate measurement,” arXiv preprint arXiv:1709.00962, 2017.
[4] L. Van der Maaten and G. Hinton, “Visualizing data using t-sne.,” Journal of machine
learning research, vol. 9, no. 11, 2008.
[5] R. Song, H. Chen, J. Cheng, C. Li, Y. Liu, and X. Chen, “Pulsegan: Learning to generate
realistic pulse waveforms in remote photoplethysmography,” IEEE Journal of Biomedical
and Health Informatics, vol. 25, no. 5, pp. 1373–1384, 2021.
[6] Y.-Y. Tsou, Y.-A. Lee, and C.-T. Hsu, “Multi-task learning for simultaneous video generation and remote photoplethysmography estimation,” in Proceedings of the Asian Conference on Computer Vision, 2020.
[7] F. Bousefsaf, A. Pruski, and C. Maaoui, “3d convolutional neural networks for remote
pulse rate measurement and mapping from facial video,” Applied Sciences, vol. 9, no. 20,
p. 4364, 2019.
[8] W. Chen and D. McDuff, “Deepphys: Video-based physiological measurement using convolutional attention networks,” in Proceedings of the European Conference on Computer
Vision (ECCV), pp. 349–365, 2018.
[9] E. Lee, E. Chen, and C.-Y. Lee, “Meta-rppg: Remote heart rate estimation using a
transductive meta-learner,” in European Conference on Computer Vision, pp. 392–409,
Springer, 2020.
[10] H. Lu, H. Han, and S. K. Zhou, “Dual-gan: Joint bvp and noise modeling for remote
physiological measurement,” in Proceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition, pp. 12404–12413, 2021.
[11] X. Niu, H. Han, S. Shan, and X. Chen, “Synrhythm: Learning a deep heart rate estimator
from general to specific,” in 2018 24th International Conference on Pattern Recognition
(ICPR), pp. 3580–3585, IEEE, 2018.
[12] R. Špetlík, V. Franc, and J. Matas, “Visual heart rate estimation with convolutional neural
network,” in Proceedings of the british machine vision conference, Newcastle, UK, pp. 3–
6, 2018.
[13] Y.-Y. Tsou, Y.-A. Lee, C.-T. Hsu, and S.-H. Chang, “Siamese-rppg network: Remote photoplethysmography signal estimation from face videos,” in Proceedings of the 35th annual
ACM symposium on applied computing, pp. 2066–2073, 2020.
[14] Z. Yu, W. Peng, X. Li, X. Hong, and G. Zhao, “Remote heart rate measurement from highly
compressed facial videos: an end-to-end deep learning solution with video enhancement,”
in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 151–
160, 2019.
[15] Z. Yu, Y. Shen, J. Shi, H. Zhao, P. H. Torr, and G. Zhao, “Physformer: facial videobased physiological measurement with temporal difference transformer,” in Proceedings
of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4186–
4196, 2022.
[16] W.-H. Chung, C.-J. Hsieh, S.-H. Liu, and C.-T. Hsu, “Domain generalized rppg network:
Disentangled feature learning with domain permutation and domain augmentation,” in
Proceedings of the Asian Conference on Computer Vision, pp. 807–823, 2022.
[17] A. Wu and C. Deng, “Single-domain generalized object detection in urban scene via
cyclic-disentangled self-distillation,” in Proceedings of the IEEE/CVF Conference on
computer vision and pattern recognition, pp. 847–856, 2022.
[18] T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive
learning of visual representations,” in International conference on machine learning,
pp. 1597–1607, PMLR, 2020.
[19] X. Niu, S. Shan, H. Han, and X. Chen, “Rhythmnet: End-to-end heart rate estimation
from face via spatial-temporal representation,” IEEE Transactions on Image Processing,
vol. 29, pp. 2409–2423, 2019.
[20] S. Perche, D. Botina, Y. Benezeth, K. Nakamura, R. Gomez, and J. Miteran, “Dataaugmentation for deep learning based remote photoplethysmography methods,” in 2021
International Conference on e-Health and Bioengineering (EHB), pp. 1–4, IEEE, 2021.
[21] C.-J. Hsieh, W.-H. Chung, and C.-T. Hsu, “Augmentation of rppg benchmark datasets:
Learning to remove and embed rppg signals via double cycle consistent learning from
unpaired facial videos,” in European Conference on Computer Vision, pp. 372–387,
Springer, 2022.
[22] X. Niu, X. Zhao, H. Han, A. Das, A. Dantcheva, S. Shan, and X. Chen, “Robust remote
heart rate estimation from face utilizing spatial-temporal attention,” in 2019 14th IEEE
international conference on automatic face & gesture recognition (FG 2019), pp. 1–8,
IEEE, 2019.
[23] H. Wang, E. Ahn, and J. Kim, “Self-supervised representation learning framework for
remote physiological measurement using spatiotemporal augmentation loss,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 2431–2439, 2022.
[24] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair,
A. Courville, and Y. Bengio, “Generative adversarial nets,” Advances in neural information processing systems, vol. 27, 2014.
[25] T. Karras, S. Laine, and T. Aila, “A style-based generator architecture for generative adversarial networks,” in Proceedings of the IEEE/CVF conference on computer vision and
pattern recognition, pp. 4401–4410, 2019.
[26] G. De Haan and V. Jeanne, “Robust pulse rate from chrominance-based rppg,” IEEE
Transactions on Biomedical Engineering, vol. 60, no. 10, pp. 2878–2886, 2013.
[27] W. Verkruysse, L. O. Svaasand, and J. S. Nelson, “Remote plethysmographic imaging
using ambient light.,” Optics express, vol. 16, no. 26, pp. 21434–21445, 2008.
[28] M.-Z. Poh, D. J. McDuff, and R. W. Picard, “Non-contact, automated cardiac pulse measurements using video imaging and blind source separation.,” Optics express, vol. 18,
no. 10, pp. 10762–10774, 2010.
[29] W. Wang, A. C. Den Brinker, S. Stuijk, and G. De Haan, “Algorithmic principles of remote
ppg,” IEEE Transactions on Biomedical Engineering, vol. 64, no. 7, pp. 1479–1491, 2016.
[30] X. Li, J. Chen, G. Zhao, and M. Pietikainen, “Remote heart rate measurement from face
videos under realistic situations,” in Proceedings of the IEEE conference on computer
vision and pattern recognition, pp. 4264–4271, 2014.
[31] W. Wang, S. Stuijk, and G. De Haan, “A novel algorithm for remote photoplethysmography: Spatial subspace rotation,” IEEE transactions on biomedical engineering, vol. 63,
no. 9, pp. 1974–1984, 2015.
[32] G. De Haan and V. Jeanne, “Robust pulse rate from chrominance-based rppg,” IEEE
Transactions on Biomedical Engineering, vol. 60, no. 10, pp. 2878–2886, 2013.
[33] M.-Z. Poh, D. J. McDuff, and R. W. Picard, “Advancements in noncontact, multiparameter physiological measurements using a webcam,” IEEE transactions on biomedical engineering, vol. 58, no. 1, pp. 7–11, 2010.
[34] F. Qiao, L. Zhao, and X. Peng, “Learning to learn single domain generalization,” in
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,
pp. 12556–12565, 2020.
[35] R. Volpi, H. Namkoong, O. Sener, J. C. Duchi, V. Murino, and S. Savarese, “Generalizing
to unseen domains via adversarial data augmentation,” Advances in neural information
processing systems, vol. 31, 2018.
[36] X. Fan, Q. Wang, J. Ke, F. Yang, B. Gong, and M. Zhou, “Adversarially adaptive normalization for single domain generalization,” in Proceedings of the IEEE/CVF conference on
Computer Vision and Pattern Recognition, pp. 8208–8217, 2021.
[37] L. Li, K. Gao, J. Cao, Z. Huang, Y. Weng, X. Mi, Z. Yu, X. Li, and B. Xia, “Progressive domain expansion network for single domain generalization,” in Proceedings of the
IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 224–233, 2021.
[38] V. Balntas, E. Riba, D. Ponsa, and K. Mikolajczyk, “Learning local feature descriptors
with triplets and shallow convolutional neural networks.,” in Bmvc, vol. 1, p. 3, 2016.
[39] D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” arXiv preprint
arXiv:1312.6114, 2013.
[40] A. Bulat and G. Tzimiropoulos, “How far are we from solving the 2d & 3d face alignment
problem? (and a dataset of 230,000 3d facial landmarks),” in International Conference on
Computer Vision, 2017.
[41] R. Ranftl, K. Lasinger, D. Hafner, K. Schindler, and V. Koltun, “Towards robust monocular
depth estimation: Mixing datasets for zero-shot cross-dataset transfer,” IEEE transactions
on pattern analysis and machine intelligence, 2020.
[42] J. Du, S.-Q. Liu, B. Zhang, and P. C. Yuen, “Dual-bridging with adversarial noise generation for domain adaptive rppg estimation,” in Proceedings of the IEEE/CVF Conference
on Computer Vision and Pattern Recognition, pp. 10355–10364, 2023.
[43] J. Gideon and S. Stent, “The way to my heart is through contrastive learning: Remote
photoplethysmography from unlabelled video,” in Proceedings of the IEEE/CVF international conference on computer vision, pp. 3995–4004, 2021.
[44] Z. Sun and X. Li, “Contrast-phys: Unsupervised video-based remote physiological measurement via spatiotemporal contrast,” in European Conference on Computer Vision,
pp. 492–510, Springer, 2022.
[45] A. K. Gupta, R. Kumar, L. Birla, and P. Gupta, “Radiant: Better rppg estimation using
signal embeddings and transformer,” in Proceedings of the IEEE/CVF Winter Conference
on Applications of Computer Vision, pp. 4976–4986, 2023.
[46] H. Lu, Z. Yu, X. Niu, and Y.-C. Chen, “Neuron structure modeling for generalizable remote
physiological measurement,” in Proceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition, pp. 18589–18599, 2023.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *