帳號:guest(18.218.224.226)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):呂政祺
作者(外文):Lu, Zheng-Chi
論文名稱(中文):基於生物力學的三維手勢資料增強
論文名稱(外文):Biomechanical Augmentation for 3D Hand Pose Estimation
指導教授(中文):陳煥宗
指導教授(外文):Chen, Hwann-Tzong
口試委員(中文):王聖智
李濬屹
口試委員(外文):Wang, Sheng-Jyh
Lee, Chun-Yi
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:109062569
出版年(民國):112
畢業學年度:111
語文別:英文
論文頁數:34
中文關鍵詞:手部姿勢估計資料增強生物力學
外文關鍵詞:Hand Pose EstimationData AugmentationBiomechanics
相關次數:
  • 推薦推薦:0
  • 點閱點閱:32
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
在深度神經網路的助力下,學習型模型在三維手勢估計上取得令人驚豔的進展。然而這類型的模型容易捕捉到訓練資料集裡的姿勢偏差(pose bias),進而導致模型的跨資料集表現不佳,或甚至在訓練資料集切分出來的測試集都會有可察覺的退步。為了緩解姿勢偏差的問題,我們提出了一個新穎的、基於生物力學的資料增強框架,用於合成新的二維-三維姿勢對(pose pair)。這些新生成的姿勢對除了能分佈於符合生物力學的子空間,同時也提供充足的姿勢多樣性。為了達成這個結果,我們首先從原始訓練資料集計算出生物力學的數據:兩種手指的骨頭旋轉角度。接著,我們會從這些數據中取樣來產生多個待選的資料增強參數,然後藉由我們設計的、基於骨頭角度的可能性(likelihood)從待選參數中決定最終的資料增強參數。最後,我們對訓練資料集中的原始手勢樣本執行手指骨頭旋轉的資料增強來生成最終的資料增強結果,旋轉的參數是由最終選擇的資料增強參數決定。在FreiHAND資料集的實驗以及在HO-3D和DexYCB資料集上的跨資料集實驗展示了基於生物力學的資料增強框架的有效性。我們在這些資料集上針對基準模型的姿勢估計錯誤皆取得近$10\%$的進步幅度。相較之下,競爭方法只在FrieHAND資料集上取得與我們相似的進步,但在HO-3D和DexYCB的跨資料集設定下進度幅度縮小。
Empowered by deep neural networks, learning-based models have achieved amazing progress in 3D hand pose estimation. However, these models tend to capture pose bias in the original training dataset and result in poor cross-dataset performance or even have noticeable degradation when evaluating on test split of the training dataset. To alleviate the pose bias issue, we propose a novel biomechanical augmentation framework to synthesize new 2D-3D pose pairs where synthesized hand samples lie in the biomechanically-valid subspace but at the same time provide sufficient pose variety. To this end, we first calculate the biomechanical statistics, including two types of finger bone rotation angles, from the original training set. Next, with these statistics, we generate multiple augmentation parameter candidates through sampling among them and determine the final candidate based on our designed bone-angle-based hand likelihood. Last, we conduct finger bone rotation augmentation on original hand samples in the training dataset with bone angles specified in the final candidate, resulting in new augmentation samples. Extensive experiments on FreiHAND and cross-dataset evaluations on HO-3D and DexYCB demonstrate the effectiveness of our biomechanical augmentation framework. We improve the pose estimation error of the baseline model on these datasets with $10\%$ margin, while the competing method achieves similar performance as ours on FreiHAND but performs inferiorly on both HO-3D and DexYCB in the cross-dataset setting.
List of Tables 3
List of Figures 4

摘 要 6
Abstract 7

1 Introduction 8

2 Related Work 10
2.1 3D Hand Pose Estimation 10
2.2 Data Augmentation for 3D Human and Hand Pose 11

3 Approach 12
3.1 Problem Formulation 12
3.2 Human Hand Representation 14
3.3 Defining Biomechanically-Valid Range 15
3.4 Determining Augmentation Parameter 17
3.5 Biomechanical Augmentation 20
3.6 Algorithm 20

4 Experiments 22
4.1 Implementation Details 22
4.2 Datasets and Protocols 23
4.3 Evaluation Metric 24
4.4 Results on Benchmark Datasets 25
4.5 Ablation Study 25

5 Conclusion 30

Bibliography 32
[1] Adnane Boukhayma, Rodrigo de Bem, and Philip HS Torr. 3d hand shape and pose
from images in the wild. In Proceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition, pages 10843–10852, 2019.
[2] Zhe Cao, Ilija Radosavovic, Angjoo Kanazawa, and Jitendra Malik. Reconstructing
hand-object interactions in the wild. In Proceedings of the IEEE/CVF International
Conference on Computer Vision, pages 12417–12426, 2021.
[3] Yu-Wei Chao, Wei Yang, Yu Xiang, Pavlo Molchanov, Ankur Handa, Jonathan Tremblay,
Yashraj S Narang, Karl VanWyk, Umar Iqbal, Stan Birchfield, et al. Dexycb: A
benchmark for capturing hand grasping of objects. In Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern Recognition, pages 9044–9053, 2021.
[4] Xingyu Chen, Yufeng Liu, Chongyang Ma, Jianlong Chang, Huayan Wang, Tian
Chen, Xiaoyan Guo, Pengfei Wan, and Wen Zheng. Camera-space hand mesh recovery
via semantic aggregation and adaptive 2d-1d registration. In Proceedings of the
IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13274–
13283, 2021.
[5] Yu Cheng, Bo Yang, Bo Wang, Wending Yan, and Robby T Tan. Occlusion-aware
networks for 3d human pose estimation in video. In Proceedings of the IEEE/CVF
international conference on computer vision, pages 723–732, 2019.
[6] Hongsuk Choi, Gyeongsik Moon, and Kyoung Mu Lee. Pose2mesh: Graph convolutional
network for 3d human pose and mesh recovery from a 2d human pose. In
Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–
28, 2020, Proceedings, Part VII 16, pages 769–787. Springer, 2020.
[7] Kehong Gong, Jianfeng Zhang, and Jiashi Feng. Poseaug: A differentiable pose augmentation
framework for 3d human pose estimation. In Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern Recognition, pages 8575–8584, 2021.
[8] Shreyas Hampali, Mahdi Rad, Markus Oberweger, and Vincent Lepetit. Honnotate: A
method for 3d annotation of hand and object poses. In Proceedings of the IEEE/CVF
conference on computer vision and pattern recognition, pages 3196–3206, 2020.
[9] Umar Iqbal, Pavlo Molchanov, Thomas Breuel Juergen Gall, and Jan Kautz. Hand
pose estimation via latent 2.5 d heatmap regression. In Proceedings of the European
Conference on Computer Vision (ECCV), pages 118–134, 2018.
[10] Dominik Kulon, Riza Alp Guler, Iasonas Kokkinos, Michael M Bronstein, and Stefanos
Zafeiriou. Weakly-supervised mesh-convolutional hand reconstruction in the
wild. In Proceedings of the IEEE/CVF conference on computer vision and pattern
recognition, pages 4990–5000, 2020.
[11] Han Li, Bowen Shi, Wenrui Dai, Hongwei Zheng, Botao Wang, Yu Sun, Min
Guo, Chenlin Li, Junni Zou, and Hongkai Xiong. Pose-oriented transformer with
uncertainty-guided refinement for 2d-to-3d human pose estimation. arXiv preprint
arXiv:2302.07408, 2023.
[12] Shichao Li, Lei Ke, Kevin Pratama, Yu-Wing Tai, Chi-Keung Tang, and Kwang-Ting
Cheng. Cascaded deep monocular 3d human pose estimation with evolutionary training
data. In Proceedings of the IEEE/CVF conference on computer vision and pattern
recognition, pages 6173–6183, 2020.
[13] Julieta Martinez, Rayat Hossain, Javier Romero, and James J Little. A simple yet
effective baseline for 3d human pose estimation. In Proceedings of the IEEE international
conference on computer vision, pages 2640–2649, 2017.
[14] Franziska Mueller, Florian Bernard, Oleksandr Sotnychenko, Dushyant Mehta, Srinath
Sridhar, Dan Casas, and Christian Theobalt. Ganerated hands for real-time 3d
hand tracking from monocular rgb. In Proceedings of the IEEE conference on computer
vision and pattern recognition, pages 49–59, 2018.
[15] Alejandro Newell, Kaiyu Yang, and Jia Deng. Stacked hourglass networks for human
pose estimation. In Computer Vision–ECCV 2016: 14th European Conference,
Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part VIII 14, pages
483–499. Springer, 2016.
[16] Javier Romero, Dimitrios Tzionas, and Michael J Black. Embodied hands: Modeling
and capturing hands and bodies together. arXiv preprint arXiv:2201.02610, 2022.
[17] István Sárándi, Timm Linder, Kai O Arras, and Bastian Leibe. How robust is 3d
human pose estimation to occlusion? arXiv preprint arXiv:1808.09316, 2018.
[18] Adrian Spurr, Umar Iqbal, Pavlo Molchanov, Otmar Hilliges, and Jan Kautz. Weakly
supervised 3d hand pose estimation via biomechanical constraints. In Computer
Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020,
Proceedings, Part XVII 16, pages 211–228. Springer, 2020.
[19] Ke Sun, Bin Xiao, Dong Liu, and Jingdong Wang. Deep high-resolution representation
learning for human pose estimation. In Proceedings of the IEEE/CVF conference
on computer vision and pattern recognition, pages 5693–5703, 2019.
[20] Xiao Tang, Tianyu Wang, and Chi-Wing Fu. Towards accurate alignment in realtime
3d hand-mesh reconstruction. In Proceedings of the IEEE/CVF International
Conference on Computer Vision, pages 11698–11707, 2021.
[21] Christian Zimmermann and Thomas Brox. Learning to estimate 3d hand pose from
single rgb images. In Proceedings of the IEEE international conference on computer
vision, pages 4903–4911, 2017.
[22] Christian Zimmermann, Duygu Ceylan, Jimei Yang, Bryan Russell, Max Argus, and
Thomas Brox. Freihand: A dataset for markerless capture of hand pose and shape
from single rgb images. In Proceedings of the IEEE/CVF International Conference
on Computer Vision, pages 813–822, 2019.

 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *