用於人體姿勢估測之簡化回歸方法__國立清華大學博碩士論文全文影像系統

帳號：guest(216.73.216.146) 離開系統

字體大小：

詳目顯示

第 1 筆 / 共 1 筆

/1頁

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士論文系統

、以作者查詢全國書目

論文基本資料
摘要
外文摘要
論文目次
參考文獻
電子全文

作者(中文):	陳昱甫
作者(外文):	Chen, Yu Fu
論文名稱(中文):	用於人體姿勢估測之簡化回歸方法
論文名稱(外文):	Simplified Regression for Human Pose Estimation
指導教授(中文):	陳煥宗
指導教授(外文):	Chen, Hwann Tzong
口試委員(中文):	賴尚宏劉庭錄
口試委員(外文):	Lai, Shang Hong Liu, Tyng Luh
學位類別:	碩士
校院名稱:	國立清華大學
系所名稱:	資訊工程學系
學號:	103062610
出版年(民國):	105
畢業學年度:	104
語文別:	英文、中文
論文頁數:	27
中文關鍵詞:	捲積類神經網路、人體姿勢估測
外文關鍵詞:	Convolutional Neural Network、Human Pose Estimation
相關次數:	推薦:0 點閱:570 評分: 下載:34 收藏:0

我們介紹一個用於人體姿勢估測的兩階段深度捲積類神經網路。在第一個階段，網路從輸入的圖直接提取特徵，並且結合所有特徵產生一個簡潔但有效預測關節點位置的結果，而不是對每個關節點產生一張熱圖來預測結果。之後，我們利用輸入的圖和從前一個階段生成的合成熱圖當作第二階段的輸入，得到更進一步的結果。我們在兩個資料庫上做評估:FLIC和LSP。我們的方法在FLIC上能夠達到目前最佳的效果。

We present a two-stage deep convolutional neural network for human pose estimation. In the first stage, it directly extracts features from the input image and combines all the features to generate a compact yet effective result for predicting the keypoint locations instead of producing one heatmap for each keypoint. Then, we use the input image and the synthetic heatmaps derived from the previous stage as the input of the second stage to get a refined result of pose estimation. We evaluate our method on two datasets: FLIC and LSP. Our method achieves the state-of-the-art performance on FLIC dataset.

1 Introduction 7
2 Related Work 9
2.1 Human Pose Estimation 9
2.2 YOLO 10
3 Simplified Regression for Human Pose Estimation 11
3.1 The First Stage 11
3.1.1 Network 12
3.1.2 Training 12
3.1.3 Inference 15
3.2 The Second Stage 15
3.2.1 Network 16
3.2.2 Training 16
3.2.3 Inference 16
4 Experiments 18
4.1 Dataset 18
4.2 Evaluation Metrics 19
4.3 Results 19
4.3.1 FLIC 19
4.3.2 LSP 20
5 Conclusion 24

[1] J. Carreira, P. Agrawal, K. Fragkiadaki, and J. Malik. Human pose estimation with iterative error feedback. CoRR, abs/1507.06550, 2015.
[2] X. Chen and A. L. Yuille. Articulated pose estimation by a graphical model with image dependent pairwise relations. In NIPS, pages 1736-1744, 2014.
[3] M. Dantone, J. Gall, C. Leistner, and L. J. V. Gool. Human pose estimation using body parts dependent joint regressors. In CVPR, pages 3041-3048. IEEE Computer Society, 2013.
[4] P. F. Felzenszwalb, D. A. McAllester, and D. Ramanan. A discriminatively trained, multiscale, deformable part model. In CVPR. IEEE Computer Society, 2008.
[5] V. Ferrari, M. J. Marín-Jiménez, and A. Zisserman. Progressive search space reduction for human pose estimation. In CVPR. IEEE Computer Society, 2008.
[6] R. B. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In CVPR, pages 580-587. IEEE Computer Society, 2014.
[7] E. Insafutdinov, L. Pishchulin, B. Andres, M. Andriluka, and B. Schiele. DeeperCut: A deeper, stronger, and faster multi-person pose estimation model. CoRR, abs/1605.03170, 2016.
[8] S. Johnson and M. Everingham. Clustered pose and nonlinear appearance models for human pose estimation. In BMVC, pages 1-11. British Machine Vision Association, 2010.
[9] S. Johnson and M. Everingham. Learning effective human pose estimation from inaccurate annotation. In CVPR, pages 1465-1472. IEEE Computer Society, 2011.
[10] A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, pages 1106-1114, 2012.
[11] I. Lifshitz, E. Fetaya, and S. Ullman. Human pose estimation using deep consensus voting. CoRR, abs/1603.08212, 2016.
[12] A. Newell, K. Yang, and J. Deng. Stacked hourglass networks for human pose estimation. CoRR, abs/1603.06937, 2016.
[13] L. Pishchulin, M. Andriluka, P. V. Gehler, and B. Schiele. Strong appearance and expressive spatial models for human pose estimation. In ICCV, pages 3487-3494. IEEE Computer Society, 2013.
[14] M. Rajchl, M. C. H. Lee, O. Oktay, K. Kamnitsas, J. Passerat-Palmbach, W. Bai, B. Kainz, and D. Rueckert. Deepcut: Object segmentation from bounding box annotations using convolutional neural networks. CoRR, abs/1605.07866, 2016.
[15] J. Redmon, S. K. Divvala, R. B. Girshick, and A. Farhadi. You only look once: Unified, real-time object detection. CoRR, abs/1506.02640, 2015.
[16] S. Ren, K. He, R. B. Girshick, X. Zhang, and J. Sun. Object detection networks on convolutional feature maps. CoRR, abs/1504.06066, 2015.
[17] B. Sapp and B. Taskar. MODEC: multimodal decomposable models for human pose estimation. In CVPR, pages 3674-3681. IEEE Computer Society, 2013.
[18] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. E. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In CVPR, pages 1-9. IEEE Computer Society, 2015.
[19] J. Tompson, R. Goroshin, A. Jain, Y. LeCun, and C. Bregler. Efficient object localization using convolutional networks. In CVPR, pages 648-656. IEEE Computer Society, 2015.
[20] J. J. Tompson, A. Jain, Y. LeCun, and C. Bregler. Joint training of a convolutional network and a graphical model for human pose estimation. In NIPS, pages 1799-1807, 2014.
[21] A. Toshev and C. Szegedy. Deeppose: Human pose estimation via deep neural networks. In CVPR, pages 1653-1660. IEEE Computer Society, 2014.
[22] S. Wei, V. Ramakrishna, T. Kanade, and Y. Sheikh. Convolutional pose machines. CoRR, abs/1602.00134, 2016.
[23] Y. Yang and D. Ramanan. Articulated pose estimation with flexible mixtures-of-parts. In CVPR, pages 1385-1392. IEEE Computer Society, 2011.

電子全文
摘要

推文
推薦
評分
引用網址
轉寄

top

詳目顯示

相關論文