帳號:guest(216.73.216.146)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):孫昕霈
作者(外文):Sun, Hsin Pei
論文名稱(中文):在不同頭部姿勢下的視線估測
論文名稱(外文):Gaze Estimation under Head Pose Variations
指導教授(中文):賴尚宏
指導教授(外文):Lai, Shang Hong
口試委員(中文):王聖智
許秋婷
陳煥宗
口試委員(外文):Wang, Sheng Jyh
Hsu, Chiou Ting
Chen, Hwann Tzong
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:103062562
出版年(民國):105
畢業學年度:105
語文別:英文
論文頁數:35
中文關鍵詞:視線估測深度學習人機互動電腦視覺
外文關鍵詞:Gaze EstimationDeep LearningHuman-Computer InteractionComputer Vision
相關次數:
  • 推薦推薦:0
  • 點閱點閱:830
  • 評分評分:*****
  • 下載下載:18
  • 收藏收藏:0
本篇論文提出一個視線估算的系統,該系統可以在不同的使用者和不同的頭部姿勢下,根據人眼影像去估算出目前人所看到的位置。此一研究有助於開發有別於觸控或體感的人機互動控制模式。
在我們的系統中,為了達到不同受測者皆可使用的特性,我們採用了包含多種頭部姿勢資訊以及多個受測者的UT dataset來學習出各種不同頭部移動的情形。另外,我們建立3D臉部模型來作頭部姿勢的估算來得到轉動的3D資訊,藉此達到全程採用單一相機的基於影像學習視線估測,以擴充應用的廣泛性及一般性。
對於視線估測這類回歸問題,我們引入近年來流行的深度學習架構來解決問題。然而,大部分的視線估測演算法都是在固定頭部姿勢下對於瞳孔在不同位置來判斷人所看的地方,這樣的研究並不適用於一般看電視的情境,比如移動的物體或是人在不同位置,人的視線都會隨著頭部轉動而移動。因此為了解決頭部移動所導致眼睛形狀不同的問題,我們針對區域性的頭部姿勢來訓練不同的深度網路來估算目光位置。
透過實驗,我們證明了如此的方法可以有效的解決在不同頭部姿勢下視線估測的問題,且在訓練時間和表現結果都有所提升。
In this thesis, we propose a new gaze estimation algorithm that estimates where a user looks from the eye images. The proposed gaze estimation algorithm is based on using multiple convolutional neural networks (CNN) to learn the regression networks for estimating gaze angles from eye images. The proposed algorithm can provide accurate gaze estimation for users with different head poses, since it explicitly uses the head pose information in the proposed gaze estimation framework. To achieve person independent system, we train the deep CNN regression networks with UT Multiview dataset, which contains a large number of subjects with large head pose variations. On the other hand, we estimate the head pose from the 2D face image and a generic 3D face model. It is the reason that the proposed algorithm can be widely used for appearance-based gaze estimation in practice. Our experimental results show that the proposed gaze estimation system improves the accuracy of appearance-based gaze estimation under head pose variations compared to the previous methods.
Chapter 1 Introduction 1
1.1 Motivation 1
1.2 Problem Description 1
1.3 Main Contribution 2
1.4 Thesis Organization 3
Chapter 2 Literature Review 5
2.1 Model-based Gaze Estimation 5
2.2 Appearance-based Gaze Estimation Method 6
2.3 Convolutional Neural Networks 7
Chapter 3 Proposed System 9
3.1 System Overview 9
3.2 Image Patch Normalization 10
3.3 Head Pose Estimation 11
3.4 Our CNN architecture 14
3.5 Training Phase 16
3.6 Estimation Phase 18
Chapter 4 Experiment 20
4.1 Camera Calibration and Coordinate System Transformation 20
4.2 Gaze Angle Procedure 22
4.3 Our Data Collection 23
4.4 Different Structures of Our Proposed Method 24
4.5 Comparison with Baseline Method 28
4.6 Cross-subject Experiment in UT Multiview Dataset 30
Chapter 5 Conclusion 31
References 32

[1] J. Nielsen, K. Pernice. Eyetracking Web Usability, Berkeley, CA: New Riders Press, 2009.
[2] J. Nielsen, K. Pernice. How to conduct eyetracking studies, Nielsen Norman Group, 2009.
[3] B. A. Smith, Q. Yin, S. K. Feiner, and S. K. Nayar, Gaze locking: passive eye contact detection for human-object interaction, in Proc. UIST, pages 271–280, 2013.
[4] C. H. Morimoto and M. R. Mimica, Eye gaze tracking techniques for interactive applications, Comput. Vi. Image Understand., Special Issue on Eye Detection and Tracking, vol. 98, no. 1, pp. 4–24, 2005.
[5] J. Gall and V. Lempitsky, Class-specific Hough forest for object detection, in Proceedings of IEEE Conference Computer Vision and Pattern Recognition, 2009.
[6] J. P. Rae, W. Steptoe, and D. J. Roberts, Some Implications of Eye Gaze Behavior and Perception for the Design of Immersive Telecommunication Systems, 2011 IEEE/ACM 15th Int. Symp. Distrib. Simul. Real Time Appl., pp. 108–114, Sep. 2011.
[7] D. W. Hansen and Q. Ji, In the eye of the beholder: A survey of models for eyes and gaze, IEEE Trans. Pattern Anal. Mach. Intell., vol. 32, no. 3, pp. 478–500, Mar. 2010.
[8] C. H. Morimoto and M. R. M. Mimica, Eye gaze tracking techniques for interactive applications, Comput. Vis. Image Understand., vol. 98, no. 1, pp. 4–24, 2005.
[9] A. Nakazawa and C. Nitschke, Point of gaze estimation through corneal surface reflection in an active illumination environment, in Proc. 12th ECCV, pp. 159–172, 2012
[10] R. Valenti, N. Sebe, and T. Gevers, Combining head pose and eye location information for gaze estimation, IEEE Transactions on Image Processing, vol. 21, pp. 802–815, 2012.
[11] J. Panero and M. Zelnik, Human Dimension and Interior Space: A Source Book of Design Reference Standards, New York: Watson-Guptill, 1979.
[12] F. Lu, Y. Sugano, T. Okabe, and Y. Sato, Adaptive linear regression for appearance-based gaze estimation, IEEE Trans. PAMI, Oct 2014.
[13] F. Lu, Y. Sugano, O. Takahiro, and Y. Sato, A head pose-free approach for appearance-based gaze estimation, in BMVC, 2011.
[14] F. Lu, Y. Sugano, T. Okabe, and Y. Sato, Head pose-free appearance-based gaze sensing via eye image synthesis, in ICPR, 2012.
[15] T. Schneider, B. Schauerte, and R. Stiefelhagen, Manifold alignment for person independent appearance-based gaze estimation, in ICPR, 2014.
[16] N. S. Altman, An Introduction to Kernel and Nearest-Neighbor Non-parametric Regression, The American Statistician, vol. 46, pp. 175–185, 1992.
[17] L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone, Classification and Regression Trees, Belmont, CA: Wadsworth International Group, 1984
[18] C. E. Rasmussen and C. K. I. Williams, Gaussian processes for machine learning, the MIT Press, 2006
[19] A. J. Smola and B. Sch¨olkopf, A tutorial on support vector regression, Statistics and Computing, vol. 14, pp. 199–222, 2004.
[20] N. Dalal and W. Triggs, Histograms of Oriented Gradients for Human Detection, in CVPR, 2004.
[21] D. C. He and L. Wang, Texture Unit, Texture Spectrum And Texture Analysis, IEEE Transactions on Geoscience and Remote Sensing, vol. 28, no. 4, pp. 509–512, 1990.
[22] Y. Sugano, Y. Matsushita, and Y. Sato, Learning-by-synthesis for appearance-based 3d gaze estimation, in Proc. CVPR, pages 1821–1828, 2014.
[23] L. Breiman, Random forests, Machine learning, 45(1):5–32, 2001.
[24] Y. LeCun, L.D. Jackel, L. Bottou, C. Cortes, J.S. Denker, H. Drucker, I.Guyon, U.A. M¨uller, E. S¨ackinger, P. Simard, and V. Vapnik, Learning algorithms for classification: A comparism on handwritten digit recognistion, Neural Networks, pages 261–276, 1995.
[25] A. Krizhevsky, I. Sutskever, and G. E. Hinton, ImageNet classification with deep convolutional neural networks, 2012.
[26] K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition, CoRR, abs/1409.1556, 2014.
[27] De la Torre, F., W.-S. Chu, X. Xiong, F. Vicente, X. Ding and J. Cohn, Intraface, 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), pp. 1–8, 2015.
[28] Y. C. Lee, S. H. Lai, Accurate and robust face recognition from RGB-D images with a deep learning approach, National Tsing Hua University, 2016.
[29] V. Lepetit, F. Moreno-Noguer, and P. Fua, EPnP: An accurate o(n) solution to the PnP problem, International Journal of Computer Vision, 81(2):155–166, 2009.
[30] A. Krizhevsky, I. Sutskever, and G. E. Hinton, ImageNet classification with deep convolutional neural networks, in NIPS, pp. 1106–1114, 2012.
[31] J. Ngiam, A. Khosla, M. Kim, J. Nam, H. Lee, and A. Ng, Multimodal deep learning, in Proc. ICML, pages 689–696, 2011.
[32] F. Martinez, A. Carbone, and E. Pissaloux, Gaze estimation using local features and non-linear regression, in ICIP, 2012.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *