帳號:guest(3.145.168.203)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):張劭平
作者(外文):Chang, Shao-Pin
論文名稱(中文):基於行車紀錄器提取駕駛行為-行車全球座標定位
論文名稱(外文):Extracting Driving Behavior: Global Metric Localization from Dashcam Videos in the Wild
指導教授(中文):孫民
指導教授(外文):Sun, Min
口試委員(中文):賴尚宏
陳煥宗
口試委員(外文):Lai, Shang-Hong
Chen, Hwann-Tzong
學位類別:碩士
校院名稱:國立清華大學
系所名稱:電機工程學系
學號:104061554
出版年(民國):106
畢業學年度:106
語文別:英文
論文頁數:34
中文關鍵詞:駕駛行為全球定位行車紀錄器
外文關鍵詞:driving behaviorglobal localizationdashcam
相關次數:
  • 推薦推薦:0
  • 點閱點閱:449
  • 評分評分:*****
  • 下載下載:38
  • 收藏收藏:0
現今因為可攜式相機的進步, 大多數的車輛都會配備行車紀錄裝置. 我們的目標是利用從網路上下載得到的這些型車紀錄器的影片來提取駕駛行為-3D行車軌跡全球座標定位. 我們提出了一個有效的方法來 (1)從行車紀錄器中提取相對應的3D軌跡, (2)利用谷歌街景全景圖片建立全球度量3D地圖, (3)將相對應的3D軌跡與全球度量3D地圖做校正並對齊. 我們從十一個城市中蒐集了五十部的影片, 每部影片都有著不同的時間點與不同的路況. 在每部影片中, 我們平均會在一條道路上人工均勻的選擇至少15張圖片來作為'控制點', 控制點用於當作3D座標上車子軌跡的參考標註. 由影片中提取出來的3D軌跡會與相對應的控制點做以公尺為單位的誤差運算. 我們提出的方法在中位誤差上只有$2.05$公尺並且有$85.5\%$的控制點在5公尺以內. 我們的方法顯著的優於其他基於視覺的基準方法並且比最廣泛使用的消費者級全球定位系統還要更為準確. 我們的方法在未來的應用上可使用於自駕車子的模擬環境訓練, 先隨機初始化各種車子的軌跡, 並利用增強式學習的方法來讓車子的軌跡不要互相交錯, 進而達到模擬行車的安全.
Given the advance of portable cameras, many vehicles are
equipped with always-on cameras on their dashboards (referred to as dashcam). We aim to utilize these dashcam videos harvested in the wild to extract the driving behavior – global metric localization of 3D vehicle trajectory. We propose a robust approach to (1) extract a relative vehicle 3D trajectory from a dashcam video, (2) create a global metric 3D map using geo-localized Google StreetView RGBD panorama images, and (3) align the relative vehicle 3D trajectory to the 3D map to achieve global metric localization. We conduct an experiment on 50 dashcam videos captured in 11 cities in various time and under various road conditions.
For each video, we uniformly sample at least 15 control point per road
segment to manually annotate the ground truth 3D coordinate of the
vehicle. Each extracted 3D trajectory is compared with these manually
labeled ground truth 3D control points to calculate the error in meters.

Our proposed method achieves a median error of $2.05$ meters and $85.5\%$ of them has error smaller than 5 meters. Our method significantly outperforms other vision-based baseline methods and is a more accurate alternative method than the most widely used consumer-level Global Positioning System (GPS).
Declaration v
誌謝 vii
摘要 ix
Abstract xi
1 Introduction 1
1.1 Motivation and Problem Description . . . . . . . . . . . . . . . . . . . 1
1.2 Main Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Preliminaries 7
2.1 Handcrafted Feature . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.1 Scale-Invariant Feature Transform (SIFT) . . . . . . . . . . . . 7
2.1.2 Histogram of oriented gradients (HOG) . . . . . . . . . . . . . 8
2.2 Deep Convolutional Matching (DeepMatching) . . . . . . . . . . . . . 9
2.3 RANdom SAmple Consensus (RANSAC) . . . . . . . . . . . . . . . . 10
2.4 Structure from Motion (SfM) . . . . . . . . . . . . . . . . . . . . . . . 10
3 Our Method 13
3.1 Extracting 3D Vehicle Trajectory . . . . . . . . . . . . . . . . . . . . . 13
3.2 Creating Global Metric 3D Map . . . . . . . . . . . . . . . . . . . . . 14
3.3 Aligning 3D Trajectory to 3D Map. . . . . . . . . . . . . . . . . . . . 17
4 Experiments 23
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2.1 Baseline Methods . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5 Conclusion 29
References 33
[1] M. Cavallo, “3d city reconstruction from google street view,” Computer Graphics Journal, 2015. xv, 2, 4, 15, 16, 26
[2] P. Gargallo and Y. Kuang, “Opensfm.” https://github.com/mapillary/OpenSfM/. 2, 14
[3] D. Robertson and R. Cipolla, “An image-based system for urban navigation,” in BMVC, 2004. 3
[4] W. Zhang and J. Kosecka, “Image based localization in urban environments,” in 3DPVT, 2006. 3
[5] G. Schindler, M. Brown, and R. Szeliski, “City-scale location recognition,” in CVPR, 2007. 3, 4
[6] J. Hays and A. A. Efros, “Im2gps: estimating geographic information from a single image,” in CVPR, 2008. 3
[7] Y. Song, X. Chen, X. Wang, Y. Zhang, and J. Li, “6-dof image localization from massive geo-tagged reference images,” vol. 18, pp. 1–1, 08 2016. 3
[8] A. R. Zamir and M. Shah, “Accurate image localization based on google maps street view,” in ECCV, 2010. 3, 4
[9] G. Vaca-Castano, A. Zamir, and M. Shah, “City scale geo-spatial trajectory estimation of a moving camera,” in CVPR, 2012. 4
[10] S. Cao and N. Snavely, “Graph-based discriminative learning for location recognition,” in CVPR, 2013. 4
[11] V. Bettadapura, I. Essa, and C. Pantofaru, “Egocentric field-of-view localization using first-person point-of-view devices,” in WACV, 2015. 4
[12] H. J. Kim, E. Dunn, and J.-M. Frahm, “Learned contextual feature reweighting for image geo-localization,” in CVPR, 2017. 4
[13] A. Irschara, C. Zach, J. Frahm, and H. Bischof, “From structure-from-motion point clouds to fast location recognition,” in CVPR, 2009. 4
[14] Y. Li, N. Snavely, and D. Huttenlocher, “Location recognition using prioritized feature matching.,” in ECCV, 2010. 4
[15] T. Sattler, B. Leibe, and L. Kobbelt, “Fast image-based localization using direct 2d-to-3d matching,” in ICCV, 2011. 4
[16] Y. Li, N. Snavely, D. Huttenlocher, and P. Fua, “Worldwide pose estimation using 3d point clouds,” in ECCV, 2012. 4
[17] L. Liu, H. Li, and Y. Dai, “Efficient global 2d-3d matching for camera localization in a large-scale 3d map,” in The IEEE International Conference on Computer Vision (ICCV), Oct 2017. 4
[18] L. Yu, C. J. nad Guillaume Bresson, and F. Moutarde, “Monocular urban localization using street view,” Arxiv, 2016. 4
[19] V. Lepetit, F.Moreno-Noguer, and P.Fua, “Epnp: An accurate o(n) solution to the pnp problem,” International Journal Computer Vision, vol. 81, no. 2, 2009. 4, 26
[20] H. Badino, D. Huber, and T. Kanade, “Real-time topometric localization,” in ICRA, 2012. 5
[21] A. Taneja, L. Ballan, and M. Pollefeys, “Never get lost again: Vision based navigation using streetview images,” in ACCV, 2014. 5
[22] H. Lategahn, M. Schreiber, J. Ziegler, and C. Stiller, “Urban localization with camera and inertial measurement unit,” in Intelligent Vehicles Symposium (IV), 2013.5
[23] G. Floros, B. van der Zander, and B. Leibe, “OpenStreetSLAM: Global vehicle localization using openstreetmaps,” in ICRA, 2013. 5
[24] M. Brubaker, A. Geiger, and R. Urtasun, “Lost! leveraging the crowd for probabilistic visual self-localization,” in CVPR, 2013. 5
[25] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. Comput. Vision, vol. 60, pp. 91–110, nov 2004. 7, 18
[26] N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in In CVPR, pp. 886–893, 2005. 8
[27] P. Weinzaepfel, J. Revaud, Z. Harchaoui, and C. Schmid, “DeepFlow: Large displacement optical flow with deep matching,” in ICCV, Dec. 2013. 9, 20, 26
[28] V. Usenko, J. Engel, J. Stückler, and D. Cremers, “Reconstructing street-scenes in real-time from a driving car,” in Proceedings of the 2015 International Conference
on 3D Vision, 3DV ’15, (Washington, DC, USA), pp. 607–614, IEEE Computer Society, 2015. 16
[29] H. Jegou, F. Perronnin, M. Douze, J. Sánchez, P. Perez, and C. Schmid, “Aggregating local image descriptors into compact codes,” TPAMI, 2011. 18
[30] J. Kim, C. Liu, F. Sha, and K. Grauman, “Deformable spatial pyramid matching for fast dense correspondences,” in CVPR, 2013. 26
[31] N. Bulusu, J. Heidemann, and D. Estrin, “Gps-less low cost outdoor localization for very small devices,” IEEE Pervasive Computing Magazine, vol. 7, pp. 28–34,
Oct. 2000. 27

 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *