作者(外文):Chang, Shao-Pin
論文名稱(外文):Extracting Driving Behavior: Global Metric Localization from Dashcam Videos in the Wild
指導教授(外文):Sun, Min
口試委員(外文):Lai, Shang-Hong
Chen, Hwann-Tzong
外文關鍵詞:driving behaviorglobal localizationdashcam
現今因為可攜式相機的進步, 大多數的車輛都會配備行車紀錄裝置. 我們的目標是利用從網路上下載得到的這些型車紀錄器的影片來提取駕駛行為-3D行車軌跡全球座標定位. 我們提出了一個有效的方法來 (1)從行車紀錄器中提取相對應的3D軌跡, (2)利用谷歌街景全景圖片建立全球度量3D地圖, (3)將相對應的3D軌跡與全球度量3D地圖做校正並對齊. 我們從十一個城市中蒐集了五十部的影片, 每部影片都有著不同的時間點與不同的路況. 在每部影片中, 我們平均會在一條道路上人工均勻的選擇至少15張圖片來作為'控制點', 控制點用於當作3D座標上車子軌跡的參考標註. 由影片中提取出來的3D軌跡會與相對應的控制點做以公尺為單位的誤差運算. 我們提出的方法在中位誤差上只有$2.05$公尺並且有$85.5\%$的控制點在5公尺以內. 我們的方法顯著的優於其他基於視覺的基準方法並且比最廣泛使用的消費者級全球定位系統還要更為準確. 我們的方法在未來的應用上可使用於自駕車子的模擬環境訓練, 先隨機初始化各種車子的軌跡, 並利用增強式學習的方法來讓車子的軌跡不要互相交錯, 進而達到模擬行車的安全.
Given the advance of portable cameras, many vehicles are
equipped with always-on cameras on their dashboards (referred to as dashcam). We aim to utilize these dashcam videos harvested in the wild to extract the driving behavior – global metric localization of 3D vehicle trajectory. We propose a robust approach to (1) extract a relative vehicle 3D trajectory from a dashcam video, (2) create a global metric 3D map using geo-localized Google StreetView RGBD panorama images, and (3) align the relative vehicle 3D trajectory to the 3D map to achieve global metric localization. We conduct an experiment on 50 dashcam videos captured in 11 cities in various time and under various road conditions.
For each video, we uniformly sample at least 15 control point per road
segment to manually annotate the ground truth 3D coordinate of the
vehicle. Each extracted 3D trajectory is compared with these manually
labeled ground truth 3D control points to calculate the error in meters.

Our proposed method achieves a median error of $2.05$ meters and $85.5\%$ of them has error smaller than 5 meters. Our method significantly outperforms other vision-based baseline methods and is a more accurate alternative method than the most widely used consumer-level Global Positioning System (GPS).
Declaration v
誌謝 vii
摘要 ix
Abstract xi
1 Introduction 1
1.1 Motivation and Problem Description . . . . . . . . . . . . . . . . . . . 1
1.2 Main Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Preliminaries 7
2.1 Handcrafted Feature . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.1 Scale-Invariant Feature Transform (SIFT) . . . . . . . . . . . . 7
2.1.2 Histogram of oriented gradients (HOG) . . . . . . . . . . . . . 8
2.2 Deep Convolutional Matching (DeepMatching) . . . . . . . . . . . . . 9
2.3 RANdom SAmple Consensus (RANSAC) . . . . . . . . . . . . . . . . 10
2.4 Structure from Motion (SfM) . . . . . . . . . . . . . . . . . . . . . . . 10
3 Our Method 13
3.1 Extracting 3D Vehicle Trajectory . . . . . . . . . . . . . . . . . . . . . 13
3.2 Creating Global Metric 3D Map . . . . . . . . . . . . . . . . . . . . . 14
3.3 Aligning 3D Trajectory to 3D Map. . . . . . . . . . . . . . . . . . . . 17
4 Experiments 23
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2.1 Baseline Methods . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5 Conclusion 29
References 33
