作者(外文):Ku, Chun-Chieh
論文名稱(外文):Boosting Unsupervised Domain Adaptation for 3D Object Detection in Point Clouds via 2D Image Semantic Information
指導教授(外文):Lai, Shang-Hong
口試委員(外文):Hsu, Chui-Ting
Lin, Huei-Yung
Chiang, Chen-Kuo
外文關鍵詞:Deep learning3D object detectionUnsupervised domain adaptation
三維和 RGB-D 資訊都能被應用在三維物件偵測的研究領域中,但是由於兩者資料各有不同的建立程序,導致兩種數據表示存在顯著的幾何偏差。這兩種資料類型之間的幾何偏差會導致跨領域測試的性能下降,因此我們提出了一種無監督自適應的框架,以利用不同格式的標註資料進行室內三維物件偵測。我們的方法將從二維圖像預測的像素級語意標籤反向投射到點雲上,以便在兩個方向上進行三維物件偵測和無監督領域自適應。對於從三維到RGB-D 資料這種更具挑戰的無監督領域自適應任務,我們將兩個域中提取
的特徵用在對抗訓練作為額外的策略。我們的方法減少了兩種資料的域間隙,並且利用從二維 RGB 圖像預測的語意標籤信息來提高三維物件偵測模型的準確性。據我們所知,目前沒有應用在室內三維物件偵測無監督自適應的工作或通用基準,因此我們以ScanNet 和 SUNRGB-D 兩個廣泛用於室內三維物件偵測的資料集作為雙向域適應的資料集。和沒有應用任何域適應的方法相比,我們的方法在跨資料集測試的兩個方向分別將 mAP@0.25 提高了6.4% 和 10.3%。
Both 3D and RGB-D data are applicable for 3D object detection, yet there exists significant geometric bias between these two data representations owing to the different reconstruction procedures.
The geometric bias between these two data types induces performance drops for cross-domain testing; hence we propose an unsupervised domain adaption (UDA) framework to leverage annotated data in different data formats for indoor 3D object detection.
Our method inverse-projects the pixel-wise semantic labels predicted from 2D images onto point clouds for object detection and UDA in both directions.
For the more challenging UDA from 3D to RGB-D data, we propose some additional strategies to reduce the domain gap by aligning the extracted features from two domains with adversarial training.
Our method reduces the domain gap between two types of data and leverages the semantic label information predicted from 2D RGB images to boost the accuracy of the 3D object detection model.
To our knowledge, there are no prior works or common benchmarks on unsupervised domain adaptation for indoor object detection. Thus, we validate our approach with ScanNet and SUN RGB-D as the source and the target datasets in both directions of domain adaptation. The proposed method improves the mAP@0.25 by 6.4\% and 10.3\% for the two directions of cross-dataset testing compared with that without applying any domain adaptation.
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Related Work 4
2.1 Point-based 3D Object Detection . . . . . . . . . . . . . . . . . . . 4
2.2 3D Object Detection and Semantic Segmentation . . . . . . . . . . 5
2.3 Unsupervised Domain Adaptation . . . . . . . . . . . . . . . . . . 5
3 Proposed Method 7
3.1 Inverse-Projection of 2D Semantic Labels . . . . . . . . . . . . . . 8
3.2 3D Object Detection Branch . . . . . . . . . . . . . . . . . . . . . 10
3.3 Adversarial Training . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.4 Loss Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4 Experiments 15
4.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.2 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.3 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . 16
4.4 Experimantal Results . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.4.1 Unsupervised Domain Adaptation Results on 3D Object Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.4.2 Within-domain Results on 3D Object Detection . . . . . . . 18
4.5 Ablations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.5.1 Discussion of Frame Selection . . . . . . . . . . . . . . . . 19
4.5.2 Contribution of Adversarial Training . . . . . . . . . . . . 19
4.5.3 Discussion of Input Features of Domain Classifier . . . . . 24
4.6 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.6.1 Visualization of Unsupervised Domain Adaptation Detection Results . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.6.2 Visualization of Within-domain Detection Results . . . . . 25
4.6.3 Failure Cases Study . . . . . . . . . . . . . . . . . . . . . . 26
5 Conclusions 30
References 31
