作者(外文):Chang, Wen-Yen
論文名稱(外文):Enhance data selection efficiency with variational auto-encoder for object detection’s active learning
指導教授(外文):Sun, Min
口試委員(外文):Wang, Yu-Chiang
Chen, Hwann-Tzong
外文關鍵詞:Object DetectionActive LearningComputer VisionVariational Auto-encoder
相比於多樣性篩選以及不穩定性篩選,我們的混合型篩選策略在各種環境都具有穩定的表現:我們仰賴不穩定性篩選策略對影像的評分方式,但是我們會動態的調整評分的權重來避免相似資料篩選帶來得不必要花費,首先先藉由K-means 聚類法將變分自編碼器描述的影像分布取得相似影像的假設性標註,再藉由假設性標註以及篩選過的圖片,調輕與被選過的資料同類別的篩選權重,反覆上述步驟直到選取定量的影像進行標註後,我們會加入訓練集來訓練我們的物件偵測模型。我們實驗在四種不同環境以測定我們的混合策略是有效且強健的,並且給予各種方法對於各種環境的適用性比較及使用建議。透過我們的方法可以加速物件偵測系統的建置以及資料的收集在監視器上的應用,在多數環境下我們可以僅使用30%資料訓練模型取得完整資料集訓練模型的90%表現。
We apply pool-based active learning on object detection with surveillance video. The pool-based needs to select one batch of images, which have a budget limit in each selection iteration. Our method utilizes the VAE to enhance the diversity property of the selection strategy. Comparing with uncertainty and diversity selection, our method (hybrid strategy) have robust performance in different environments: Our method relies on uncertainty selection strategy to score image, which is more valuable for labeling. Moreover, we dynamic re-weight the uncertainty score of the image to avoid selecting similar data, which causes the redundant information for object detection model. First, we cluster the latent space of VAE by k-means in order to get similar data pseudo-label. Second, we re-weight uncertainty scores of similar images by the number of selected images with the same pseudo-label. Third, we select the most informative image for annotator labeling, which has the top-1 high re-weighted uncertainty score. Then we select data iteratively following the above steps until reaching the budget limited of the one batch of images. In the end, we add the batch of images as the object detector's training data. We do four experiments to validate that data selection in our method is more efficient and robust. Besides, we organize the recommendation usage of each method in different environments. Finally, we can accelerate the surveillance system build-up time and the data collection through our method. In most environments, we can only use the 30% data to achieve a competitive model 90% performance with the entire dataset.
1 Introduction 1
2 Related Work 5
2.1 Object Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.1 Anchor-design: Anchor-free v. s. Anchor-based . . . . . . . . . 6
2.1.2 Detector Architecture Selection : One/Two-stage Detector . . . 6
2.2 Image Uncertainty Estimation . . . . . . . . . . . . . . . . . . . . . . 7
2.2.1 Valuable Objects Estimation . . . . . . . . . . . . . . . . . . . 7
2.2.2 Valuable Image Estimation . . . . . . . . . . . . . . . . . . . . 9
2.3 Image Diversity Estimation . . . . . . . . . . . . . . . . . . . . . . . . 9
3 Method 11
3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2 Uncertainty Re-weighting . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2.2 Image Uncertainty Estimation . . . . . . . . . . . . . . . . . . 15
3.2.3 Similar Image Clustering . . . . . . . . . . . . . . . . . . . . . 16
3.2.4 Uncertainty Re-weighting . . . . . . . . . . . . . . . . . . . . 18
4 Experiments 20
4.1 Experiments Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.1.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.1.2 Classification Setting . . . . . . . . . . . . . . . . . . . . . . . 23
4.1.3 Object Detection Setting . . . . . . . . . . . . . . . . . . . . . 23
4.2 Proposed Methods Comparison . . . . . . . . . . . . . . . . . . . . . . 25
4.3 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.3.1 Uncertainty Ablation Study . . . . . . . . . . . . . . . . . . . 26
4.3.2 Diversity Ablation Study . . . . . . . . . . . . . . . . . . . . . 28
4.4 Quantitative Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.4.1 Quantitative comparison with AUC of the mAP-labeled . . . . 32
4.4.2 Quantitative comparison with curve of mAP-labeled . . . . . . 33
4.4.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.5 Qualitative Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.5.1 VAE Clustering Property . . . . . . . . . . . . . . . . . . . . . 39
4.5.2 Object Detection Visualize . . . . . . . . . . . . . . . . . . . . 40
5 Conclusion 43
6 Future Work 44
A Method 45
A.1 Heapified Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
A.1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
A.1.2 Observation Feature Designing . . . . . . . . . . . . . . . . . . 46
A.1.3 Heapified Policy . . . . . . . . . . . . . . . . . . . . . . . . . 50
B Dataset Statistic 53
C Failure Case Analysis 54
C.1 Uncertainty selection failure case . . . . . . . . . . . . . . . . . . . . . 54
C.2 The effect of slight-ego-motion for motion selection . . . . . . . . . . . 55
References 57
