作者(外文):Wang, Hsiu-Ping
論文名稱(外文):DECUDA: Discrepancy Estimation for Compressing UDA Models
指導教授(外文):Lee, Che-Rung
口試委員(外文):Wang, Sheng-Jyh
Chen, Hwann-Tzong
Lee, Che-Rung
外文關鍵詞:Deep LearningComputer VisionModel CompressionDomain Adaption
無監督式領域適應 (UDA)
是轉移學習(Transfer Learning)的一個分支。UDA可以在僅有部分領域有標記資料時,將從此領域學習到的資訊應用在另一個沒有標記資料的領域上。這個技術在深度學習上有很大的用處,可以減少人工標記的成本。但是經由UDA方法訓練的模型參數量通常很大,並且在UDA情境下在目標領域的資料沒有標籤,導致一般的模型壓縮方法不可行或是效果不佳。在本論文中,我們提出一種基於估計領域間差異性(Domain Discrepancy Estimation)的方法實作UDA模型的壓縮。我們利用一種特殊的取樣方法,並在取樣的個體間計算在模型上的原始輸出的餘弦相似度,以此估計來源領域和目標領域間的差異性。這個估計值被利用於壓縮後模型最佳化的目標函數中,用以減少來源領域和目標領域間的差異性。我們將這個過程應用在一個迭代式的模型剪枝方法。在ImageCLEF-DA和Office-31資料集上,我們的方法比起既有的方法有更高的平均準確度,並且不需要存取原始模型而只依賴當前的模型。
Unsupervised domain adaptation (UDA) that transfers the knowledge learned from the domain containing well labeled data to the target domain with unlabeled data has received wide attention owing to the expense of data labeling. However, many UDA models are large, and the lack of target data labels makes most of pruning methods inapplicable. In this paper, we proposed a UDA compression algorithm, called DECUDA (Discrepancy Estimation for Compressing UDA models), which is based on a new domain discrepancy estimation method. DECUDA employs the cosine similarity of logits of instances from source domain and target domain, which are selected by a sampling technique, as an estimation of domain discrepancy. The estimated discrepancy is used in the loss function during the fine-tuning in an iterative pruning algorithms. Comparing to other methods which fetch extra information from full-size model, such as knowledge distillation based methods, DECUDA only relies on the current pruned model. Our method achieves higher average accuracy than other works on Office-31 and ImageCLEF-DA dataset.
中文摘要 1
Abstract 2
List of Figures 5
List of Tables 6
1 Introduction 7
2 Related Works 10
2.1 Unsupervised Domain Adaptation . . . . . . . . . . . . . . . . . . . . . 10
2.2 Deep Network Compression Methods . . . . . . . . . . . . . . . . . . . 11
2.3 Compression for UDA Models . . . . . . . . . . . . . . . . . . . . . . . 12
3 DECUDA Algorithm 14
3.1 Overview of DECUDA Algorithm . . . . . . . . . . . . . . . . . . . . . 14
3.1.1 Basic Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.1.2 Iterative Pruning Algorithm . . . . . . . . . . . . . . . . . . . . 16
3.1.3 Fine-Tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2 Domain Discrepancy in UDA models . . . . . . . . . . . . . . . . . . . 17
3.2.1 Target Classification Loss and Domain Discrepancy . . . . . . . 17
3.2.2 Domain Discrepancy Estimation . . . . . . . . . . . . . . . . . . 19
3.2.3 Probabilistic Instance Sampling (PIS) . . . . . . . . . . . . . . . 19
3.2.4 Extra Fine-Tuning with Discrepancy Estimation . . . . . . . . . 21
4 Experiments 23
4.1 Experimental Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.3 Comparison with Other Methods . . . . . . . . . . . . . . . . . . . . . 25
4.4 Validation of Estimated Domain Discrepancy . . . . . . . . . . . . . . . 27
4.4.1 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5 Conclusion and Future Work 31
References 32
