作者(外文):Tsai, Yun-Lin
論文名稱(外文):Dynamic Pruning and Expansion for Federated Multitask Learning
指導教授(外文):Hong, Yao-Win Peter
口試委員(外文):Chen, Chu-Song
Wang, I-Hsiang
外文關鍵詞:Federated Multitask LearningNeural Network PruningNeural Network Expansion
此論文提出了一種基於動態刪剪及擴張之聯合多任務學習演算法 (DyPE)。該演算法使本地端可以針對各自特定任務量身訂製其模型,並同時利用共享模型參數間的好處。該作法與多數現有的聯合學習作法有所不同,多數現有的作法通常假設所有本地端都使用共通的模型。然而特別的是,於DyPE中,本地刪剪可以協助刪除與本地任務較不相關的參數,從而減少來自其他任務的干擾,而本地拆分擴張則可以使本地端生成特定的子模型以用於捕捉特定的任務知識。此外,由於僅需要於每輪訓練回合中交換中央伺服端與本地端間共享的模型參數,因此所提出的方法也能降低資訊交換的成本。通過利用真實數據集進行實驗並與當前用於聯合多任務學習的最新技術相比,DyPE的有效性得以證明。結果表明,我們所提出的方法不僅能夠良好地處理具有異質數據分佈的情境,也能夠很好地適應不同任務的兼容性。
This thesis proposes a dynamic pruning and expansion (DyPE) technique for the federated learning of multiple diverse local tasks. The technique enables local devices to tailor their models toward locally specific tasks while leveraging the benefits of transfer through shared model parameters. This is different from most existing works on federated learning that assumes the use of a common model at all local devices. In particular, local pruning helps eliminate parameters that are less relevant to the local task so as to reduce interference from other tasks, whereas local expansion generates sub-models that can be used to capture task-specific knowledge. The proposed method is also communication-efficient since only the shared model parameters need to be exchanged between center and local devices in each training round. The effectiveness of DyPE is shown through simulations on real-world datasets in comparison to the current state-of-the-art for federated multitask learning. The results show that our proposed method is capable of handling tasks with non-IID data distributions, and adapts well to the compatibility of different tasks.
Abstract i
Contents ii
1 Introduction 1
2 Related Works 4
2.1 Federated Learning . . . . . . . . . . . . . . . . . . . . 4
2.2 Sequential Learning . . . . . . . . . . . . . . . . . . . . 5
3 Problem Description 7
4 Dynamic Pruning and Expansion for Federated Multitask Learning 10
4.1 Local Pruning of Parameters . . . . . . . . . . . . . . . . . . . . 10
4.2 Splitting of Nodes at Local Devices . . . . . . . . . . . . . . . . . . . . 11
5 Experiment 16
5.1 Dataset Description . . . . . . . . . . . . . . . . . . . . 16
5.2 Baseline Methods . . . . . . . . . . . . . . . . . . . . 19
5.3 Experiment I: Local Tasks from Different Datasets . . . . . . . . . . . . . . . . . . . . 20
5.3 Experiment II: Local Tasks formed by Disjoint Subsets of a Common Dataset . . . . . . . . . . . . . . . . . . . . 30
5.4 Experiment III: Local Tasks formed by Overlapped Classes of a Common Dataset . . . . . . . . . . . . . . . . . . . . 33
6 Conclusion 44

