作者(外文):Wu, Chung-Yen
論文名稱(中文):NWQMIX - 帶有負權重的競技型QMIX擴展版本
論文名稱(外文):NWQMIX-an Extension of QMIX with Negative Weights for Competitive Play
指導教授(外文):Lee, Duan-Shin
口試委員(外文):Chang, Cheng-Shang
Yi, Chih-Wei
外文關鍵詞:multi-agent reinforcement learningcentralized trainingdecentralized executionQMIXcompetition
在本文中,我們探討並緩解了合作競爭環境中多智能體強化學習 (MARL)中大狀態空間及動作空間所帶來的挑戰。我們的重點是智能體 學習並適應對手策略時的迭代改進。我們提出了一種基於集中訓練和分散執行 等技術的演算法。基於QMIX 框架,我們不僅考慮了隊友的資訊,還考慮了敵 人的資訊並在混合網路中利用負權重混合,從而提高了以對抗性互動為特徵的 環境中的學習效率和策略深度。透過推進這些演算法技術,我們的方法不僅加 速了學習過程,而且還促進了競爭下的穩健決策。透過實驗,顯現我們的演算 法在捕食者-獵物合作競爭環境中明顯超越現有的MARL 方法。
In this paper, we research and alleviate the challenges posed by the expansive state and action space in Multi-Agent Reinforcement Learning (MARL) within cooperative-competitive scenarios. Our focus is on the iterative improvement of agents as they learn from and adapt to their opponents’ strategies. We propose an algorithm based on techniques such as centralized training with decentralized execution. Building on the QMIX framework, our approach incorporates oppo- nent information. It utilizes negative weight mixing in the mixing network, which enhances learning efficiency and strategic depth in environments characterized by adversarial interactions. By advancing these algorithmic techniques, our approach not only accelerates the learning process but also fosters robust decision-making under competition. Through experiments, we demonstrate that our algorithm significantly outperforms existing MARL methods in Predator-Prey cooperative- competitive settings.
Abstract (Chinese) I
Abstract II
Acknowledgements (Chinese) III
Contents IV
List of Figures VI
List of Tables VII
1 Introduction 1
2 Background 4
2.1 VDN, QMIX, and QTRAN . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Deep Q-Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Independent Q-Learning . . . . . . . . . . . . . . . . . . . . . . . . 7
3 Methodology 9
4 Experimental Setup 15
4.1 Decentralized Predator-Prey . . . . . . . . . . . . . . . . . . . . . . 15
4.2 Ablations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5 Results 18
5.1 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.2 Ablation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
6 Conclusion 23
Bibliography 24
