作者(外文):Liao, Yu-Chen
論文名稱(中文):具分時複用 I/O 的多重2.5D FPGA 系統腳位定位最佳化
論文名稱(外文):Pin Assignment Optimization for Multi-2.5D FPGA-based Systems with Time-Multiplexed I/Os
指導教授(外文):Mak, Wai-Kei
口試委員(外文):Wang, Ting-Chi
Ho, Tsung-Yi
外文關鍵詞:2.5D field-programmable gate arrays (FPGAs)Pin AssignmentNetwork flow algorithm
先進的2.5D FPGA與傳統FPGA相比,2.5D FPGA 能提供更大的邏輯容量及更多的腳位數量。 有些多重FPGA系統使用了2.5D FPGA。 例如商業2.5D FPGA包含者數個裸晶,彼此之間利用載版相互連接。 載版提供裸晶間相互連接的資源非常有限。與裸晶內的連接資源相比,其資源數量遠多於裸晶間相互連接的資源,並且透過載版之間的連接會造成額外的信號延遲。在最近的研究中顯示,當將電路放置到一個2.5D FPGA時,有效的降低信號在裸晶之間的傳遞數量對於2.5D FPGA系統的可繞線性及時序能獲取較好的表現。在多重2.5D FPGA系統裡,2.5D FPGA之間由多條硬體線路連接著,在系統裡可能會有大量的信號與一個FPGA相關,並且他們的腳位分配會對裸晶之間的信號傳遞數量有著很大的影響。在本論文中,我們制定了腳位分配問題,目標是將系統中每個2.5D FPGA中的裸晶之間的信號傳遞數量最小化。我們提出一個有效且效率很高的演算法,此演算法基於聚類最佳化及最小成本流最佳化來實現。實驗結果表示,與之前的方法 [1] 相比,我們可以減少99.8%的運行時間,並且僅增加1%的裸晶之間的信號傳遞數量。雖然 [1] 擁有高達1%的質量優勢,但我們的方法快了數個量級並且可以輕鬆處理非常大的實例。
Advanced 2.5D FPGAs with larger logic capacity and higher pin counts compared to conventional FPGAs are commercially available. Some multi-FPGA systems have already utilized 2.5D FPGAs. Commercial 2.5D FPGA consists of multiple dies connected through an interposer. The interposer provides a fraction of the amount of interconnect resources with increased delay compared to that within individual dies. A recent study has shown the benefits of reducing signal crossings between dies on routability and timing when a circuit is mapped to a 2.5D FPGA. In a multi-2.5D FPGA system with multiplexed hardwired inter-FPGA connections, there can be tens of thousands of inter-FPGA signals incident with each FPGA and their pin assignment can greatly affect the amount of inter-die signal crossings within each FPGA. In this thesis, we formulate the pin assignment problem for such system with the objective of minimizing signal crossings between dies within the individual FPGAs. Taking into consideration of the multi-die structure of 2.5D FPGA, we propose an effective and efficient iterative improvement algorithms based on clustering optimization and minimum cost flow optimization to the problem. Experimental results show that our proposed algorithm can reduce 99.8% on average runtime while incurring less than 1% signal crossing compared to [1]. While the ILP-based algorithm has up to 1% quality advantage for moderate-sized instances, the other algorithm is orders of magnitude faster and can comfortably handle very large instances.
誌謝 v
Acknowledgements vii
摘要 ix
Abstract xi
1 Introduction 1
1.1 Multi-FPGA System .......................... 1
1.2 SLR crossings in 2.5DFPGA ..................... 2
1.3 Major contribution ........................... 3
2 Preliminaries 4
2.1 Targeted Architecture and Compilation Flow. . . . . . . . . . . . . 4
2.2 Related Works ............................. 7
2.3 SLR-Aware Pin Assignment ...................... 8
3 Algorithm 11
3.1 Overview ................................ 11
3.2 Initial Feasible Pin Assignment Generation. . . . . . . . . . . . . . 13
3.3 Pin Assignment Refinement Based on Clustering and Minimum CostFlow................................ 13
3.3.1 Tentative Signal Grouping by Clustering Subnets between
Two Adjacent FPGAs ..................... 15
3.3.2 Pin Mapping between Two Adjacent FPGAs . . . . . . . . . 17
3.3.3 Physically-Aware Signal Re-grouping between Two Adjacent
FPGAs ............................. 19
3.3.4 Remedying Conservative Start ................ 22
4 Experimental Result 24
5 Conclusion 28
Reference 29
