作者(外文):Yang, Meng-Hsun
論文名稱(外文):Code Layout Optimization Applying Community Detection Algorithm
指導教授(外文):Lee, Che-Rung
口試委員(外文):Hsu, Wei-Chung
Chung, Yeh-Ching
外文關鍵詞:Community DetectionCode LayoutInstruction LocalityLouvain methodPettis-Hansen
由於處理器速度的成長與記憶體速度的成長差距持續擴大,指令快取(instruction cache)成了熱門的最佳化標的。程式碼佈局最佳化(code layout optimization)是一種優化區域性(locality)與減少快取失誤(cache miss)的優化技巧。過去已有些研究提出以函示為單位的程式碼佈局最佳化,大部分都延伸於 Pettis-Hansen 的方法。然而就我們所知,目前並沒有應用社群偵測於程式碼佈局最佳化的研究。
在此篇論文中,我們將社群結構(community structure)的概念導入程式碼佈局最佳化分析。在我們的實驗中,我們的方法可以平均提升 10.69% 的效能,在最佳的測試中,效能提升高至 12.71%。相較於純粹應用 Pettis-Hansen 的方法,我們的平均效能提升多了 3.19%。此外,分析 LLVM 的呼叫圖來產生最佳化程式碼佈局的計算時間測試中,我們的方法約比 Pettis-Hansen 的方法快至 33 倍,如此一來可以讓程式碼佈局最佳化應用於當代大型系統軟體中更加可行。
Due to the continuous growing of the gap between processor and memory speed, the instruction cache is a popular target for optimization. Code layout optimization is a technique to improve the locality and reduce the instruction cache miss. Numbers of previous research have worked on code layout optimization. Most of their approaches are based on Pettis-Hansen’s method while there is no attempt to use community detection techniques.
In this thesis, we introduce the concept of community structure to optimize code layout and improve the locality. In our experiments, our approach can reach up to 12.71% of performance improvement rate over the baseline and 10.69% on average, which outperforms Pettis-Hansen’s method by 3.19%. Furthermore, the execution time of our approach to generate an optimal code layout from LLVM’s weighted call graph is 33 times faster than Pettis-Hansen’s method, which can make code layout optimization in modern large-scale system software more practical.
Chapter 1 Introduction 6
Chapter 2 Background And Related Work 8
2.1 Code Layout Optimization 9
2.2 Community Detection 11
Chapter 3 Proposed Approach 15
3.1 Community Detecting 16
3.2 In-Community Procedure Reordering 16
Chapter 4 Evaluation 18
4.1 Exeperimental Environment 18
4.2 Evaluation on Layout Generation Time 19
4.2.1 Evaluation Datasets 20
4.2.2 Results 20
4.3 Evaluation on Performance Improvement 21
4.3.1 Evaluation Datasets 22
4.3.2 Results 22
Chapter 5 Conclusions and Future Work 25

