帳號:guest(3.14.250.221)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):張書桓
作者(外文):Chang, Shu-Huan
論文名稱(中文):應用強化學習法於邏輯合成的面積最佳化
論文名稱(外文):A Reinforcement Learning Based Logic Synthesis Framework for Further Area Optimization
指導教授(中文):張世杰
指導教授(外文):Chang, Shih-Chieh
口試委員(中文):王俊堯
吳凱強
口試委員(外文):Wang, Chun-Yao
Wu, Kai-Chiang
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系所
學號:104062701
出版年(民國):107
畢業學年度:106
語文別:英文
論文頁數:31
中文關鍵詞:機器學習強化學習邏輯合成
外文關鍵詞:Machine LearningReinforcement LearningLogic Synthesis
相關次數:
  • 推薦推薦:0
  • 點閱點閱:584
  • 評分評分:*****
  • 下載下載:33
  • 收藏收藏:0
在業界的邏輯合成階段,電子設計自動化工具的結果,可以由調整邏輯閘級單元庫(gate-level cell library)來達到進一步的優化。透過更好地選擇單元,Synopsys Design Compiler 可以在保持給定的設計約束(design constraint)的同時,實現更好的面積縮減。因為每個設計所需的單元庫可能不同,通常要根據每個設計,手動且憑經驗地執行這種微調過程(fine-tuning procedure),可能非常耗時且低效率。我們在本論文中提出了一個基於強化學習(reinforcementlearning)的邏輯合成框架,在保持給定的設計約束的同時進一步減少面積。我們使用基於策略policy-based)的機器學習方法,在邏輯合成時探索更好的單元選擇。首先,我們使用Synopsys Design Compiler 為給定的設計用一個目標單元庫作邏輯合成。合成後網表(netlist)的時間與面積資訊被發送到我們的機器學習模型,模型根據獲得的資訊修改目標單元庫後,再次以修改過的單元庫作邏輯合成。我們迭代執行上述的過程,機器學習模型將在收斂後,為給定的設計找到最佳的單元庫。實驗結果顯示,我們在ITC’99 的暫存器傳輸級(registertransfer level)設計和TSMC 40 奈米技術的邏輯閘級單元庫,能在邏輯合成後達到33.07%的面積減少。
It is well-known in the industry that the outcomes of Electronic Design Automation(EDA) tools can be further fine-tuned by carefully trimming gate-level cell library at logic synthesis stage. With better selections of library cells, the Synopsys Design Compiler can achieve better area reduction while maintaining given design constraints. Conventionally, this fine-tuning procedure is performed manually and empirically for each design since the library needed for each design can be different. The process could be time consuming and inefficient. In this work, we propose a reinforcement learning based logic synthesis framework to achieve further area reduction while maintaining given design constraints. In our framework, we used a policy-based technique to explore a better library selection. First, a target library as well as the given design is synthesized by Synopsys Design Compiler. After that, the timing and area information obtained from the synthesized netlist is send to the machine learning model. Then the model will modify the target library based on the obtained information. The framework will be executed iteratively and the machine learning model will learn the best library for the given design after convergence. Experimental results show that we can obtain up to 33.07% area reduction on ITC’99 RTL circuits with TSMC 40nm technology gate-level cell library.
1 Introduction 1
2 Preliminaries 5
2.1 Reinforcement Learning 5
2.2 Policy Gradient 7
3 Motivation and Problem Formulation 10
3.1 Motivation 10
3.2 Problem Formulation 12
4 Reinforcement Learning Based Logic Synthesis Framework 14
4.1 Framework Overview 14
4.2 Implementation Details 17
5 Experimental Results 20
5.1 Experimental Setup 20
5.2 Result Summary 22
5.3 Result Discussion 24
6 Conclusions 26
References 27
[1] International technology roadmap for semiconductors (itrs) 2.0 executive report,2015. https://goo.gl/HVUwRj.
[2] K. K. V. M. Brendan O’Donoghue, Remi Munos. Combining policy gradient and q-learning. In International Conference on Learning Representations, ICLR’17, 2017.
[3] Y.-G. Chen, W.-Y. Wen, T. Wang, Y. Shi, and S.-C. Chang. Q-learning based dynamic voltage scaling for designs with graceful degradation. In Proceedings of the 2015 Symposium on International Symposium on Physical Design (ISPD), 2015.
[4] H. Hantao, P. D. S. Manoj, D. Xu, H. Yu, and Z. Hao. Reinforcement learning based self-adaptive voltage-swing adjustment of 2.5d i/os for many-core microprocessor and memory communication. In Proceedings of the 2014 IEEE/ACM International Conference on Computer-Aided Design, ICCAD ’14, pages 224–229, 2014.
[5] D.-C. Juan and D. Marculescu. Power-aware performance increase via core/uncorereinforcement control for chip-multiprocessors. In 2012 ACM/IEEE international symposium on Low power electronics and design (ISLPED), pages 97–102, 2012.
[6] D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
[7] L. Li, W. Chu, J. Langford, and R. E. Schapire. A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th International Conference on World Wide Web, WWW ’10, pages 661–670. ACM, 2010.
[8] Y. Li. Deep reinforcement learning: An overview. CoRR, abs/1701.07274, 2017.
[9] V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. A. Riedmiller. Playing atari with deep reinforcement learning. CoRR, abs/1312.5602, 2013.
[10] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, et al. Human-level control through deep reinforcement learning. Nature, 518(7540):529, 2015.
[11] O. Nachum, M. Norouzi, K. Xu, and D. Schuurmans. Bridging the gap between value and policy based reinforcement learning. In Advances in Neural Information Processing Systems, 2017.
[12] I. Osband, C. Blundell, A. Pritzel, and B. Van Roy. Deep exploration via bootstrapped dqn. In D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, editors, Advances in Neural Information Processing Systems 29, pages 4026–4034. Curran Associates, Inc., 2016.
[13] J. Peters and S. Schaal. Policy gradient methods for robotics. In 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 2219–2225, 2006.
[14] S. Roy, M. Choudhury, R. Puri, and D. Z. Pan. Towards optimal performance-area trade-off in adders by synthesis of parallel prefix structures. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 33(10):1517–1530, 2014.
[15] H. Shen, J. Lu, and Q. Qiu. Learning based dvfs for simultaneous temperature, performance and energy management. In Thirteenth International Symposium on Quality Electronic Design (ISQED), pages 747–754, 2012.
[16] B. C. Stadie, S. Levine, and P. Abbeel. Incentivizing exploration in reinforcement learning with deep predictive models. CoRR, abs/1507.00814, 2015.
[17] R. S. Sutton and A. G. Barto. Reinforcement learning: An introduction, volume 1. MIT press Cambridge, 1998.
[18] R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour. Policy gradient methods for reinforcement learning with function approximation. In Proceedings of the 12th International Conference on Neural Information Processing Systems, NIPS’99, pages 1057–1063. MIT Press, 1999.
[19] Synopsys. Design Compiler User Guide, 2010. http://eclass.uth.gr/eclass/modules/document/index.php?course=MHX303&download=/5346dc69nktr/5346dc86FWh3.pdf.
[20] H. Tang, R. Houthooft, D. Foote, A. Stooke, O. X. Chen, Y. Duan, J. Schulman, F. DeTurck, and P. Abbeel. # exploration: A study of count-based exploration for deep reinforcement learning. In Advances in Neural Information Processing Systems, pages 2750–2759, 2017.
[21] K. Tanigawa and T. Hironaka. Design consideration for reconfigurable processor ds-hietrade-off between performance and chip area. In SoC Design Conference (ISOCC), 2011 International, pages 187–190. IEEE, 2011.
[22] G. Theocharous, P. S. Thomas, and M. Ghavamzadeh. Personalized ad recommendation systems for life-time value optimization with guarantees. In Proceedings of the 24th International Conference on Artificial Intelligence, IJCAI’15, pages 1806–1812. AAAI Press, 2015.
[23] D. Xu, N. Yu, H. Huang, P. D. S. Manoj, and H. Yu. Q-learning-based voltageswing tuning and compensation for 2.5-d memory-logic integration. IEEE Design Test, 35(2):91–99, April 2018.
[24] R. Ye and Q. Xu. Learning-based power management for multi-core processors via idle period manipulation. In Asia and South Pacific Design Automation Conference(ASPDAC), pages 115–120, 2012.
[25] D. Zeng, K. Liu, S. Lai, G. Zhou, and J. Zhao. Relation classification via convolutional deep neural network. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pages 2335–2344, 2014.
[26] B. Zoph and Q. V. Le. Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578, 2016.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *