帳號:guest(3.138.134.19)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):林俊伸
作者(外文):Lin, Jun-Shen
論文名稱(中文):應用於多核心處理器之交易層級內嵌式追蹤器設計
論文名稱(外文):Design of Transaction-Level Embedded Tracer for Many-Core Processors
指導教授(中文):黃稚存
指導教授(外文):Huang, Chih-Tsun
口試委員(中文):金仲達
劉靖家
黃稚存
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:101062571
出版年(民國):103
畢業學年度:103
語文別:英文
論文頁數:122
中文關鍵詞:多核心追蹤器
外文關鍵詞:many-coretracer
相關次數:
  • 推薦推薦:0
  • 點閱點閱:473
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
基於效率高、熱閾值和可擴展性,採用多核心架構,而不是具有較高的頻率單核來實現SoC(系統級晶片)更好的性能是發展的趨勢。
然而,隨著晶片中系統的數量不斷增加,核心間通訊的複雜性急劇增長而設計驗證的難度也越來越高。為了讓多核心架構發揮到最大利益,在平台上運行並行應用程序亦是重要的事情。因此,設計和追蹤與應用程序和硬體協作的驗證是當今的多核系統一個具有挑戰性的問題。

在此論文中,我們提出了一個對多核心平台使用的內嵌式追蹤器架構(通訊追蹤)和專注於跨PE(運算單元)交互應用程式層級的追蹤方法。在應用程式層級追蹤可以提供使用者更有效率的方式找到問題,因為追蹤數據可以被最小化。我們改進了原有的API事件產生器使其具有追蹤存取外部記憶體的功能。此外,晶片網路事件的產生具有更好的效率,如果交易是正常的,便不產生追蹤數據。通訊追蹤包括通訊事件產生器,NoC(晶片網路)事件產生器和追蹤引擎。通訊事件發生器監控PE的傳輸元件而NoC的事件產生器則是監視通過路由器的數據封包。追蹤引擎匯集了來自通訊事件產生器或晶片網路追蹤事件後加上時間戳記,然後壓縮成追蹤數據。

除了之外,我們實作離線分析程式,以幫助使用者分析多核平台上執行的應用程式。使用者可以利用該分析程式增加分析追蹤數據的效率。我們在論文中提出了一些應用程序的通訊追蹤資料的壓縮比可以達到96.01%至99.97%。
通訊事件產生器和追蹤器引擎占整個平台的面積約為1.5% 而NoC事件產生器則是0.57%。
Due to the power efficiency, thermal threshold and scalability, using many-core architecture instead of having a single core with higher frequency to achieve better performance on SoC(System-on-chip) is a trend of the development.
However, when the number of cores in a system is increasing, the complexity of inter-core communication grows rapidly and the difficulty of design verification is also increasing. To get maximum profit of many-core architecture, the parallel application executed on platform is the other important thing. Thus, the verification of design and the debugging cooperating with application and hardware is a challenging issue for many-core systems in nowadays.

In the thesis, we present an embedded debug and trace platform( Communication Tracer) and the methodology for application-level that focus on the interaction of inter-PE( Processing Element) on many-core platform. Debugging and Tracing at application-level can offer user an efficiency way to find the problem because the trace data can be minimized. We improve the original API Event Generator to have the feature of tracing external memory access. Moreover, the NoC Event Generator has better efficiency that if the transaction is normal, there are no trace data generated. The Communication Tracer consists of Communication Event Generator, NoC Event Generator and Tracer Engine. Communication Event Generator monitors the communication of PE and NoC Event Generator monitors the packets that pass through the switch node. Tracer Engine gathers the trace event from Communication Event Generator or NoC Event Generator, appends time stamp and then compressing the trace data.

Moreover, we implement off-line analyzer to help user to profiling the application executed on many-core platform. User can exploit the software application's efficiency by the off-line analyzer extracts and analyze the trace data log from our proposed Communication Tracer. We present some application as cases in this thesis. The compression ratio of trace data in the proposed communication tracer can achieve between 96.01% to 99.97% depended on various cases.
The Communication Event Generator and Tracer Engine is 1.5% of the whole platform and the NoC Event Generator is 0.57% of the whole platform.
1 Introduction
1.1 Introduction of SoC Tracing and Debugging
1.2 The Challenge of Tracing and Debugging on Many-Core SoC
1.3 Motivation and Contribution
1.4 Organization of the Thesis
2 Previous Work
2.1 Background
2.1.1 SystemC
2.1.2 OSCI TLM-2.0
2.1.3 OpenRISC
2.1.4 Bus Protocol
2.1.4.1 Wishbone
2.1.4.2 Open Core Protocol
2.2 Overview of Existent Many-Core Platform
2.2.1 Processing Element
2.2.2 Communication Unit
2.2.3 Network on Chip
2.2.3.1 Arteris NoC(RTL)
2.2.3.2 Transaction Generator 2(ESL)
2.3 Software Communication Library
2.3.1 On-chip Communication Library
2.3.2 iLib Libary
2.4 Overview of Previous Many-Core Tracer
2.4.1 Application-Level Transaction-Based Trace Infrastructure for NoC-Based Many-Core Platform
2.4.2 Transaction-Level Embedded Tracer on Many-Core Platform
2.4.2.1 API Event Generator
2.4.2.2 NoC Transaction Generator
2.5 Existent Method for Tracing and Debugging
3 Component of Proposed Communication Tracer and Implementation 41
3.1 Overview of Communication Tracer
3.2 Communication Event Generator
3.2.1 FSM of Communication Event Generator
3.2.2 Communication Event List
3.3 NoC Event Generator
3.3.1 The Top View of NoC Event Generator
3.3.2 Packet Congested Event Generator
3.3.2.1 Size of Trace FIFO and Dedicated Counter
3.3.2.2 Push Mechanism of Trace FIFO
3.3.2.3 Pop Mechanism of Trace FIFO
3.3.2.4 Packet Congested Event Generator On ESL platform
3.3.3 Hop Count Event Generator
3.3.3.1 Definition of Hop Count
3.3.3.2 Architecture of Hop Count Event Generator
3.4 Tracer Engine
3.4.1 Event Filter
3.4.2 Time Event Generator
3.4.3 Trace Packet Generator
3.4.4 Tracer Control Unit
3.5 Tracer Timer
3.6 Non-Blocking Communication Engine Issue
4 Proposed Tracing and Debugging Methodology and Demonstrations
4.1 The Debugging and Tracing Flow on Many-Core Platform
4.2 Off-line Analyzer
4.2.1 Definition of Communication Time
4.2.2 NAck Related Analsis
4.2.3 Packets Congested on NoC Related
4.2.4 Auto-Paring of Send and Receive
4.3 Simulation Model for Application-Level Debugging and Tracing
4.4 Case Study
4.4.1 Odd-Even Sort
4.4.1.1 Introduction of Odd-Even Sort
4.4.1.2 Cases of Odd-Event Sort
4.4.1.3 Result of Analysis
4.4.2 3 phases of Send and Receive
4.4.2.1 Introduction of 3 phases of Send and Receive
4.4.2.2 Introduction of Barrier Function
4.4.2.3 Cases of 3 phases of Send and Receive
4.4.2.4 Result of Analysis
4.4.3 3D Parallel Graphic Pipeline Program
4.4.3.1 Introduction of 3D Parallel Graphic Pipeline Program
4.4.3.2 The Method of Load Balance
4.4.3.3 Cases of 3D Parallel Graphic Pipeline Program
4.4.3.4 Result of Analysis
4.4.4 Router Decision Bug
5 Experiment Result and Analysis
5.1 The Compression Rate of Communication Trace Data
5.2 The Amount of NoC Trace Data
5.3 The Rate of Generating Trace Data
5.3.1 Analyze the Profling of Switches' Utilization Ratio
5.4 Synthesis Result
6 Conclusion and Future Work
6.1 Conclusion
6.2 Future Work
Appendices
Appendix .A ESL Global Addressing Space on Many-Core Platform
Appendix .B Communication Unit Addressing Space Used in A Processing Element
[1] DAFCA, Dr. Heinz Holzapfel "On-chip, at-speed, debug and DFT support for OCP-based SoCs", in DAFCA Presentation Oct. 2014.
[2] "collett ASIC/IC Verification Study", 2004 (data for 180nm and 130nm)
[3] Open SystemC Initiative, "IEEE Standard SystemC Language Reference Manual", IEEE Std 1666-2005 ,pp. 1-423, Mar. 2006.
[4] Open SystemC Initiative, OSCI TLM 2.0 Language Referece Manual , July. 2009.
[5] John Aynsley, Doulos ,http://www.cl.cam.ac.uk/research/srg/han/ACS-P35/documents/TLM 2 0 presentation.pdf
[6] D.Lampret, C.-M. Chen, M. Mlinar, J. Rydberg, M. Ziv-Av, C. Ziomkowski, G. Mc-Gary, B. Gardner, R. Mathur, and M. Bolado, "OpenRISC 1000 Architecture Manual recv 1.3", ,http://opencores.org/or1k/Main Page, May. 2006.
[7] Lampret D and Baxter J, "OpenRISC 1200 IP Core Specification recv .011",http://opencores.org/or1k/Main Page, Jan. 2011.
[8] Silicore OpenCores, "Wishbone, revision b.3 specification", http://cdn.opencores.org/downloads/wbspec b3.pdf, July 2002.
[9] OCP International Partnership (OCP-IP), "Open core protocol specification release 2.2", http://www.ocpip.org, Jan. 2007.
[10] Prashant D. Karandikar), Texas Instruments Inc, "Open Core Protocol ( OCP ) An Introduction to Interface Specification", http://ocpip.org/uploads/dynamic areas/Ct9Rr6XmkN84Y6MvTouu/947/OCP-HPCA 2010.pdf, Jan. 2010.
[11] E. Pekkarinen, L. Lehtonen, E. Salminen, and T. Hamalainen, "A set of traffic models for network-on-chip benchmarking," in System on Chip (SoC), International Symposium on IEEE, 2011, pp.78-81.
[12] Lasse Lehtonen, "TRANSACTION GENERATOR 2 BRIEF", http://www.tkt.cs.tut/research/nocbench/data/sctg2 brief.pdf, June 2, 2010.
[13] Arteris S.A., NoC solutuon 1.16 NoCcompiler User's Guide, Feb. 2009.
[14] Arteris S.A., NoC solutuon 1.16 NoC Transaction and Transport Protocol Technical Reference, Feb. 2009.
[15] Arteris S.A., NoC solutuon 1.16 OCP Network Interface Units Technical Reference, Feb. 2009.
[16] Arteris S.A., NoC solutuon 1.16 Packet Transport Units Technical Reference, Feb. 2009.
[17] Arteris S.A., NoC solutuon 1.16 NoCexplorer User's Guide, Feb. 2009.
[18] J. Hu and R. Marculescu, Communication and task scheduling of application-specfic networks-on-chip, vol. 152, Issue: 5, Sept. 2005.
[19] P.-Y. Chen and C.-T. Huang, "RTL Realization of NoC-Based Multi-Core Platform", in Master Thesis, Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan, Oct. 2011.
[20] Y.-H. Chen and C.-T. Huang, "Design and Analysis of Inter-PE Communication on Many-Core Platform", in Master Thesis, Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan, Nov. 2012.
[21] Y.-S. and C.-T. Huang, "Design of a Real-Time Bus Tracer for OCP-Based Systems", in Master Thesis, Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan, Jul. 2012.
[22] C.-L. Huang and C.-T. Huang, "Application Level Transaction-Based Trace Infrastructure for NoC-Based Many-Core Platform", in Master Thesis, Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan, Mar. 2013.
[23] K.-C. Huang and C.-T. Huang, "Transacation-Level Embedded Tracer Architecture for NoC-Based Many-Core Platform", in Master Thesis, Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan, Nov. 2013.
[24] William Orme, "Debug and Trace for Multicore SoCs", Sep. 2008.
[25] ARM Ltd, CoreSight Architecture Specification v2.0, Sep. 2013.
[26] William Orme, "CoreSight Technical Introduction", Aug. 2013.
[27] ARM Ltd,AMBA AHB Trace Macrocell(HTM), Apr. 2008.
[28] ARM Ltd,Embedded Trace Macrocell Architecture Specification, Sep. 2011.
[29] ARM Ltd,Coresight PTM-A9 Techinal Reference Manual, Jul. 2011.
[30] Roberto Mijat, "Better Trace for Better Software", Dec. 201
[31] ARM Ltd,CoreSight System Trace Macrocell, Apr. 2008.
[32] ARM Ltd,AMBA AXI and ACE Protocol Specification, Oct. 2011.
[33] ARM Ltd,AMBA 4 ATB Protocol Specification, Mar. 2012.
[34] I.-J. Huang, C.-F. Kao, H-M. Chen, C.-N. Juan, and T.-A. Liu, "A retargetable embedded in-circuit emulation module for microprocessors", IEEE Design & Test of Computers, , vol. 19, no.4, pp. 28-38, Oct.-Dec. 2002
[35] B. Vermeulen, "Functional debug techniques for embedded systems", IEEE Design & Test of Computers, , vol. 25, no.3, pp. 208-215, Jul.-Sep. 2002
[36] M.Abramovici, "In-system silicon validation and debug", IEEE Design & Test of Computers, , vol. 25, no.3, pp. 216-223, Jul.-Sep. 2008
[37] S. Tang and Q. Xu, "A debug probe for concurrently debugging multiple embedded cores and inter-core transactions in NoC-based systems", Proc. Asia and South Pacific Design Automation Conf., , (ASP-DAC), Seoul, Jan. 2008, pp. 416-421.
[38] K.-J. Lee, S.-Y. Liang, and A. Su, "A low-cost SOC debug platform based on on-chip test architectures", Proc. IEEE Int. SOC Conf.(SOCC), Belfast, Ireland, , Sep. 2009,pp. 709-716.
[39] N. H. Neishaburi, Z. Zilic, "An enhanced debug-aware network interface for Network-on-Chip", Proc. Int'l Symp. on Quality Electronic Design(ISQED), , Santa Clara, Mar.2012, pp. 709-716
[40] K.Goossens, B. Vermeulen, R. van Steeden and Mbennebroek, "Transaction-Based Communication-Centric Debug", Proc. of Design, Automation & Test in Europe Conference, , DATE, 2008
[41] K.Goossens, B. Vermeulen, " Included in Your Digital Subscription A Network-on-Chip monitoring infrastructure for communication-centric debug of embedded multiprocessor SoCs", Proc. VLSI Design, Automation & Test , , 2009
[42] K.Goossens, B. Vermeulen, A.B. Nejad, "A high-level debug environment for communication-centric debug", Design, Automation & Test in Europe Conference & Exhibition, , DATE, 2008
[43] Wikipedia, "Hop(networking)", http://en.wikipedia.org/wiki/Hop (networking)
[44] W.-S. Chen and J.-J. Liou, "Design of Non-Blocking Communication Engine for NoC-Based Platform", in Master Thesis, Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan, Jan. 2014.
[45] R.-R. Lee,Yi Lo, "Loading Balancing Graphics Rendering Process on a Many-Core Architecture",
[46] Wikipedia, "Oddeven sort", http://en.wikipedia.org/wiki/Odd-even sort
[47] Wikipedia, "Barrier", http://en.wikipedia.org/wiki/Barrier (computer science)
[48] Wikipedia, "Graphics pipeline", http://en.wikipedia.org/wiki/Graphics pipeline
[49] J.-L. Lin and J.-J. Liou, "Timing Simulation for Network-on-Chip with Dynamic Frequency Scaling", in Master Thesis, Department of Electrical Engineering, National Tsing Hua University, Hsinchu, Taiwan, Oct. 2012.
[50] C.-K. Yu and J.-J. Liou, "Dynamic Timing Simulation for Network-on-Chip with Pipelined Architecture", in Master Thesis, Department of Electrical Engineering, National Tsing Hua University, Hsinchu, Taiwan, Sep. 2014.
(此全文未開放授權)
電子全文
摘要
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *