帳號:guest(3.21.21.53)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):陳怡樺
作者(外文):Chen, I-Hua
論文名稱(中文):基於QEMU建構嵌入式異質多核心全系統模擬平台
論文名稱(外文):Full System Emulation of Embedded Heterogeneous Multicores Based on QEMU
指導教授(中文):金仲達
指導教授(外文):King, Chung-Ta
口試委員(中文):劉靖家
陳耀華
口試委員(外文):Liou, Jing-Jia
Chen, Yao-Hua
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系所
學號:105062561
出版年(民國):107
畢業學年度:106
語文別:英文
論文頁數:37
中文關鍵詞:異質多核心模擬嵌入式
外文關鍵詞:emulationembeddedheterogeneousmulticoreqemu
相關次數:
  • 推薦推薦:0
  • 點閱點閱:643
  • 評分評分:*****
  • 下載下載:22
  • 收藏收藏:0
近年來興起的邊緣運算期望將中央伺服器上的計算和⼈⼯智慧相關應⽤逐漸移動到靠近資料來源的終端設備上,以達成快速的執⾏回覆時間並減少終端設備到中央伺服器間的網路傳輸。在邊緣運算中,終端設備提供各式各樣的應⽤程式或服務如針對終端設備所蒐集到的數據做預處理,使⽤⼈⼯智慧做判斷或多媒體⼈機互動界⾯等。在這些應⽤程式中,有許多應⽤是⾮常適合使⽤特定⽤途的硬體加速器來加速其運算⼯作。隨著終端設備上加速器數量越來越多,由⼀個或多個微控制與主處理器所組成的⾮對稱異質多核⼼架構是⼀個極具淺⼒且被看好的
硬體架構。在此架構中,主處理器對加速器之排程和中斷處理⼯作將會由微處理器代為執⾏。詳細架構如NVIDIA 深度學習加速器(NVDLA)中所⽰。為了開發⾮對稱異質多核⼼架構的邊緣運算系統,經常在開發階段中使⽤如QEMU 的虛擬平台提前進⾏效能評估或功能驗證。可惜的是,QEMU 只有提供對稱同質多核⼼系統的模擬,並沒有模擬⾮對稱異質多核⼼系統。在本論⽂中,我們透過兩種可能的實現策略來解決在QEMU 上實現⾮對稱異質多核⼼系統的問題:單進程和多進程。最後,對於這兩種⽅法分別進⾏數據⽐較及討論。
he emerging edge computing is poised to move computing and intelligence to the network's edge so as to be close to the data sources for fast responses and reduced network traffic. In edge computing, edge devices need to encompass a wide variety of applications or services, from data preprocessing, intelligence inference, to multimedia human interface. Many such applications are well suited for special-purpose hardware accelerators. With the increasing number of accelerators on the edge devices, a promising architecture for edge devices is an asymmetric heterogeneous multicore that incorporates one or more microcontrollers to offload accelerator scheduling and interrupt handling from the main CPU, as exemplified in the NVIDIA Deep Learning Accelerator (NVDLA). To develop such computing systems, virtual platforms such as QEMU are often used. Unfortunately, QEMU only supports symmetric homogeneous multicore systems. In this paper, we tackle the challenging problem of supporting asymmetric heterogeneous multicore systems on QEMU by considering two possible implementation strategies: one-process and multi-process. The two strategies are implemented and compared qualitatively and quantitatively.
Abstract
Contents
List of Figures
List of Tables
1 Introduction 1
2 Background 6
2.1 QEMU . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.1 Overall architecture . . . . . . . . . . . . . . . 7
2.1.2 Dynamic binary translation . . . . . . . . . . . . 7
2.1.3 Full system emulation . . . . . . . . . . . . . . 8
2.1.4 User emulation . . . . . . . . . . . . . . . . . . 9
2.2 Library . . . . . . . . . . . . . . . . . . . . . . . 9
3 System Design 12
3.1 System architecture . . . . . . . . . . . . . . . . . 12
3.2 System flow . . . . . . . . . . . . . . . . . . . . . 15
4 Implementation 17
4.1 One-process methodology . . . . . . . . . . . . . . . 17
4.2 Multi-process methodology . . . . . . . . . . . . . . 22
5 Experiments 24
5.1 Validation . . . . . . . . . . . . . . . . . . . . . 25
5.2 Time distribution . . . . . . . . . . . . . . . . . . 25
5.3 Comparison of the two proposed methodologies . . . . 27
6 Conclusions 33
[1] NVDIA Company. Nvdla primer. http://nvdla.org/primer.html.
[2] J. Power, J. Hestness, M. S. Orr, M. D. Hill, and D. A.Wood. gem5-gpu: A heterogeneous cpu-gpu simulator. IEEE Computer Architecture Letters, 14(1):34-36, Jan 2015.
[3] S. T. Shen, S. Y. Lee, and C. H. Chen. Full system simulation with qemu: An approach to multi-view 3d gpu design. In Proceedings of 2010 IEEE International Symposium on Circuits and Systems, pages 3877{3880, May 2010.
[4] C. S. Peng, L. C. Chang, C. H. Kuo, and B. D. Liu. Dual-core virtual platform with qemu and systemc. In 2010 International Symposium on Next Generation
Electronics, pages 69-72, Nov 2010.
[5] Fabrice Bellard. Qemu, a fast and portable dynamic translator. In USENIX Annual Technical Conference, FREENIX Track, volume 41, page 46, 2005.
[6] M. Monton, A. Portero, M. Moreno, B. Martinez, and J. Carrabina. Mixed sw/systemc soc emulation framework. In 2007 IEEE International Symposium on Industrial Electronics, pages 2338-2341, June 2007.
[7] P. Mistry D. Schaa D. Kaeli R. Ubal, B. Jang. The multi2sim simulation framework. http://www.multi2sim.org/downloads/m2s-guide-4.2.pdf, 2012.
[8] T. V. Dung, I. Taniguchi, and H. Tomiyama. Cache simulation for instruction set simulator qemu. In 2014 IEEE 12th International Conference on Dependable, Autonomic and Secure Computing, pages 441-446, Aug 2014.
[9] D. M. Beazley, B. D.Ward, and I. R. Cooke. The inside story on shared libraries and dynamic loading. Computing in Science Engineering, 3(5):90-97, Sep 2001.
[10] Y. Lee, A. Waterman, R. Avizienis, H. Cook, C. Sun, V. Stojanovi, and K. Asanovi. A 45nm 1.3ghz 16.7 double-precision gops/w risc-v processor with vector accelerators. In ESSCIRC 2014 - 40th European Solid State Circuits Conference (ESSCIRC), pages 199-202, Sept 2014.
[11] Riscv-qemu. https://github.com/riscv/riscv-qemu/wiki.
[12] M. R. Guthaus, J. S. Ringenberg, D. Ernst, T. M. Austin, T. Mudge, and R. B. Brown. Mibench: A free, commercially representative embedded benchmark suite. In Proceedings of the Fourth Annual IEEE International Workshop on Workload Characterization. WWC-4 (Cat. No.01EX538), pages 3-14, Dec 2001.
[13] Edge detection in images. https://people.sc.fsu.edu/~jburkardt/c_src/image_edge/.
[14] Remove noise from an image. https://people.sc.fsu.edu/~jburkardt/c_src/image_denoise/.
[15] Mahmoud Hatem. Reverse engineering: What we need to know as a dba. https://mahmoudhatem.wordpress.com/2016/10/10/reverse-engineering-what-we-need-to-know-as-a-dba/.
[16] M. C. Chiang, T. C. Yeh, and G. F. Tseng. A qemu and systemc-based cycle-accurate iss for performance estimation on soc development. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 30(4):593-606, April 2011.
[17] D. Bortolotti, C. Pinto, A. Marongiu, M. Ruggiero, and L. Benini. Virtual-soc: A full-system simulation environment for massively parallel heterogeneous system-on-chip. In 2013 IEEE International Symposium on Parallel Distributed Processing, Workshops and Phd Forum, pages 2182-2187, May 2013.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *