|
[1] “Qualcomm,” https://www.qualcomm.com/. [2] “Kalray,” https://www.kalrayinc.com/. [3] N. M. Qui, C. H. Lin, and P. Chen, “Design and implementation of a 256-bit risc-v-based dynamically scheduled very long instruction word on fpga,” IEEE Access, vol. 8, pp. 172 996–173 007, 2020. [4] Z. Shen, H. He, X. Yang, D. Jia, and Y. Sun, “Architecture design of a variable length instruction set vliw dsp,” Tsinghua Science and Technology, vol. 14, no. 5, pp. 561–569, 2009. [5] D. C.-W. Chang, T.-J. Lin, C.-J. Wu, J.-K. Lee, Y.-H. Chu, and A.-Y. Wu, “Parallel architecture core (pac)—the first multicore application processor soc in taiwan part i: Hardware architecture & software de- velopment tools,” Journal of Signal Processing Systems, vol. 62, pp. 373–382, 2011. [6] Y.-C. Lin, Y.-P. You, and J.-K. Lee, “Palf: compiler supports for ir- regular register files in clustered vliw dsp processors,” Concurrency and computation: practice and experience, vol. 19, no. 18, pp. 2391–2406, 2007. [7] C.-B. Kuan and J. K. Lee, “Compiler supports for vliw dsp processors with simd intrinsics,” Concurrency and Computation: Practice and Ex- perience, vol. 24, no. 5, pp. 517–532, 2012. [8] C.-K. Chen, L.-H. Tseng, S.-C. Chen, Y.-J. Lin, Y.-P. You, C.-H. Lu, and J.-K. Lee, “Enabling compiler flow for embedded vliw dsp proces- sors with distributed register files,” in Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems, 2007, pp. 146–148. [9] C. Wu, K.-Y. Hsieh, Y.-C. Lin, C.-J. Wu, W.-L. Shih, S.-C. Chen, C.-K. Chen, C.-C. Huang, Y.-P. You, and J. K. Lee, “Integrating compiler and system toolkit flow for embedded vliw dsp processors,” in 12th IEEE In- ternational Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA’06). IEEE, 2006, pp. 215–222. [10] M.-S. Shih, H.-M. Lai, C.-L. Lee, C.-K. Chen, and J.-K. Lee, “Register- pressure aware predicator for length multiplier of rvv,” in Workshop Proceedings of the 51st International Conference on Parallel Processing, 2022, pp. 1–9. [11] T. M. Lattner, “An Implementation of Swing Modulo Scheduling with Extensions for Superblocks,” Master’s thesis, Computer Science Dept., University of Illinois at Urbana-Champaign, Urbana, IL, June 2005, See http://llvm.cs.uiuc.edu. [12] J. Llosa, A. Gonzalez, E. Ayguade, and M. Valero, “Swing module scheduling: a lifetime-sensitive approach,” in Proceedings of the 1996 Conference on Parallel Architectures and Compilation Technique, 1996, pp. 80–86. [13] C.-J. Wu, C.-H. Lu, and J. K. Lee, “Register spilling via transformed interference equations for pac dsp architecture,” Concurrency and Com- putation: Practice and Experience, vol. 26, no. 3, pp. 779–799, 2014. [14] H. Li, N. Mentens, and S. Picek, “A scalable simd risc-v based processor with customized vector extensions for crystals-kyber,” in Proceedings of the 59th ACM/IEEE Design Automation Conference, ser. DAC ’22. New York, NY, USA: Association for Computing Machinery, 2022, p. 733–738. [Online]. Available: https://doi.org/10.1145/3489517.3530552 [15] V. Zivojnovic, J. Martinez, C. Schläger, and H. Meyr, “Dspstone: A dsp- oriented benchmarking methodology,” in Proc. of ICSPAT’94 - Dallas, Oct. 1994. |