|
[1] A. R. Alameldeen and D. A.Wood, “Adaptive cache compression for high-performance processors,” Proc. the 31st Annual International Symposium on Computer Architecture (ISCA), pp. 212–223, 2004. [2] S. Sardashti and D. A. Wood, “Decoupled compressed cache: Exploiting spatial locality for energy-optimized compressed caching,” Proc. the 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 62–73, 2013. [3] A. R. Alameldeen and D. A.Wood, “Frequent pattern compression: A significance-based compression scheme for l2 caches,” Technical Report 1500, University of WisconsinMadison, Computer Sciences Department, Tech. Rep., 2004. [4] E. Ahn, S.-M. Yoo, and S.-M. S. Kang, “Effective algorithms for cache-level compression,” Proc. the 11th Great Lakes symposium on VLSI (GLSVLSI), pp. 89–92, 2001. [5] F. Douglis, “The compression cache: Using on-line compression to extend physical memory,” Proc. 1993 Winter USENIX Conference, pp. 519–529, 1993. [6] M. J. Freedman, “The compression cache: Virtual memory compression for handheld computers,” Parallel and Distributed Operating Systems Group, MIT Lab for Computer Science, Cambridge, Tech. Rep., 2000. [7] X. Chen, L. Yang, R. Dick, L. Shang, and H. Lekatsas., “Cpack: A high-performance microprocessor cache compression algorithm,” IEEE Trans. Very Large Scale Integration (VLSI) Systems, vol. 18, no. 8, pp. 1196 –1208, 2010. [8] L. Villa, M. Zhang, and K. Asanovic, “Dynamic zero compression for cache energy reduction,” Proc. the 33rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 214–220, 2000. [9] J. Yang, Y. Zhang, and R. Gupta, “Frequent value compression in data caches,” Proc. the 33rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 258–265, 2000. [10] G. Pekhimenko, V. Seshadri, O. Mutlu, M. A. Kozuch, P. B. Gibbons, and T. C. Mowry, “Base-delta-immediate compression: Practical data compression for on-chip caches,” Proc. the 21st International Conference on Parallel Architectures and Compilation Techniques (PACT), pp. 377–388, 2013. [11] J. Dusser, T. Piquet, and A. Seznec, “Zero-content augmented caches,” Proc. the 23rd international conference on Supercomputing (ICS), pp. 46–55, 2009. [12] E. Hallnor and S. Reinhardt, “A compressed memory hierarchy using an indirect index cache,” Proc. the 3rd Workshop on Memory performance issues: in con-junction with the 31st international symposium on computer architecture (WMPI), pp. 9–15, 2004. [13] ——, “A unified compressed memory hierarchy,” Proc. High-Performance Computer Architecture (HPCA), pp. 201–212, 2005. [14] S. Kim, J. Lee, J. Kim, and S. Hong, “Residue cache: A lowenergy low-area l2 cache architecture via compression and partial hits,” Proc. the 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 420–429, 2011. [15] Y. Xie and G. Loh, “Thread-aware dynamic shared cache compression in multi-core processors,” Proc. Computer Design (ICCD), pp. 135–141, 2011. [16] S. Baek, H. G. Lee, C. Nicopoulos, J. Lee, and J. Kim, “ECM:Effective capacity maximizer for high-performance compressed caching,” Proc. High-Performance Computer Architecture (HPCA), pp. 131–142, 2013. [17] J.-S. Lee, W.-K. Hong, and S.-D. Kim, “An on-chip cache compression technique to reduce decompression overhead and design complexity,” Journal of Systems Architecture (JSA), vol. 46, no. 15, pp. 1365–1382, 2000. [18] D. Chen, E. Peserico, and L. Rudolph, “A dynamically partitionable compressed cache,” Proc. the Singapore-MIT Alliance Symposium, 2003. [19] L. Benini, D. Bruni, B. Ricco, A. Macii, and E. Macii, “An adaptive data compression scheme for memory traffic minimization in processor-based systems,” IEEE International Symposium on Circuits and Systems (ISCAS), pp. 866–869, 2002. [20] L. Benini, D. Bruni, A. Macii, and E. Macii, “Hardware-assisted data compression for energy minimization in systems with embedded processors,” Proc. the Design, Automation and Test in Europe Conference and Exhibition (DATE), pp. 449–453, 2002. [21] J.-S. Lee, W.-K. Hong, and S.-D. Kim, “Design and evaluation of a selective compressed memory system,” Proc. Internationl Conference on Computer Design (ICCD), pp. 184– 191, 1999. [22] A.-R. Adl-Tabatabai, A. M. Ghuloum, and S. O. Kanaujia, “Compression in cache design,” Proc. the 21st annual international conference on Supercomputing (ICS), pp. 190–201, 2007. [23] A. Seznec, “Decoupled sectored caches: conciliating low tag implementation cost,” Proc. the 21st Annual International Symposium on Computer architecture (ISCA), pp. 384– 393, 1994. [24] “SPEC2006 benchmarks.” [Online]. Available: http://www.specbench.org/osg/cpu2006/ [25] E. Rotenberg, S. Bennett, and J. E. Smith, “Trace cache: A low latency approach to high bandwidth instruction fetching,” Proc. the 29th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 24–34, 1996. [26] T. M. Conte, K. N. Menezes, P. M. Mills, and B. A. Patel, “Optimization of instruction fetch mechanisms for high issue rates,” Proc. the 22nd Annual International Symposium on Computer Architecture (ISCA), pp. 333–344, 2005. [27] P. Magnusson, M. Christensson, J. Eskilson, D. Forsgren, G. Hallberg, J. Hogberg, F. Larsson, A. Moestedt, and B. Werner, “Simics: A full system simulation platform,” IEEE Computer, pp. 50–58, 2002. [28] M. Martin, D. Sorin, B. Beckmann, M. Marty, M. Xu, A. Alameldeen, K. Moore, M. Hill, and D. Wood, “Multifacets general execution-driven multiprocessor simulator(GEMS) toolset,” Computer Architecture News, pp. 92–99, 2005. [29] T. K. Prakash and L. Peng, “Performance characterization of spec cpu2006 benchmarks on intel core 2 duo processor,” ISAST Transactions on Computers and Software Engineering, pp. 36–41, 2008. [30] C. Zhang, F. Vahid, and W. Najjar, “A highly configurable cache for low energy embedded systems,” ACM Transactions on Embedded Computing Systems, TECS, pp. 363–387, 2005. [31] HP Laboratories Palo Alto, “CACTI 6.5.” [Online]. Available: ttp://www.hpl.hp.com/
|