|
[1] A. R. Biswas and R. Giaffreda, “IoT and Cloud Convergence: Opportunities and Challenges,” in Proc. IEEE World Forum on Internet of Things, pp. 375–376, March. 2014. [2] A. N. Udipi, N. Muralimanohar, N. Chatterjee, R. Balasubramonian, A. Davis, and N. P. Jouppi, “Rethinking DRAM Design and Organization for Energy-Constrained Multi-Cores,” in Proc. ACM International Symposium on Computer Architecture, pp. 175–186, June. 2010. [3] W. A. Wulf and S. A. McKee, “Hitting the Memory Wall: Implications of the Obvious,” ACM SIGARCH Computer Architecture News, vol. 23, pp. 20–24, March. 1995. [4] M. V. Wilkes, “The Memory Gap and the Future of High Performance Memories,” ACM SIGARCH Computer Architecture News, vol. 29, pp. 2–7, March. 2001. [5] S. Rixner, W. J. Dally, U. J. Kapasi, P. Mattson, and J. D. Owens, “Memory Access Scheduling,” in Proc. ACM International Symposium on Computer Architecture, pp. 128–138, May. 2000. [6] T. Vogelsang, “Understanding the Energy Consumption of Dynamic Random Access Memories,” in Proc. Microarchitecture, pp. 363–374, Dec. 2010. [7] T. Kimura, K. Takeda, Y. Aimoto, N. Nakamura, T. Iwasaki, Y. Nakazawa, H. Toyoshima, M. Hamada, M. Togo, H. Nobusawa, and T. Tanigawa, “64Mb 6.8ns Random Row Access DRAM Macro for ASICs,” in Proc. International Solid-State Circuits Conference, pp. 416–417, Feb. 1999. [8] Micron. RLDRAM 2 and 3 Specifications. http://www.micron.com/products/dram. [9] Y. Sato, T. Suzuki, T. Aikawa, S, Fujioka, W. Fujieda, H. Kobayashi, H. Ikeda, T. Nagasawa, A. Funyu, Y. Fuji, K. Kawasaki, M. Yamazaki, and M. Taguchi, “Fast Cycle RAM (FCRAM); a 20-ns Random Row Access, Pipe-Lined Operating DRAM,” in Proc. VLSI Circuits, pp. 22–25, June. 1998. [10] C. Toal, D. Burns, K. McLaughlin, S. Sezer, and S. O'Kane, “An RLDRAM II Implementation of a 10Gbps Shared Packet Buffer for Network Processing,” in Proc. Adaptive Hardware and Systems, pp. 613–618, Aug. 2007. [11] ITRS. International Technology Roadmap for Semiconductors: Process Integration, Devices, and Structures. http://www.itrs.net/Links/2007ITRS/Home2007.htm, 2007. [12] M. Jun, M.-J. Kim, and E.-Y. Chung, “Asymmetric DRAM Synthesis for Heterogeneous Chip Multiprocessors in 3D-Stacked Architecture,” in Proc. IEEE/ACM International Conference Computer-Aided Design, pp. 73–80, Nov. 2012. [13] D. Lee, Y. Kim, V. Seshadri, J. Liu, L. Subramanian, and O. Mutlu, “Tiered-latency DRAM: A Low Latency and Low Cost DRAM Architecture,” in Proc. High Performance Computer Architecture, pp. 615–626, Feb. 2013. [14] Y. Kim, V. Seshadri, D. Lee, J. Liu, and O. Mutlu, “A Case for Exploiting Subarray-Level Parallelism (SALP) in DRAM,” in Proc. ACM International Symposium on Computer Architecture, pp. 368–379, June. 2012. [15] S. M. Sharroush, Y. S. Abdalla, A. A. Dessouki, and E.-S. A. El-Badawy, “Dynamic random-access memories without sense amplifiers,” Elektrotechnik & Informationstechnik, pp. 88–101, Nov. 2012. [16] Rambus. DRAM Power Model. http://www.rambus.com/energy, 2010. [17] JEDEC Solid State Technology Association [Online]. Available: http://www.jedec.org/. [18] S. Rixner, "Memory Controller Optimizations for Web Servers," in Proc. Microarchitecture, pp.355–366, Dec. 2004. [19] E. Ipek, O. Mutlu, J. F. Martinez, and R. Caruana, "Self-Optimizing Memory Controllers: A Reinforcement Learning Approach," in Proc. International Symposium on Computer Architecture, pp. 39–50, June. 2008. [20] O. Mutlu, and T. Moscibroda, "Parallelism-Aware Batch Scheduling: Enhancing both Performance and Fairness of Shared DRAM Systems," in Proc. International Symposium on Computer Architecture, pp. 63–74, June. 2008. [21] H.-J. Lee, and E.-Y. Chung, "Scalable QoS-Aware Memory Controller for High-Bandwidth Packet Memory," in Very Large Scale Integration Systems, IEEE Transactions on, vol.16, no.3, pp. 289–301, March. 2008. [22] X. Dong, Y. Xie, N. Muralimanohar, and N. P. Jouppi, "Simple but Effective Heterogeneous Main Memory with On-Chip Memory Controller Support," in Proc. High Performance Computing, Networking, Storage and Analysis, pp. 1–11, Nov. 2010. [23] Y. Kim, D. Han, O. Mutlu, and M. H.-Balter, "ATLAS: A Scalable and High-Performance Scheduling Algorithm for Multiple Memory Controllers," in Proc. High Performance Computer Architecture, pp. 1–12, Jan. 2010. [24] C. J. Lee, O. Mutlu, V. Narasiman, and Y. N. Patt, "Prefetch-Aware Memory Controllers," in Computers, IEEE Transactions on, vol.60, no.10, pp. 1406–1430, Oct. 2011. [25] K. Li, Q. Guang, L. Lei, Y.-J. Peng, and J.-Y. Shi, "A High-Performance DRAM Controller Based on Multi-Core System Through Instruction Prefetching," in Proc. International Conference on Electronics, Communications and Control, pp. 1220–1223, Sep. 2011. [26] Y.-J. Liu, C.-C. Yang, S.-L. Chen, C.-C. Chiu, C.-C. Chu, C.-M. Wu, and C.-M. Huang, "An Efficient Memory Controller for 3D Heterogeneous Integration Platform," in Proc. VLSI Design, Automation, and Test, pp. 1–4, April. 2012. [27] M. N. Bojnordi, and E. Ipek, "PARDIS: A Programmable Memory Controller for the DDRx Interfacing Standards," in Proc. International Symposium on Computer Architecture, pp. 13–24, June. 2012. [28] M. D. Gomony, B. Akesson, and K. Goossens, "Architecture and Optimal Configuration of a Real-Time Multi-Channel Memory Controller," in Proc. Design, Automation & Test in Europe Conference & Exhibition, pp. 1307–1312, March. 2013. [29] J. Reineke, I. Liu, H. D. Patel, S. Kim, and E. A. Lee, "PRET DRAM Controller: Bank Privatization for Predictability and Temporal Isolation," in Proc. Hardware/Software Codesign and System Synthesis, pp. 99–108, Oct. 2011. [30] W. Shin, J. Yang, J. Choi, and L.-S. Kim, "NUAT: A Non-Uniform Access Time Memory Controller," in Proc. High Performance Computer Architecture, pp. 464–475, Feb. 2014. [31] Y. Wang, A. Ferraiuolo, and G. E. Suh, "Timing Channel Protection for a Shared Memory Controller," in Proc. High Performance Computer Architecture, pp. 225–236, Feb. 2014. [32] A. Hansson, N. Agarwal, A. Kolli, T. Wenisch, and A. N. Udipi, "Simulating DRAM Controllers for Future System Architecture Exploration," in Proc. Performance Analysis of Systems and Software, pp. 201–210, March. 2014. [33] S. Khan, A. R. Alameldeen, C. Wilkerson, O. Mutluy, and D. A. Jimenezz, “Improving Cache Performance Using Read-Write Partitioning,” in Proc. High Performance Computer Architecture, pp. 452–463, Feb. 2014. [34] S. M. Khan, Z. Wang, and D. A. Jimenez, “Decoupled Dynamic Cache Segmentation,” in Proc. High Performance Computer Architecture, pp. 1–12, Feb. 2012. [35] S. Seo, J. Lee, and Z. Sura, “Design and Implementation of Software-managed Caches for Multicores with Local Memory,” in Proc. High Performance Computer Architecture, pp. 55–66, Feb. 2009. [36] M. Chaudhuri, “PageNUCA: Selected Policies for Page-Grain Locality Management in Large Shared Chip-Multiprocessor Caches,” in Proc. High Performance Computer Architecture, pp. 227–238, Feb. 2009. [37] J. D. Collins, and D. M. Tullsen, “Hardware Identification of Cache Conflict Misses,” in Proc. Microarchitecture, pp. 126–135, Nov. 1999. [38] A. Jaleel, K. B. Theobald, S. C. Steely, Jr., and J. Emer, “High Performance Cache Replacement Using Re-reference Interval Prediction,” in Proc. International Symposium on Computer Architecture, pp. 60–71, June. 2010. [39] T. L. Johnson, D. A. Connors, M. C. Merten, and W.-M. W. Hwu, “Run-Time Cache Bypassing,” in Computers, IEEE Transactions on, vol.48, no. 12, pp. 1338–1354, Dec. 1999. [40] T. Piquet, O. Rochecouste, and A. Seznec, “Exploiting Single-Usage for Effective Memory Management,” in Proc. Asia-Pacific conference on Advances in Computer Systems Architecture, pp. 90–101, Dec. 2007. [41] M. K. Qureshi, and Y. N. Patt, "Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches," in Proc. Microarchitecture, pp. 423–432, Dec. 2006. [42] V. Seshadri, O. Mutlu, M. A. Kozuch, and T. C. Mowry, “The Evicted-Address Filter: A Unified Mechanism to Address Both Cache Pollution and Thrashing,” in Proc. International Conference on Parallel Architectures and Compilation Techniques, pp. 355–366, Sep. 2012. [43] H.-C. Shih, P.-W. Luo, J.-C. Yeh, S.-Y. Lin, D.-M. Kwai, S.-L. Lu, S., A., and C.-W. Wu, "DArT: A Component-Based DRAM Area, Power, and Timing Modeling Tool," IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., vol. 33, no. 9, pp. 1356–1369, Sep. 2014. [44] DRAMPower: Open Source DRAM Power & Energy Estimation Tool. Available: http://www.es.ele.tue.nl/drampower/ [45] S. Burkhart, R. Chase, J. Arada and K. Morris, "PMTA Specification," 2010. [46] P. Rosenfeld, E. C.-Balis, and B. Jacob, "DRAMSim2: A Cycle Accurate Memory System Simulator," in Computer Architecture Letters, vol. 10, no. 1, pp. 16–19, Jan. –June 2011.
|