|
[1] M.-W. Benabderrahmane, L.-N. Pouchet, A.Cohen, and C. Bastoul, “The poly- hedral model is more widely applicable than you think,” in Compiler Construc-tion. Springer, 2010, pp. 283–303.
[2] C. Bastoul, “Improving data locality in static control programs,” Ph.D. disser-tation, University Paris 6, Pierre et Marie Curie, France, Dec. 2004.
[3] C. Chen, “Polyhedra scanning revisited,” in Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation. ACM, 2012, pp. 499–508.
[4] C. Bastoul, “Efficient code generation for automatic parallelization and optimiza- tion,” in ISPDC2 IEEE International Symposium on Parallel and Distributed Computing, 2003, pp. 23–30.
[5] S. Grauer-Gray, L. Xu, R. Searles, S. Ayalasomayajula, and J. Cavazos, “Auto- tuning a high-level language targeted to gpu codes,” in Innovative Parallel Com-puting (InPar), 2012. IEEE, 2012, pp. 1–10.
[6] C. A. Lattner, “Llvm: An infrastructure for multi-stage optimization,” Ph.D. dissertation, University of Illinois, 2002.
[7] T. Grosser, H. Zheng, R. Aloor, A. Simburger, A. Groblinger, and L.-N. Pouchet, “Polly-polyhedral optimization in llvm,” in Proceedings of the First International Workshop on Polyhedral Compilation Techniques (IMPACT), vol. 2011, 2011.
[8] C. Lattner, “Llvm and clang: Next generation compiler technology,” in The BSD Conference, 2008, pp. 1–2.
[9] D. Khaldi, C. Ancourt, and F. Irigoin, “Towards automatic c programs optimiza- tion and parallelization using the pips-pocc integration,” PDF from http://www. rocq. inria. fr/˜ pouchet/software/pocc/doc/ht mldoc/htmldoc/index. html, 2011.
[10] G. Rudy, “Cuda-chill: A programming language interface for gpgpu optimiza- tions and code generation,” Ph.D. dissertation, The University of Utah, 2010.
[11] M. M. Baskaran, J. Ramanujam, and P. Sadayappan, “Automatic c-to-cuda code generation for affine programs,” in Compiler Construction. Springer, 2010, pp. 244–263.
[12] O. Kayiran, A. Jog, M. T. Kandemir, and C. R. Das, “Neither more nor less: Optimizing thread-level parallelism for gpgpus,” CSE Penn State Tech Report, TR-CSE-2012-006, 2012.
[13] J. Lee, J. Kim, S. Seo, S. Kim, J. Park, H. Kim, T. T. Dao, Y. Cho, S. J. Seo, S. H. Lee et al., “An opencl framework for heterogeneous multicores with local memory,” in Proceedings of the 19th international conference on Parallel architectures and compilation techniques. ACM, 2010, pp. 193–204. |