|
[1] Nikolai Sakharnykh, “Efficient Tridiagonal Solvers for ADI methods and Fluid Simulation,” San Jose Convention Center, San Jose, CA, September 21, 2010 [2] R. W. Hockney, “A fast direct solution of Poisson’s equation using Fourier analysis,” J. ACM, vol. 12, pp. 95–113, January 1965. [3] Y. Zhang, J. Cohen, and J. D. Owens, “Fast tridiagonal solvers on the GPU,” in Proceed-ings of the 15th ACM SIGPLAN symposium on Principles and practice of parallel programming, PPoPP ’10, (New York, NY, USA), pp. 127–136, ACM, 2010. [4] E. Polizzi and A. H. Sameh, “A parallel hybrid banded system solver: The SPIKE algo-rithm,” Parallel Computing, vol. 32, no. 2, pp. 177–194, 2006. [5] Walter Gander and Gene H. Golub, “Cyclic Reduction – History and Applications,” Pro-ceedings of the Workshop on Scientific Computin, Hong Kong, 10-12 March, 1997. [6] Harold S. Stone, “An efficient parallel algorithm for the solution of a tridiagonal linear system of equations,” Journal of ACM,Vol. 20, No. 1, pp. 27-38, January 1973. [7] J. B. Erway, R. F. Marcia, and J. Tyson, “Generalized diagonal pivoting methods for tridi-agonal systems without interchanges,” IAENG International Journal of Applied Mathematics, vol. 4, no. 40, pp. 269–275, 2010. [8] Ali Cevahir, Akira Nukada, Satoshi Matsuoka, “ High performance conjugate gradient solver on multi-GPU clusters using hypergraph partitioning,” Computer Science - Research and Development May 2010, Volume 25, Issue 1-2, pp 83-91 [9] I. S. Duff, R. Guivarch, D. Ruiz, and M. Zenadi, “The Augmented Block Cimmino Dis-tributed Method,” Technical Report TR/PA/13/11, CERFACS, Toulouse, France, 2013. [10] Chang Li-Wen, John A. Stratton, Hee-Seok Kim, and Wen-Mei W. Hwu, “A scalable, nu-merically stable, high performance tridiagonal solver using GPUs.” In: Proceedings of the Interna-tional Conference on High Performance Computing, Networking, Storage and Analysis, SC ’12, pp. 27:1–27:11 [11] CUDA C Programming Guide, http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html [12] cuSPARSE, CUDA toolkit documentation http://docs.nvidia.com/cuda/cusparse/#abstract [13] CUDA Batching Kernels http://docs.nvidia.com/cuda/cublas/#batching-kernels
|