帳號:guest(18.216.36.75)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):謝 坤
作者(外文):XIE, KUN
論文名稱(中文):運用深度學習優化 Spark 上的稀疏矩陣相乘
論文名稱(外文):The Optimization of SpMV by Deep Learning on Spark
指導教授(中文):李哲榮
指導教授(外文):LEE, CHE-RUNG
口試委員(中文):周志遠
王偉仲
口試委員(外文):CHOU, JERRY
Wang, WEI-CHUNG
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系所
學號:104062467
出版年(民國):107
畢業學年度:106
語文別:英文
論文頁數:52
中文關鍵詞:稀疏矩陣乘法分佈式深度學習卷積神經網絡
外文關鍵詞:SpMVSparkDeep LearningCNN
相關次數:
  • 推薦推薦:0
  • 點閱點閱:446
  • 評分評分:*****
  • 下載下載:4
  • 收藏收藏:0
稀疏矩陣乘法(SpMV)是解決大規模數值問題的重要計算工具。但
是,其性能不僅取決於運行平台,還取決於矩陣的結構。在本文中,我們
著重於優化Spark 上SpMV 的性能,利用內存技術(Cache) 來提高分佈式
環境下迭代算法的性能。我們設計了一個深度學習模型來對矩陣和SpMV
計算進行分類,以便根據矩陣結構和操作配置(包括迭代次數和Spark 節
點內存大小)來決定最佳的數據格式。矩陣結構被編碼成小圖像,其中每
個像素代表一個子矩陣中非零點的密度,並用卷積神經網絡(CNN)進
行識別。同時把操作配置增加到模型,以提高預測的準確性。我們使用佛
羅里達稀疏矩陣集合(The University of Florida Sparse Matrix Collection)
中的一千多個矩陣進行訓練和測試,並對模型進行了微調,以達到最佳
的精度。實驗表明,我們的模型可以達到81%的準確率去選擇到最佳的數
據格式和94%的準確率選擇中前2名的數據格式。與單一稀疏數據格式相
比,SpMV 的性能可以提高近2.5倍。
Sparse matrix-vector multiplication (SpMV) is an important computa-
tional operation for solving large scale numerical problems. However, its
performance depends on not only the running platforms but also the struc-
ture of matrices. In this thesis, we focus on optimizing the performance of
SpMV on Spark, which utilizes in-memory technique to improve the perfor-
mance of iterative algorithms in the distributed environments. We designed
a deep learning model to classify the matrices and SpMV operations, so that
the best data format can be decided based on the matrix structures and op-
erational con gurations, including the number of iterations and the size
of memory in Spark nodes. The matrix structures are encoded into small
images, in which each pixel represents the density of nonzeros in a block
submatrix, and recognized by using convolution neural network (CNN).
Other con gurations are augmented to the model to increase the enhance
the accuracy of prediction. We used more than one thousand matrices from
Florida Sparse Matrix Collection for training and testing, and ne tuned
the model to achieve the best accuracy. Experiments show that our model can reach 82% accuracy can be obtained for top 1 choices and 94% for top
2 choice. The performance of SpMV can be improved near 2.5 times by our
method comparing that of using single sparse data format.
1 Introduction 1
2 Background 4
2.1 Sparse Matrix-Vector Multiplication (SpMV) . . . . . . . . . . . . . 4
2.2 Apache Spark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.1 Spark Core . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.2 MLLib . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Data Formats Of Sparse Matrix . . . . . . . . . . . . . . . . . . . . 6
2.3.1 Coordinate Format (COO) . . . . . . . . . . . . . . . . . . . 6
2.3.2 Compressed Sparse Row (CSR) . . . . . . . . . . . . . . . . 6
2.3.3 Compressed Sparse Column (CSC) . . . . . . . . . . . . . . 7
2.3.4 Block Compressed Sparse Row (BCSR) . . . . . . . . . . . . 8
2.4 Data Formats Of Sparse Matrix in Spark MLlib . . . . . . . . . . . 8
2.4.1 Local Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.4.2 Distributed Matrix . . . . . . . . . . . . . . . . . . . . . . . 10
2.5 Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.5.1 Convolutional Neural Networks (CNN) . . . . . . . . . . . . 12
2.5.2 TensorFlow . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.5.3 Keras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.6 The University of Florida Sparse Matrix Collection . . . . . . . . . 14
3 Motivation 15
3.1 The Limitation of MLlib Implementations . . . . . . . . . . . . . . 15
3.2 The Limitation of the SpMV on Spark . . . . . . . . . . . . . . . . 15
3.3 The Structure of the Matrix . . . . . . . . . . . . . . . . . . . . . . 16
3.4 The Limitation of Adaptive SpMV . . . . . . . . . . . . . . . . . . 17
4 Algorithms and Implementations 19
4.1 Training Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.1.1 The Image of Matrix . . . . . . . . . . . . . . . . . . . . . . 20
4.1.2 The Con gurations of Dataset . . . . . . . . . . . . . . . . . 21
4.1.3 The Label of Training Set . . . . . . . . . . . . . . . . . . . 22
4.1.4 Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2 Deep Learning Model Design . . . . . . . . . . . . . . . . . . . . . . 25
4.2.1 Baseline Model . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.2.2 Fine-Tuned Model . . . . . . . . . . . . . . . . . . . . . . . 27
4.2.3 Additional Feature . . . . . . . . . . . . . . . . . . . . . . . 30
4.3 Training Dataset Preparation . . . . . . . . . . . . . . . . . . . . . 32
4.3.1 Block Coordinate Format (BCOO+) . . . . . . . . . . . . . 32
4.3.2 Coordinate Format (COO) . . . . . . . . . . . . . . . . . . . 35
4.3.3 Compressed Sparse Row (CSR) . . . . . . . . . . . . . . . . 35
4.3.4 Compressed Sparse Column (CSC) . . . . . . . . . . . . . . 37
5 Experiments and Result 39
5.1 The Experiments of Learning . . . . . . . . . . . . . . . . . . . . . 39
5.1.1 Experiments Setting . . . . . . . . . . . . . . . . . . . . . . 39
5.1.2 The Structure of the Baseline Model . . . . . . . . . . . . . 39
5.1.3 The Tuning of the Activation Function . . . . . . . . . . . . 40
5.1.4 The Tuning of the Batch Normalization . . . . . . . . . . . . 41
5.1.5 The Image of Matrix . . . . . . . . . . . . . . . . . . . . . . 41
5.1.6 The Overall Performance of Adaptive Format . . . . . . . . 42
5.1.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.2 The Experiments of SpMV on Spark . . . . . . . . . . . . . . . . . 45
5.2.1 Experiments Setting . . . . . . . . . . . . . . . . . . . . . . 45
5.2.2 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.2.3 Benchmark Experiment of COO, CSC and CSR . . . . . . . 47
5.2.4 Experiments of BCOO+ . . . . . . . . . . . . . . . . . . . . 48
6 Conclusion and Future Works 50
Reference 51
[1] Bolye, P. (1977), \Options: A Monte Carlo Approach". Journal of Financial
Economics, 4, 323-338
[2] R. W. Vuduc. \Automatic Performance Tuning of Sparse Matrix Kernels".
PhD thesis, 2003. AAI3121741.
[3] Hubel, D. and Wiesel, T. (1968). Receptive elds and functional architecture
of monkey striate cortex. Journal of Physiology (London), 195, 215{243.
[4] R. Kannan, \Ecient sparse matrix multiple-vector multiplication using a
bitmapped format" in HiPC, IEEE, pp. 286-294, 2013.
[5] Zhang Y, Li S, Yan S, Zhou H. \A cross-platform SpMV framework on many-
core architectures". ACM Trans Archit Code Optim. 2016;13(4):33:1-33:25.
[6] N. Bell and M. Garland, Ecient sparse matrix-vector multiplication on
CUDA, NVIDIA Technical Report, NVR-2008-004, NVIDIA Corporation,
2008.
[7] Bell, N. and Garland, M., 2009, November. Implementing sparse matrix-
vector multiplication on throughput-oriented processors. In Proceedings of the
conference on high performance computing networking, storage and analysis
(p. 18). ACM.
[8] Liu, H., Yu, S., Chen, Z., Hsieh, B. and Shao, L., 2012. Sparse matrix-vector
multiplication on NVIDIA GPU. International Journal of Numerical Analysis
& Modeling, Series B, 3(2), pp.185-191.
[9] Nathan Bell. 2011. Sparse Matrix Representations &
Iterative Solvers. NVIDIA. [ONLINE] Available at:
http://www.bu.edu/pasi/ les/2011/01/NathanBell1-10-1000.pdf.
[10] Zardoshti, P., Khunjush, F. and Sarbazi-Azad, H., 2016. Adaptive sparse
matrix representation for ecient matrix{vector multiplication. The Journal of Supercomputing, 72(9), pp.3366-3386.
[11] Kourtis, K., Karakasis, V., Goumas, G. and Koziris, N., 2011, February.
CSX: an extended compression format for spmv on shared memory systems.
In ACM SIGPLAN Notices (Vol. 46, No. 8, pp. 247-256). ACM.
[12] Spark.apache.org. (2018). Spark SQL & DataFrames | Apache Spark. [on-
line] Available at: https://spark.apache.org/sql/ [Accessed 23 Jan. 2018]
[13] Spark.apache.org. (2018). Spark Streaming | Apache Spark. [online] Avail-
able at: https://spark.apache.org/streaming/ [Accessed 23 Jan. 2018].
[14] Spark.apache.org. (2018). MLlib | Apache Spark. [online] Available at:
https://spark.apache.org/mllib/ [Accessed 23 Jan. 2018].
[15] Spark.apache.org. (2018). GraphX | Apache Spark. [online] Available at:
https://spark.apache.org/graphx/ [Accessed 23 Jan. 2018].
[16] A. Benatia, W. Ji, Y. Wang and F. Shi, "Machine Learning Approach for the
Predicting Performance of SpMV on GPU," 2016 IEEE 22nd International
Conference on Parallel and Distributed Systems (ICPADS), Wuhan, 2016,
pp. 894-901.
[17] Mishkin, D., Matas, J.: All you need is a good init. In: ICLR (2016)
[18] Romero, Adriana, Ballas, Nicolas, Kahou, Samira Ebrahimi, Chassang, An-
toine, Gatta, Carlo, and Bengio, Yoshua. Fitnets: Hints for thin deep nets.
In Proceedings of ICLR, May 2015. URL http://arxiv.org/abs/1412.6550.
[19] Eunbyung Park, Xufeng Han, Tamara L Berg, and Alexander C Berg. 2016.
Combining multiple sources of knowledge in deep cnns for action recognition.
In Proceedings of AWACV. IEEE, pages 1{8.
[20] Gunter Klambauer, Thomas Unterthiner, Andreas Mayr, and Sepp Hochre-iter. Self-normalizing neural networks. arXiv preprint arXiv:1706.02515, 2017.
[21] Io e, S. and Szegedy, C., 2015, June. Batch normalization: Accelerating deep
network training by reducing internal covariate shift. In International Con-
ference on Machine Learning (pp. 448-456).
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *