帳號:guest(216.73.216.88)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):楊宗翰
作者(外文):Yang, Tsung-Han
論文名稱(中文):結合多層次模塊化分群與多模式演算法以增進軟體模組化品質
論文名稱(外文):Incorporating Multi-Level Modularity Clustering into Multi-Pattern algorithms for Improving the Quality of Software Modularization
指導教授(中文):黃慶育
指導教授(外文):Huang, Chin-Yu
口試委員(中文):林振緯
蘇銓清
林其誼
口試委員(外文):Lin, Jenn-Wei
Sue, Chuan-Ching
Lin, Chi-Yi
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:108062613
出版年(民國):110
畢業學年度:109
語文別:中文
論文頁數:69
中文關鍵詞:軟體架構恢復軟體模組化軟體叢集軟體分群模塊化分群
外文關鍵詞:Software architecture recoverySoftware modularizationSoftware clusteringModular clustering
相關次數:
  • 推薦推薦:0
  • 點閱點閱:456
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
軟體對我們的生活已經是不可或缺的一部分。在現今軟體開發的過程當中,隨著軟體開發的週期不斷演進,後續的維護過程時常讓整體架構偏離原定的規劃,使專案難以維護,以至於需要重新對架構進行系統性的分析、模組化,然而此一過程對程式維護人員來說是個龐大的負擔。叢集(clustering)是一種直觀的方式來實作模組化,然而傳統的叢集演算法卻有一些缺點,例如,產出的模組可能相當巨大、可理解性不足、無法自動定義目標模組的數量等等。因此這個領域的研究嘗試使用不同的方法、找出相似的特徵或是更先進的演算法來解決問題。
在這份研究中,我們提出了多特徵模塊化演算法(MPMC)去增進軟體模組化的品質。這個演算法主要結合了兩部份。一為多模式(multi-pattern)策略,其分析程式碼間依賴程度,分別進行標記檔案、連鎖關聯的收集等等步驟,將程式碼進行預先處理、合併。二為多階層的模塊化分群演算法,其為一種圖分群方法,經過粗化和多層次細化等過程將圖分群。在實驗中,我們使用了不同大小和功能的五種開源程式和一種閉源程式作為資料集,比較MPMC與其他傳統軟體分群演算法和現有軟體分群方法的效能。評估結果顯示MPMC的效能在Weighted-MQ指標相較於其他方法有2.71倍左右的提升、而在MoJoFM指標中則有1.35倍的改進。由此可以證明MPMC演算法是一種有效的軟體叢集工具可供開發者使用,也比其他方法更能產出更好的模組品質。
Software is already an indispensable part of our lives. In the current software development process, with the continuous evolution of the software development cycle, the subsequent maintenance process often deviates from the overall architecture of the original plan. It makes the project difficult to maintain, so that it is necessary to re-analyze the architecture and modules. However, this process is a considerable burden for program maintainers. Clustering is an intuitive method to implement modularization. However, traditional clustering algorithms have some shortcomings. For example, the output modules may be quite large. They may also lack understandability and cannot define the number of target modules automatically. Therefore, researchers in this field have attempted to use different methods, find similar features, or use more advanced algorithms to solve the problem.
In this study, we proposed a multi-pattern modularity clustering algorithm (MPMC) to improve the quality of software modularization. This algorithm mainly combines two components. One is a multi-pattern strategy, which analyzes the degree of dependencies between files, separately performing steps such as tagging files, collecting chain dependencies, and then preprocessing and merging the modules. The second is a multi-level modularity clustering algorithm, which is a graph partition method that divides graphs into groups through processes such as coarsening and multi-level refinement. In the experiment, we used five open-source and one closed-source programs with different sizes and functions as datasets to compare the performance of MPMC with other traditional software grouping algorithms and existing software clustering methods. The evaluation results show that the performance of MPMC is improved by approximately 2.71 times in the weighted modularity quality (MQ) criteria compared with other methods and is improved by 1.35 times in the MoJoFM criteria. It was proven that the MPMC algorithm was an effective software clustering tool for developers to use, and it can also produce better module quality than other methods.
Abstract i
Abstract in Chinese iii
Acknowledgement iv
Contents v
List of Tables vii
List of Figures viii
List of Symbols ix
Chapter 1 Introduction 1
Chapter 2 Background and Literature Review 5
Chapter 3 Multi-Pattern Modularity Clustering 11
3.1 Step 1: Preprocessing 12
3.1.1 Step 1-1: Dependency Matrix Construction 13
3.1.2 Step 1-2: Non-Dependency Files Removal 15
3.1.3 Step 1.3: Self-Dependency Removal 15
3.2 Step 2: File Labeling 18
3.2.1 Removal of Omnipresent Objects: Utility Files and Driver Files 20
3.2.2 Step 2-2: Leaf Collection: Parent and Children 22
3.2.3 Step 2-3: Label Uniqueness 23
3.3 Step 3: Collection of Chain Dependency 25
3.4 Step 4: Multilevel Modularity Clustering 26
Chapter 4 Experimental Results and Analysis 29
4.1 Selected Subject Systems 30
4.2 Ground Truth 30
4.3 Evaluation Criteria 31
4.3.1 Weighted Modularization Quality 31
4.3.2 Metrics: Accuracy, Precision, Recall and F1-Measure 33
4.3.3 Mojo-FM 36
4.4 Experimental Result 37
Chapter 5 Research Question and Threat to Validity 49
5.1 Research Question 49
5.2 Threat to Validity 54
Chapter 6 Conclusions and Future Work 57
Reference 59
Appendix 64
[1] N. Ruparelia, “Software Development Lifecycle Models,” ACM SIGSOFT Software Engineering Notes, Vol. 35, No. 3, pp. 8-13, May 2010.
[2] W. Chu, C. Chang, C. Lu, Y. Chung, H. Yang, B. Qiao and H. Jiau, “Enhancing Software Maintainability by Unifying and Integrating Standards,” Advances in Software Maintenance Management: Technologies and Solutions, IGI Global, pp. 114-150, 2003.
[3] G. Bavota, A. De Lucia, A. Marcus and R. Oliveto, “Using structural and semantic measures to improve software modularization”, Empirical Software Engineering, vol. 18, no. 5, pp. 901-932, 2012.
[4] J. McCall, P. Richards and G. Walters, Factors in software quality : Final Report. Information Systems Programs, General Electric Company, 1977.
[5] B. Boehm, J. Brown, and M. Lipow, “Quantitative Evaluation of Software Quality,” Proceedings of the 2nd International Conference on Software Engineering, San Francisco, CA, Oct. 1976, pp. 592-605.
[6] Software Engineering - Product Quality - Part 1: Quality Model, ISO/IEC 9126-1, 2001.
[7] Systems and Software Engineering - Systems and Software Quality Requirements and Evaluation (SQuaRE) - System and Software Quality Models, ISO/IEC 25010, 2011.
[8] D. Parnas, “Software Aging,” Proceedings of 16th International Conference on Software Engineering, Sorrento, Italy, May 1994, pp. 279-287.
[9] B. Meyer, Object-Oriented Software Construction, 1st ed. Upper Saddle River, NJ: Prentice Hall, 1997.
[10] S. Kada, D. Woods and R. Cole, “Design Methods and Code Structure: A Comparative Case Study,” Software Quality Journal, Vol. 2, No. 3, pp. 163-176, Sep. 1993.
[11] P. Andritsos and V. Tzerpos, “Information-theoretic software clustering,” in IEEE Transactions on Software Engineering, vol. 31, no. 2, pp. 150-165, Feb. 2005
[12] H. Sözer, B. Tekinerdoğan and M. Akşit, “Optimizing Decomposition of Software Architecture for Local Recovery,” Software Quality Journal, vol. 21, no. 2, pp. 203-240, 2011.
[13] R. Pressman, Software Engineering. NY: Mcgraw-Hill.
[14] C. Otero, Software Engineering Design: Theory and Practice, 1st ed. Auerbach Publications, 2012.
[15] Witten, E. Frank and M. Hall, Data Mining: Practical Machine Learning Tools and Techniques, 4th ed. Morgan Kaufmann, 2016.
[16] S. Theodoridis and K. Koutroumbas, Pattern Recognition, 4th ed. Elsevier/Acad. Press, 2008.
[17] R. Edgar, “Search and Clustering Orders of Magnitude Faster than BLAST,” Bioinformatics, Vol. 26, No. 19, pp. 2460-2461, Oct. 2010.
[18] Y. Chen, “Enhancing Software Maintainability by Using Multi-Pattern Clustering Algorithm Considering Both Understandability and Quality Driven Modularization”, MS, Thesis. Institute of Information Systems and Applications, National Tsinghwa University, Hsinchu, Taiwan, 2017.
[19] H. So ̈zer, “Evaluating the Effectiveness of Multi-level Greedy Modularity Clustering for Software Architecture Recovery,” European Conference on Software Architecture (ECSA) 2019, LNCS 11681, pp. 71–87, 2019.
[20] R. Koschke. “Atomic architectural component recovery for program understanding and evolution,” Proceedings of the International Conference on Software Maintenance, pp. 478–481. IEEE, 2002.
[21] M. Saeed, O. Maqbool, H. Babri, S. Hassan and S. Sarwar, “Software Clustering Techniques and the Use of Combined Algorithm,” Proceedings of the 7th European Conference on Software Maintenance and Reengineering(CSMR '03), Benevento, Italy, Mar. 2003, pp. 301-306.
[22] O. Maqbool and H. Babri, “Hierarchical Clustering for Software Architecture Recovery,” IEEE Trans. on Software Engineering, Vol. 33, No. 11, pp. 759-780, Nov. 2007.
[23] V.Tzerpos and R. Holt, “ACDC: An Algorithm for Comprehension-Driven Clustering,” Proceedings of the 7th Working Conference on Reverse Engineering, Brisbane, Qld., Australia, Nov. 2000, pp. 258-267.
[24] R. Schwanke and M. Platoff, “Cross References Are Features,” Proceedings of the 2nd International Workshop on Software Configuration Management (SCM '89), Princeton, NJ, Oct. 1989, pp. 86-95.
[25] Z. Wen and V. Tzerpos, “Software Clustering based on Omnipresent Object Detection,” Proceedings of the 13th International Workshop on Program Comprehension (IWPC'05), St. Louis, MO, May 2005, pp. 269-278.
[26] S. Mancoridis, B. Mitchell, Y. Chen and E. Gansner, “Bunch: A Clustering Tool for the Recovery and Maintenance of Software System Structures,” Proceedings of the IEEE International Conference on Software Maintenance, Oxford, England, Aug./Sep. 1999, pp. 50-62.
[27] V. Tzerpos and R. Holt, “The Orphan Adoption Problem in Architecture Maintenance,” Proceedings of the 4th Working Conference on Reverse Engineering (WCRE '97), Amsterdam, Netherlands, Oct. 1997.
[28] B. Mitchell, “Clustering Software Systems to Identify Subsystem Structures,” Department of Mathematics and Computer Science Drexel University, Philadelphia, PA, USA, 2006.
[29] O. Maqbool and H. Babri, “The weighted combined algorithm: A linkage algorithm for software clustering,” in European Conference on Software Maintenance and Reengineering (CSMR), 2004.
[30] A. Corazza, S. Di Martino, and G. Scanniello, “A probabilistic based approach towards software system clustering,” European Conference on Software Maintenance and Reengineering (CSMR), 2010.
[31] A. Corazza, S. D. Martino, V. Maggio, and G. Scanniello, “Investigating the use of lexical information for software system clustering,” European Conference on Software Maintenance and Reengineering (CSMR), 2011.
[32] J. Garcia, D. Popescu, C. Mattmann, N. Medvidovic, and Y. Cai, “Enhancing architectural recovery using concerns,” 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011), 2011.
[33] D. Blei, A. Ng, and M. Jordan, “Latent dirichlet allocation,” The Journal of Machine Learning Research, 2003.
[34] A. Noack, and R. Rotta. “Multi-level algorithms for modularity clustering,” Experimental Algorithms, pp. 257–268. Springer, Heidelberg, 2009.
[35] F. Rossi and N. Villa-Vialaneix. “Repr ́esentation d’un grand r ́eseau ́a partir d’une clas- sification hi ́erarchique de ses sommets,” Journal de la Soci ́et ́e Franc ̧aise de Statis- tique 152(3), 34–65, 2011
[36] J. Davey and E. Burd, “Evaluating the Suitability of Data Clustering for Software Remodularisation,” Proceedings of the 7th Working Conference on Reverse Engineering, Brisbane, Qld., Australia, Nov. 2000, pp. 268-277.
[37] F. Beck and S. Diehl, “On the Impact of Software Evolution on Software Clustering,” Empirical Software Engineering, Vol. 18, No. 5, pp. 970-1004, Oct. 2013.
[38] I. Stavropoulou, M. Grigoriou and K. Kontogiannis, “Case Study on Which Relations to Use for Clustering-Based Software Architecture Recovery,” Empirical Software Engineering, Vol. 22, No. 4, pp. 1717-1762, Jan. 2017.
[39] Understand Tool. [Online]. Available: https://scitools.com/category/release
[40] K. Praditwong, M. Harman, and X. Yao, “Software Module Clustering as A Multi-Objective Search Problem,” IEEE Transactions on Software Engineering, vol. 37, no. 2, pp. 264-282, March-April 2011
[41] T. Gilb and G. Weinberg, Software Metrics, 1st ed. Cambridge: Winthrop Publishers, 1977.
[42] M. Triola, Elementary Statistics, 12th ed. Boston: Pearson.
[43] A. Bluman, Elementary Statistics: A Step by Step Approach, 9th ed. McGraw-Hill Education, 2013.
[44] M. Shtern and V. Tzerpos, “Clustering Methodologies for Software Engineering,” Advances in Software Engineering, vol. 2012, pp. 1-18, 2012.
[45] H. A. M¨uller, M. A. Orgun, S. R. Tilley, and J. S. Uhl. “A reverse engineering approach to subsystem structure identification,” Journal of Software Maintenance: Research and Practice, 5:181–204, Dec. 1993.
[46] P. Newbold, W. Carlson and B. Thorne, Statistics for Business and Economics, 1st ed. Boston: Pearson, 2013.
[47] K. Black, Business Statistics, 1st ed. Hoboken, NJ: Wiley, 2012.
[48] R. Schwanke and M. Platoff, “Cross References Are Features,” Proceedings of the 2nd International Workshop on Software Configuration Management (SCM '89), Princeton, NJ, Oct. 1989, pp. 86-95.
[49] “LUMS-Software Engineering Research Group,” 2017. [Online]. Available: http://suraj.lums.edu.pk/~reverseeng/. [Accessed: 17- Mar- 2017].
[50] N. Anquetil and T. Lethbridge, “Experiments with Clustering as A Software Remodularization Method,” Proceedings of the 6th Working Conference on Reverse Engineering, Atlanta, GA, Oct. 1999, pp. 235-255.
[51] A. Isazadeh, H. Izadkhah, I. Elgedawy, 2017. “Source Code Modularization: Theory and Techniques”
[52] B. S. Mitchell, “A heuristic approach to solving the software clustering problem,” in Proc. Int. Conf. Softw. Maintenance, 2003, pp. 285–288.
[53] M. Sokolova and G. Lapalme, “A Systematic Analysis of Performance Measures for Classification Tasks,” Information Processing and Management, Vol. 45, No. 4, pp. 427-437, Jul. 2009.
[54] A. Tharwat, “Classification assessment methods,” Applied Computing and Informatics, Vol. 17, No. 1, pp. 168-192, 2020.
[55] F. Valverde-Albacete and C. Peláez-Moreno, “100% Classification Accuracy Considered Harmful: The Normalized Information Transfer Factor Explains the Accuracy Paradox,” PLoS ONE, Vol. 9, No. 1, 2014.
[56] T. Sing, O. Sander, N. Beerenwinkel and T. Lengauer, “ROCR: Visualizing Classifier Performance in R,” Bioinformatics, Vol. 21, No. 20, pp. 3940-3941, Aug. 2005.
[57] M. Buckland and F. Gey, “The Relationship between Recall and Precision,” Journal of the American Society for Information Science, Vol. 45, No. 1, pp. 12-19, 1994.
[58] C. Manning, P. Raghavan, H. Schütze, Introduction to Information Retrieval, Cambridge University Press, 2008. [Online]. Available: https://nlp.stanford.edu/IR-book/html/htmledition/evaluation-of-clustering-1.html
[59] V. Tzerpos and R. Holt, “MoJo: A Distance Metric for Software Clustering,” Proceedings of the 6th Working Conference on Reverse Engineering, Atlanta, GA, Oct. 1999.
[60] Z. Wen and V. Tzerpos, “An Effectiveness Measure for Software Clustering Algorithms,” Proceedings of the 12th IEEE International Workshop on Program Comprehension, Bari, Italy, Jun. 2004, pp. 194-203.
[61] Runeson, P., & Höst, M. (2008). Guidelines for Conducting and Reporting Case Study Research in Software Engineering. Empirical Software Engineering.
[62] D. Le, P. Behnamghader, J. Garcia, D. Link, A. Shahbazian, and N. Medvidovic, “An Empirical Study of Architectural Change in Open- Source Software Systems,” in Technical Report USC-CSSE-2014-509, Center for Systems and Software Engineering, University of Southern California, 2014.
(此全文20260803後開放外部瀏覽)
電子全文
中英文摘要
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *