作者(外文):Chen, Ting Wei
論文名稱(外文):On the Construction of the Burrows-Wheeler Transform and the Maximal Repeating Group Finding
指導教授(外文):Lu, Chin Lung
口試委員(外文):Lee, Chia Tung
Tang, Chuan Yi
外文關鍵詞:BWTMaximal Repeating GroupsExact String Matching
在這篇論文中,我們對於以Burrows-Wheeler Transform (簡稱BWT) 來解決字串比對問題有相當大的興趣,使用BWT的問題在於為某個字串產生出BWT非常耗費時間,我們的方法是以KSS的方法為基礎來修改並產生出BWT。我們的方法較KSS更簡單理解並實作,而且我們的實驗結果也顯示出我們產生出BWT的方法相當有效率。同時,我們也對最大重複子字串的問題感到相當大的興趣,也依照我們產生出BWT的方法稍做修改之後並利用到解決此問題上,舉例來說,我們的實驗裡,有一串長度為155606181個字的DNA序列,在這麼長的序列中找到長度大於2000的重複子字串只花了我們226秒,我們也成功找出了55對最大重複子字串。
In this thesis, we are interested in the Burrows-Wheeler Transform (BWT for short) for exact string matching. The problem of BWT is that it is very time-consuming to construct BWT. We have developed a method which is based upon the KSS Method to construct BWT. Our method is quite easy to comprehend and implement. Experimental results show that our method is efficient. We are also interested in the maximal repeating group problem. We have developed an efficient method to find maximal repeating groups. For example, for a DNA sequence with length 155606181, it took only 226 seconds to find 55 maximal repeating groups with lengths longer than 2000.
Chapter 1 Introduction 9
1.1 Motivations 9
1.2 Background 9
1.3 Thesis Organization 13
Chapter 2 The Suffix Tree Approach 14
2.1 Prefix and Suffix 14
2.2 The Introduction of Suffix Tree 16
2.3 The Searching of Suffix Tree 20
Chapter 3 The Suffix Array Approach 24
3.1 An Introduction of the Suffix Array Approach 24
3.2 The Searching of the Suffix Array Approach 26
Chapter 4 The Burrows Wheeler Transform 32
4.1 An Introduction of the Burrows Wheeler Transform 32
4.2 The BWT Search 34
4.3 Correctness 38
4.4 The Implementation of the BWT Method 41
Chapter 5 Our Method to Obtain the BWT Efficiently 45
5.1 The Introduction of Our Method 45
5.2 The Experiment of Our Method 46
Chapter 6 Our Method to Solve Repeating Group Finding Problem 54
6.1 Solving the Repeating Group Finding Problem with Dynamic Programming 54
6.2 Our Method to Find the Repeating Groups 57
6.3 Experiment Results 61
References 76
Appendix A 79
Appendix B 87
