帳號:guest(3.145.57.187)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):陳盈妤
作者(外文):Chen, Ying-Yu
論文名稱(中文):在子演化樹上做字串搜尋的索引框架
論文名稱(外文):Index Framework for the Subphylogeny Pattern Searching Problem
指導教授(中文):韓永楷
指導教授(外文):Hon, Wing-Kai
口試委員(中文):盧錦隆
李哲榮
口試委員(外文):Lu, Chin-Lung
Lee, Che-Rung
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系所
學號:102062596
出版年(民國):106
畢業學年度:105
語文別:英文
論文頁數:23
中文關鍵詞:子演化樹字串搜尋DNA相似度索引結構
外文關鍵詞:SubphylogenyPattern MatchingDNA SimilarityIndexing
相關次數:
  • 推薦推薦:0
  • 點閱點閱:57
  • 評分評分:*****
  • 下載下載:12
  • 收藏收藏:0
本篇論文研究如何在任意子演化樹上做有效率的字串搜索。我們利用在演化樹上相近的物種在DNA字串上較相似的特點,可以以此減少相同字串的儲存空間,再輔以Suffix Tree、Wavelet Tree及Heavy Path Composition三種資料結構,設計出一個空間與最少資料量成正比的索引結構,並以此達到有效率字串搜索的目標。
This paper studies how to perform efficient subphylogeny pattern matching query on a phylogeny. By exploiting the fact that the close species in the phylogeny share many similarities in their DNA sequences, so that we can reduce the space to store these sequences, and applying existing data structures like suffix array, wavelet tree, and heavy path decomposition, we design an indexing structure that takes asymptotically minimal space, while supporting the subphylogeny pattern matching query, efficiently, as desired.
1. Introduction---------------------------------2
2. Preliminaries--------------------------------7
2.1 Suffix Array--------------------------------7
2.2 Heavy Path Decomposition--------------------8
2.3 Wavelet Tree-------------------------------10
2.3.1 Restricted 2D Orthogonal Range Query-----12
3. Our Framework-------------------------------14
3.1 Index Structure----------------------------15
3.2 Query Algorithm----------------------------17
4. Conclusion----------------------------------20
[1] R. J. Britten. Divergence between Samples of Chimpanzee and
Human DNA Sequences is 5%, Counting Indels. Proc. National
Academy of Science, 99(21):13633-13635, 2002.
[2] R. Cole, L.-A. Gottlieb, and M. Lewenstein. Dictionary Matching
and Indexing with Errors and Don't Cares. In Proc. of Symposium
on Theory of Computing (STOC), pages 91-100, 2004.
[3] P. Ferragina and G. Manzini. Indexing Compressed Text. Journal
of the ACM (JACM), 52(4):552-581, 2005.
[4] R. Grossi, A. Gupta, and J. S. Vitter. High-order Entropycompressed
Text Indexes. In Proc. of ACM-SIAM Symposium
on Discrete Algorithms (SODA), pages 841-850, 2003.
[5] M.-C. King and A. C. Wilson. Evolution at Two Levels in Humans
and Chimpanzees. Science, 188(4184):107-116, 1975.
[6] U. Manber and G. Myers. Sux Arrays: A New Method for Online
String Searches. siam Journal on Computing, 22(5):935-948, 1993.
[7] E. M. McCreight. A Space-Economical Sux Tree Construction
Algorithm. Journal of the ACM (JACM), 23(2):262-272, 1976.
[8] G. Navarro. Wavelet Trees for All. Journal of Discrete Algorithms(JDA), 25:2-20, 2014.
[9] R. Raman, V. Raman, and S. R. Satti. Succinct Indexable Dictionaries with Applications to Encoding k-ary Trees, Pre x Sums
and Multisets. ACM Transactions on Algorithms (TALG), 3(4),
article 43, 2007.
[10] P. Weiner: Linear Pattern Matching Algorithms. In Proc. of
Symposium on Switching and Automata Theory, pages 1-11,
1973.
[11] Wikipeidia entry for Metagenomics. Link at:
https://en.wikipedia.org/wiki/Metagenomics
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *