作者(外文):Li, Chih-Yu
論文名稱(外文):A Study on Consensus Methods for Phylogenies
指導教授(外文):Wang, Biing-Feng
口試委員(外文):Wang, Jia-Shung
Huang, Yao-Ting
演化樹是分析生物演化關係的重要資料結構。然而,根據不同資訊或是方法建構的演化樹可能會有不同的結構。目前有數種方法解決此種衝突,其中最常被使用的一種為一致樹。此方法建構出數個演化樹的一致樹以統整其共有的資訊。 Adams 於 1972 年提出第一個一致樹的方法,稱為 Adams 一致樹。此後,有多種一致樹被提出且深入研究。其中有些也被實作於常用的生物資訊軟體中。
本論首先分析各種一致樹,介紹其定義並且統整其已知演算法之時間。此外,本論文亦提出一個時間複雜度 O(k^2n)
的演算法,其中 n 為物種數而 k 為輸入之演化樹個數。在此之前本問題的最佳演算法為 Jansson et al.
提出,時間複雜度為 min{O(kn^2), O(kn(k + log^2 n))}。Jansson et al. 亦建議改進其 O(kn(k + log^2 n)
時間之演算法以到 O(k^2n) 之時間,本論文所提出之演算法有效達成此一目標。
Phylogenies are important structures for analyzing the evolutionary rela-tionships among species. However, phylogenies obtained from different data sets or methods may lead to different structures. Several methods have been proposed to resolve such conflicts. One of the most popular approaches is the consensus tree method, which constructs a consensus tree to summarize the in-formation common to a collection of different phylogenies on the same set of species. Adams introduced the first method called Adams consensus tree in 1972. Since then, numerous consensus tree methods have been proposed and extensively studied. Some of them are also implemented in popular computational phylogenetic software packages.
In this thesis, a comprehensive review on consensus tree methods is first presented. For each method, its definition is introduced and its known complexity results are summarized. Additionally, this thesis also provides an efficient O(k^2n)-time algorithm for the frequency difference consensus tree problem, which is one of the more recent consensus tree methods. The previous best upper bound of this problem is min{O(kn^2), O(kn(k + log^2n))} by Jansson et al., where n is the number of species and k is the number of input trees. Jansson et al. suggested further improving their O(kn(k + log^2n)) upper bound to O(k^2n) as an open problem. The algorithm presented in this thesis gives a positive answer to their question.
Abstract i
Contents ii
List of Figures v
List of Tables vi
Chapter 1 Introduction 1
Chapter 2 A Review on Consensus Tree Methods 3
2.1 Notation and Definition 4
2.1.1 Phylogenetic trees and clusters 4
2.1.2 Unrooted phylogenies and splits 6
2.1.3 Compatibility 6
2.2 Cluster-based Consensus Trees 8
2.2.1 Strict Consensus Tree 9
2.2.2 Majority Rule Consensus Tree 9
2.2.3 Ml Consensus Tree 10
2.2.4 Majority rule (+) Consensus Tree 11
2.2.5 Frequency Difference Consensus Tree 11
2.2.6 Greedy Consensus Tree 12
2.2.7 Loose consensus tree 13
2.2.8 Nelson-Page consensus tree 14
2.2.9 Summary of complexity results 2.2 15
2.3 Cluster-intersection-based Consensus Tree 17
2.3.1 Adams consensus tree 17
2.3.2 Neumann consensus tree 18
2.3.3 Durschnitt consensus tree 19
2.3.4 Cardinality rule consensus tree 19
2.3.5 s-consensus tree 20
2.3.6 Summary of complexity results 2.3 20
2.4 Subtree-based Consensus Tree 21
2.4.1 Local consensus tree 22
2.4.2 Prune and regraft tree 23
2.4.3 Q* and R* consensus tree 24
2.4.4 Summary of complexity results 2.4 24
2.5 Recoding-based Consensus Tree 25
2.5.1 Matrix representation with parsimony 25
2.5.2 Average consensus tree 26
2.5.3 Buneman consensus tree 26
2.5.4 Summary of complexity results 2.5 27
2.6 Implementation of consensus trees 27
Chapter 3 Day’s Algorithm 29
3.1 Relabeling 29
3.2 Cluster table 31
3.3 Procedure COMCLUSTER 33
3.4 Time complexity 33
Chapter 4 Common Subroutines for Clusters 35
4.1 Delete Cluster 35
4.2 Delete Cluster Group 36
4.3 One-way Compatible 37
4.4 Merge Tree 38
Chapter 5 An O(k3n)-time Algorithms 39
5.1 Preprocessing 41
5.2 Cluster removal 41
5.2.1 k-value decomposition 42
5.2.2 Candidate cluster trees 42
5.2.3 Filter trees 44
5.3 Tree combine 48
5.4 Time Complexity 48
Chapter 6 An O(k2n)-time Algorithm 49
6.1 Procedure Filter Clusters 50
6.2 Jansson et al.’s Algorithm 50
6.3 Procedure Modified Filter Clusters 51
6.4 Time Complexity 52
Chapter 7 Conclusion and Future Work 53
References 54

