Maximum Agreement Subtree in a Set of Evolutionary Trees: Metrics and Efficient Algorithms

The maximum agreement subtree approach is one method of reconciling different evolutionary trees for the same set of species. An agreement subtree enables choosing a subset of the species for whom the restricted subtree is equivalent (under a suitable definition) in all given evolutionary trees.

Recently, dynamic programming ideas were used to provide polynomial time algorithms for finding a maximum homeomorphic agreement subtree of two trees. Generalizing these methods to sets of more than two trees yields algorithms that are exponential in the number of trees. Unfortunately, it turns out that in reality one is usually presented with more than two trees, sometimes as many as thousands of trees.

In this paper we prove that the maximum homeomorphic agreement subtree problem is $\cal{NP}$-complete for three trees with unbounded degrees. We then show an approximation algorithm of time O(kn5) for choosing the species that are not in a maximum agreement subtree of a set of k trees. Our approximation is guaranteed to provide a set that is no more than 4 times the optimum solution.

While the set of evolutionary trees may be large in practice, the trees usually have very small degrees, typically no larger than three. We develop a new method for finding a maximum agreement subtree of k trees, of which one has degree bounded by d. This new method enables us to find a maximum agreement subtree in time O(knd + 1+ n2d).

  • [1]  Google Scholar

  • [2]  Hans‐Jürgen Bandelt and , Andreas Dress, Reconstructing the shape of a tree from observed dissimilarity data, Adv. in Appl. Math., 7 (1986), 309–343 87k:05060 CrossrefISIGoogle Scholar

  • [3]  William Day, Optimal algorithms for comparing trees with labeled leaves, J. Classification, 2 (1985), 7–28 800511 CrossrefISIGoogle Scholar

  • [4]  Google Scholar

  • [5]  Google Scholar

  • [6]  Martin Farach and , Mikkel Thorup, Fast comparison of evolutionary trees, ACM, New York, 1994, 481–488 95c:92009 Google Scholar

  • [7]  C. R. Finden and  and A. D. Gordon, Obtaining common pruned trees, J. Classification, 2 (1985), pp. 255–276. CrossrefISIGoogle Scholar

  • [8]  Michael Garey and , David Johnson, Computers and intractability, W. H. Freeman and Co., 1979x+338, A guide to the theory of NP‐completeness; A Series of Books in the Mathematical Sciences 80g:68056 Google Scholar

  • [9]  Google Scholar

  • [10]  A. Gordon, A measure of the agreement between rankings, Biometrika, 66 (1979), 7–15 81b:62073 CrossrefISIGoogle Scholar

  • [11]  Google Scholar

  • [12]  Google Scholar

  • [13]  Google Scholar

  • [14]  Google Scholar

  • [15]  D. Neumann, Faithful consensus methods for n‐trees, Math. Biosci., 63 (1983), 271–287 10.1006/bulm.2000.0219 84h:92047 CrossrefISIGoogle Scholar

  • [16]  E. A. Smolenskii, Jurnal Vicisl. Mat. i Matem. Fiz, 2 (1962), pp. 371–372. Google Scholar

  • [17]  Mike Steel and , Tandy Warnow, Kaikoura tree theorems: computing the maximum agreement subtree, Inform. Process. Lett., 48 (1993), 77–82 10.1016/0020-0190(93)90181-8 95d:68066 CrossrefISIGoogle Scholar

  • [18]  M. A. Steel and  and D. Penny, Distributions of tree comparison metrics ‐ Some new results, Syst. Biol., 42 (1993), pp. 126–141. ISIGoogle Scholar