On the Number of Non-equivalent Ancestral Configurations for Matching Gene Trees and Species Trees.

Affiliation

Department of Biology, Stanford University, Stanford, CA, USA. [Email]

Abstract

An ancestral configuration is one of the combinatorially distinct sets of gene lineages that, for a given gene tree, can reach a given node of a specified species tree. Ancestral configurations have appeared in recursive algebraic computations of the conditional probability that a gene tree topology is produced under the multispecies coalescent model for a given species tree. For matching gene trees and species trees, we study the number of ancestral configurations, considered up to an equivalence relation introduced by Wu (Evolution 66:763-775, 2012) to reduce the complexity of the recursive probability computation. We examine the largest number of non-equivalent ancestral configurations possible for a given tree size n. Whereas the smallest number of non-equivalent ancestral configurations increases polynomially with n, we show that the largest number increases with [Formula: see text], where k is a constant that satisfies [Formula: see text]. Under a uniform distribution on the set of binary labeled trees with a given size n, the mean number of non-equivalent ancestral configurations grows exponentially with n. The results refine an earlier analysis of the number of ancestral configurations considered without applying the equivalence relation, showing that use of the equivalence relation does not alter the exponential nature of the increase with tree size.

Keywords

Ancestral configurations,Combinatorics,Gene trees and species trees,Phylogenetics,