Article contents
A central limit theorem for the parsimony length of trees
Published online by Cambridge University Press: 01 July 2016
Abstract
In phylogenetic analysis it is useful to study the distribution of the parsimony length of a tree under the null model, by which the leaves are independently assigned letters according to prescribed probabilities. Except in one special case, this distribution is difficult to describe exactly. Here we analyze this distribution by providing a recursive and readily computable description, establishing large deviation bounds for the parsimony length of a fixed tree on a single site and for the minimum length (maximum parsimony) tree over several sites. We also show that, under very general conditions, the former distribution converges asymptotically to the normal, thereby settling a recent conjecture. Furthermore, we show how the mean and variance of this distribution can be efficiently calculated. The proof of normality requires a number of new and recent results, as the parsimony length is not directly expressible as a sum of independent random variables, and so normality does not follow immediately from a standard central limit theorem.
Keywords
MSC classification
- Type
- General Applied Probability
- Information
- Copyright
- Copyright © Applied Probability Trust 1996
Footnotes
The research of Larry Goldstein and Michael S. Waterman is supported in part by NSF grant DMS-9005833 and Larry Goldstein in part by NSF grant DMS-9505075. The research of Michael S. Waterman was also supported in part by NIH grant #6M 36230.
References
- 4
- Cited by