Hostname: page-component-745bb68f8f-b95js Total loading time: 0 Render date: 2025-01-23T17:40:58.789Z Has data issue: false hasContentIssue false

Identifiability of a Coalescent-Based Population Tree Model

Published online by Cambridge University Press:  30 January 2018

Arindam RoyChoudhury*
Affiliation:
Columbia University
*
Postal address: Department of Biostatistics, Columbia University, 6th Floor, 722 W. 168th Street, New York, NY 10032, USA. Email address: [email protected]
Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

Identifiability of evolutionary tree models has been a recent topic of discussion and some models have been shown to be nonidentifiable. A coalescent-based rooted population tree model, originally proposed by Nielsen et al. (1998), has been used by many authors in the last few years and is a simple tool to accurately model the changes in allele frequencies in the tree. However, the identifiability of this model has never been proven. Here we prove this model to be identifiable by showing that the model parameters can be expressed as functions of the probability distributions of subsamples, assuming that there are at least two (haploid) individuals sampled from each population. This a step toward proving the consistency of the maximum likelihood estimator of the population tree based on this model.

Type
Research Article
Copyright
© Applied Probability Trust 

References

Allman, E. S., Ané, C. and Rhodes, J. A. (2008). Identifiability of a Markovian model of molecular evolution with gamma-distributed rates. Adv. Appl. Prob. 40, 228249.Google Scholar
Bryant, D. et al. (2012). Inferring species trees directly from biallelic genetic markers: bypassing gene trees in a full coalescent analysis. Mol. Biol. Evol. 29, 19171917.CrossRefGoogle Scholar
Chai, J. and Housworth, E. A. (2011). On Rogers' proof of identifiability for the GTR + Γ + I model. Syst. Biol. 60, 713718.CrossRefGoogle ScholarPubMed
Felsenstein, J. (1981). Evolutionary trees from DNA sequences: a maximum likelihood approach. J. Mol. Evol. 17, 368376.CrossRefGoogle ScholarPubMed
Kingman, J. F. C. (1982). The coalescent. Stoch. Process. Appl. 13, 235248.CrossRefGoogle Scholar
Liu, L., Yu, L., Pearl, D. K. and Edwards, S. V. (2009). Estimating species phylogenies using coalescence times among sequences. Syst. Biol. 58, 468477.Google Scholar
Matsen, F. A. and Steel, M. (2007). Phylogenetic mixtures on a single tree can mimic a tree of another topology. Syst. Biol. 56, 767775.Google Scholar
Nielsen, R. and Slatkin, M. (2000). Likelihood analysis of ongoing gene flow and historical association. Evolution 54, 4450.Google ScholarPubMed
Nielsen, R., Mountain, J. L., Huelsenbeck, J. P. and Slatkin, M. (1998). Maximum-likelihood estimation of population divergence times and population phylogeny in models without mutation. Evolution 52, 669677.CrossRefGoogle ScholarPubMed
RoyChoudhury, A. (2011). Composite likelihood-based inferences on genetic data from dependent loci. J. Math. Biol. 62, 6580.Google Scholar
RoyChoudhury, A. and Thompson, E. A. (2012). Ascertainment correction for a population tree via a pruning algorithm for likelihood computation. Theoret. Pop. Biol. 82, 5965.CrossRefGoogle Scholar
RoyChoudhury, A., Felsenstein, J. and Thompson, E. A. (2008). A two-stage pruning algorithm for likelihood computation for a population tree. Genetics 180, 10951105.Google Scholar
Steel, M. A., Székely, L. and Hendy, M. D. (1994). Reconstructing trees when sequence sites evolve at variable rates. J. Comp. Biol. 1, 153163.Google Scholar
Takahata, N. and Nei, M. (1985). Gene genealogy and variance of interpopulational nucleotide differences. Genetics 110, 325344.Google Scholar