Mathematical and Computational Challenges in Reconstructing Evolution
Presenter
November 8, 2018
Abstract
Tandy Warnow - University of Illinois at Urbana-Champaign, Computer Sciences
The estimation of species trees from multi-locus datasets is a basic step in many biological research projects. However, heterogeneity between the loci resulting from processes such as incomplete lineage sorting and horizontal gene transfer make standard approaches (such as concatenation using maximum likelihood) statistically inconsistent. In this talk, I will present the state of the art methods for species tree estimation from multi-locus data sets when gene trees can differ from the species tree due to incomplete lineage sorting. I will also discuss the current understanding about statistical consistency in two contexts: when sequence lengths and number of genes both go to infinity (essentially assuming perfect gene trees) or when the sequence length per gene is bounded but the number of genes goes to infinity. I will also present the state of the art methods for large-scale species tree estimation, and present new techniques for improving the scalability of these methods to large data sets. Much of this talk will be unpublished research, joint with Erin Molloy (Illinois), Mike Nute (Illinois), and Sebastien Roch (Wisconsin).