Parameter estimation on growing phylogenetic trees
Presenter
October 21, 2024
Abstract
In recent years, there have been many efforts to understand the statistical properties of evolutionary processes on growing trees (i.e., when the number of taxa increases while the number of traits/sequence length is fixed). In most processes, correlations among the sampled data do not decay, invalidating the standard i.i.d assumptions for statistical analyses. This moves the statistical learning problem into a regime where classical tools such as the law of large numbers and the central limit theorem no longer hold, making many well-known statistical estimates inconsistent.
In this talk, I will outline the main challenges and distinctive characteristics of statistical learning of evolutionary-related correlated random variables, using the problem of estimating the transition rate of a two-state model on phylogenetic trees as the guiding example. The aim is to illustrate that in this new regime, the problems of parameter estimation and ancestral state reconstruction are inherently intertwined and may need to be solved together with the help of a new class of concentration inequalities and identifiability conditions.