Accommodating rate heterogeneity to substitution models
Presenter
October 23, 2024
Abstract
In this talk we discuss single site substitution models used in Markov processes over phylogenetic trees. Most of the widely used models are time-continuous and implicitly consider that rates of substitution are time-homogeneous along each branch of the tree. Moreover, most of the models also assume global homogeneity by imposing a single rate matrix throughout the tree and are commonly stationary and time-reversible.
At the other end of the spectrum there is the general Markov model, which does not assume homogeneity of rates at all, nor stationarity or time-reversibility, and is the most general site substitution model one could consider on a phylogenetic tree. As it involves many parameters, it is impractical for a maximum likelihood approach and, although reconstruction methods based on phylogenetic invariants have been proposed for this model, these are not very popular yet.
In the case of amino acid substitution, models commonly used in phylogenetic software rely on empirical rates and only consider a single parameter per branch. This contrasts with the general Markov model, which involves 380 parameters per edge in this case. In between, there are the algebraic time-reversible models that we present in this talk: stationary and time-reversibile models which do not assume time-homogeneity of rates and are amenable to an algebraic approach. We will discuss how phylogenetic invariants can be adapted to these models and how they can mimic tools like the Hadamard transform. We will also discuss how restrictive continuous-time models are with respect to the general Markov.
This is joint work with R. Homs-Pons and A. Torres.