Benefits and Challenges of Data-Motivated Phylogenetic Inference
Presenter
November 20, 2024
Abstract
Many evolutionary scenarios require complex mathematical models to capture the underlying processes. Evaluating these complex models is often computationally hard, motivating the use of heuristics to estimate the optimal phylogenies. Instead of focusing on the general case, this talk highlights the benefits and challenges of focusing on individual scenarios. In some scenarios, simpler models can efficiently capture the relationships and yield rich insights that often improve, both in accuracy and running time, the analysis and visualization of the results. These simpler models hold promise for data sets with high consistency index as well as inferring the mutation history of tumors. As a challenge of data-motivated inference, when data sets increase in taxonomic scope, some characters may describe traits not found across all the taxa. It is common practice to treat such inapplicable characters as missing data. But missing some data, due to poor sequencing or broken fossil, is very different from a character that does not apply to some taxa, such as wing color for taxa with no wings. We explore how the treatment of missing and inapplicable characters can greatly affect which phylogenies are inferred.