MathInstitutes.org

Challenges posed by dependency structures in non-normal multivariate data from microbiota

Presenter

Susan Holmes

January 23, 2020

Challenges posed by dependency structures in non-normal multivariate data from microbiota Thumbnail

Abstract

Susan Holmes Stanford University Statistics Data from sequencing bacterial communities are formalized as contingency tables whose columns correspond to different biological sample-specimens. The row-features are a random collection with of Amplicon Sequence Variants (ASVs in the case of 16S rRNA type amplicon sequencing) or gene fragments (in the case of metagenomics). In both cases, these entities are defined after the data are collected, thus imposing a nonparametric framework. There are usually more features-rows than columns imposing necessary regularization through use of Bayesian priors. However the classical Dirichlet-multinomial models are insufficient to account for the strong associations (or exclusions) between certain bacteria, thus recent hierarchical models such as latent Dirichlet topic models have provided a more flexible framework that allow mixed membership models more appropriate for these non-Gaussian data. We will show how these hierarchical topic models can enhance our understanding of both longitudinal dependencies between samples and biological dependencies between taxa, regardless of the differences in sampling depth and sources of variability. This is talk contains joint work with Kris Sankaran, Pratheepa Jenganathan and David Relman's group at Stanford.

Abstract

Supplementary Materials

Latent variables explain dependencies in bacterial communities

Videos

Challenges posed by dependency structures in non-normal multivariate data from microbiota

Presenter

Abstract

Supplementary Materials