The cgDNA sequence-dependent coarse-grain model of dsDNA: Bridging the scales from Molecular Dynamics to Bioinformatics
Presenter
October 30, 2019
Abstract
John Maddocks - École Polytechnique Fédérale de Lausanne (EPFL)
The cgDNA+ coarse-grain model of DNA ( lcvmwww.epfl.ch/research/cgDNA/ ) can now accurately predict the sequence-dependent statistical mechanics properties, for example shape and stiffness (or equivalently first and second moments of the equilibrium distributions), of double-stranded DNA fragments of arbitrary sequence. At scales of tens of base pairs these predictions can be compared with Molecular Dynamics simulations and they agree very well. However the efficiency of the cgDNA+ model allows genome length scales to be scanned in order to identify mechanically exceptional sequence fragments, including in an epigenetically modified sequence alphabet. Large data sets and aspects of machine learning arise both in a) fitting model parameters (20K+) to molecular dynamics training data (some terabytes of time series data), and in b) identifying common patterns in the billions of Gaussian PDFs that are generated when a genome is scanned with a sliding sequence window.