Videos

Genomic Identification of Structural RNAs using phylo-SCFGs

Presenter
October 30, 2007
Abstract
RNA structures often evolve with characteristic substitution patterns that preserve base-pairs in spite of changes in primary sequence. With the advent of closely related full-length genomes, it has become possible to exploit this comparative signal for genomic identification of structural RNAs (1). Phylo-SCFGs (2) are attractive models for this problem since they can describe both RNA structure, using stochastic context-free grammars (SCFGs), and sequence evolution, using phylogenetic models. Using variations of classical algorithms, multiple alignments with any number sequences can be handled efficiently. EvoFold implements this approach and has been used to screen multiple-sequence genomic-alignments of both vertebrates and Drosopholids for structural RNAs (1,3). This has resulted in hundreds of high-confidence novel candidates of both ncRNAs and cis-regulatory structures. 1) Identification and Classification of Conserved RNA Secondary Structures in the Human Genome. Pedersen JS , Bejerano G, Siepel G, Rosenbloom K, Lindblad-Toh K, Lander ES, Kent J, Miller W, and Haussler D. PLoS Comput Biol. 2006 Apr;2(4):e33. 2) Using stochastic context free grammars and molecular evolution to predict RNA secondary structure. Knudsen B and Hein JJ. Bioinformatics. 1999; 15 (6): 446-454. 3) Discovery of functional elements in 12 Drosophila genomes using evolutionary signatures. Stark A, Lin MF, Kheradpour P, and Pedersen JS, et al. 2007 (in press).