Videos

Combinatorial Pooling Enables Selective Sequencing of the Barley Gene Space

Presenter
February 14, 2012
Keywords:
  • Sequences and sets
Abstract
We propose a new sequencing protocol that combines recent advances in combinatorial pooling design and second-generation sequencing technology to efficiently approach de novo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when dealing with hundreds or thousands of DNA samples, such as genome-tiling gene-rich BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundreds of million of short reads and assign them to the correct BAC clones so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is extremely accurate (99.57% of the deconvoluted reads are assigned to the correct BAC), and the resulting BAC assemblies have very high quality (BACs are covered by contigs over about 77% of their length, on average). Experimental results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate (almost 70% of left/right pairs in paired-end reads are assigned to the same BAC, despite being processed independently) and the BAC assemblies have good quality (the average sum of all assembled contigs is about 88% of the estimated BAC length). Joint work with D. Duma (UCR), M. Alpert (UCR), F. Cordero (U of Torino), M. Beccuti (U of Torino), P. R. Bhat (UCR and Monsanto), Y. Wu (UCR and Google), G. Ciardo (UCR), B. Alsaihati (UCR), Y. Ma (UCR), S. Wanamaker (UCR), J. Resnik (UCR), and T. J. Close (UCR). Preprint available at http://arxiv.org/abs/1112.4438