Identification of rare alleles and their carriers using compressed se(que)nsing
Presenter
February 13, 2012
Keywords:
- DNA sequences
MSC:
- 92D20
Abstract
Identification of rare variants by resequencing is important both for detecting novel variations and for screening individuals for known disease alleles. New technologies enable low-cost resequencing of target regions, although it is still prohibitive to test more than a few individuals. We propose a novel pooling design that enables the recovery of novel or known rare alleles and their carriers in groups of individuals. The method is based on combining next-generation sequencing technology with a Compressed Sensing (CS) approach. The approach is general, simple and efficient, allowing for simultaneous identification of multiple variants and their carriers.
It reduces experimental costs, i.e., both sample preparation related costs and direct sequencing costs, by up to 70 fold, and thus allowing to scan much larger cohorts.
We demonstrate the performance of our approach over several publicly available data sets, including the 1000 Genomes Pilot 3 study.
We believe our approach may significantly improve cost effectiveness of future association studies, and in screening large DNA cohorts for specific risk alleles.
We will present initial results of two projects that were initiated following publication. The first project concerns identification of de novo SNPs in genetic disorders common among Ashkenazi Jews, based on sequencing 3000 DNA samples. The second project in plant genetics involves identifying SNPs related to water and silica homeostasis in Sorghum bicolor, based on sequencing 3000 DNA samples using 1-2 Illumina lanes.
Joint work with Amnon Amir from the Weizmann Institute of Science, and Or Zuk from the Broad Institute of MIT and Harvard