P0009 Repeat reduction in Illumina libraries prior to sequencing using duplex-specific nuclease

Alexander Kozik , UC Davis Genome Center, Davis, CA
Lutz Froenicke , UC Davis Genome Center, Davis, CA
Marta Matvienko , CLC bio
Dean Lavelle , UC Davis Genome Center, Davis, CA
Belinda Martineau , UC Davis Genome Center, Davis, CA
Richard Michelmore , UC Davis Genome Center, Davis, CA
Current DNA sequencing technologies provide opportunities to generate massive amounts of sequence data. Analyses of large plant and animal genomes have been complicated by the presence of repetitive sequences of varying degrees of complexity and sequence divergence. Several uses of sequence data, such as gene and SNP discovery as well as genotyping, would benefit from libraries with reduced abundance of repeated sequences. We refined a method for reducing the high-copy components in libraries prior to sequencing using Illumina Genome Analyzer or HiSeq systems. DNA libraries are denatured to single strands and then allowed to partially reanneal. Treatment with a thermostable duplex-specific nuclease (DSN) after an appropriate reannealment period results in the selective destruction of the more rapidly re-annealing high-copy sequences leaving the low-copy component to be amplified and sequenced. As a part of Compositae Genome Project http://compgenomics.ucdavis.edu/ the lettuce transcriptome and gene space have been sequenced using this repeat reduction approach, assembled and analyzed using CLC Genomics Workbench. Experiments were designed to investigate the consequences of variables in the DSN protocol. These demonstrate that 2 to 3 fold enrichment of gene space can be achieved for large plant genomes such as lettuce (2.7 Gb) that are comprised of more than 70% repeated sequences.