W288 Exploring the megagenome of Pine by targeted resequencing

Date: Sunday, January 15, 2012
Time: 10:40 AM
Room: Sunrise
Leandro Gomide Neves , University of Florida, Gainesville, FL
Loblolly pine is an ecologically and economically important conifer species, but the size of its megagenome (~21.7 Gbp) is an obstacle to its complete sequencing. To begin characterizing the genetic position of all loblolly pine genes, we captured genic regions of 72 individuals of a segregating family using Agilent’s SureSelect, followed by sequencing using Illumina GAIIx and HiSeq. Custom probes were designed to capture 6.6 Mbp (~0.03% of the genome) of 14,729 genes. Sequence capture resulted in an enrichment of ~1000 times of the target region, and was highly reproducible among the 72 individuals. Following sequencing, reads from each individual were aligned to the reference, and 4,563 segregating SNPs were detected at high stringency. Genes represented by these SNP were genetically mapped, generating a valuable resource for comparative mapping and genome assembly. For a portion of the genes targeted by the probes, capture of paralogs and pseudogenes seem to occur, complicating read alignment and polymorphism detection. To address this limitation, we have developed a bioinformatics pipeline that separates targeted sequences from paralogs and pseudogenes, to increase the number of polymorphisms detected. Finally, we are now developing methods to identify gene copy number variation based on targeted resequencing data.