P0395 High throughput SNP genotyping in Brassica napus L.: SNP detection in genomic areas associated to traits of agronomical and nutritional importance

Wayne E. Clarke , Agriculture & Agri-Food Canada, Saskatoon, SK, Canada
Cristell Navarro , Centro de Genómica Nutricional Agroacuícola (CGNA), Temuco, Chile
Daniel J. Gerhardt , Roche NimbleGen, Madison, WI
Humberto Gajardo , Centro de Genómica Nutricional Agroacuícola (CGNA), Temuco, Chile
Andrew Sharpe , National Research Council, Saskatoon, SK, Canada
Isobel Parkin , Agriculture and Agri-Food Canada, Saskatoon, SK, Canada
Maria Laura Federico , Centro de Genómica Nutricional Agroacuícola (CGNA), Temuco, Chile
Federico L. Iniguez-Luy , Centro de Genómica Nutricional Agroacuícola (CGNA), Temuco, Chile
Targeted enrichment of specific genomic regions allows for large-scale resequencing in species with large and complex genomes. In this study, we combined Roche NimbleGen sequence capture microarray technologies with NGS Roche 454 Life Science chemistry (454FLX-T) to discover single nucleotide polymorphisms (SNPs) in 50 specific genomic areas previously associated to yield, yield component traits, seedling vigor, seed quality and a disease resistance trait in five allopolyploid Brassica napus genotypes. Sequence information was compiled into 890 FASTA files annotated from scaffold bins and raw genomic data, totaling approximately 51 Mb. A 2.1 million feature sequence capture arrays representing 93.4-98.3% target coverage was used to hybridize the five B. napus genotypes. Captured DNA sequenced with 454FLX chemistry yielded an average number of 917,702 reads with an average total of 345,199,848 bases and an average length of 370 bp. On average 80% of the NGS reads mapped back to the reference genome providing great coverage of the examined genomic locations. SNP markers were detected using the CLC bio’s Genomic Workbench software and a combination of in house build Perl scripts protocols. A total of 60,000 putative haploSNP markers were identified based on their unique flanking sequence and polymorphic frequencies within genotypes and between the reference genome. Putative haploSNP were classified by their genomic context (distribution of interrogated region, coding vs. non-coding etc.). This classification was used to select a subset of putative SNP markers to be validated in segregating populations.