P0189 SNP Mining in the Genome of Barley

Burkhard Steuernagel , Leibniz- Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
Thomas Schmutzer , Leibniz- Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
Stefan Taudien , Leibniz Institute for Age Research, Fritz Lipmann Institute, Jena, Germany
Marius Felder , Leibniz Institute for Age Research, Fritz Lipmann Institute, Jena, Germany
Ruvini T. Ariyadasa , Leibniz- Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
Naser Poursarebani , Leibniz- Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
Ruonan Zhou , Leibniz- Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
Daniela Schulte , KWS SAAT AG, Einbeck, Germany
Thomas Nussbaumer , MIPS/IBIS, Helmholtz Zentrum München, German Research Center for Environmental Health (GmbH), Germany
Heidrun Gundlach , MIPS/IBIS, Helmholtz Zentrum München, German Research Center for Environmental Health (GmbH), Germany
Klaus Mayer , MIPS/IBIS, Helmholtz Zentrum München, German Research Center for Environmental Health (GmbH), Germany
Mattias Platzer , Leibniz Institute for Age Research, Fritz Lipmann Institute, Jena, Germany
Uwe Scholz , Leibniz- Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
Nils Stein , Leibniz- Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
The genome of barley (Hordeum vulgare) has a size of more than 5 giga-bases and is composed predominantly of repetitive DNA. Today no reference sequence is available, but we have deep-sequenced several barley cultivars.  A typical whole genome shotgun (WGS) approach using the Illumina platform assembles to more than two million contigs with a combined length of less than two giga-bases indicating that much of the repetitive content collapses during the sequence assembly.  Nevertheless comparisons to full length cDNAs show that the gene space is represented to a large extend. Thus the data can be efficiently utilized for SNP mining between sequenced (mainly homozygous) genotypes.  Our pipeline primary identifies unambiguous regions in the assembly of one genotype that is used as a reference for mapping reads of a second genotype. Considering only positions that are represented by sufficiently deep coverage and low minor-allele at polymorphic sites, a large number of SNPs can be called from the data at high confidence. Comparison to a set of known and experimentally verified SNPs in barley showed a sensitivity of 82 percent of our pipeline.