W692 Sequencing Sugarcane

Date: Sunday, January 15, 2012
Time: 2:50 PM
Room: Royal Palm Salon 1-2
Alan Durham , University of São Paulo, São Paulo, Brazil
André Y. Kashiwabara , Universidade Tecnologica Federal do Paraná, Brazil
Abdalla Almeida , Universidade de Sao Paulo
Carlos Takeshi Hotta , Institute of Chemistry - University of São Paulo, São Paulo, Brazil
Marie-Anne Van Sluys , Universidade de Sao Paulo, Sao Paulo, Brazil
Glaucia Souza , Institute of Chemistry - University of São Paulo
We are interested in understanding the complexity of the sugarcane genome, the contribution of different alleles to traits of interest and the definition of gene networks. We are sequencing the sugarcane genome using several plataforms. In this work we will describe data for 109 Bacterial Artificial Chromosomes (BACs) from the library of sugarcane hybrid R570. On these BACs we predicted genes and performed an initial categorization of the corresponding proteins.  We have also used a whole genome shotgun approach to sequence the commercial variety SP80-3280. The BACs were assembled using PHRAP and manual curation.  An initial assembly of the WGS data using Newbler produced 1.1 million contigs (totaling 830Mpb assembled). We are using sorghum genes as an initial quality indicator of the assemblies. In parallel we are also developing an experimental pipeline to automatically obtain high quality assemblies of the promoter regions. For gene prediction we are using Augustus, a widely used gene predictor, MYOP a locally developed gene prediction platform and PASA, a EST to genome mapping software that is used to validate gene predictions with sugarcane EST data (SUCEST) and estimate accuracy of the ab initio gene predictors. The gene predictors produced significantly different predictions. Since our validations showed MYOP’s higher success rate in conflicting predictions against Augustus, the latter predictions were used only in regions without any MYOP prediction. This resulted in 1881 candidate genes.  We have annotated the gene set with BLAST against Swissprot/Uniprot and also mapped known promoters in 56 sites in putative promoter regions.