W720 Genome Sequence for the Apicomplexan Sarcosystis neurona

Date: Saturday, January 14, 2012
Time: 9:00 AM
Room: Towne
Jessica Kissinger , University of Georgia, Institute of Bioinformatics, Center for Tropical and Emerging Global Diseases & Department of Genetics, Athens, GA
Daniel K. Howe , University of Kentucky-Dept. Of Vet. Science, Lexington, KY
Christopher Schardl , University of Kentucky, Lexington, KY
The current Sarcocystis neurona strain SN3.E1 genome assembly (May 2011) is comprised of 3193 contigs that come together into 172 scaffolds and suggests an approximate genome size of 124 Mb.  Transcriptome data has been generated using 454/Roche pyrosequencing (673,331 reads from merozoite and schizont stages) and Illumina (paired-end, 480 million reads from merozoite-stage parasites) platforms. All available Sanger ESTs and transcripts assembled from the next generation sequencing data have been mapped to the genome and preliminary analyses suggest ~8400 transcripts plus ~2000 alternatively-spliced transcripts with an average length of ~7500bp. Compared to the closest species with a genome sequence (T. gondii), S. neurona genes contain a similar number of introns (average 4/gene) but the average intron size is nearly twice as large at ~1400 bp. Mapped ESTs have been used to generate the required training data sets for use with the Augustus, Twinscan, GlimmerHMM, and SNAP gene finders.  A preliminary BLAST-searchable database and sequence viewer database has been established and an Apollo instance has been created to display the data needed for annotation.  A cursory search of the S. neurona genome with orthologs retained in all other sequenced apicomplexan and 2 ciliate genomes (1,088 genes) revealed that 95% were detectable in S. neurona.  Inspection of introns and the culled repeat sequences does not, as of yet, provide any insight into the larger genome size of S. neurona.