P0537 Improving the Honey Bee Consensus Gene Set

Anna Bennett , Biology Department, Georgetown University, Washington, DC
Christine Elsik , Biology Department, Georgetown University, Washington, DC
We produced an improved official gene set (OGSv3.0) for honey bee (Apis mellifera) to facilitate comparative analyses and manual annotation.  Challenges encountered in generating the first honey bee official gene set (OGSv1.0), published in 2006, included the AT richness of the genome, highly heterogeneous GC composition, limited EST/cDNA data and the large evolutionary distance between honey bee and other sequenced genomes.  To improve the gene set and detect genes believed to be missing from OGSv1.0, the Baylor College of Medicine Human Genome Sequencing Center sequenced the genomes of two closely related species, dwarf honey bee (A. florea) and bumble bee (Bombus terrestris), and deep sequenced multiple A. mellifera tissue transcriptomes.  OGSv3.0 represents a significant improvement as it includes several thousand genes that are not present in OGSv1.0.  Understanding the reasons these genes were not predicted in OGSv1.0 will lead to more effective gene prediction strategies for new genome projects.  To determine whether genes that were not detected in OGSv1.0 have common characteristics that make them more challenging to predict, we compared them to the OGSv1.0 genes.  We evaluated features such as tissue expression specificity, coding feature length, GC content and existence of arthropod homologs.  Our analysis of genes that were difficult to predict will prove valuable to those working to improve gene prediction algorithms.