P0941 Genome Annotation in Ensembl

Susan Fairley , Wellcome Trust Sanger Institute, Cambridge, United Kingdom
Bronwen Aken , Wellcome Trust Sanger Institute, Cambridge, United Kingdom
Daniel Barrell , Wellcome Trust Sanger Institute, Cambridge, United Kingdom
Carlos Garcia-Giron , Wellcome Trust Sanger Institute, Cambridge, United Kingdom
Thibaut Hourlier , Wellcome Trust Sanger Institute, Cambridge, United Kingdom
Magali Ruffier , Wellcome Trust Sanger Institute, Cambridge, United Kingdom
Simon White , Wellcome Trust Sanger Institute, Cambridge, United Kingdom
Amonida Zadissa , Wellcome Trust Sanger Institute, Cambridge, United Kingdom
Steve Searle , Wellcome Trust Sanger Institute, Cambridge, United Kingdom
Ensembl (www.ensembl.org) currently provides gene annotation for over 50 vertebrate species and model organisms, with resources for other species being provided by Ensembl Genomes (www.ensemblgenomes.org). Among the species for which Ensembl provides gene annotation are zebrafish, cow, pig, chicken and dog. The gene annotation in Ensembl is typically updated when a new genome assembly becomes available. The cow UMD3.1 assembly has been recently annotated, work on the pig Sscrofa10.2 assembly is in progress and we plan to annotate the new assembly of the chicken genome. Ensembl continues to develop its gene annotation methods, with ongoing work to use RNASeq data. RNASeq annotation has been produced for zebrafish, gorilla and Tasmanian devil. Zebrafish RNAseq data has been used to generate tissue-specific sets of gene models. It is anticipated that RNASeq data will be used in the annotation of further species. In addition, users can view uploaded BAM files via the website, with this functionality being extended to multi-species view. In Ensembl, gene annotation is integrated across species via comparative genomic resources. These include genomic alignments and gene trees. Additional data, provided by Ensembl's variation and regulatory resources, can be viewed alongside the gene annotation in the browser, which also gives access to tools such as the variant effect predictor (VEP). The data produced by Ensembl is accessible via the web browser, public copies of MySQL databases, a Perl API and through a BioMart. All of Ensembl's code and data is freely available.