Sequencing of equine mRNA (RNA-seq) identified 428 putative transcripts which do not co-localize with any previously annotated or predicted horse genes.1 Most of these represent the equine homologs of known protein-coding genes described in other species, yet the potential exists to identify novel and perhaps equine-specific gene structures. A set of 36 transcripts were prioritized for further study by filtering for levels of expression (depth of RNA-seq tag coverage), distance from annotated features in the equine genome, the number of putative exons, and patterns of gene expression between tissues. From these, 4 were selected for further investigation based on predicted open reading frames of greater than or equal to 50 amino acids and lack of detectable homology to known genes across species. Sanger sequencing of RT-PCR amplicons from additional equine samples confirmed expression and structural annotation of each transcript as derived from the RNA-seq results. Functional predictions were made by conserved domain searches. A single transcript, expressed specifically in the cerebellum, contains a putative kruppel-associated box (KRAB) domain, suggesting a potential function associated with zinc finger proteins and transcriptional regulation. Overall levels of synteny and sequence conservation across a 1MB region surrounding each transcript were approximately 73% compared to the human, canine, and bovine genomes. However, the four loci display some areas of low conservation and sequence inversion in regions that immediately flank the unannotated equine transcripts. The question remains as to whether these loci are expressed in any non-equine species.
1) Coleman et al. 2010. Anim Genet. 41 Suppl2:121.