EpiDB: An Omics Data Resource for Cattle

James E. Koltes , Department of Animal Science, University of Arkansas, Fayetteville, AR
Eric Fritz-Waters , Department of Animal Science, Iowa State University, Ames, IA
James M. Reecy , Iowa State University, Ames, IA
Livestock genomics researchers are generating mountains of next generation sequencing data that must be deposited in a public repository for publication.  The NCBI short read archive (SRA) is a repository for many types of sequence data including RNA-seq and various regulatory markings that may mediate epigenetic regulation.  The livestock Epigenetics database (EpiDB) is a resource that filters public RNA-seq, small RNA-seq, ChIP-seq and methyl-seq data by species, tissue and sequencer type.  Only Illumina data that passes quality control (FASTQC) and is annotated for tissue type is retained for analysis.  All metatdata is captured and stored in a MySQL database that is linked to a web portal where data can be queried based on species, data type, and tissue.  Users can download metadata and access all sequence data through links to NCBI.  RNA-seq data is processed to allele specific expression values that can be used to identify differential splicing or other gene regulatory effects.  In addition, standardized expression panels were calculated to identify tissue specific transcripts and relative expression levels.  As a proof of principle, we analyzed publically available bovine functional genomics datasets to develop reference expression profiles.  These data will allow tissue specific transcripts and expression levels to be generated to allow for the comparison of gene expression levels across species as well as other downstream analyses.