New in-depth rainbow trout transcriptome reference and digital atlas of gene expression

Salem, Mohamed

Sequencing the rainbow trout genome is underway and a transcriptome reference sequence is required to help in genome assembly and gene discovery. Previously, we reported a transcriptome reference sequence using a 19X coverage of 454-pyrosequencing data. Although this work added a great wealth of annotated EST sequences, the transcriptome is still far from being complete. In addition, tissue-specific gene expression was not included in the previous study.

The high-throughput Illumina HiSeq platform was used to sequence non-normalized cDNA libraries from 15 vital tissues (in separate lanes). To improve assembly and annotation of a single transcriptome sequence, libraries were constructed from a single double-haploid individual that has been used in the previous transcriptome study and in the reference genome assembly.

A total of ~1.3 billion (100 bp) paired-end reads were de novo assembled using ABySS assembler (kmer values k29-k95) which yielded 2,719,815 contigs (>500bp). To improve assembly, 454-pyrosequencing isotigs from the previous assembly (139,390) were added and contigs were clustered into 830,969 groups of contig sequences. A total of 508,729 (61%) of the contigs were assigned to 42,492 proteins; (26,501 from zebrafish, 3,826 from tetraodon and 12,165 from NR protein databases); leaving 322,240 contigs without matches in databases. Tissue specific digital gene-expression patterns were profiled by mapping original reads from each tissue to the new reference assembly.

The study provides the most comprehensive assembled and annotated transcriptome resource that is available for functional genome research in rainbow trout. Additionally, it provides a digital atlas of tissue gene expression and tissue-specific genes.

P0663 New in-depth rainbow trout transcriptome reference and digital atlas of gene expression