The high-throughput Illumina HiSeq platform was used to sequence non-normalized cDNA libraries from 15 vital tissues (in separate lanes). To improve assembly and annotation of a single transcriptome sequence, libraries were constructed from a single double-haploid individual that has been used in the previous transcriptome study and in the reference genome assembly.
A total of ~1.3 billion (100 bp) paired-end reads were de novo assembled using ABySS assembler (kmer values k29-k95) which yielded 2,719,815 contigs (>500bp). To improve assembly, 454-pyrosequencing isotigs from the previous assembly (139,390) were added and contigs were clustered into 830,969 groups of contig sequences. A total of 508,729 (61%) of the contigs were assigned to 42,492 proteins; (26,501 from zebrafish, 3,826 from tetraodon and 12,165 from NR protein databases); leaving 322,240 contigs without matches in databases. Tissue specific digital gene-expression patterns were profiled by mapping original reads from each tissue to the new reference assembly.
The study provides the most comprehensive assembled and annotated transcriptome resource that is available for functional genome research in rainbow trout. Additionally, it provides a digital atlas of tissue gene expression and tissue-specific genes.