P1006 The AUGUSTUS eukaryotic RNA-Seq-based gene prediction pipeline

Mario Stanke , University of Greifswald, Greifswald, Germany
We incorporate evidence from RNA-Seq into the gene finder AUGUSTUS. The approach uses individual RNA-Seq-to-genome alignments in an iterative fashion rather than a de novo assembly of RNA-Seq reads. At the same time the training algorithm of AUGUSTUS was much improved using a discriminative Conditional Random Field training procedure. The resulting pipeline is particularly accurate at predicting the coding sequence of genes. Another strength, due to the full ab initio model, is the prediction of lowly expressed transcripts. The pipeline accounts for alternative splicing and allows the simultaneous integration of other evidence like protein homology or peptides from tandem mass spectrometry.