P0835 Using RNASeq to Profile Soybean Seed Development from Fertilization to Maturity

Sarah I. Jones , University of Illinois, Urbana, IL
Lila Vodkin , University of Illinois, Urbana, IL
To understand gene expression networks leading to functional properties and compositional traits of the soybean seed, we have undertaken a detailed examination of soybean seed development from a few days post-fertilization to the mature seed using Illumina high-throughput transcriptome sequencing (RNASeq). RNA was sequenced from seven different stages of seed development, yielding between 12 million and 76 million sequenced transcripts. These have been aligned to the 79,000 gene models predicted from the soybean genome recently sequenced by the Department of Energy Joint Genome Institute. Data are given in RPKMs, representing reads per kilobase per million mapped reads. By taking into account both the total number of mapped reads per sample and the length of the gene model, RPKMs can be used to compare across different gene models and samples. Genes involved with some storage proteins had their highest expression levels at the stage of largest fresh weight, confirming previous knowledge. Over one hundred gene models were identified with high expression exclusively in young seed stages; these were annotated as being related to many basic components and processes such as histones and ubiquitination. Other gene models, including those annotated as transcription factors and seed proteins, showed high expression in the dry, mature seeds, perhaps indicating the preparation of pathways needed later, in the early stages of imbibition. Many gene models with unknown annotations showed high expression at both very young and dry, mature stages, suggesting intriguing areas for future research.