P0543 High throughput SNP discovery in the bovine lactome by RNA sequencing

Saumya Wickramasinghe , Department of Animal Science, University of California, Davis, Davis, CA
Gonzalo Rincon , Department of Animal Science, University of California, Davis, Davis, CA
Alma Islas-Trejo , Department of Animal Science, University of California, Davis, Davis, CA
Juan F. Medrano , Department of Animal Science, University of California, Davis, Davis, CA
High-throughput sequencing of RNA (RNA-Seq) is an efficient, low cost method to identify SNP in expressed regions of the genome. SNP discovery was performed in 24 RNA-seq libraries of milk and mammary tissues of Holstein, Jersey, Brown Swiss and Shorthorn cows. A total of 472 million 40bp reads were generated with Illumina GAII sequencer and analyzed with CLC Genomics workbench 3.7. Analysis revealed 160,873 SNP in Holsteins, 62,355 in Jerseys, 38,929 in Brown Swiss and 23,237 in Shorthorns. In Holstein, a larger number of samples (n=14) increased the detection of polymorphic SNP (67%) within the breed. The other three breeds that had smaller sample size had lower percentage of (37-39%) polymorphic SNP. Relationship among the number of SNP per gene, exon length and expression level of the genes was analyzed in Holstein samples. Exon length of the genes was not an important factor in determining the number of SNP detected per gene. However the genes with higher expression and low level of conservation showed higher number of SNP. Detected SNP in Holstein samples were validated by comparing their reference position to the bovine dbSNP database, confirming that 63% of the RNA-Seq detected SNP corresponded to dbSNP entries. High correspondence between the RNA-Seq SNP and dbSNP entry provides strong evidence that RNA-Seq is an efficient and cost effective method for SNP discovery