High-Throughput Targeted Genotyping of Coffea arabica and Coffea canephora Using Next Generation Sequencing

Date: Sunday, January 10, 2016
Time: 4:40 PM
Room: Pacific Salon 3
Marcio Resende , RAPiD Genomics LLC, Gainesville, FL
Eveline Caixeta , EMBRAPA, Višosa, Brazil
Emilly Ruas Alkimin , Federal University of Višosa, Vicosa, Brazil
Tiago Vieira Sousa , Federal University of Višosa, Vicosa, Brazil
Marcos D.V. Resende , EMBRAPA - sucursal, Vicosa, Brazil
Srikar Chamala , RAPiD Genomics, Gainesville, FL
Leandro G Neves , RAPiD Genomics LLC, Gainesville, FL
Coffee is an important tropical crop in the world. Among the different species, C. canephora and C. arabica are the most widely planted. One of the challenges for the breeding and genomic characterization of Coffee, specially C. arabica, is the low genetic diversity and complex polyploid nature of its genome. Here, we present the development of a multi-species, genome-wide, high-throughput genotyping platform for Coffee. The strategy is based on the targeted genome capture of 40,000 regions in the Coffee genome followed by next-generation sequencing. These regions were bioinformatically identified to avoid repetitive elements and screen a large number of annotated genes. To capture these regions, we designed probes using a combination of genomic resources, including the C. canephora reference genome and assembled unigenes specific to each of the two species. We evaluated the method on 72 samples from C. canephora and 72 from C. arabica. This population resulted in the discovery of 162,026 SNPs in 27,651 polymorphic probes, with a median of 5 SNPs per probe. From this total, 33,239 SNPs were specific to C. arabica and 87,271 SNPs were specific to C. canephora. The assay resulted in 3% median missing data, out of which 967 and 40 SNPs were missing in all the individuals of C. arabica and C. canephora, respectively, indicating the discovery of inter-specific presence and absence (PAV) variants. This assay represents a new tool for the Coffee community that can help future genome assemblies, accelerate breeding, unravel the genetic basis of traits of interest and manage genetic diversity in the species.