W180
Dihaploid Coffea arabica Genome Sequencing and Assembly

Date: Sunday, January 11, 2015
Time: 5:20 PM
Room: Esquire - Meeting House
Alexandre de Kochko , IRD UMR DIADE, Montpellier, France
Dominique Crouzillat , Centre R&D Nestlé Tours, Tours, France
Michel Rigoreau , Centre R&D Nestlé Tours, Tours, France
Maud Lepelley , Centre R&D Nestlé Tours, Tours, France
Laurence Bellanger , Centre R&D Nestlé Tours, Tours, France
Virginie Merot l'Anthoene , Centre R&D Nestlé Tours, Tours, France
Celine Vandecasteele , Centre R&D Nestlé Tours, Tours, France
Romain Guyot , IRD UMR DIADE, Montpellier cedex 5, France
Valerie Poncet , IRD UMR DIADE, Montpellier cedex 5, France
Christine Tranchant-Dubreuil , IRD UMR DIADE, Montpellier, France
Perla Hamon , IRD UMR DIADE, Montpellier cedex 5, France
Serge Hamon , IRD UMR DIADE, Montpellier cedex 5, France
Emmanuel Couturon , IRD UMR DIADE, Montpellier cedex 5, France
Patrick Descombes , NIHS, Lausanne, Switzerland
Deborah Moine , NIHS, Lausanne, Switzerland
Lukas Mueller , Boyce Thompson Institute for Plant Research, Ithaca, NY
Susan R. Strickler , Boyce Thompson Institute for Plant Research, Ithaca, NY
Alan Andrade , Embrapa, Brasilia, Brazil
Luiz-Filipe Protasio Pereira , Embrapa Café, Londrina, Brazil
Pierre Marraccini , CIRAD, Brasilia, Brazil
Giovanni Giuliano , ENEA - Italian Agency for New Technologies, Roma, Italy
Alessia Fiore , ENEA - Italian Agency for New Technologies, Rome, Italy
Marco Pietrella , ENEA - Italian Agency for New Technologies, Rome, Italy
Giuseppe Aprea , ENEA - Italian Agency for New Technologies, Rome, Italy
Ray Ming , University of Illinois at Urbana-Champaign, Urbana, IL
Jennifer Wai , University of Illinois Urbana-Champaign, Urbana Champaign, IL
Douglas S. Domingues , Instituto Agronômico do Paraná, Londrina, Brazil
Alexandre Paschoal , Univ. of Londrina, Londrina, Brazil
Gerrit Kuhn , Pacific Biosciences, Menlo Park, CA
Jonas Korlach , Pacific Biosciences, Menlo Park, CA
Jason Chin , Pacific Biosciences, Menlo Park, CA
David Sankoff , University of Ottawa, Ottawa, ON, Canada
Chunfang Zheng , University of Ottawa, Ottawa, ON, Canada
Victor A. Albert , University at Buffalo, Buffalo, NY
Coffea arabica which accounts for 70% of world coffee production is an allotetraploid with a genome size of approximately 1.3 Gb and is derived from the hybridization of C. canephora (710 Mb) and C. eugenioides (670 Mb). To elucidate the evolutionary history of C. arabica, and generate critical information for breeding programs, a sequencing project is underway to finalize a reference genome using a dihaploid line and a set of 30 C. arabica accessions. For the reference genome, we have generated two assemblies, one from Illumina data (>150x coverage) and a second from PacBio sequences (>50x coverage). The present assemblies cover 1,031 and 1,042 Mb, respectively. After further refinement, using Illumina mate pairs and optical mapping, the genome assemblies will be annotated using RNA-Seq. Resequencing of C. eugenioides and C. canephora has been completed and is being used to better assess homeologs within the sub-genomes. Furthermore, 30 C. arabica accessions, representing wild and cultivated genotypes, are being resequenced (20x coverage) using Illumina. A C. arabica genetic map, currently including over 600 SSR markers, that differentiate between the two sub-genomes, is used to anchor the assemblies. Newly identified SNP markers are being added to the map.

The final goals of the project are to produce a high quality reference genome, assess an eventual neo-diversification occurring in the cultivated varieties, have a better understanding of the species formation and evolution, and develop tools that will make the finished genome accessible and useful to breeders and researchers.