News Release

Consortium including Brazilians sequences the reference genome of Arabica coffee

The work makes it possible to tell the story of the fusion of genomes that gave rise to the world’s most consumed species, as well as identifying genes responsible for resistance to rust and other diseases

Peer-Reviewed Publication

Fundação de Amparo à Pesquisa do Estado de São Paulo

Consortium including Brazilians sequences the reference genome of Arabica coffee


Study may guide the development of varieties better adapted to climate change 

view more 

Credit: Gian Barros

Coffee is one of the world’s most traded commodities, and Coffea arabica is the most widely consumed of the 130 or so species that exist. It is the result of the fusion of two other species: Coffea canephora (known in Brazil as Conilon or Robusta coffee) and Coffea eugenioides. In the last decade, almost every major commodity in the world has had a reference genome sequenced, but coffee has only recently joined the list.

The reference genome is essential for developing varieties that are better adapted to climate change and resistant to disease. By sequencing the reference genome of Arabica coffee in an unprecedented endeavor, a consortium of scientists was able to select genes that may be responsible (candidate genes) for coffee’s resistance to rust and other diseases. At the same time, they identified the expression of some genes related to the aroma of Arabica.

“With the knowledge of the genome, it is possible to obtain information that allows us to go in two directions: the development of varieties by directing crossbreeding, in other words, as a reference to guide us in future crossbreeding that produces new varieties; and more direct interventions, such as modifying a gene specifically,” summarizes Douglas Domingues, currently a researcher at the Plant Genomics and Transcriptomics Group of the Luiz de Queiroz School of Agriculture at the University of São Paulo (ESALQ-USP), in Brazil, and one of the authors of the paper (developed when he was still working at the Rio Claro campus of the São Paulo State University).

According to him, there was a bit of a race to sequence the genome. “The price of sequencing has come down a lot, and coffee was one of the few commodities that hadn’t had its reference genome sequenced. There were other groups trying, and there was a paper published just before ours. But most of them used the standard strategy: choosing an interesting plant for cultivation and sequencing its genome,” he reports.

The group to which Domingues belongs has sequenced a plant that is not interesting from an agronomic point of view but has a lot to offer from a genetic point of view. “The advantage of our reference genome is that it’s derived from a ’dihaploid’ individual. This results in a homogeneous reference genome that will be a superior standard for future research,” explains Patrick Descombes, coordinator of the work and senior expert in genomics at the Nestlé Institute of Food Safety & Analytical Sciences. He explains that Arabica coffee is a tetraploid: it has two genomes in one because it is the fusion of two other species.

By sequencing a dihaploid derived from Arabica coffee compared to a common tetraploid variety, scientists get a clearer and more simplified view of the genome. This makes it possible to identify variations between similar genes with greater precision, facilitating the use of molecular information for improvement studies.

In this study, the group was able to determine more precisely when this fusion took place: no more than 600,000 years ago, C. canephora and C. eugenioides fused to form this tetraploid hybrid, which continued its evolutionary path. “We came to this conclusion using DNA information from Arabica, Robusta and Eugenioides: we were able to make a more accurate inference because previously this interval was dated at between 50,000 and 1 million years. We reduced that window to 350,000 to 600,000 years,” reports Domingues.

The article, published in Nature Genetics on April 15, was the result of a consortium of scientists from more than ten countries, including Brazil, which participated with more than one institution. In Domingues’ case, his participation was partially funded by FAPESP through a Young Researcher project and a postdoctoral fellowship awarded to Suzana Tiemi Ivamoto-Suzuki, also an author of the article.

“We used the reference sequence to understand the diversity that exists in wild Arabica coffees, from the African region of origin, and compare this with the Arabica coffees that are cultivated today,” says the ESALQ-USP scientist, explaining that the group resequenced Arabica coffee varieties planted in different parts of the world, as well as wild specimens collected in the forests of Ethiopia, and managed to understand the difference between the wild and cultivated ones.

To gain a genomic perspective on the evolutionary history of Arabica, the consortium sequenced 46 accessions, including three Robusta, two Eugenioides and 41 Arabica. The latter included an 18th-century type specimen (the physical specimen designated by the author of the taxon at the time of the description as the material on which it was based), 12 cultivars with different breeding histories, the Timor hybrid (a spontaneous cross of Arabica with the pest-resistant C. canephora Robusta variety) and five of its backcrosses with Arabica and 17 wild accessions plus three wild/cultivated ones collected from the east and west sides of the Great Rift Valley in Ethiopia.

“We used the latest genomic technologies, i.e. long reads from the high-fidelity PacBio system [for gene sequencing] and proximity ligation with short reads from Illumina [an integrated system for analyzing genetic variation and biological function], to generate the chromosome assembly. This combination resulted in a chromosome-level assembly of the highest quality and integrity,” says Descombes.


According to the ESALQ-USP professor, among the cultivated species, something very important for breeding was the introduction of genes for resistance to coffee leaf rust. “In the 1930s, Brazil played an important role in this regard. And the IAC [Agronomic Institute of Campinas, also in the state of São Paulo] is a pioneering center for studies and breeding. IAC researchers provided us with plants that predate the institution’s breeding program, which dates back to the 1930s. Disease-oriented breeding emerged between the 1960s and 1970s, and the main work was to cross a rust-resistant Arabica plant, the so-called Timor hybrid, with plants grown in various countries so that the new varieties would be resistant. But it wasn’t known which genes were responsible for the resistance.”

Discovered in the fields of Timor Island in the 1920s, the Timor hybrid is naturally resistant to rust and other diseases. “In addition to rust, coffee berry disease, coffee berry borer and coffee stem borer are three other major pests affecting production in many regions of the world. Climate change is also a key concern in the control of pests and diseases, as it allows them to spread to new regions. The trade of green coffee beans between different regions is another factor that can facilitate the spread of certain pests and diseases to new areas,” reveals Maud Lepelley, manager of the Plant Genetics and Chemistry group at the Nestlé Institute of Agricultural Sciences.

In the paper now published, the group has managed to find sets of genes already linked in the literature to disease resistance that are only present in post-improvement varieties. “Somehow the Timor hybrid managed to get these resistance genes, and now we know which ones. There are dozens of them, but we’ve narrowed the search. Arabica coffee has 69,000 genes; we’ve narrowed it down to just under 30 genes. Being able to identify these candidate resistance genes, which were previously unknown, is an unprecedented achievement in our research,” points out Domingues.

But the work is far from over, as these genes have yet to be tested. “More research will be needed to identify and create varieties that are resistant to these and other coffee pests and diseases,” says Lepelley.

Using molecular genetics, the consortium was also able to make a triple separation, showing that the genetic diversity of Ethiopia’s wild plants differs from that of the coffee grown today, probably due to a bottleneck effect and domestication, as few plants were selected for this process. “We’ve shown here that the genetic diversity was already very low among wild specimens due to multiple pre-domestication bottlenecks, and that the genotypes selected for cultivation by man, both the ancient local Ethiopian varieties and the more recent ones, were already somewhat mixed between divergent lineages,” the scientists state.


At the same time, Domingues’ group was able to observe some events related to the expression of genes linked to coffee quality, especially aroma. They studied the terpene synthase enzymes, which in plants are related to defense against insects, as well as a gene related to lipid compounds in coffee, which codes for fatty acid desaturase 2.

“We observed in an Asian Arabica variety that the genes associated with aroma and flavor are expressed more in the fruit by the C. eugenioides subgenome than by the other parent. In other words, one of the genomes contributes more to the sensory characteristics of the drink than the other. What we’re wondering now is: does this apply to all the varieties we’ve sequenced, both pre- and post-improvement?” says Domingues.

“This study sheds light on how interactions between C. canephora and C. eugenoides genes are associated with Arabica coffee traits such as aroma. Elucidating the interactions between genes helps to improve our knowledge of the genetic mechanisms underlying important characteristics of Arabica, a fundamental prerequisite for developing new varieties that will guarantee the production of coffee beans for future coffee products,” says Lepelley.

A spin-off of the work is already underway, according to Domingues. “I’ve just started another project, which is an offshoot of this first effort, in partnership with the French researchers who were part of this consortium. We’re now going to analyze non-cultivated coffee species. We want to get to know the genome of non-coffee species that contain characteristics that are relevant in a climate change scenario. We’re focusing on sequencing species that are more climate resilient. We want to know what genes they have that Arabica coffee doesn’t have and that make them climate-resistant. Eventually, we could introduce or modify them through gene editing to make cultivated species more resistant.’

About São Paulo Research Foundation (FAPESP)

The São Paulo Research Foundation (FAPESP) is a public institution with the mission of supporting scientific research in all fields of knowledge by awarding scholarships, fellowships and grants to investigators linked with higher education and research institutions in the State of São Paulo, Brazil. FAPESP is aware that the very best research can only be done by working with the best researchers internationally. Therefore, it has established partnerships with funding agencies, higher education, private companies, and research organizations in other countries known for the quality of their research and has been encouraging scientists funded by its grants to further develop their international collaboration. You can learn more about FAPESP at and visit FAPESP news agency at to keep updated with the latest scientific breakthroughs FAPESP helps achieve through its many programs, awards and research centers. You may also subscribe to FAPESP news agency at

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.