A cup of joe with a side of genetic history, please!

Written by:

A recent study published in Nature Genetics sequenced the genomes of different coffee plant varieties and uncovered new insights about the history of Coffea arabica as well as the species’ potential disease resistance genes.

Photo by mali maeder from Pexels: https://www.pexels.com/photo/black-coffee-fruit-picked-during-daytime-92354/

You know how you have two copies of each chromosome? Well, that’s not the case for all organisms–some have even more! Many species of plants exhibit polyploidy, meaning that they have multiple copies of their genome. These extra copies can drive evolutionary change in the species as the plants either undergo large genomic changes or adapt to their newfound chromosomal abundance. Potential genomic changes include loss of genes, rearrangements of genetic material on chromosomes, and changes in gene expression.

One of the most common coffee plants, Coffea arabica, has four copies of each chromosome (known as tetraploidy). After two ancestral coffee species interbred, the resulting C. arabica ended up with its tetraploid state. Since coffee trade makes up the largest segment of tropical beverage trade, and its production can create jobs and income, C. arabica is an important crop. Efforts to protect its cultivation in the face of climate and disease challenges can help preserve its role in the global economy.

In order to better understand the history of C. arabica’s cultivation, a team of researchers sequenced the DNA from not only C. arabica, but also two other coffee species (Robusta and Eugenioides) which each descended from one of the ancestral parents of C. arabica. The sequencing data suggested that after becoming polyploid, C. arabica’s parental genomes were largely able to coexist.

But how do scientists actually use a sequence of letters representing the genome to understand how a species has evolved? For many computational genomics tools, a sequenced genome is compared to a known, or reference sequence. Based on sequence similarity to the reference, functional elements, including specific genes, can be predicted. There are also tools that use mathematical models to help researchers predict the history of populations over time, including evolutionary splits and migrations (in this study, a tool called FastSimcoal2 was used to model C. arabica’s population history).

Gaining a more detailed understanding of C. arabica’s evolutionary past required the researchers to sequence many more strains. In total, they sequenced over three dozen C. arabica genomes and some additional strains of Robusta and Eugenioides. They detected some specific chromosomal regions in C. arabica where exchanges between the ancestral parent chromosomes had led to either Robusta or Eugenioides DNA sequence being more common. In other words, not all genes have the Robusta and Eugenioides versions equally. The researchers also traced the geographic origins of some widely consumed crop lineages of C. arabica to Yemen, where human cultivation of the species began. 

Further analysis of the C. arabica genomes suggested that there were two long-term population bottlenecks (instances where the population is greatly reduced, which can lead to lower genetic diversity) in wild plants; the first ran from ~350,000 years ago to ~15,000 years ago, and the second started ~5,000 years ago and is still ongoing. C. arabica was predicted to have become polyploid sometime between 610,000 years ago and 350,000 years ago (the start of the first bottleneck), which is much earlier than previous research had estimated. 

Notably, both wild and cultivated strains have low genetic diversity. This could make them susceptible to environmental and pathogenic threats, since there may not be enough variety for a subset of the species to survive a challenge. On the other hand, it can be helpful if genes that enhance resilience are widespread in the population. Some of the coffee strains sequenced in this study were descendents of a hybrid coffee plant from Timor that is resistant to the fungus that causes the disease, coffee leaf rust. The sequenced descendents share a genomic region containing resistance-related genes, but it’s unknown whether a single or multiple genes confer the resistance. Nonetheless, the researchers were able to come up with a list of candidate genes based on their sequencing data and data from a previously published study that examined gene expression in resistant and non-resistant coffee plants infected with the fungus.

Narrowing down candidate immunity genes for plants can allow us to understand how crops respond to pathogens and to predict which varieties will be able to withstand particular challenges. Genomics-informed breeding strategies offer major benefits for cultivating crops that have significant roles in the economy, including coffee.

Edited by: Zach Patterson, JP Flores


Discover more from GeneBites

Subscribe to get the latest posts sent to your email.

Leave a comment