Languages that involve "clicks" are relatively rare worldwide but are spoken by several groups in Africa. The Khoisan language family includes a handful of these click languages, spoken by hunter-gatherer groups in southern and eastern Africa. But the grouping of these populations into a single language family has been controversial, with some linguists convinced that a few of the languages are too different to be classified together.
A genomic study of 50 African populations led by researchers at the University of Pennsylvania adds some clarity to the relationships between these click-speaking groups and many others. The results point to a relatively recent shared ancestry for a few of the click-speaking hunter-gatherer populations, indicating they are more closely related to one another than to their neighbors that practice other subsistence lifestyles, such as farming or animal herding.
The analysis, one of the most extensive of its kind of ethnically diverse populations in Africa, also demonstrates the importance of infectious disease, immunity, and diet in shaping the diversity of popluations across Africa. The work is published in the journal Proceedings of the National Academy of Sciences.
"It's very rare to have a study of this many groups that are genetically different in terms of ancestry, in their susbsistence patterns, and are geographicaly dispersed as well," says Sarah Tishkoff, a geneticist and Penn Integrates Knowledge Professor who was the senior author on the paper. "This allows us to characterize population structure and demographic history as well as to look at signatures of natural selection acting on these populations."
The analysis builds upon decades of work by the Tishkoff lab and African collaborators to explore African genetic diversity. The research, says Tishkoff, facilitates genomics research overall by examining populations that have been otherwise understudied, and it can play a role in identifying genetic variants that influence health and disease in Africa and around the world.
This study probes deeply into the genomic landscape of 840 Africans, identifying 621,000 separate nucleotides in the DNA of each participant.
The 50 groups surveyed are spread across sub-Saharan Africa and include almost all groups that practice a hunting-and-gathering lifestyle, or have until recently.
Tishkoff, Scheinfeldt, and colleagues were particularly interested in what these study participants' genomes would reveal about ancient relationships among hunter-gatherer populations, particularly those speaking languages that had been classified as Khoisan. East Africa's Hadza and Sandawe hunter-gatherers had been labeled Khoisan by some linguistic analyses, grouped with southern Africa's San hunter-gatherers.
"Some linguists say it's not correct to place all of these into the Khoisan family, arguing that the Hadza and Sandawe languages are so different from each other and from the San that they really should be in separate language classifications," says Tishkoff.
The researchers also included study participants from the Dahalo of Kenya, who have never been studied genetically but speak a language with remnant clicks. "It's an ongoing question in linguistics and genetics," Tishkoff says, "and we wanted to ask the question, 'Do these groups with click phonemes have a common genetic ancestry?'"
They were also curious to know whether a shared subsistence lifestyle practice--that of hunting and gathering--indicated a shared ancestry. Among the 16 hunter-gatherer populations they studied was a group called the Sabue who live in southwestern Ethiopia, surrounded by pastoralist groups. The Sabue had never before participated in genomic research and speak a language that is thus far unclassified.
Using the genetic information they obtained to map out the populations' likely relationships to one another, the researchers unexpectedly found that four hunter-gatherer populations--the Hadza, Sandawe, Dahalo, and Sabue, each of whom dwell in distinct areas of eastern Africa--clustered together.
"Typically what we see is that populations cluster by geography, but here we're seeing an exception to that," Tishkoff says. "Here you have three groups that either speak a click language, have remnant clicks, or have an unclassified language, and they're showing a common ancestry even though they're spread across different countries."
Although the researchers could not identify a uniquely shared ancestry between these four groups of eastern African Khoisan hunter-gatherers and the southern African San people, who also speak a language with clicks, they did observe shared ancestry between the San and rainforest hunter-gatherers from Central Africa, despite being geographically far apart.
In contrast, other hunter-gatherer groups, such as the Wata, El Molo, and Yaaku, appeared more genetically similar to neighboring agriculturalist and pastorlist groups.
The common ancestry for the four East Africa hunter-gatherer groups dates back more than 20,000 years ago, according to the team's analysis, around the beginning of the last glacial maximum, when ice covered extensive portions of Earth and the climate was much different than it is today.
"The idea is that this may have changed environmental conditions and introduced a barrier between populations," says Laura Scheinfeldt, the lead author who was a research associate in Tishkoff's lab, and is now with the Coriell Institute for Medical Research.
The researchers' techniques also allowed for a better understanding of the forces that have acted to differentiate the groups they studied.
"What we found was the strongest signatures of adaptation tended to be population-specific," says Scheinfeldt. In other words, targets of natural selection were different in the different groups and may well have contributed to the uniqueness of each.
Despite these individual differences, the categories of the genes that were selected were shared among populations, the researchers discovered.
"Genes involved in immune responses, diet, and metabolism were the broad categories that we saw coming up over and over again," Scheinfeldt notes. "We know infectious disease in general is a very strong pressure, and, when you look solely at how prevalent malaria is, that also explains some of the patterns we see in adaptive signatures. Just that one disease is a very strong selective pressure."
In future studies, Tishkoff and colleagues will be zooming in to see how particular genetic variants may affect physical traits in the people who possess them, studies that could shed light on genetic causes of disease susceptibility. They'll also be using powerful whole-genome sequencing techniques to further illuminate the relationships among Africa's diverse populations.
Tishkoff and Scheinfeldt's coauthors on the study were Penn's Sameer Soi, Charla Lambert, Wen-Ya Ko, Aoua Coulibaly, Alessia Ranciaro, Simon Thompson, Jibril Hirbo, William Beggs, and Junhyong Kim; the University of Khartoum's Muntaser Ibrahim; St. Joseph University College of Health Sciences' Thomas Nyambo; the Kenya Medical Research Institute's Sabah Omar; Addis Ababa University's Dawit Woldemeskel and Gurja Belay; and the Musée de l'Homme's Alain Froment.
The research was supported by the Lewis and Clark Fund, University of Pennsylvania, Leakey Foundation, National Institutes of Health (grants AI007532, ES022577, DK104339, and ES019851), and National Science Foundation (Grant 1540432).
Sarah Tishkoff is the David and Lyn Silfen University Professor and a Penn Integrates Knowledge Professor at the University of Pennsylvania, with appointments in the Department of Genetics in the Perelman School of Medicine and in the Department of Biology in the School of Arts and Sciences.
Laura Scheinfeldt is the principal investigator of the National Institutes of Neurological Disorders and Stroke's Human Genetic Resource Center at the Coriell Institute for Medical Research. She was formerly a research associate in the Tishkoff laboratory at the University of Pennsylvania.