Vietnam features extensive ethnolinguistic diversity and occupies a key position in Mainland Southeast Asia (MSEA). Vietnam, with its borders to China, Laos and Cambodia, has a rich geographical diversity, and ample access to human migration with the Red River and Mekong deltas, and a long coastline.
The early settlement of anatomically modern humans in MSEA dates back to at least 65 thousand years ago (kya) and is associated with the formation of a hunter-gatherer tradition called Hoabinhian. Since the Neolithic period, which dates to about ~4,000-5,000 years ago, cultural transitions and diversification have happened multiple times eventually leading to the extraordinary cultural diversity in present day MSEA.
According to the General Statistics Office of Vietnam, Vietnam has a population size of more than 96 million people comprising 54 official ethnic groups; 110 languages are spoken in the country. To date, there are hundreds of ethnolinguistic groups in MSEA, speaking languages belonging to five major language families: Austro-Asiatic (AA), Austronesian (AN), Hmong-Mien (HM), Tai-Kadai (TK), and Sino-Tibetan (ST).
Yet, the genetic diversity of Vietnam has remained relatively unexplored, especially with genome-wide data, because previous studies have focused mainly on the majority Kinh group.
Now, in a new paper published in the advanced online access edition of Molecular Biology and Evolution, Dang Liu, Mark Stoneking and colleagues have analyzed newly-generated genome-wide SNP data for the Kinh and 21 additional ethnic groups in Vietnam, encompassing all five major language families in MSEA, along with previously-published data from nearby populations and ancient samples.
"We find that the Vietnamese ethnolinguistic groups harbor multiple sources of genetic diversity that are associated with heterogeneous ancestry sharing profiles in each language family," said corresponding author Nong Van Hai. First author Dang Liu added, "However, the linguistic diversity does not completely match genetic diversity; there have been extensive interactions between the Hmong-Mien and Tai-Kadai groups, and a likely case of cultural diffusion in which some Austro-Asiatic groups shifted to speaking Austronesian languages.
On a global scale, the strongest signal separates most Indian groups from the East Asian groups. They also found evidence that the majority group Kinh, which have been the focus of previous studies, may not reflect the total Vietnamese diversity. Within modern Vietnamese groups, individuals from the same language family are mostly placed together. Within these language families, the ST, HM, and TK groups are mostly separated from AA and AN groups. Vietnam ethnolinguistic groups overall tend to show the closest relationships with Taiwanese and southern Chinese groups.
"Overall, our results highlight the importance of genome-wide data from dense sampling of ethnolinguistic groups in providing new insights into the genetic diversity and history of an ethnolinguistically-diverse region, such as Vietnam," said corresponding author Mark Stoneking. "In contrast to previous studies suggesting a largely indigenous origin of the Vietnamese, we find evidence for extensive contact, over different time periods, between Vietnamese and other groups.
The study is the most wide-spread analysis to date, carrying out the most updated and informative approaches available from using modern genomic data, to better understand the rich genetic population diversity of Vietnam.