The work was unveiled today at the Annual Meeting of the American Association for the Advancement of Science (AAAS), which publishes Science.
The mapping effort describes 1.58 million single-letter DNA variations across 71 individuals of European American, African American and Han Chinese American ancestry. Although the human genome contains millions more single-letter variations, or single-nucleotide polymorphisms (SNPs), they seem to occur within patterns that have been preserved for thousands of years -- despite the DNA reshuffling that happens from generation to generation. The new mapping effort therefore appears to capture most common human genetic variation, researchers said.
The work is believed to have significant implications for the study of cardiovascular disease, mental illness, and many other conditions thought to result from a complex interplay of multiple genetic and environmental factors.
Researchers made use of the fact that two genes located closer together are far less likely to be reshuffled over generations by the biological process known as recombination. As a result, certain patterns of variation, or "haplotypes," have been preserved across human history. The presence of these patterns, known as "linkage disequilibrium," allowed the Science authors to create a first picture of the structure of human genetic variation based on short- and long-range clustering of single-letter variations (SNPs).
Most common DNA variations are found across all populations and likely date back to the exodus of modern humans out of Africa. But, other genetic differences may be specific to certain populations, explained David R. Cox of Perlegen Sciences, Inc., in Mountain View, California. The research thus should "provide a tool for exploring many questions remaining regarding the causal role of common human DNA variation in complex human traits and for investigating the nature of genetic variation within and between human populations," the Science paper concludes.
But, the findings, based on publicly accessible data, won't resolve ongoing debate over whether distinct races of people exist, explained Cox, corresponding author of the study, with lead author David A. Hinds, also of Perlegen Sciences. "Our study was really designed to help us understand patterns of variation that are common and cut across populations," said Hinds.
"People cannot use our data to say, 'See, I told you there are races,' or, 'See, I told you there aren't races,'" said Cox. "Nor can they say, 'See, the differences are more important than the similarities. But, these data will be useful for starting to address such questions as which kinds of medical treatments should be used, based on physiological differences caused by genetic variations."
In a related Science Policy Forum article, Troy Duster of New York University described the new research as "well-intentioned, well-crafted, and designed to help better understand the molecular basis of disease." But, he also urges caution, noting that biomedical researchers must "climb back on the tightrope to address racial disparities in health" to avoid "reification" or reinforcement of outdated concepts of race.
David Altshuler, author of a related Science Perspectives essay, said that the new data "represent a major step forward." Moreover, said Altshuler, of the Broad Institute at Harvard and Massachusetts Institute of Technology and the Massachusetts General Hospital, Boston: "It is exciting to note that Perlegen and the public HapMap Project are now working together to generate an even denser map."
The more detailed description of genetic variations is expected later this year from the international HapMap Project, directed by government agencies from Japan, China and Canada, as well as The Wellcome Trust of London and the U.S. National Institutes of Health. That mapping effort will describe variation across individuals of Japanese, Chinese, Nigerian and European ancestry, said Cox, who also works with the public group.
Navigating the Genome
Around the world, life's genetic blueprint is 99.99 percent similar from person to person, with the greatest degree of variation found inside Africa, where DNA has been evolving for the longest period of time. But, single-letter variations, called "snips," for single-nucleotide polymorphisms (SNPs), may determine each person's disease-vulnerability and response to different medications, as well as a broad range of other traits, from eye color and height to hair color and body type. The relationship between such traits and single DNA-letter variations is poorly understood, though, and scientists also know little about the importance of common versus rare SNPs. An estimated 7 million common SNPs can be found within 5 percent of the entire human population, the Science paper notes. Another 4 million SNPs are far less common, turning up in only 1 percent to 5 percent of the world's entire population. Still more rare DNA variations may be found only in a single individual.
To investigate genetic variation, the Science authors took advantage of patterns among interconnected SNPs. A small, known section of genetic code containing a SNP can be used to predict larger, related chunks of sequence, and thus, provides a shortcut for mapping whole-genome patterns of genetic variations shared by people of European American, African American and Han Chinese American ancestry.
Key to the research, explained Barbara Jasny, senior supervisory editor for Science, was a phenomenon of genetic inheritance called linkage disequilibrium, which is related to the distance between two variants or alleles of one gene in a single person. The further the distance between two gene variants, Jasny explained, the greater the odds that one variant may be recombined or shuffled out of the sequence block as DNA moves from parent to child. Conversely, if the variations are closer together, they are more likely to remain together following DNA recombination over generations. At the same time, she said, haplotype blocks of genetic sequences containing genetic variations seem to have been preserved for thousands of years.
Genetic patterns may help point the way toward improved disease treatments. "Imagine that friends are giving you directions," Jasny suggested. "They might tell you that their house is part of a development that is three street lights south of the movie theater. Finding the street lights and the movie theater will be important for reaching your destination. Similarly, because genetic markers are not randomly associated, learning about a subset of them can help researchers find associations with disease-related genes."
"Bins" of Genetic Variation
During World War II, cryptographers could take seemingly random jumbles of numbers and convert them into crucial messages. Similarly, mapping strategies described in the Science paper seem to bring aspects of the human genomic sequence into sharper focus. Hinds, Cox and colleagues used a special mathematical algorithm, based on the principle of linkage disequilibrium, to characterize the structure of genetic diversity by distributing different SNPs within interconnected bins of genetic code spanning the whole genome. Genomic data also were assigned to haplotype maps, which were connected with bin patterns.
Before they could identify bin or haplotype patterns, though, researchers first had to investigate 2,384,494 SNPs believed to be common across diverse populations. Of these genetic differences, some 69 percent were identified by sequencing DNA samples from 24 people, then filling in holes in the resulting information using SNPs from public databases.
Next, the researchers analyzed the genetic blueprint of 71 people who were not related to each other or to the 24 people studied in the initial analysis of SNPs. The group of 71 people included 24 European Americans, 23 African Americans and 24 Han Chinese American individuals whose DNA had been archived within the Coriell Cell Repositories' Human Variation Collection. Of the 2.4 million SNPs originally described by the researchers, some 1,586,383 were found in both variant forms, or alleles, among the 71 DNA samples.
Because 157,000 of the SNPs and nine individuals from their study had previously been analyzed by the HapMap Project (www.hapmap.org), which assessed 1 million SNPs in 270 people, the Science authors were able to compare the two sets of data. In more than 99 percent of all assessments, results were identical in both studies, suggesting that "both data sets are of exceptionally high accuracy," Altshuler noted. Comparisons with data from a Seattle-based study by Debbie Nickerson and colleagues offered further confirmation of accuracy and indicated a subset of SNPs where further analyses are necessary.
Most of the resulting 1.58 million SNPs were common to all three human populations, researchers reported. Some 94 percent of the SNPs, for example, were found in both variant forms among African Americans. Among European Americans, 81 percent of all the SNPs showed up in both forms, and 74 percent of SNPs were found in both forms in DNA from individuals of Han Chinese American ancestry.
How useful is the new mapping effort? In particular, if some diseases are caused by very rare genetic variations, will a map of common variations prove truly practical? "The biological insights should be of tremendous value," Atshuler said, while also acknowledging the possibility of exceedingly rare disease-related variations. "In addition to the potential utility for disease research, such data are a spectacular resource for population and evolutionary geneticists," he concluded.
The SNPs analyzed here represent only a fraction of the more than 10 million common SNPs expected to exist in the human genome. But, Hinds and colleagues were able to demonstrate that with just this relatively small number of "street lights," they should be able to find their way to most of the common variants in the human genome, Jasny said. "As few as 300,000 or 500,000 SNPs could give us most of the information," Hinds added.
Although this research probably won't allow doctors to identify individual risk factors, scientists reported, "knowledge of a large fraction of all the major genetic risk factors contributing to a treatment response or common disease could have immediate utility, allowing existing treatment options to be matched to individual patients without requiring additional knowledge of the mechanisms by which the genetic differences lead to different outcomes."
The Science paper, "Whole Genome Patterns of Common DNA Variation in Three Diverse Human Populations," was authored by David A. Hinds, Laura L. Stuve, Geoffrey B. Nilsen, Dennis G. Ballinger, Kelly A. Frazer, and David R. Cox of Perlegen Sciences, Inc., Mountain View, California; with Eran Halperin of the International Computer Science Institute of Berkeley, California; and Eleazar Eskin of the University of California at San Diego. The research makes use of previously reported data from the International HapMap Project.
The American Association for the Advancement of Science (AAAS) is the world's largest general scientific society, and publisher of the journal, Science (www.sciencemag.org). AAAS was founded in 1848, and serves some 262 affiliated societies and academies of science, serving 10 million individuals. Science has the largest paid circulation of any peer-reviewed general science journal in the world, with an estimated total readership of one million. The non-profit AAAS (www.aaas.org) is open to all and fulfills its mission to "advance science and serve society" through initiatives in science policy; international programs; science education; and more. For the latest research news, log onto EurekAlert!, www.eurekalert.org, the premier science-news Web site, a service of AAAS.