News Release

Haplotype map offers new insights into human disease, evolution

Peer-Reviewed Publication

Broad Institute of MIT and Harvard

Cambridge, MA, Wed., Oct. 26, 2005 – In several papers published this week in Nature, Nature Genetics, PLoS Biology and Genome Research, Broad researchers and an international set of collaborators announce substantial advances in relating human genetic variation to disease and understanding human evolutionary history.

This flurry of high-profile studies are grounded in data described in a significant paper published in the Oct. 27 issue of the journal Nature by an international consortium of more than 200 researchers from Canada, China, Japan, Nigeria, the United Kingdom and the United States. In this paper, the authors describe the common patterns of genetic variation in human DNA samples collected from four sites around the world. The Consortium's findings provide overwhelming evidence for previous scientific work suggesting that genetic variants located physically close to each other are collectively inherited as groups, called haplotypes. The comprehensive catalog of all of these blocks, known as the "HapMap," which is now publicly available to the biomedical research community, has already accelerated the search for gene variants involved in common diseases and brought new insights into the genes involved in human evolution.

"Built upon the foundation laid by the human genome sequence, the HapMap is a powerful new tool for exploring the root causes of common diseases. Such understanding is required for researchers to develop new and much-needed approaches to understand the still-elusive root causes of common diseases such as diabetes, bipolar disorder, cancer and many others, " said David Altshuler, M.D., Ph.D., director of the program in Medical and Population Genetics at the Broad Institute of Harvard and MIT and associate professor of genetics and of medicine at at Massachusetts General Hospital and Harvard Medical School. Altshuler and Peter Donnelly, Ph.D., of the University of Oxford in England, are the corresponding authors of the Nature paper.

It has long been known that diseases run in families, with perhaps half of the risk of any given common disease explained by genetic differences inherited from one's parents. Inheritance can also play a role in different responses to a drug or to an environmental factor. Because the underlying causes of these common diseases and therapeutic responses remain largely unknown -- and because knowing this information is necessary for successful development of new approaches to prevention, diagnosis and treatment -- identifying the genetic contributors to human health is a fundamental goal for biomedicine.

A new genomics-based approach to human genetics was proposed nearly a decade ago to comprehensively catalog common human DNA sequence variations, and to test them systematically for their association to disease in human populations. Although it is theoretically possible to capture all of this information by sequencing every individual human genome, this is neither technically nor financially feasible at present. "The data from the HapMap project allows scientists to select the particular DNA variants that provide the greatest information in the most efficient manner, lowering the costs and increasing the power of genetic research to identify the origin of disease," said Mark Daly, assistant professor in the Center for Human Genetic Research at Massachusetts General Hospital, and an associate member of the Broad Institute of Harvard and MIT. Daly led the Boston team's statistical and analytical work, and was a member of the writing group for the Nature paper.

Moreover, the HapMap project helped spur a remarkable advance in the technology for testing genetic variations in DNA, making it possible to undertake comprehensive studies in large patient samples. Stacey Gabriel, director of the Broad Institute's Genetic Analysis platform, and an author on the Nature paper noted that "when we started doing this work a number of years ago, determining the genotype of a SNP in a patient cost nearly a dollar, and we could do hundreds a day. Today the prices have dropped in many cases to a fraction of a penny per genotype, and we can do millions a day. This is the difference between not being able to do the studies, and getting them done rapidly and well."

In a related paper published in the November issue of Nature Genetics, Paul de Bakker, Roman Yalensky and their colleagues demonstrate that the HapMap provides excellent power to capture most human variation and link it to disease or other traits. They did this by developing and evaluating methods to select "tag SNPs" that capture the genetic variation in each neighborhood with a minimum amount of work. Using these tags, the scientists can then compare the SNP patterns of people affected by a disease with those unaffected far more efficiently than has previously been possible. "Compared to directly genotyping all common SNPs in the genome in all individuals of a disease study, we observe that selected tag SNPs based on HapMap can save genotyping costs by almost an order of magnitude without losing much power to detect a true association," says de Bakker, a postdoctoral fellow in Altshuler and Daly's group at Massachusetts General Hospital and the Broad Institute. The widely used tool for tag SNP selection was developed by de Bakker and colleagues and is available at

Another important observation revealed by the availability of the HapMap data is that previous computer models of human genetics are too simplistic and can lead to false conclusions about the role of genes or genetic loci in different diseases. In a paper published in the November issue of Genome Research, Stephen Schaffner, Altshuler and their colleagues at the Broad Institute demonstrate the limitations of these prior models. They also provide updated models for the use of the entire scientific community that more closely approximate reality, based on the empirical data generated by the HapMap Consortium. "Better computer models can be valuable tools in understanding the nature of human DNA variation, past changes in human populations size, and evolutionary selection," said Schaffner, a computational biologist in Broad's program in Medical and Population Genetics.

The public availability of HapMap's genome-wide variation data set also makes it possible for scientists to systematically examine potential sites of natural selection in the human genome as well as to re-evaluate previous claims for such selection. Pardis Sabeti, Eric Lander and their colleagues at the Broad Institute together with Stephen O'Brien and his colleagues at the National Cancer Institute used the HapMap data to examine a prominent reported case of natural selection related to HIV infection. As they report in the November issue of PLoS Biology, CCR5-?32, a genetic variation in a T-cell receptor that confers strong resistance to infection by HIV and that has been implicated in resistance to the bubonic plague, did not arise recently in the human population, as was previously thought to be true based on the more limited data available at that time. "With the benefit of greater genotyping and empirical comparisons from the HapMap, we were able to show that the pattern of genetic variation seen at CCR5-?32 does not stand out as exceptional relative to other loci across the genome and is consistent with neutral evolution," says Sabeti, a student at Harvard Medical School and a postdoctoral fellow at the Broad Institute. "In fact, the CCR5-?32 allele is likely to have arisen more than 5000 years ago, rather than during the last 1000 years as was previously thought." In addition to its power in re-examining previous claims of selection, the HapMap data also give scientists new ability to identify novel candidates for natural selection.

The successful completion of the HapMap has its roots not only in the completion of the human genome sequence in 2001, but also in the massive effort to characterize and catalog the millions of SNPs across the genome. Based on these initial data, the haplotype structure of the human genome was recognized as early as 2001, leading directly to the formation of the International HapMap Consortium. Finally, methods for identifying the influence of natural selection on the human genome were described in 2003. Altshuler, Lander, Gabriel, Daly and many other Broad Institute scientists led or contributed significantly to all of these efforts, in addition to their role in the completion of the HapMap and demonstrations of its utility, as outlined above.

In October 2002, the International HapMap Consortium set the ambitious goal of creating the HapMap within three years. The Nature paper marks the attainment of that goal with its detailed description of the Phase I HapMap, consisting of more than 1 million SNPs. The consortium is also nearing completion of the Phase II HapMap that will contain nearly three times more SNPs than the initial version and will enable researchers to focus their gene searches even more precisely on specific regions of the genome.

In line with the Broad Institute's commitment to building critical resources for the scientific community, HapMap data are freely available in several public databases, including the HapMap Data Coordination Center (, the NIH-funded National Center for Biotechnology Information's dbSNP ( and the JSNP Database in Japan ( Further information can also be found at the Broad Institute's web site (

The U.S. component of the $138 million International HapMap Consortium is led by National Human Genome Research Institute (NHGRI) on behalf of the 20 institutes, centers and offices of the National Institutes of Health (NIH) that contributed funding. For more details on the project's scientific design and rationale, see For a complete list of participating research organizations and funders, see

Sabeti PC, Walsh E, Schaffner S, Carilly P, Fry B, et al. The case for selection at CCR5-?32.
PLoS Biology. 2005;3(11): e378.

De Bakker PIW, Yelkensky R, Pe'er I, Gabriel SB, Daly MJ, Altshuler D. Efficiency and power in genetic association studies. Nature
Genetics Advance Online Publication 23 Oct 2005 (doi:10.1038/ng1669) (

Schaffner SF, Foo C, Gabriel S, Reich D, Daly MJ, Altshuler D. Calibrating a coalescent simulation of human genome sequence variation.
Genome Research. 2005;15(11): 1576-1583.

The International HapMap Consortium. A haplotype map of the human genome.
Nature. 2005;437(7063): 1299-1320.


About the Broad Institute of MIT and Harvard

The Broad Institute of MIT and Harvard was founded in 2003 to bring the power of genomics to biomedicine. It pursues this mission by empowering creative young scientists to construct new and robust tools for genomic medicine, to make them accessible to the global scientific community, and to apply them to the understanding and treatment of disease.

The Institute is a research collaboration involving faculty, professional staff and students from throughout the MIT and Harvard academic and medical communities. The two universities govern it jointly.

Organized around Scientific Programs and Scientific Platforms, the unique structure of the Broad Institute enables scientists to collaborate on transformative projects across many scientific and medical disciplines.

For further information about the Broad Institute, go to

For more information on this news release, contact:
Fintan Steele or Michelle Nhuch
Broad Institute of MIT and Harvard
617-324-1698 or 617-252-1064 (office)
617-281-3568 (cell)

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.