Public Release: 

Researchers develop new framework for analyzing genetic variants

A study from the 1000 Genomes Project yields data for analyzing structural variants in DNA

Brigham and Women's Hospital

Boston, MA - Advances in DNA sequencing technology have revolutionized biomedical research and taken us another step forward in personalized medicine. Now, scientists led by Brigham and Women's Hospital (BWH), Harvard Medical School (HMS), the Broad Institute, the Wellcome Trust Sanger Institute (WTSI), the University of Washington, and the European Molecular Biology Laboratory, have developed a new framework for analyzing key genetic variations that previously were overlooked. The research will be published in the February 3 issue of the prestigious journal Nature.

Identifying genetic differences between individuals previously concentrated on single-nucleotide polymorphisms (SNPs), single letter differences in a person's DNA, which could be informative about a person's disease or even his/her predisposition to a disease. However, more recently, it has been appreciated that each person's genome also carries an enormous amount of structural variation- deletions, duplications, insertions, and inversions in the genetic sequence.

"There are many structural variants in everyone's genomes and they are increasingly being associated with various aspects of human health" said Charles Lee, PhD, a clinical cytogeneticist at BWH and associate professor at HMS, and co-chair of this project. "It is important to be able to identify and comprehensively characterize these genetic variants using state-of-the-art DNA sequencing technologies."

The genetic sequences of 185 individuals were generated by the 1000 Genomes Project and comprehensively analyzed for structural variants by 57 scientists from 26 institutions. Scientists quickly realized that conventional methods for detecting SNPs could not be applied to the identification of SVs and 19 new computer programs and strategies had to be developed and tested to more accurately identify SVs. "The study found that no one program could comprehensively identify SVs and that each program had advantages and disadvantages that in some cases complemented other analytical programs," said Matthew Hurles, DPhil, of the Wellcome Trust Sanger Institute and co-chair of the project.

The study found a total of 22,025 deletions and 6,000 other structural variants. "We have been given our first glimpses of the complete spectrum of human genetic variation - from 1 bp indels to larger copy number changes," said Evan Eichler, PhD, a Howard Hughes Investigator at the University of Washington and co-chair of the project.

The study also provided important insights into how SVs are formed in the genome, thus linking SVs to mutational processes acting in the germline. "We found 51 hotspots where SVs, such as large deletions, appear to occur particularly often," said Jan Korbel, PhD, a senior author of this study from the European Molecular Biology Laboratory in Heidelberg, Germany. "Six of those hotspots are in regions known to be related to genetic conditions, such as Miller-Dieker syndrome, a congenital brain disease that may lead to infant death."

Data from this project are being made publically available to the scientific community through the 1000 Genomes Project, which aims to sequence the genomes of 2500 people by December 2012. The resource will be the largest collection of whole-genome DNA sequences freely available to researchers. The data may be accessed from the 1000 Genomes Project Data Coordination Center, a collaboration between the NIH National Center for Biotechnology Information (NCBI) and the European Bioinformatics Institute (EBI), at

"Identifying SVs from DNA sequencing datasets is very challenging and it is gratifying to see the incredible progress that the SV group has made over the past 2 years," said Richard Durbin, PhD, of the Wellcome Trust Sanger Institute and co-chair of the 1000 Genomes Project. David Altshuler, MD, PhD, of the Broad Institute, also a co-chair of the 1000 Genomes Project, added, "I am confident that this map will serve as an important resource for future sequencing-based disease association studies."


Organizations that have committed major support for the project include Illumina; Life Technologies; the Wellcome Trust Sanger Institute; and the NHGRI, which supports the work being done at Baylor College of Medicine, Brigham and Women's Hospital; Boston College; Broad Institute; Cold Spring Harbor Laboratory; Washington University of St. Louis; University of California San Diego; University of Washington; and Yale University. Other institutions involved in this research include BGI-Shenzhen; Howard Hughes Medical Institute; Leiden University Medical Center; Louisiana State University; Max Planck Institute for Molecular Genetics; Mount Sinai School of Medicine; Roche; Simon Fraser University; Stanford University; University of Oxford; and University of Copenhagen.

Brigham and Women's Hospital (BWH) is a 793-bed nonprofit teaching affiliate of Harvard Medical School and a founding member of Partners HealthCare, an integrated health care delivery network. BWH is the home of the Carl J. and Ruth Shapiro Cardiovascular Center, the most advanced center of its kind. BWH is committed to excellence in patient care with expertise in virtually every specialty of medicine and surgery. The BWH medical preeminence dates back to 1832, and today that rich history in clinical care is coupled with its national leadership in quality improvement and patient safety initiatives and its dedication to educating and training the next generation of health care professionals. Through investigation and discovery conducted at its Biomedical Research Institute (BRI), BWH is an international leader in basic, clinical and translational research on human diseases, involving more than 900 physician-investigators and renowned biomedical scientists and faculty supported by more than $ 537 M in funding. BWH is also home to major landmark epidemiologic population studies, including the Nurses' and Physicians' Health Studies and the Women's Health Initiative. For more information about BWH, please visit

The Eli and Edythe L. Broad Institute of MIT and Harvard was founded in 2003 to empower this generation of creative scientists to transform medicine with new genome-based knowledge. The Broad Institute seeks to describe all the molecular components of life and their connections; discover the molecular basis of major human diseases; develop effective new approaches to diagnostics and therapeutics; and disseminate discoveries, tools, methods and data openly to the entire scientific community. Founded by MIT, Harvard and its affiliated hospitals, and the visionary Los Angeles philanthropists Eli and Edythe L. Broad, the Broad Institute includes faculty, professional staff and students from throughout the MIT and Harvard biomedical research communities and beyond, with collaborations spanning over a hundred private and public institutions in more than 40 countries worldwide. For further information about the Broad Institute, go to

The Wellcome Trust Sanger Institute, which receives the majority of its funding from the Wellcome Trust, was founded in 1992 as the focus for UK sequencing efforts. The Institute is responsible for the completion of the sequence of approximately one-third of the human genome as well as genomes of model organisms such as mouse and zebrafish, and more than 90 pathogen genomes. In October 2005, new funding was awarded by the Wellcome Trust to enable the Institute to build on its world-class scientific achievements and exploit the wealth of genome data now available to answer important questions about health and disease. These programmes are built around a Faculty of more than 30 senior researchers. The Wellcome Trust Sanger Institute is based in Hinxton, Cambridge, UK.

The European Bioinformatics Institute (EBI) is part of the European Molecular Biology Laboratory (EMBL) and is located on the Wellcome Trust Genome Campus in Hinxton near Cambridge (UK). For more information, go to

The European Molecular Biology Laboratory is a basic research institute funded by public research monies from 20 member states (Austria, Belgium, Croatia, Denmark, Finland, France, Germany, Greece, Iceland, Ireland, Israel, Italy, Luxembourg, the Netherlands, Norway, Portugal, Spain, Sweden, Switzerland and the United Kingdom) and associate member state Australia. Research at EMBL is conducted by approximately 85 independent groups covering the spectrum of molecular biology. The Laboratory has five units: the main Laboratory in Heidelberg, and Outstations in Hinxton (the European Bioinformatics Institute), Grenoble, Hamburg, and Monterotondo near Rome. The cornerstones of EMBL's mission are: to perform basic research in molecular biology; to train scientists, students and visitors at all levels; to offer vital services to scientists in the member states; to develop new instruments and methods in the life sciences and to actively engage in technology transfer activities. Around 190 students are enrolled in EMBL's International PhD programme. Additionally, the Laboratory offers a platform for dialogue with the general public through various science communication activities such as lecture series, visitor programmes and the dissemination of scientific achievements.

NHGRI is one of 27 institutes and centers at the NIH, an agency of the Department of Health and Human Services. The NHGRI Division of Extramural Research supports grants for research and for training and career development t sites nationwide. Additional information about NHGRI can be found at its Web site, The National Institutes of Health - "The Nation's Medical Research Agency" - includes 27 institutes and centers, and is a component of the U.S. Department of Health and Human Services. It is the primary U.S. federal agency for conducting and supporting basic, clinical and translational medical research, and it investigates the causes, treatments and cures for both common and rare diseases. For more, visit

The National Center for Biotechnology Information (NCBI) creates public databases in molecular biology, conducts research in computational biology, develops software tools for analyzing molecular and genomic data, and disseminates biomedical information, all for the better understanding of processes affecting human health and disease. NCBI ( is a division of the National Library of Medicine (, the world's largest library of the health sciences.

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.