Large areas of medically important genes fall within troublesome regions of the human genome, where it is currently difficult to obtain accurate sequence information, according to research published in the open access journal Genome Medicine. On average, one fifth of each of these medically important genes is challenging for today's gene sequencing methods to decipher, and the information in these gene regions may be key to a patient's diagnosis or treatment plan.
To optimize medical care, an accurate account of each patient's genetic code is needed to predict risk for disease and to select appropriate medication. The study by researchers from Stanford University highlights the medical consequences of sequencing errors.
Such errors include false positives (identifying genetic mutations that aren't really there) as well as false negatives (failing to detect legitimate disease-causing mutations). Both can have profound consequences for patient care. For example, a false positive mutation in BRCA2, a well-known gene associated with hereditary breast and ovarian cancer, could lead to risk-reducing surgeries, such as double mastectomy and oophorectomy. Thus, a wrongly identified mutation could potentially lead to radical and unnecessary surgeries.
The Stanford team used a gold-standard genome sequence, provided by the US National Institute of Standards and Technology (NIST). This genome, belonging to a female of European ancestry, had been previously sequenced with five different sequencing technologies. The NIST team combined the results from all five technologies to develop a reliable consensus sequence in regions of the genome where the technologies agreed. A reliable consensus was achieved for just 77% of this donor's genome.
Looking at how these "high confidence" areas of the donor's genome overlap with 3,300 genes known to cause human disease, the researchers found that for 593 of these genes, less than half of the crucial protein-coding regions are in areas that can reliably be sequenced.
There is a group of 56 disease genes regarded as most medically "actionable" by the American College of Medical Genetics and Genomics (ACMG), including BRCA2. ACMG guidelines now require clinical genetic testing labs to screen all patients undergoing exome or genome sequencing for disease-causing mutations in these 56 genes, which are involved in treatable conditions ranging from hereditary cancer to life-threatening cardiac arrhythmias. A patient might initially undergo sequencing to identify the cause of their autism, for example, yet would also be informed of an incidental finding in BRCA2, with the goal of predicting or even preventing disease.
Yet for these medically-important genes, the Stanford researchers found that only 80% of each gene's protein-coding regions, on average, can be sequenced with confidence.
This study also shows that the majority of disease-causing mutations identified to date fall within easy-to-sequence areas. Specifically, among disease-causing mutations catalogued in the database ClinVar, more than 80% fall within high-confidence regions of the NIST genome. Furthermore, the overwhelming majority of these ClinVar mutations (greater than 98%) are in stretches of unique DNA sequence, long known to be easier to sequence.
These findings highlight the need for sequencing methods that better penetrate hard-to-sequence regions of the genome, accurately revealing disease-causing mutations there that may currently be obscured.
Lead author Rachel Goldfeder, from Stanford University, says, "As this technology moves from the research lab to the clinic, we need to be able to accurately and reliably sequence entire genomes, because incorrect sequence information can lead to inappropriate medical care. The good news is that, in this case, 77% of the donor's genome was reliably sequenced using current methods. The challenge now is to focus our efforts on the other 23%--namely, on regions of the genome that remain elusive. Only then can we realize the full potential of precision medicine."
T: +44 (0)20 3192 2054
Notes to editor:
1. Medical implications of technical accuracy in genome sequencing
Rachel L. Goldfeder, James R. Priest, Justin M. Zook, Megan Grove, Daryl Waggott, Matthew Wheeler, Marc Salit, Euan Ashley
Genome Medicine 2016
During the embargo period, please contact Alanna Orpen for a copy of the article.
After the embargo lifts, the article will be available at the journal website here:
Please name the journal in any story you write. If you are writing for the web, please link to the article. All articles are available free of charge, according to BioMed Central's open access policy.
2. An open access journal at the cutting edge of genomic and high-throughput technologies, medical discovery, and clinical application, Genome Medicine publishes high quality peer-reviewed articles of broad interest. Current areas of focus include precision medicine, novel methods and software, disease genomics and epigenomics, immunogenomics, infectious disease, microbiome, and systems medicine.
3. BioMed Central is an STM (Science, Technology and Medicine) publisher which has pioneered the open access publishing model. All peer-reviewed research articles published by BioMed Central are made immediately and freely accessible online, and are licensed to allow redistribution and reuse. BioMed Central is part of Springer Nature, a major new force in scientific, scholarly, professional and educational publishing, created in May 2015 through the combination of Nature Publishing Group, Palgrave Macmillan, Macmillan Education and Springer Science+Business Media. http://www.