Scientists from the University of Chicago and Johns Hopkins University have developed a new technique that promises to significantly enhance the rate of novel-gene discovery, a process that becomes increasingly difficult as the Human Genome Project moves closer to completion.
"We have identified and corrected a fundamental flaw in the current genomewide gene studies," note the authors in the April 11 issue of the Proceedings of the National Academy of Sciences. The technique they developed "provides a much higher degree of novel gene identification than the current approaches," adds lead author San Ming Wang, Ph.D., assistant professor of medicine at the University of Chicago.
Although two thirds, more than 90,000, of the estimated 140,000 human genes have already been identified, finding the final third could be far more challenging. These elusive, as yet undiscovered genes tend to be expressed at low levels or only in certain cell types, or turned on only during specific developmental stages or growth conditions. Yet these genes may play important roles in normal processes or in the development or prevention of various diseases.
To speed the search for these unknown genes, scientists use a "subtraction" method to remove the bulk of ubiquitous, already-identified genes from a sample pool of genetic material, leaving behind a higher percentage of the undiscovered genes expressed less frequently or at low levels in an attempt to increase the probability of discovering these genes.
However, this subtraction method often results in a nightmare; many of these unknown genes disappear after this treatment. The authors have figured out how and why this occurs.
Genes are located in DNA. They are expressed as messenger RNA from which protein is made. This messenger RNA is the target for gene discovery.
In order to identify genes, the messenger RNA needs to be converted back into DNA molecules called complementary DNA. The subtraction method is then applied to remove the known, highly expressed genes in order to identify the unknown genes.
Almost every messenger RNA contains a natural long "tail" -- a strand of hundreds of adenosine bases located on one end. The conventional method being used in past decades converts each messenger RNA into DNA molecule, which also includes a long "tail" copied from the messenger RNA.
During the subtraction reaction, tangled hybrids form randomly between these long tails of unrelated DNA molecules. Because all the hybrids are removed after subtraction, many unknown genes in the sample pool become the victims; they are also removed inadvertently even before having a chance to be discovered. The genes expressed at low level are particularly effected by this phenomenon.
The authors of the PNAS paper devised a specific method to truncate these troublesome tails, reducing them from hundreds down to tens of adenosine bases. They showed that removal of the tails resulted in the retention of 1.4 to 7.8 times higher number of copies in four out of five colon-specific genes expressed at low levels after subtraction.
When they applied their method to fish out genes in a previously well-characterized sample from colon cells, they found that many unknown genes were still hidden within that sample and these genes can be identified with their new method.
"This could make a real difference in the increasingly difficult process of cataloguing the human genes," said Janet Rowley, M.D., Blume-Reise Distinguished Service Professor in the departments of medicine, molecular genetics & cell biology, and human genetics at the University of Chicago and director of the study. "It could also simplify the postgenome process of learning what these less common genes do and when they are expressed."
Additional authors of the study include Scott Fears and Jian-Jun Chen of the University of Chicago and Lin Zhang of Johns Hopkins.
This work was supported by the National Cancer Institute, the American Cancer Society and the G. Harold and Lelia Y. Mathers Foundation.