Sequencing the first tree genome
The poplar genome was the first tree to be sequenced.
Click here for a high resolution photograph.
In 2004 researchers from around the world finished sequencing the complete genome of Populus, the first tree and the third plant to have its molecular "parts list" revealed. Jerry Tuskan of ORNL's Environmental Sciences Division, who led a group that played important roles in the international effort, says the sequenced genome will bolster researchers' chances of answering several important questions. For example, "What makes a tree a tree?"
Researchers have sequenced the complete genomes of two other plants, which are neither trees nor perennial species. One plant is rice and the other, Arabidopsis, is an herbaceous weed. Comparison of the genomes of the three plants is expected to provide some answers.
Studying the Populus genome under which hybrid poplars, cottonwoods, and aspens fall, could enable scientists to address some questions of interest to the Department of Energy's Office of Biological and Environmental Research. The office funded ORNL's research effort in support of the International Populus Genome Consortium (IPGC).
These questions might be: How do individual genes influence the growth of trees, their adaptation to the natural environment, the functioning of the forest ecosystem, and its response to climate change? Can poplar trees be designed to promote storage of carbon in the soil for longer times by fixing it into a chemical form that resists microbial degradation, thus enhancing carbon sequestration and slowing the buildup of atmospheric carbon dioxide? Can poplar trees be designed to grow faster and produce higher-quality wood for building products, as well as more biomass that can be converted to liquid biofuels with higher energy content?.
"Populus was selected as the first tree genome to sequence for several reasons," Tuskan says. "The genome is small, it is easy to clone, a lot of genetic information is available on this species, and a lot of scientists have studied it. The genome is a model perennial woody plant, is fast growing, and has several uses of interest to DOE and the forest industry."
A group of researchers in ORNL's Environmental Sciences Division worked on the Populus genome for almost two years. Tuskan served as the point of contact for the IPGC and the three groups annotating the sequence, including the group led by Frank Larimer in ORNL's Life Sciences Division.
"We developed a genetic map of the Populus genome and identified 1300 simple sequence repeats, which are important DNA markers, in the map," Tuskan says. "Of 365 million DNA bases in the genome, we linked 265 megabases to the genetic map. Our second contribution was to help IPGC computational biologists 'train' gene-calling algorithms so they can identify genes in the Populus genome. We sequenced about 500 full-length cDNAs--expressed DNA sequences--and sent them to the three annotation labs."
These labs--Larimer's group in Oak Ridge, DOE's Joint Genome Institute in California, and the University of Ghent in Belgium--are developing algorithms and training them based on poplar-unique or poplar-specific genomic characteristics. About 95 to 98% of the expressed part of the genome--the part that contains genes--has been sequenced.
"If all three models from these labs predict that a particular sequence is a gene, we will have pretty high confidence that it is, in fact, a gene," Tuskan says. "If only one group predicts that a sequence has function, we may be more skeptical about whether it's a gene.
"We think the number of genes in the Populus genome will probably range from 30,000 to 35,000 genes. The process will take a decade or more of research to understand what each gene does."
What makes a tree a tree? Tuskan says researchers are already finding hints as they compare plant genomes. One answer may lie in the regulatory elements that control the expression of the structural genes that code for certain enzymes.