Named Phytome, the unique library is a compilation of voluminous genetic data on 39 plant species. The list includes almost all the world's most valuable crops, among them rice, wheat, corn and potatoes.
"But it's also much more than just a repository of genetic information," said Dr. Todd J. Vision, assistant professor of biology in UNC's College of Arts and Sciences.
"It allows plant researchers to ask complex questions that involve comparisons across different genes and species, such as 'what genes with known function are related to this gene with unknown function?' or 'what biochemical functions have been gained or lost in different species,'" Vision said.
Answers to those kinds of questions will lead to plants that yield more food and resist damage from diseases and insects more successfully, he said. They also can lead to better and cheaper medicines and plant products like cotton, paper and rubber.
"A persistent challenge of genomics is how to capture, analyze and distribute the massive amounts of data being churned out at record levels from ongoing genome projects," he said. "Creating useful and accessible tools like this is critical to the field. Even if researchers have access to all the genomic data and all the methods necessary to analyze it, it takes a lot of human and computer effort to put those two things together in one user-friendly package like Phytome."
Phytome went online at www.phytome.org in the fall after two years work and already is being used by basic and applied scientists worldwide. The first version of the database contains information on more than 730,000 unique protein sequences in more than 25,000 protein families.
Analysis of that data required 460 days of computer processing time, but it was condensed into a few short weeks by use of parallel computing, Vision said.
Currently, scientists know the complete genetic makeup of only a handful of plants although humans are economically dependent on dozens of different ones, he said. It would be far too expensive and time-consuming, however, to characterize the genomes of all of them experimentally.
"Many plants are impractical to work with experimentally because they have genomes full of material that serves no known function," Vision said. "For example, the lily genome is roughly 40 times the size of the human genome."
Nevertheless, the gene content of lilies can be predicted reasonably well by looking at a related plant with a small genome, like rice, he said. To breed better crops, it's important to be able to leverage the information from the tractable model systems that scientists already know so much about.
"We will continue to add new features to Phytome over the next few years," Vision said. "Perhaps the most important will be the ability to compare the genetic maps of multiple species simultaneously and predict the gene content in regions of plant genomes that have not yet been deciphered."
Besides Vision, others involved in the development of Phytome are Dr. Stefanie Hartmann, a postdoctoral researcher in biology; Dihui Lu, a graduate student in information and library science; and computer programmer Jason Phillips.
The National Science Foundation is supporting the project with a five-year, $1 million grant it awarded to UNC in 2002.
"This is a unique resource for scientists trying to understand the genes contributing to variation in traits of economic importance in crops," Vision said.