New York University and the American Museum of Natural History have received a $1.6 million grant from the National Science Foundation to explore plant evolution and to create a public database that provides information about the structure and inferred function of proteins found in two plant genomes. The three-year grant will allow the researchers at both institutions to investigate ground-breaking methods for exploring the evolution, structure, and function of proteomes-- the entire array of proteins expressed by a genome.
Richard Bonneau and Michael Purugganan, biologists from NYU's Center for Genomics and Systems Biology, along with Rod DeSalle, curator from the Sackler Institute for Comparative Genomics at AMNH, will model species Arabidopsis thaliana (the most widely studied plant model system, and the first sequenced plant) and the rice plant Oryza sativa. For these species, the researchers will combine information about the structure of proteins with information about the evolution of those proteins. By mapping information about the evolutionary importance of parts of genes onto their predictions about how the proteins that are encoded by those genes fold into 3D shapes they hope to glean critical insights into what the many thousands of proteins in these genomes are actually doing. The work will result in a novel resource for other scientists working on several types of species.
The project will rely on bioinformatics--the use of algorithms as well as computational and statistical techniques to conduct research in biology--in achieving these aims. It will be carried out on the World Community Grid (wcgrid.org), an-IBM backed computing platform composed of more than 400,000 volunteers from all over the world (anyone can volunteer by going to wcgrid.org and downloading the grid agent, which will then fold proteins when you are not using your computer). The grid employs an infrastructure developed by IBM and the contributions of its volunteer researchers to embark on genome-wide structure predictions in a cost effective way. NYU biologists, headed by Bonneau, who holds an appointment in NYU's Courant Institute of Mathematical Sciences, have already obtained structure predictions for more than 150 genomes using the grid.
Proteins are synthesized in cells as long polymers that fold to form three-dimensional shapes critical for their function; knowledge of the three-dimensional structure of proteins can be crucial for inferring their specific function. The researchers will use multiple state-of-the-art methods for predicting protein structure from these protein sequences. These methods include "Rosetta," a computer program NYU researchers have previously used in predicting protein structure.
These methods will be especially useful for annotating the large fraction of proteins in plant genomes whose functions are currently unknown, the majority of which do not have any annotation of 3D folded structure--that is, no detectable similarity to another protein with a known structure. In this case annotating the proteins effectively means translating the gene sequence into predictions about the gene's function in the cell.
The project will be linked with a continuing education program for high school teachers at NYU's Steinhardt School for Culture, Education, and Human Development to train teachers how to incorporate bioinformatics into high school science curricula.