The database, developed by researchers at the University of Pennsylvania, is described in a paper in the Oct. 3 issue of Nature, which is devoted to the sequencing of the Plasmodium genome. More than a million people die of malaria each year, and widespread access to information on the parasite's genetic composition should speed the search for new drugs and vaccines to combat the disease.
"The sequencing of Plasmodium falciparum has generated huge amounts of data," said David S. Roos, director of Penn's Genomics Institute, who has spearheaded the Plasmodium database project. "It is important to provide researchers with access to this data as soon as possible and to equip them with tools to transform this data into a useful form."
The Web-based database builds upon sequencing efforts conducted by researchers at the Institute for Genomic Research in Rockville, Md., Stanford University and Britain's Sanger Institute.
"Malaria biologists are a more diverse and dispersed community than those who study fruit fly or yeast genomes," Roos and his co-authors write this week in Nature. "They encompass field scientists in Cameroon, epidemiologists in Papua New Guinea, pharmaceutical developers in India [and] molecular geneticists in Brazil. ... Having the data literally 'in hand' provides scientists everywhere with a sense of ownership and involvement in the Plasmodium genome project, expediting the pace of research and discovery."
Scientists can use the PlasmoDB database to examine chromosome organization, to scan the genome for genes, to predict the structure of these genes and the function of the proteins they encode, to look for patterns of nucleotides or amino acids, and to search for gene functions analogous to those found in other organisms, including promising targets for drug and vaccine design.
An early version of the PlasmoDB web site -- http://PlasmoDB.
"The purpose of PlasmoDB is not to provide scientists with a single 'right' answer," said Roos, a professor of biology at Penn. "Instead, the database should help researchers filter the overwhelming number of sequences in the genome down to a few genes suitable for experimental analysis -- in short, to let computers do what computers do well, and to let people do what people do well."
Vast as the scale of the malaria parasite genome project may be -- nearly 30 million letters of DNA code -- this information pales in comparison with the data produced by other projects made possible now that the genome sequence is available. Proteomics approaches permit analysis of all proteins made by the parasite as it follows its complicated path from mosquito to human and from the liver to the blood. Transcription profiling allows scientists to examine how and when all the genes in the genome are turned on, such as during exposure to drugs used for chemotherapy.
While Plasmodium falciparum causes the most devastating form of human malaria, several other species also cause the disease, and it is now possible to compare the genomes of many such organisms. Comparison with the human genome sequence permits the identification of parasite-specific differences that may provide an Achilles' heel for drug design.
The Plasmodium database project takes advantage of pioneering work in genome database design conducted in the Computational Biology and Informatics Laboratory at Penn, under the direction of Christian J. Stoeckert and the late G. Christian Overton in the Department of Genetics at Penn's School of Medicine. Other key members of this team include Jonathan Crabtree, Jonathan Schug and Brian Brunk. New data-mining tools specific for the Plasmodium project were developed by Jessica Kissinger, Martin Fraunholz and Bindu Gajria, and the stand-alone CD-ROM version is the brainchild of Jules Milgram, all in Penn's Department of Biology.
Financial support for the Plasmodium database comes from the Burroughs Wellcome Fund and the World Health Organization.