News Release 13-May-2014

Protein Data Bank: 100,000 structures

Business Announcement

Rutgers University

Number of Structures Available in the PDB Per Year through May 14, 2014 — image: Number of structures available in the PDB per year through May 14, 2014, with selected examples. Early structures included myoglobin (1; PDB ID 1mbn), the first structure solved by X-ray crystallography, and small enzymes (2; top: 4pti, bottom right: 2cha, bottom left: 3cpa). As technologies developed, the archive grew to host examples of tRNA (3; 6tna), viruses (4; 4rhv), antibodies (5; 1igt), protein-DNA complexes (6; top to bottom, 1j59, 1tro, 2bop, 1aoi), ribosomes (7; 1fjg, 1fka, 1ffk), and chaperones (8; 1aon).
Figure References: Myoglobin (PDB ID 1mbn): J. C. Kendrew, G. Bodo, H. M. Dintzis, R. G. Parrish, H. Wyckoff, D. C. Phillips. (1958) A three-dimensional model of the myoglobin molecule obtained by x-ray analysis. Nature 181: 662-666; J. C. Kendrew, R. E. Dickerson, B. E. Strandberg, R. G. Hart, D. R. Davies, D. C. Phillips, V. C. Shore. (1960) Structure of myoglobin: A three-dimensional Fourier synthesis at 2 A. resolution. Nature 185: 422-427; Enzymes (top, 4pti): M. Marquart, Walter, J., Deisenhofer, J., Bode, W., Huber, R. (1983) The geometry of the reactive site and of the peptide groups in trypsin, trypsinogen and its complexes with inhibitors. Acta Crystallogr B39: 480; (bottom left: 3cpa): D. W. Christianson, W. N. Lipscomb. (1986) X-ray crystallographic investigation of substrate binding to carboxypeptidase A at subzero temperature. Proc Natl Acad Sci U S A 83: 7568-7572; (bottom right: 2cha):J. J. Birktoft, D. M. Blow. (1972) Structure of crystalline alpha-chymotrypsin. V. The atomic structure of tosyl-alpha-chymotrypsin at 2 Å resolution. J Mol Biol 68: 187-240; tRNA (6tna): J. L. Sussman, S. R. Holbrook, R. W. Warrant, G. M. Church, S.-H. Kim. (1978) Crystal structure of yeast phenylalanine transfer RNA. I. Crystallographic refinement. J. Mol. Biol. 123: 607-630; Virus (4rhv): E. Arnold, M. G. Rossmann. (1988) The use of molecular-replacement phases for the refinement of the human rhinovirus 14 structure. Acta Crystallogr A 44 (Pt 3): 270-282; Antibody (1igt): L. J. Harris, S. B. Larson, K. W. Hasel, A. McPherson. (1997) Refined structure of an intact IgG2a monoclonal antibody. Biochemistry 36: 1581-1597; Protein-DNA view more

Credit: wwPDB

Spanning the globe from the US, UK, and Japan, the Worldwide Protein Data Bank (wwPDB) organization announces that the Protein Data Bank archive now contains more than 100,000 entries.

Established in 1971, this central, public archive of experimentally-determined protein and nucleic acid structures has reached a critical milestone thanks to the efforts of structural biologists throughout the world.

Four wwPDB data centers support online access to three-dimensional structures of biological macromolecules that help researchers understand many facets of biomedicine, agriculture, and ecology, from protein synthesis to health and disease to biological energy.

Function follows form

In the 1950s, scientists had their first direct look at the structures of proteins and DNA at the atomic level. Determination of these early three-dimensional structures by X-ray crystallography ushered in a new era in biology—one driven by the intimate link between form and biological function. As the value of archiving and sharing these data were quickly recognized by the scientific community, the Protein Data Bank (PDB) was established as the first open access digital resource in all of biology by an international collaboration in 1971 with data centers located in the US and the UK.

Among the first structures deposited in the PDB were those of myoglobin and hemoglobin, two oxygen-binding molecules whose structures were elucidated by Chemistry Nobel Laureates John Kendrew and Max Perutz. With this week's regular update, the PDB welcomes 219 new structures into the archive. These structures join others vital to drug discovery, bioinformatics, and education, for a total of 100,147 entries.

The PDB is growing rapidly, doubling in size since 2008, and releasing around 200 new structures to the scientific community every week. The resource is accessed hundreds of millions of times annually by researchers, students, and educators intent on exploring how different proteins are related to one another, to clarify fundamental biological mechanisms and discover new medicines.

"The PDB is a critical resource for the international community of working scientists which includes everyone from geneticists to pharmaceutical companies interested in drug targets," said Nobel Laureate Venki Ramakrishnan of the MRC Laboratory of Molecular Biology in Cambridge, UK.

A growing community

Since its inception, the PDB has been a community-driven enterprise, evolving into a mission critical international resource for biological research. Since 2003 the Worldwide PDB (wwPDB) organization, a collaboration involving four PDB data centers in the US, UK, and Japan, has ensured that these valuable data are securely stored, expertly managed, and made freely available for the benefit of scientists and educators around the globe. wwPDB data centers work closely with community experts to deﬁne deposition and annotation policies, resolve data representation issues, and implement community validation standards. In addition, the wwPDB works to raise the profile of structural biology with increasingly broad audiences.

Each structure submitted to the archive is carefully curated by wwPDB staff before release. New depositions are checked and enhanced with value-added annotations and linked with other important biological data to ensure that PDB structures are discoverable and interpretable by users with a wide range of backgrounds and interests.

Future challenges

The scientific community eagerly awaits the next 100,000 structures and the invaluable knowledge these new data will bring. However, the increasing number, size and complexity of biological data being deposited in the PDB and the emergence of hybrid structure determination methods, which use a variety of biophysical, biochemical, and modelling techniques to determine the shapes of biologically relevant molecules, constitute major challenges for the management and representation of structural data. wwPDB will continue to work with the community to meet these challenges and ensure that the archive maintains the highest possible standards of quality, integrity, and consistency.

###

About the wwPDB

The wwPDB is the international partnership of four data centers that manage the PDB archive. Its mission is to maintain a single archive of macromolecular structural data that is freely and publicly available to the global community. It consists of the Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB; http://rcsb.org) at Rutgers, The State University of New Jersey and the San Diego Supercomputer Center (SDSC) and Skaggs School of Pharmacy and Pharmaceutical Sciences at the University of California San Diego and BioMagResBank (BMRB; http://bmrb.wisc.edu) at the University of Wisconsin in the USA, the Protein Data Bank in Europe (PDBe; http://pdbe.org) at the EMBL European Bioinformatics Institute, and the Protein Data Bank Japan (PDBj; http://pdbj.org) at Osaka University.

The RCSB PDB receives funds from the NSF, NIH, and DOE. BioMagResBank is funded by the NLM.

The PDBe receives funding from EMBL, the Wellcome Trust, NIH, EU, BBSRC, and MRC. PDBj is funded by the Japan Science and Technology Agency.

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.