image: (L) Senior co-corresponding author M. Madan Babu, PhD, FRS, St. Jude Senior Vice President of Data Science, Chief Data Scientist, Center of Excellence for Data-Driven Discovery director and Department of Structural Biology member and (R) first and co-corresponding author Benjamin Lang, PhD, formerly of the St. Jude Department of Structural Biology.
Credit: Courtesy of St. Jude Children's Research Hospital
(MEMPHIS, Tenn. – November 11, 2025) The shape, or structure, of proteins determines how they work, making it essential for researchers to have access to high-quality structural models. St. Jude Children’s Research Hospital investigators today announce AlphaSync, a free database that improves upon existing protein structure prediction resources through continuous updating. In the fast-moving fields of biomedical research, structural biology and protein science, new protein sequence information is constantly generated, allowing better prediction of protein structures. AlphaSync maintains a database of 2.6 million predicted protein structures across hundreds of species, updating as soon as new or modified sequences are available. The database’s capabilities, including additional generated data, were published in Nature Structural & Molecular Biology.
It takes considerable effort to determine a protein’s structure, and prediction tools have historically been limited in their accuracy. However, in 2021, an approach called AlphaFold2 applied machine learning to enable high-accuracy structure predictions of proteins based on the sequence of their building blocks, amino acids. This resource super-charged structural biology, giving new insights into how proteins function and how mutations contribute to disease. In 2022, the AlphaFold Protein Structure Database was launched, providing predictions for nearly all catalogued protein sequences known to science at the time. However, because it does not automatically update when new protein sequences are discovered, nor when an existing sequence is corrected based on new data, the quality of the predicted models can decrease over time, leading to out-of-date structures and potentially cascading errors.
“In a rapidly evolving scientific landscape, having access to the most current and detailed information on protein structural models is essential for breakthroughs in medicine and biology,” said senior co-corresponding author M. Madan Babu, PhD, FRS, St. Jude Senior Vice President of Data Science, Chief Data Scientist, Center of Excellence for Data-Driven Discovery director and Department of Structural Biology member. “With AlphaSync, we ensure predicted protein structures stay continuously updated and enriched with key information such as amino acid interaction networks, surface accessibility and disorder status so that researchers can move from sequence to insight faster than ever before.”
“AlphaSync performs an important job in keeping all of these predicted structures updated,” said first and co-corresponding author Benjamin Lang, PhD, formerly of the St. Jude Department of Structural Biology. “The AlphaSync database ensures that the structure you are looking at matches the sequence of the protein you are working with.”
AlphaSync empowers researchers with up-to-date protein predictions and additional data
AlphaSync updates its information using the latest data from UniProt, the largest database of protein sequences. It checks the database for new or modified sequences from UniProt, then runs structure predictions for proteins that have new or changed sequence information. When the researchers first performed this task, they found a backlog of 60,000 structures that were outdated, including 3% of human proteins.
“To establish AlphaSync, we ran a massive set of structure predictions that required enormous computational power,” Lang said. “Now, all of the data that we’ve collected in the database, and our ongoing efforts, enable scientists to look at important sites within proteins from over 200 species and be confident they reflect the latest experimental evidence and sequence information.”
In addition to updating structures, the database also provides pre-computed data and other ease-of-use features. This pre-computed data includes residue interaction networks (i.e., which amino acid contacts each other), surface area (i.e., whether an amino acid is accessible or not) and conformational state (i.e., whether the amino acid is in a structured or unstructured region). The scientists also chose to alter the data’s format for ease of use to empower researchers to make discoveries.
“3D structural information is quite a complex format, so we broke it down further into a simpler 2D tabular format, which we hope will enable more insight into individual proteins,” Lang said. “In addition, this tabular format is easier for downstream machine learning applications, which will help future biomedical research projects find and understand disease mechanisms.”
“AlphaSync provides high-quality predicted protein structures along with detailed, amino acid–level information in a user-friendly format, making it easy for researchers to explore and analyze,” Babu said. “We hope it not only minimizes structural and sequence inaccuracies from propagating but also enhances our understanding of proteins relevant to human disease, ultimately accelerating the development of better treatments and cures.”
Access AlphaSync at: https://alphasync.stjude.org/.
Authors and funding
The study’s other authors are Bálint Mészáros, Besian Sejdiu and Jaimin Patel, all of St. Jude.
The study was supported by grants from the American Lebanese Syrian Associated Charities (ALSAC), the fundraising and awareness organization of St. Jude.
St. Jude Media Relations Contact
Michael Sheffield
Desk: (901) 595-0221
Cell: (901) 379-6072
michael.sheffield@stjude.org
media@stjude.org
St. Jude Children's Research Hospital
St. Jude Children’s Research Hospital is leading the way the world understands, treats, and cures childhood catastrophic diseases. From cancer to life-threatening blood disorders, neurological conditions, and infectious diseases, St. Jude is dedicated to advancing cures and means of prevention through groundbreaking research and compassionate care. Through global collaborations and innovative science, St. Jude is working to ensure that every child, everywhere, has the best chance at a healthy future. To learn more, visit stjude.org, read St. Jude Progress, a digital magazine, and follow St. Jude on social media at @stjuderesearch.
Journal
Nature Structural & Molecular Biology
Method of Research
Experimental study
Subject of Research
Not applicable
Article Title
AlphaSync is an enhanced AlphaFold structure database synchronized with UniProt
Article Publication Date
11-Nov-2025