News Release

From brain scans to alloys: Teaching AI to make sense of complex research data

Peer-Reviewed Publication

Penn State

UNIVERSITY PARK, Pa. — Artificial intelligence (AI) is increasingly used to analyze medical images, materials data and scientific measurements, but many systems struggle when real-world data do not match ideal conditions. Measurements collected from different instruments, experiments or simulations often vary widely in resolution, noise and reliability. Traditional machine-learning models typically assume those differences are negligible — an assumption that can limit accuracy and trustworthiness.

To address this issue, Penn State researchers have developed a new artificial intelligence framework with potential implications for fields ranging from Alzheimer’s disease research to advanced materials design. The approach, called ZENN and detailed in a study that was featured as a showcase in the Proceedings of the National Academy of Sciences, teaches AI models to recognize and adapt to hidden differences in data quality rather than ignoring them.

ZENN, short for Zentropy-Embedded Neural Networks, was developed by Shun Wang, postdoctoral scholar of mathematics; Wenrui Hao, professor of mathematics and director of the Center for Mathematical Biology in the Huck Institutes of the Life Sciences; Zi-Kui Liu, professor of materials science and engineering; and Shunli Shang, research professor of materials science and engineering.

Zentropy is Liu’s advanced theory of entropy, which posits that systems tend to move towards disorder in the absence of energy to maintain order. This deeper theory of entropy integrates quantum mechanics, thermodynamics and statistical mechanics into a cohesive predictive model. The researchers used this approach to develop their framework, embedding principles from thermodynamics directly into neural networks — a type of AI that mimics how human brains process information based on science — and allows models to distinguish meaningful signals from uncertainty and noise.

“Most machine-learning methods assume that all data is homogeneous,” Hao said. “But real-world data is heterogeneous by nature. If we want AI to be useful for scientific discovery, it must account for that.”

Conventional neural networks are often trained using a mathematical technique called cross-entropy loss to measure how far a model’s predictions are from the correct answers. This approach works well when the training data are clean, reliable and consistent, Hao said.

Problems arise when models are asked to integrate heterogeneous data, such as combining precise computer simulations with noisier experimental or sensor measurements. ZENN takes a different approach inspired by thermodynamics by breaking properties of data down into two parts. One part, called “energy,” captures the meaningful patterns or signals in the data. The other part, called “intrinsic entropy,” captures the noise, uncertainty or disorder in the measurements. The model also uses a “temperature” parameter that it can be tuned, which helps it recognize hidden differences, such as whether the data comes from precise simulations or noisier experiments, between datasets. This allows ZENN to focus on the true signal while accounting for varying data quality.

Wang compared the idea to reading imperfect documents.

“If you are reading a handwritten note with smudges and stains, you know which marks are meaningful and which are just noise,” Wang said. “Traditional AI often treats everything the same. ZENN is designed to tell the difference.”

In tests, the researchers found that ZENN matched the performance of larger, more complex neural networks while remaining more robust when data quality varied. Just as importantly, the team said, the framework provides insight into why a system behaves in a certain way, not just what outcome to expect.

They tested the framework in a materials science case study involving an alloy known as iron-rich iron platinum, which has the rare property of contracting when heated. Using ZENN, the team reconstructed the material’s free-energy landscape, revealing the thermodynamic mechanisms behind its unusual negative thermal expansion.

“Many AI models act like black boxes,” Liu said. “They can make predictions, but they do not explain the physics behind them. ZENN helps reveal the mechanisms driving the behavior.”

The researchers say the framework could be especially valuable in biomedical research. Diseases such as Alzheimer’s disease involve complex, heterogeneous data, including brain imaging, genetic information, molecular markers and other clinical records. ZENN could help integrate those datasets to identify disease subtypes, track progression and potentially pinpoint key transition points in processes, they said.

Similar advantages, the team reported, could apply to cryo-electron microscopy studies of amyloids, analysis of fossil pollen grains used in climate research and advanced imaging systems that combine geographic information system data with sensor measurements such as PM2.5 indices, housing price, and mental health. A broad range of collaborations is being established across multiple disciplines at Penn State.

In materials science and engineering, ZENN could help bridge the gap between idealized computer simulations and real-world experiments, according to Liu. By learning from both, the framework could guide the design of materials that are not only theoretically promising but also manufacturable, with potential applications ranging from medical implants for bone repair to advanced data platforms such as ULTERA, a system which manages and analyzes large, complex datasets. He also noted that the approach may also prove useful in emerging areas such as quantum computing, where uncertainty is a fundamental feature rather than a flaw. Embedding Zentropy-aware reasoning into AI models could offer new tools for interpreting and managing quantum information.

While challenges remain, particularly in scaling the method to extremely large or complex systems, Liu said the work reflects a broader shift in how artificial intelligence can support science.

“Instead of using AI only to find patterns, we want it to help us understand mechanisms,” Liu said. “That is what allows scientific knowledge to move forward.”

The U.S. National Institute of General Medical Sciences and the U.S. Department of Energy funded this work, along with the Endowed Dorothy Pate Enright Professorship.


Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.