News Release

Scientists design molecules “backward” to speed up discovery

Ground-breaking method—“PropMolFlow”—offers potential for faster creation of pharmaceuticals, materials, and new technologies

Peer-Reviewed Publication

New York University

Molecular Target Search

image: 

A new AI model designs molecules with specified properties ten times faster than previous methods, potentially speeding up the process for the creation of pharmaceuticals and materials. The figure illustrates how the system transforms random noise into complete molecular structures guided by target properties.

view more 

Credit: Image courtesy of the University of Florida and New York University.

Every medication in your cabinet, every material in your phone’s battery, and virtually every compound that makes modern life work started as a molecular guess, with scientists hypothesizing that a particular arrangement of atoms might do something useful—kill a bacterial infection, store electrical charge, or absorb sunlight efficiently. 

But, given the billions and billions of potential small molecules, finding the right combination to remedy an affliction or advance technology is challenging—a needle-in-a-haystack search that can take decades.

In recent years, AI tools have helped shorten this process. Generative AI models can propose molecular structures guided by target properties, compressing what once took years of trial-and-error into hours of computation. 

A team of researchers has now developed a new method that advances this capability even further. The method, PropMolFlow (Property-guided Molecular Flow), can generate molecular candidates roughly 10 times faster than existing methods—and without compromising the accuracy or chemical validity of the results. 

Led by scientists at the University of Florida and New York University, the breakthrough is described in the journal Nature Computational Science.

“For most of scientific history, material discovery often preceded understanding—useful compounds were found by accident, then scientists figured out why they worked,” says Stefano Martiniani, an assistant professor of physics, chemistry, mathematics, and neural science at NYU and an author of the paper. “Generative AI offers the possibility of inverting this: specify the properties, then find the structures. PropMolFlow represents another step toward making that vision practical.”

“For a field where computational speed directly translates to discovery speed, this represents a meaningful advance,” adds Mingjie Liu, an assistant professor in the University of Florida’s Department of Chemistry and one of the paper’s authors. “The work doesn’t replace what came before it, but, rather, demonstrates that the next generation of molecular generators can be substantially faster while maintaining the accuracy that makes these tools useful.”

Designing Molecules Backward

The paper’s authors, who also included researchers from the University of Minnesota and Brigham Young University, note that designing molecules is fundamentally an “inverse problem.” 

“Chemists don’t usually want ‘a molecule,’” explains Martiniani. “Instead, they want a molecule that does something specific—to interact strongly with light for optical applications or to possess a particular electronic structure that determines how it absorbs energy or conducts electricity.”

Advances in AI have made this kind of targeted design possible. Traditional drug and materials discovery typically starts from what’s already known—tweaking existing compounds or searching through catalogs of molecules that have already been synthesized. Generative AI can instead invent entirely new structures from scratch, exploring chemical possibilities no one has considered before.

This capability has developed rapidly since 2022, when researchers first showed that the same type of AI powering image generators like DALL-E could be adapted to create three-dimensional molecular structures. Each successive method has improved the accuracy of property targeting, the chemical validity of generated structures, or the speed of generation. 

PropMolFlow advances all three simultaneously, using an innovative algorithm that finds more direct paths from random noise to valid molecular structures. The result: roughly 100 computational steps where previous methods needed 1,000.

Accuracy Without Shortcuts

The researchers recognized that speed is useless if the generated molecules are chemically nonsensical or miss their target properties—the desired characteristics to meet a specified need—so they tested PropMolFlow’s accuracy by comparing it with other models.

Here they found that the method consistently outperformed baseline models on structural validity: it generated molecules that had correct bonding patterns and appropriate geometries more than 90 percent of the time. 

“This matters because many earlier approaches produced structures that looked superficially plausible but violated basic chemical rules,” says Martiniani.

Similarly, PropMolFlow could render molecular properties the scientists sought, performing at a competitive or superior accuracy across multiple molecular properties when compared to the best existing approaches—while also computing much more quickly. 

Checking the AI's Homework

One foundational concern with AI-based molecular design is evaluation. 

“If a neural network generates a molecule and another neural network predicts its properties, both systems may share similar blind spots because they are drawing from the same reservoir of information—AI is then grading its own homework,” observes Martiniani.

The PropMolFlow team addressed this concern by validating generated molecules using density functional theory, a physics-based quantum chemistry method that computes molecular properties from first principles—independent of any machine-learning model.

For most properties, the neural network predictions effectively tracked the physics-based calculations, confirming that faster AI-based evaluation is statistically reliable. 

“This kind of validation provides the credibility needed for generated molecules to be taken seriously for real applications,” says Liu.

What This Enables

The combination of speed and accuracy that PropMolFlow demonstrates has practical implications for molecular discovery, the authors conclude. 

“With the ability to generate thousands of chemically valid, property-targeted candidates in minutes rather than hours, researchers can iterate faster: generate candidates, filter computationally, validate the best ones with physics or experiments, and feed results back to improve the next round,” Martiniani explains.

“Real drugs and advanced materials are typically larger and more complex than the molecules we explored—and extending these approaches to bigger systems remains an active challenge,” acknowledges Liu. “But the principles translate, and the careful treatment of property embedding and physics-based validation provides a template for more ambitious applications.”

The research was supported by the National Science Foundation (OAC-2311632), the Simons Center for Computational Physical Chemistry at NYU, and a University of Florida AI and Complex Computational Research Award.

# # #

 


Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.