News Release

Scientists pack centuries of organic chemistry into neat 2D visualization

Peer-Reviewed Publication

Skolkovo Institute of Science and Technology (Skoltech)

Figure 1

image: Representations of chemical reactions projected on a plane as points for easy intuitive grouping. view more 

Credit: Mikhail Andronov et al./ACS Omega

Researchers from Skoltech, Lomonosov Moscow State University, and Sirius University of Science and Technology have proposed a new method for visualizing chemical reactions to help scientists understand the global chemical reaction space and come up with ways of synthesizing organic compounds used in the industry. Reported in ACS Omega, the neural network-based method projects chemical reactions onto a 2D plane as dots, grouping similar chemical reactions together.

Chemists are constantly on the lookout for new ways of synthesizing useful organic compounds. These may range from the active ingredients in drugs and pesticides to fuel additives and other industrially significant substances: organic LEDs, dyes and pigments, etc. Since there are many ways to synthesize an organic compound, medicinal chemists have to dig into large reaction databases. Even for a simple compound, one can find hundreds of already known synthetic ways. It is challenging to analyze this amount of data using only human perception.

“Analyzing a typical database search output, a chemist can group reactions of a similar kind together to get an idea of the compound’s synthetic landscape, but this requires a well-established chemical intuition, and it may be subjective, too,” the study’s PI Sergey Sosnin of Skoltech says.

To simplify this process and make it more consistent, the researchers devised a way to capture the “essence” of chemical reactions and plot them on a graph for easy analysis. “It is more convenient to look at a picture rather than a long list of reactions. We visualize reactions based on what the reactants and the products are,” Sosnin adds.

The proposed method converts a molecule into a numerical representation (bit vector). Then the algorithm extracts the essence of the reaction by subtracting the vectors of the reagents from those of the products. “In a way, the resulting vector stands for whatever’s changed in the reaction, regardless of which specific compounds were involved,” Sosnin explains. “That’s what makes it such a powerful and pure representation of a reaction.”

The problem with reaction vectors is they are in themselves unintelligible — unless you’re good at thinking in 1,024 dimensions.

“We visualize these vectors, which are inaccessible to direct human comprehension, using an approach known as — here comes a mouthful — parameterized t-distributed stochastic neighborhood embedding,” the researcher comments. “A neural network projects each multidimensional vector to the coordinates of a point on a plane.”

Given this chart, a chemist can recognize typical reaction types, for example the clusters indicated by diamonds numbered one through three in figure 1. Suppose someone is interested in ways of synthesizing the anti-HIV/AIDS drug darunavir (purple circles) or asthma medication montelukast (gray circles). The visualization affords insights as to which reaction types are mostly used for the purpose, which appear underused — or perhaps not used at all — despite possible assumptions to the contrary on the part of the researcher.

The team stresses the objective nature of the visualization. This is a bit like classifying animals based on DNA only, without ever having taken a single look at them. You may find, for example, that falcons, surprisingly, are more closely related to parrots than to other birds of prey. With chemical reactions, faulty intuitions can play similar tricks on us.


Skoltech is a private international university located in Russia. Established in 2011 in collaboration with the Massachusetts Institute of Technology (MIT), Skoltech is cultivating a new generation of leaders in the fields of science, technology, and business, conducting research in breakthrough fields, and promoting technological innovation with the goal of solving critical problems that face Russia and the world. Skoltech is focusing on six priority areas: artificial intelligence and communications, life sciences and health, cutting-edge engineering and advanced materials, energy efficiency and ESG, photonics and quantum technologies, advanced studies. Website:

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.