Article Highlight | 6-Jan-2026

AI is rewriting the rules of enzyme engineering—from faster prediction to smarter design

Nanjing Agricultural University The Academy of Science

Enzymes are biological catalysts that underpin vital cellular reactions and numerous industrial processes, from food production to pharmaceuticals. Their efficiency depends on how amino acid residues form active sites, recognize substrates, and stabilize transition states, often through induced-fit conformational changes. Traditional enzyme engineering relies on directed evolution, rational or semi-rational design, and de novo design. While effective, these approaches are constrained by slow and costly screening, dependence on prior structural knowledge, or limited reliability under real conditions. Across all strategies, the vastness of protein sequence space remains a fundamental bottleneck, as even small mutational changes generate enormous numbers of variants that cannot be exhaustively tested.

study (DOI: 10.1016/j.bidere.2025.100044) published in BioDesign Research on 29 August 2025 by Dr. Shuaiqi Meng’s & Dr. Haiyang Cui’s team, Nanjing Normal University, argues that the field is entering a new phase where foundation models and multimodal systems can unify sequence, structure, chemistry, and experimental context, enabling enzyme design that is not only faster, but increasingly generalizable and mechanistically interpretable.

This review presents a unifying, AI-first framework that expands enzyme engineering from single-enzyme modeling to multi-enzyme pathway design, integrating sequence, structure, reaction environment, and systems-level constraints into a continuous “model–design–validate” logic. At the single-enzyme scale, the modeling strategy explicitly encodes key contextual variables—pH, temperature, solvent composition, substrate and product identities, and cofactor availability—so that predictions reflect realistic biochemical settings rather than idealized conditions. Within this scope, the authors organize AI applications into three core task families: function modeling (enzyme/non-enzyme discrimination, EC-number prediction, GO annotation, ligand-binding site identification, and kinetic parameter estimation such as kcat, KM, and kcat/KM), structure modeling (near-atomic 3D prediction of enzymes and complexes that reveals catalytic pockets, substrate-binding features, and supports inverse folding), and property modeling (thermostability, pH tolerance, selectivity, binding affinity, robustness, and resistance-related traits that govern real-world usability). Methodologically, the framework combines condition-aware kinetic prediction, substrate- and product-centered outcome modeling (including by-product control and feedback inhibition assessment), cofactor-specific prediction and regeneration/recycling optimization, thermodynamics-informed inverse folding, and AI-driven complex structure prediction for protein–ligand, protein–protein, and protein–nucleic acid assemblies. These capabilities then scale upward to pathway-level modeling, where coordination among enzymes is optimized via expression balancing, flux tuning, pathway design and retrosynthesis, and metabolic network analysis that evaluates thermodynamic/kinetic feasibility and even spatial constraints. The review also traces a four-stage evolution of AI integration—classical machine learning, deep neural networks, protein language models, and emerging multimodal plus agent-style workflows—and stresses that progress depends as much on data infrastructure as on algorithms, leveraging continuously updated databases and curated machine-learning datasets spanning sequences, structures, kinetics, and interactions. Importantly, the integrated approach has already demonstrated measurable gains: condition-aware tools such as UniKP helped mine high-activity tyrosine ammonia lyase candidates and guide directed evolution to achieve up to 3.5-fold improvements in catalytic efficiency, while systems-level modeling improved the design of coordinated multi-enzyme processes (e.g., polysaccharide breakdown, lignin valorization, terpene biosynthesis) and increased pathway prediction reliability when enzyme organization is treated explicitly—collectively signaling a shift from isolated enzyme optimization to holistic control of biocatalytic systems.

By linking “mining–design–validation” into a data-driven loop, AI can reduce the number of experiments needed to reach a high-performance enzyme and expand what is feasible to engineer. Near-term impacts include faster discovery of candidate enzymes from uncharacterized sequence space, improved prediction of catalytic and stability outcomes before synthesis, and more reliable prioritization in directed evolution campaigns.

###

References

DOI

10.1016/j.bidere.2025.100044

Original Source URL

https://doi.org/10.1016/j.bidere.2025.100044

Funding information

This work was supported by the National Natural Science Foundation of China (Grant No. 22008166), Innovative Research Group Project of the National Natural Science Foundation of China (Grant No.22478199) and the Jiangsu Basic Research Center for Synthetic Biology (Grant No. BK20233003). This work is supported by Jiangsu Province “Entrepreneurship and Innovation Plan” Education Fund (No. 164080H00250), Research Start-up Fund of Nanjing Normal University (No. 184080H201B94), the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD, No. 164320H1865).

About BioDesign Research

BioDesign Research is dedicated to information exchange in the interdisciplinary field of biosystems design. Its unique mission is to pave the way towards the predictable de novo design and assessment of engineered or reengineered living organisms using rational or automated methods to address global challenges in health, agriculture, and the environment.

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.