image: An interpretable machine learning framework to predict drug toxicity based on Genotype-Phenotype Difference (GPD) between preclinical models and humans
Credit: POSTECH
In the UK, there was a case where TGN1412, an immunotherapy under development, triggered a cytokine storm within hours of administration to humans, leading to multiple organ failure. Another example, Aptiganel, a stroke drug candidate, was also highly effective in animals but was discontinued in humans due to side effects such as hallucinations and sedation. Even though drugs considered safe in preclinical tests can be fatal in human clinical trials. A machine-learning-based technology has been developed to learn these differences and preemptively identify potentially dangerous drugs before clinical trials.
A research team led by Professor Sanguk Kim of the Department of Life Sciences and the Graduate School of Artificial Intelligence at POSTECH, along with Dr. Minhyuk Park and Mr. Woomin Song of the Department of Life Sciences, and Mr. Hyunsoo Ahn of the Graduate School of Artificial Intelligence, has developed a technology that uses machine learning to predict drug side effects in humans. This study was recently published online in the international medical journal eBioMedicine.
During the development of new drugs, those that pass preclinical trials often show unexpected toxicity in humans. This issue arises from differences in biological responses between humans and animals. For example, chocolate is generally safe for humans but toxic to dogs. Similarly, a drug that is safe in mice does not necessarily mean it is safe for humans. To date, this "cross-species difference" has been a major reason for failures in new drug development.
The research team focused on the "Genotype-Phenotype Difference (GPD)," the biological differences between cells, mice, and humans. They analyzed how genes targeted by drugs function differently in humans and preclinical models, focusing on three key factors: first, the gene's perturbation impact on survival (essentiality); second, the pattern of gene expression in different tissues; and third, the connectivity of genes within biological networks.
Validation using data from 434 hazardous drugs and 790 approved drugs revealed that GPD characteristics were significantly associated with drug failure due to toxicity in humans. Predictive power was significantly improved over relying on drug chemical data, with the area under the curve (AUPRC1) increasing from 0.35 to 0.63, and the area under the curve (AUROC2) increasing from 0.50 to 0.75. The developed AI model demonstrated superior predictive performance compared to existing state-of-the-art models.
Furthermore, it demonstrated practicality in "chronological validation," which alerts users to drugs facing market withdrawal due to toxicity. After training the prediction model on only drug data up to 1991, it correctly predicted drugs expected to be withdrawn from the market after 1991, achieving 95% accuracy.
The significance of this study is that it bridges the "translation gap" between preclinical and clinical trials by quantifying biological differences in cells, preclinical animal models, and humans. Pharmaceutical companies can reduce development costs and time by screening out high-risk candidates before clinical trials, while also improving patient safety. The model's effectiveness is expected to increase as more relevant data and annotations accumulate.
Professor Sanguk Kim stated, "This is the first attempt to incorporate differences in genotype-phenotype relationships for drug toxicity prediction. Our framework enables early identification of high-risk drugs in clinical development.” He added, "This approach holds promise for reducing development costs, improving patient safety, and increasing the success rate of therapeutic approvals. Co-first authors Dr. Min-hyuk Park and Mr. Woomin Song stated, "The human-centered toxicity prediction model will be a very practical tool in new drug development. We anticipate that pharmaceutical companies will be able to screen out high-risk drugs in advance at the preclinical stage, thereby improving development efficiency."
This research was supported by the National Research Foundation (NRF), funded by the Korean government (MSIT), Medical Device Innovation Center, and the Synthetic Biology Human Resources Development Program.
Journal
EBioMedicine
Article Title
Drug toxicity prediction based on genotype-phenotype differences between preclinical models and humans
Article Publication Date
28-Oct-2025