image: Members of the Health Informatics Research Group of the University of Tartu: Hendrik Šuvalov, Maria Malk and Raivo Kolde
Credit: Credits: University of Tartu, Marianne Liiv
Language models read doctors’ notes to reveal why patients discontinue medication
Researchers at the University of Tartu showed that large language models can identify, with high accuracy, why patients stop using antidiabetic medications or statins, based on doctors’ electronic clinical notes. The study opens new possibilities for using clinical information that has so far been difficult to analyse in health research.
The study, carried out by the Health Informatics Research Group at the Institute of Computer Science of the University of Tartu, combined prescription data from a 10% representative sample of the Estonian population from 2012 to 2019 with clinical notes.
“Prescription data show us that a medicine was no longer purchased, but the reason is often written in the doctor’s notes instead. Until now, this information could only be used to a very limited extent, because manually reviewing medical records is extremely time-consuming,” said Hendrik Šuvalov, Junior Research Fellow in Health Informatics at the University of Tartu.
First, the researchers identified patients who had not purchased a medication for at least one year. They then used large language models to identify references to treatment discontinuation in doctors’ notes and, where possible, determine whether the decision had been initiated by the patient or the physician.
Most of the analysis was carried out using the Llama 3.1-70B language model in a secure local environment. For comparison, GPT-4o was used only on manually reviewed text that had been stripped of sensitive information. Medical experts confirmed that the language model running on the university server was highly accurate in identifying phrases and reasons related to medication discontinuation: accuracy was 93–98% for extracting discontinuation phrases and 95–96% for extracting reasons.
According to Hendrik Šuvalov, the study shows that health data contains far more information than what is reflected in diagnoses, laboratory results, or prescription records. “Health data also provides insight into how treatment actually unfolds in everyday clinical care.
Patients’ experiences, side effects, and reasons for changing treatment often make their way into doctors’ notes, but remain unused in conventional data analyses,” Šuvalov said.
The study found that treatment was most often discontinued because of adverse reactions. Among statins, adverse reactions accounted for approximately 70% of documented reasons for discontinuation, while among antidiabetics they accounted for nearly 45% of cases. Compared with statins, discontinuation of antidiabetic medications was also more often linked to insufficient treatment effect and contraindications.
The greatest value of the study, however, lies in the methodology: using language models to transform clinical free text, which has previously been difficult to use, into analyzable data. This can help researchers better understand treatment pathways and support evidence-based decision-making in health policy.
Journal
Journal of Medical Internet Research
DOI
Method of Research
Data/statistical analysis
Subject of Research
Not applicable
Article Title
Extracting and Classifying Drug Discontinuations From Estonian Electronic Health Records: Development and Validation Study
Article Publication Date
17-Jun-2026