News Release 14-Oct-2020

Deep neural networks show promise for predicting future self-harm based on clinical notes

Researchers at the Medical University of South Carolina use deep learning models to identify patients at risk of intentional self-harm based on unstructured patient clinical notes alone

Peer-Reviewed Publication

Medical University of South Carolina

Machine Learning and Artificial Intelligence — **image: Artificial neural network with a chip.** view more

Credit: Mikemacmarketing. Licensed under CC BY 2.0

According to the American Foundation for Suicide Prevention, suicide is the 10th leading cause of death in the U.S., with over 1.4 million suicide attempts recorded in 2018. Although effective treatments are available for those at risk, clinicians do not have a reliable way of predicting which patients are likely to make a suicide attempt.

Researchers at the Medical University of South Carolina and University of South Florida report in JMIR Medical Informatics that they have taken important steps toward addressing the problem by creating an artificial intelligence algorithm that can automatically identify patients at high risk of intentional self-harm, based on the information in the clinical notes in the electronic health record.

The study was led by Jihad Obeid, M.D., co-director of the MUSC Biomedical Informatics Center, and Brian Bunnell, Ph.D., formerly at MUSC and currently an assistant professor in the Department of Psychiatry and Behavioral Neurosciences at the University of South Florida.

The team used complex artificial neural networks, a form of artificial intelligence also known as deep learning, to analyze unstructured, textual data in the electronic health record. Deep learning methods progressively use layers of artificial networks to extract higher information from raw input data. The team showed that these models, once trained, could identify patients at risk of intentional self-harm.

"This kind of work is important because it leverages the latest technologies to address an important problem like suicide and identifies patients at risk so that they can be referred to appropriate management," said Obeid.

Thus far, researchers have primarily relied on structured data in the electronic health record for the identification and prediction of patients at risk. Structured data refers to tabulated information that has been entered into designated fields in the electronic health record as part of clinical care. For example, when physicians diagnose patients and assign International Classification of Disease (ICD) codes, they are creating structured data. This sort of tabulated, structured data is easy for computer programs to analyze.

However, 80% to 90% of the pertinent information in the electronic health record is trapped in text format. In other words, the clinical notes, progress reports, plan-of-care notes and other narrative texts in the electronic health record represent an enormous untapped resource for research. Obeid's study is unique because it uses deep neural networks to "read" clinical notes in the electronic health record and identify and predict patients at risk for self-harm.

After regulatory ethics review and approval of the proposed research by the Institutional Review Board at MUSC, Obeid began by identifying patient records associated with ICD codes indicative of intentional self-harm in MUSC's research data warehouse. That warehouse, which was created with support from the South Carolina Clinical & Translational Research Institute, provides MUSC researchers access to patient electronic health record data, provided they have obtained the necessary permissions.

In order to simulate a real-world scenario, Obeid and his team divided the clinical records into two time categories: 2012 to 2017 records that were used for training the models and 2018-2019 records that were used for testing the trained models. First, they looked at the clinical notes taken during the hospital visit in which the ICD code was assigned. Using that as the training data set, the models "learned" which patterns of language in the clinical notes of the patients' electronic medical records were associated with the assignment of an ICD code of intentional self-harm. Once the models were trained, they could identify those patients based solely on their analysis of text in the clinical notes, with an accuracy of 98.5%. Experts manually reviewed a subset of records to confirm the model's accuracy.

Next, the team tested whether the most accurate of the models could use clinical notes in the electronic health record to predict future self-harm. To this end, Obeid's team identified the records of patients who had presented with intentional self-harm and trained the model using their clinical notes between six months to one month prior to the intentional self-harm hospital visit. They then tested whether the trained models could correctly predict if these patients would later present with intentional self-harm.

Predicting future self-harm based solely on clinical notes proved to be more challenging than identifying current at-risk patients due to the extra "noise" that is introduced when vast amounts of patient history are included in the model. Historical clinical notes tend to be varied and not always relevant. For example, if a patient was seen for depression or other mental health issues six months prior to his or her hospital visit for intentional self-harm, then the clinical notes were likely to include relevant information. However, if the patient came in for a condition unrelated to mental health, then the notes were less likely to include relevant information.

While the inclusion of irrelevant information introduces a lot of noise into the data analysis, all of this information must be included across patients in the models to predict outcomes. As a result, the model was less accurate at predicting which patients would later present for intentional self-harm than simply classifying current patients for suicide risk. Nonetheless, the predictive accuracy of this model was very competitive with that previously reported for models that relied on structured data, reaching an accuracy of almost 80% with relatively high sensitivity and precision.

Obeid's team has shown the feasibility of using deep learning models to identify patients at risk of intentional self-harm based on clinical notes alone. The study also showed that models can be used to predict, with fairly good fidelity, which patients will present in the future for intentional self-harm based on clinical notes in their electronic health record.

These early results are promising and could have large impacts at the clinical level. If deep learning models can be used to predict which patients are at high-risk for suicide based on clinical notes, then clinicians can refer high-risk patients early for appropriate treatment. Using these models to classify patients as at risk for self-harm could also facilitate enrollment into clinical studies and trials of potential new treatments relevant to suicide.

In future studies, Obeid aims to evaluate changes in the predictive time window for his models, for example, looking at records one year before a patient's presentation for intentional self-harm instead of six months. The team also intends to examine other outcomes such as suicide or suicidal ideation. And while the models work well at MUSC, Obeid must now show that they can be generalized to other institutions.

"Can the models be trained in one location and transferred to another location and still work?" asked Obeid. "If the answer is yes, then this saves critical resources because other institutions will not have to perform expensive and time-consuming manual chart reviews to confirm that the models are getting it right during the training periods."

###

About MUSC

Founded in 1824 in Charleston, the Medical University of South Carolina (MUSC) is the oldest medical school in the South as well as the state's only integrated, academic health sciences center with a unique charge to serve the state through education, research and patient care. Each year, MUSC educates and trains more than 3,000 students and nearly 800 residents in six colleges: Dental Medicine, Graduate Studies, Health Professions, Medicine, Nursing and Pharmacy. The state's leader in obtaining biomedical research funds, in fiscal year 2019, MUSC set a new high, bringing in more than $284 million. For information on academic programs, visit musc.edu.

As the clinical health system of the Medical University of South Carolina, MUSC Health is dedicated to delivering the highest quality patient care available while training generations of competent, compassionate health care providers to serve the people of South Carolina and beyond. Comprising some 1,600 beds, more than 100 outreach sites, the MUSC College of Medicine, the physicians' practice plan and nearly 275 telehealth locations, MUSC Health owns and operates eight hospitals situated in Charleston, Chester, Florence, Lancaster and Marion counties. In 2020, for the sixth consecutive year, U.S. News & World Report named MUSC Health the No. 1 hospital in South Carolina. To learn more about clinical patient services, visit muschealth.org.

MUSC and its affiliates have collective annual budgets of $3.2 billion. The more than 17,000 MUSC team members include world-class faculty, physicians, specialty providers and scientists who deliver groundbreaking education, research, technology and patient care.

About the SCTR Institute

The South Carolina Clinical & Translational Research (SCTR) Institute is the catalyst for changing the culture of biomedical research, facilitating the sharing of resources and expertise and streamlining research-related processes to bring about large-scale change in the clinical and translational research efforts in South Carolina. Our vision is to improve health outcomes and quality of life for the population through discoveries translated into evidence-based practice. To learn more, visit https://research.musc.edu/resources/sctr

Journal

JMIR Medical Informatics

DOI

10.2196/17784

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.