Analyzing medical records from thousands of patients, statisticians have devised a statistical model for predicting what other medical problems a patient might encounter.
Like how Netflix recommends movies and TV shows or how Amazon.com suggests products to buy, the algorithm makes predictions based on what a patient has already experienced as well as the experiences of other patients showing a similar medical history.
"This provides physicians with insights on what might be coming next for a patient, based on experiences of other patients. It also gives a predication that is interpretable by patients," said Tyler McCormick, an assistant professor of statistics and sociology at the University of Washington.
The algorithm will be published in an upcoming issue of the journal Annals of Applied Statistics. McCormick's co-authors are Cynthia Rudin, Massachusetts Institute of Technology, and David Madigan, Columbia University.
McCormick said that this is one of the first times that this type of predictive algorithm has been used in a medical setting. What differentiates his model from others, he said, is that it shares information across patients who have similar health problems. This allows for better predictions when details of a patient's medical history are sparse.
For example, new patients might lack a lengthy file listing ailments and drug prescriptions compiled from previous doctor visits. The algorithm can compare the patient's current health complaints with other patients who have a more extensive medical record that includes similar symptoms and the timing of when they arise. Then the algorithm can point to what medical conditions might come next for the new patient.
"We're looking at each sequence of symptoms to try to predict the rest of the sequence for a different patient," McCormick said. If a patient has already had dyspepsia and epigastric pain, for instance, heartburn might be next.
The algorithm can also accommodate situations where it's statistically difficult to predict a less common condition. For instance, most patients do not experience strokes, and accordingly most models could not predict one because they only factor in an individual patient's medical history with a stroke. But McCormick's model mines medical histories of patients who went on to have a stroke and uses that analysis to make a stroke prediction.
The statisticians used medical records obtained from a multiyear clinical drug trial involving tens of thousands of patients aged 40 and older. The records included other demographic details, such as gender and ethnicity, as well as patients' histories of medical complaints and prescription medications.
They found that of the 1,800 medical conditions in the dataset, most of them – 1,400 – occurred fewer than 10 times. McCormick and his co-authors had to come up with a statistical way to not overlook those 1,400 conditions, while alerting patients who might actually experience those rarer conditions.
They came up with a statistical modeling technique that is grounded in Bayesian methods, the backbone of many predictive algorithms. McCormick and his co-authors call their approach the Hierarchical Association Rule Model and are working toward making it available to patients and doctors.
"We hope that this model will provide a more patient-centered approach to medical care and to improve patient experiences," McCormick said.
The work was funded by a Google Ph.D. fellowship awarded to McCormick and by the National Science Foundation.
The Annals of Applied Statistics