News Release

New filtering approach may improve online health information experience

Peer-Reviewed Publication

Penn State

UNIVERSITY PARK, Pa. — Patients and their caregivers are increasingly turning to online communities, such as social media, for health information about disease and treatment. But doing so may not provide relevant or useful results, particularly for patients who are not familiar with health care language. A Penn State research team has proposed a new information-filtering approach for predicting future health information needs of online community participants as they move through different stages in their illness.

“The goal of our research is to take a patient’s social media posts and online health profile — typically a paragraph written by the patient or caregiver — and use that information to look for articles from trustworthy sources that can provide health information that is helpful to the patient,” said Sharon Huang, associate dean for undergraduate studies in the Penn State College of Information Science and Technology (IST), who led the study.

Leveraging the similarities of disease progression timelines among patients with the same diagnosis, the new approach incorporates user profiles, past posts and replies to predict topic tags — labels that help categorize online content and guide users. The researchers found that adjusting the decision-making mechanism underlying topic tag predictions may provide more personalized health care information and resources for online users. They published their approach in the IEEE Journal of Biomedical and Health Informatics.

“Patients expect health care providers to supply detailed information about their disease, prognosis and treatment,” Huang said, elaborating further on the motivation for the research study. “Unfortunately, there is often a disconnect between provider and patient in terms of the language used and the actual information that is shared.”

As a result, Huang said, patients may turn to online communities for information from peers — on topics such as treatment options and side effects of medications — in lay terms that are easier to understand. The goal is to connect with people who are or have been on a similar health care journey.

This can work well for savvy users who know what specific keywords or phrases to use in their online search, according to Huang. But studies show that only 12% of U.S. adults are proficient in health literacy and able to comprehend the information provided by their health care professionals, which suggests that most patients may not have the knowledge to conduct an efficient search to find the information or community peers they seek.

“It can be frustrating for these users to have to sift through a large volume of content that doesn’t directly address their needs,” Huang said. “Further, there is a risk of misinformation or inaccurate recommendations that, when acted upon, could cause serious harm to the patient.”

The researchers hypothesized that patients with similar histories of disease progression or with similar courses of treatment would have related information needs at comparable stages. With a cancer patient, for example, this might be the stage of the cancer, the treatment or the response to treatment, such as side effects or remission. To test the idea, the researchers proposed a new approach to topic tag prediction that relies on recent advancements in the field of natural language processing, which focuses on how computers process and respond to human prompts, to better understand context from text data.

Existing tag prediction algorithms work by breaking down a prompt into individual words and selecting tags based on how frequently key words appear in the prompt, Huang said. In the current study, the researchers replaced the traditional tag prediction approach with an auxiliary sentence-generation task model — a form of artificial intelligence that uses longer phrases or full sentences to predict likely related content — focused on medical terminology.

They used anonymized data from, a social network designed to offer health-related information and emotional support for patients and their caregivers, to fine-tune their model.

“By comparing the internet profiles, past posts and replies of a health information seeker with the profiles and past online interactions of other users with similar health experiences, we can develop predictions of topic tags that describe the future information needs of the information seeker,” Huang said. “The result is an information-filtering or recommendation system that is tailored to the needs of users of online health communities.”

The model combines the patient’s posted information as well as keywords corresponding to those posts, and tags from users with similar illness progressions into a sentence-like string of text, explained Huang. The end of the sentence prompts the model to predict future topic tags.

For example, a cancer patient’s posts can contain detailed information about disease status and progression and mention the possibility of getting chemotherapy. After analyzing the patient’s bio, posts and the bios and posts of patients with similar disease progressions, the tag prediction algorithm would predict “chemo side effects” as a tag that the website could use to populate the users’ page with relevant medical articles from trusted sources, Huang said.

The approach, when tested on anonymized data from a Facebook group for chronic pain patients, predicted topics tags that led to more accurate search results for the user.

“Our topic recommendation system, using disease timelines of similar users for topic tag predictions, ultimately improves the accuracy, timeliness and personalization of health information searches online,” Huang said.

As part of their future work, the researchers said they intend to improve their approach to dynamically update retrieved information, according to the user’s health condition, with the goal of developing a personal health library accessible through a mobile app.

Huang’s collaborators from the Penn State College of IST included Amogh Adishesha, a recent graduate of the doctoral program; Fariha Azhar, former graduate student; Lily Jakielaszek, former undergraduate student; Vasant Honavar, Huck Chair in Biomedical Data Sciences and Artificial Intelligence; Fenglong Ma, assistant professor; Prasenjit Mitra, professor; and Xinning Gui, assistant professor. Also collaborating were Chandra Belani, Penn State Cancer Institute; and Peixuan Zhang, Penn State College of Engineering.

This work was supported by the National Science Foundation Center for Health Transformation and the Penn State Institute of Computational and Data Sciences. provided the data.

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.