University of Texas at Arlington computer scientist Jacob Luber has earned a five-year, $2 million grant from the Cancer Prevention and Research Institute of Texas (CPRIT) to create a database that contains every publicly available cancer dataset from the National Cancer Institute (NCI).
The database will allow researchers and physicians to map where cancer patients have similar traits and improve and expand treatments based on that data. Such datasets are currently too large for physicians to access easily.
An assistant professor in the Computer Science and Engineering Department and with UTA’s Multi-Interprofessional Center for Health Informatics (MICHI), Luber is developing an algorithm that compresses bioinformatic data related to cancer. Using deep-learning techniques, he is exploring how to reduce the size of extremely large datasets from up to three petabytes to a more-manageable 12 terabytes. A petabyte is the equivalent of 1,000 terabytes of data.
“With deep learning, we can take a high-dimensional image with proteomic data overlaid on top and compress it, then use it to find similarities between patients,” said Luber, who also is a CPRIT Scholar in Cancer Research. “Once those similarities are identified, physicians could mine electronic health records to see what treatments were most effective and adjust treatment regimens accordingly.”
Marion Ball, MICHI executive director, said Luber’s work is a valuable contribution to the health informatics efforts of the center.
“MICHI conducts research in two focus areas central to the long-term strategic goals of UTA: data-driven discovery and health and the human condition,” said Ball, Presidential Distinguished Professor and the Raj and Indra Nooyi Endowed Distinguished Chair in Bioengineering. “Dr. Luber’s research program perfectly bridges these two areas by utilizing massive-scale supercomputing to understand health-oriented data and then drive novel research programs addressing disease both through algorithmic approaches in the clinic and machine learning-guided drug discovery.”
Ultimately, the research would allow doctors to input patient data and access the larger index through a computer at the patient’s bedside or in an exam room. Additionally, the imaging index will also be used to further drug discovery efforts by better understanding how chemicals derived from the microbiome affect how the body’s own immune system fights cancer.
“Dr. Luber’s postdoctoral experience at NCI gives him a unique perspective into how to approach the problem of making extremely large datasets accessible to physicians and others who do not have access to supercomputers,” said Hong Jiang, chair of the Computer Science and Engineering Department. “His research is an excellent application of deep learning that could make a big difference in the effectiveness of cancer treatments.”
Luber, who has secondary appointments in the Bioengineering Department at UTA and at the NCI’s Cancer Data Science Lab, used UT System Science and Technology Acquisition and Retention funds to purchase a supercomputer for this project. The CPRIT grant will allow him to hire several doctoral students and postdoctoral researchers to assist him in creating the index.
- Written by Jeremy Agor, College of Engineering