Researchers at the University of California San Diego have developed an approach that uses machine learning to identify and predict which genes make infectious bacteria resistant to antibiotics. The approach was tested on strains of Mycobacterium tuberculosis--the bacteria that cause tuberculosis (TB) in humans. It identified 33 known and 24 new antibiotic resistance genes in these bacteria.
The researchers say the approach can be used on other infection-causing pathogens, including staph and bacteria that cause urinary tract infections, pneumonia and meningitis. The work was recently published in Nature Communications.
"Knowing which genes are conferring antibiotic resistance could change the way infectious diseases are treated in the future," said co-senior author Jonathan Monk, research scientist in the Department of Bioengineering at UC San Diego. "For example, if there's a persistent infection of TB in the clinic, physicians can sequence that strain, look at its genes and figure out which antibiotics it's resistant to and which ones it's susceptible to, then prescribe the right antibiotic for that strain."
"This could open up opportunities for personalized treatment for your pathogen. Every strain is different and should potentially be treated differently," said co-senior author Bernhard Palsson, Galletti Professor of Bioengineering at the UC San Diego Jacobs School of Engineering. "Through this machine learning analysis of the pan-genome--the complete set of all the genes in all the strains of a bacterial species--we can better understand the properties that make these strains different."
The team trained a machine learning algorithm using the genome sequences and phenotypes--the physical traits or characteristics that can be observed, such as antibiotic resistance--of more than 1,500 strains of M. tuberculosis. From these inputs, the algorithm predicted a set of genes and variant forms of these genes, called alleles, that cause antibiotic resistance. 33 were validated with known antibiotic resistance genes, the remaining 24 were new predictions that have not yet been experimentally tested.
The researchers further analyzed the algorithm's predictions and identified combinations of alleles that could be interacting together and causing a strain to be antibiotic resistant. They also mapped these alleles onto crystal structures of M. tuberculosis proteins (published in the Protein Data Bank). They found that some of these alleles appeared in certain structural regions of the proteins.
"We did interactional and structural analyses to dig deeper and develop more intricate hypotheses for how these genes could be contributing to antibiotic resistance phenotypes," said first author Erol Kavvas, a bioengineering Ph.D. student in Palsson's research group. "These findings could aid future experimental investigations on whether structural grouping of these alleles plays a role in their conferral of antibiotic resistance."
The results of this study are all computational, so the team is looking to work with experimental researchers to test whether the 24 new genes predicted by the algorithm indeed confer antibiotic resistance in M. tuberculosis.
Future studies will involve applying the team's machine learning approach to the leading infectious bacteria, known as the ESKAPE pathogens: Enterococcus faecium, Staphylococcus aureus, Klebsiella pneumoniae, Acinetobacter baumannii, Pseudomonas aeruginosa and Enterobacter species. As a next step, the team is integrating genome-scale models of metabolic networks with their machine learning approach to understand mechanisms underlying the evolution of antibiotic resistance.
Full paper: "Machine learning and structural analysis of Mycobacterium tuberculosis pan-genome identifies genetic signatures of antibiotic resistance." Co-authors include Edward Catoiu, Nathan Mih, James T. Yurkovich, Yara Seif, Nicholas Dillon, David Heckmann, Amitesh Anand, Laurence Yang and Victor Nizet, all at UC San Diego.
This research was supported by the NIH's National Institute of Allergy and Infectious Diseases (NIAID, grant 1-U01-AI124316-01) and the NIH National Institute of General Medical Sciences (award U01GM102098).