Rice computer scientists race ahead in pathogen-detection DARPA challenge
Rice University
image: Ryan Doughty, Felix Quintana, Michael Nute, Joshua Stallings and Todd Treangen
Credit: Rice University
HOUSTON – (June 9, 2026) – A pinch of soil harbors hundreds of millions of microorganisms, most of them harmless. However, a lone pathogen in that multitude can be enough to negatively impact crops, livestock or human health.
Until now, the tools to screen environmental DNA samples rapidly and with a high enough level of precision to find even a trace amount of a potentially harmful biological agent have been lacking. A team of Rice University computer scientists is building a computational tool that can get the job done.
Their work, which builds on years of prior research, has enabled them to advance to the next stage of the Defense Advanced Research Projects Agency’s Bio-attribution Challenge. Rice computer scientist and computational biologist Todd Treangen and his team spent a “fun yet hectic” April working on the first round of the challenge, which involved scanning 250 terabytes of data in eight hours ⎯ an ask that challenged the limits of what is possible with current state-of-art computational tools.
“That is an alarming amount of data to survey in a relatively short amount of time,” said Michael Nute, a research scientist in the Treangen group who helped coordinate the effort.
The goal was to be able to identify even trace amounts of genetic material belonging to known pathogens in a large dataset designed to mimic the content of complex environmental DNA samples, such as ones derived from soil. Such samples contain genetic material from all organisms present in a given environment, mixed together indiscriminately.
“It is a legitimate ‘needle-in-a-haystack’ problem,” said Nute, who was the lead architect of a broad pathogen database assembled in about a week in order to serve as a reference for the search code. “We had to be able to find even a single viral DNA sequence among 10 to 100 million short read DNA sequences.”
While the numbers involved are staggering, Treangen offered a comparison that illustrates the scale of the challenge in more familiar terms.
“To put this in perspective, it would be the equivalent of taking 10,000 puzzle boxes, each containing 10,000 pieces, mixing all the pieces together and then trying to identify which box a single piece belongs to, out of 100 million pieces in total,” Treangen said.
Working alongside Treangen and Nute were doctoral students Ryan Doughty and Felix Quintana, and Joshua Stallings, an undergraduate student who graduated in May. Their success attests to the Treangen group’s deep expertise in this field.
Treangen, an associate professor of computer science at Rice, has been working on computational tools to identify, characterize and assess the threat level of pathogens in metagenomic data for years. In 2020, he was a recipient of an Intelligence Advanced Research Projects Activity award as part of the Functional Genomic and Computational Assessment of Threats program.
In 2021, he joined colleagues at Rice and the Houston Health Department to monitor the city wastewater for a real-time understanding of community infection dynamics during the COVID-19 pandemic. Their efforts coalesced a year later into Houston Wastewater Epidemiology, a Centers for Disease Control and Prevention national center of excellence, a wastewater-based epidemiology surveillance system that now monitors for several other potential pathogens, including influenza, mpox and measles.
In 2022, Treangen and colleagues developed SeqScreen, a first-of-its-kind tool designed to track and characterize DNA fragments of concern using functional labels rather than by direct matching against a set list of pathogens. Because SeqScreen can be used to screen for unknown risks, it is especially relevant at a time when technological advances have made it easier to design and synthesize novel DNA constructs.
The development of SeqScreen helped set the stage for Treangen’s receipt of a National Science Foundation CAREER Award in 2023. In his proposal abstract for the award, Treangen wrote the goal of the project is to develop computational tools capable of tracking “yet-unseen pathogens and [prevent] intentional or unintentional misuse of synthetic DNA.”
His team’s expertise was likewise well matched to the challenge.
“This Bio-attribution challenge is on theme for the computational biology and computational microbial forensics focus of my research group,” said Treangen, who leads the AI and Computational Biology for Health (AI2Health) research cluster at Rice’s Ken Kennedy Institute, which expands on his computational data analysis efforts primarily through building practical, biologically inspired AI tools designed to make complex data easier to interpret and act on.
Felix Quintana, a computer science doctoral candidate in his group (who is co-advised by Lydia Kavraki, University Professor, the Kenneth and Audrey Kennedy Professor of Computing and director of the Ken Kennedy Institute), recently won best paper award at the first International Workshop on Cyberbiosecurity for his work titled “PlasmidScreen: Towards Enhanced Detection of Genetically Engineered Plasmids.”
Doughty, Quintana and Nute were all involved with the development of SeqScreen. Doughty recently led the development of a computational tool for variant calling at unprecedented scale, called Bronko, where variant calling focuses on the detection of fine-grained mutation differences in viral genomes. He is presenting the work at this year’s Research in Computational Molecular Biology conference.
“By leveraging the experience of my team, we were able to make several intentional design choices to meet the intense requirements,” Treangen said. “There was very little sleep in the month of April, but we were all highly motivated and wanted to submit the strongest possible solution in spite of the tight timeframe.”
The team has now launched back into another sprint for the second part of the challenge.
While the first part of the competition involved thoroughly identifying known pathogens in a complex data sample under unique time and technical constraints, the next stage challenges participants to determine anomalies and biothreat events and to detect their origin by tracing subtle genomic features and metadata in a highly efficient manner. The good news is that the task falls squarely into a research focus area that the Treangen group has been focused on all along.
“The specific problems raised by both Round 1 and Round 2 of the DARPA Bio-attribution challenge are exactly the things we think about every day in my research group,” Treangen said. “We’re looking forward to participating in Round 2 and embracing the enormity of the challenge.”
-30-
This news release can be found online at news.rice.edu.
Follow Rice News and Media Relations via Twitter @RiceUNews.
About Rice:
Located on a 300-acre forested campus in Houston, Texas, Rice University is consistently ranked among the nation’s top 20 universities by U.S. News & World Report. Rice has highly respected schools of architecture, business, continuing studies, engineering and computing, humanities, music, natural sciences and social sciences and is home to the Baker Institute for Public Policy. Internationally, the university maintains the Rice Global Paris Center, a hub for innovative collaboration, research and inspired teaching located in the heart of Paris. With 4,776 undergraduates and 4,104 graduate students, Rice’s undergraduate student-to-faculty ratio is just under 6-to-1. Its residential college system builds close-knit communities and lifelong friendships, just one reason why Rice is ranked No. 1 for lots of race/class interaction and No. 7 for best-run colleges by the Princeton Review. Rice is also rated as a best value among private universities by the Wall Street Journal and is included on Forbes’ exclusive list of “New Ivies.”
Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.