Public Release: 

Cloud-computing revolution applies to evolution

NSF grant will help Rice University scientists simplify tools to trace genes across species

Rice University

A $1.1 million National Science Foundation grant to two Rice University computer science groups will allow them to build cloud-computing tools to help analyze evolutionary patterns.

With the three-year grant, Christopher Jermaine and Luay Nakhleh, both associate professors of computer science, will develop parallel-processing tools that track the evolution of genes and genomes across species.

The Rice team expects its new open-source algorithms will bring sophisticated computing techniques to researchers who have limited access to supercomputing resources but can easily rent "cloud-computing" time from the likes of Amazon or Microsoft.

Even those who have access to mainframes may find it easier to go to the cloud. The programs will be able to run parallel analyses on thousands of computers, with results that may not only be faster but may also make it possible to trace genes at scales that were not practical before.

"We're doing basic analysis of evolutionary questions," Nakhleh said. "Evolutionary biologists sample taxa from across the tree of life. They want to know, for example, how a big group of plants may have evolved."

The NSF-funded project will expand upon Bayesian inference techniques that allow biologists to build upon prior knowledge. (Bayesian inference is a statistics-based method to estimate probabilities based on a data set.) "They allow biologists to incorporate any prior knowledge they might have into the analysis itself," Nakhleh said.

"Analyzing data sets with 10 or 20 gene sequences can easily take hundreds of hours," he said. "But the tree of life has millions of sequences and is built from millions of species. There's no way traditional Bayesian techniques are even going to get close to handling that."

"A problem involving, say, 50 organisms would require tens of thousands of hours of compute time, which is doable," Jermaine said. "But if you want to move into thousands of organisms, you have to multiply that by 100. Suddenly it's not so doable."

Jermaine feels computer farms that allow thousands of machines to cooperatively work on a problem hold great promise for bioinformatics in general. He recently received another NSF grant to develop tools for more machine learning in the cloud and sees phylogenetics - the study of evolutionary relationships -- as a prime candidate for parallelization.

"We're not talking about taking a one-day calculation and taking it down to minutes," he said. "We're talking about potentially taking a years- or decadeslong computation and making it feasible by changing the underlying algorithm and making it amenable to distributed computing."

The researchers plan a turnkey approach to their software that they hope will appeal to biologists. "My impression is they want a very low bar to entry," Jermaine said. "If they have to write a lot of code or have to figure out how to use all these servers, they're just not going to do it. Hopefully our solution will be as easy for biologists as pressing a return key."

###

Follow Rice News and Media Relations via Twitter @RiceUNews

Related materials:

CS Bioinformatics Group (Nakhleh): http://bioinfo.cs.rice.edu/node?destination=node

Jermaine Research Group: http://www.cs.rice.edu/~cmj4/

Rice Department of Computer Science: http://compsci.rice.edu

Image for download:

http://news.rice.edu/wp-content/uploads/2014/09/0908_LUAH-1-web.jpg

Rice University computer scientists Luay Nakhleh, left, and Christopher Jermaine have won a National Science Foundation grant to build cloud-computing tools to help analyze evolutionary patterns. (Credit: Jeff Fitlow/Rice University)

Located on a 300-acre forested campus in Houston, Rice University is consistently ranked among the nation's top 20 universities by U.S. News & World Report. Rice has highly respected schools of Architecture, Business, Continuing Studies, Engineering, Humanities, Music, Natural Sciences and Social Sciences and is home to the Baker Institute for Public Policy. With 3,920 undergraduates and 2,567 graduate students, Rice's undergraduate student-to-faculty ratio is just over 6-to-1. Its residential college system builds close-knit communities and lifelong friendships, just one reason why Rice is highly ranked for best quality of life by the Princeton Review and for best value among private universities by Kiplinger's Personal Finance.

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.