A new computational technique developed at The University of Texas at Austin has enabled an international consortium to produce an avian tree of life that points to the origins of various bird species. A graduate student at the university is a leading author on papers describing the new technique and sharing the consortium's findings about bird evolution in the journal Science.
The results of the four-year effort -- which relied in part on supercomputers at the university's Texas Advanced Computing Center (TACC) -- shed light on the timing of a "big bang" in bird evolution, rearrange evolutionary relationships between some bird species and provide new insights on the origins of song pattern recognition in birds, as well as a host of other avian traits.
To build the new bird tree of life, researchers first sequenced the complete genomes of 48 living bird species. With about 14,000 genomic regions per species, the size of the data sets and the complexity of analyzing them required a new computing method, which was led by computer scientists Tandy Warnow, an adjunct professor at The University of Texas at Austin and professor at the University of Illinois at Urbana-Champaign; and Siavash Mirarab, a graduate student at The University of Texas at Austin.
Previous bird evolutionary trees were based on analyses of a few dozen genes as opposed to this latest study, which analyzed entire bird genomes. Those earlier studies did use more bird species (about 200 compared with 48), but with hundreds of times more genetic data per species in the latest study, the new bird family tree draws from far more data, resulting in some surprising findings such as that flamingoes are more closely related to pigeons than to pelicans and other water birds.
"In the computer science community, we often focus on how to make faster tools to analyze big data sets," said Mirarab, co-lead author on one of Science's major papers about the project. "This project is exciting because it shows that it's not just about being bigger and faster. Simply having more data doesn't make you more accurate. You have to come up with more intelligent ways to analyze your data."
By testing the new technique, called statistical binning, on simulated data sets, the team demonstrated that their approach is more accurate than previous techniques.
The entire effort to construct an avian evolutionary tree took 400 years of CPU time and required the use of parallel processing supercomputers at TACC, the Munich Supercomputing Center and the San Diego Supercomputing Center. For the statistical binning portion alone, developing and testing the method took over 100 years of CPU time, divided between TACC and the Condor Cluster in the university's Department of Computer Science.
"TACC was essential," said Mirarab. "It's where most of the work on the statistical binning paper was done. We couldn't have done it without these supercomputers."
Mirarab and Warnow are part of the Avian Phylogenomics Consortium, which has so far involved more than 200 scientists from 80 institutions in 20 countries.
The consortium is led by Guojie Zhang of the National Genebank at BGI in China and the University of Copenhagen, Erich D. Jarvis of Duke University and the Howard Hughes Medical Institute, and M. Thomas P. Gilbert of the Natural History Museum of Denmark.
The group's first findings are being reported nearly simultaneously in 23 papers -- eight in a Dec. 12 special issue of Science and 15 more in Genome Biology, GigaScience and other journals.
Mirarab was also co-lead author on a paper in the Proceedings of the National Academy of Sciences in October that used a different computational technique to reveal important details about key transitions in the evolution of plant life on our planet.
The National Science Foundation and the Howard Hughes Medical Institute funded Warnow and Mirarab.
For an interactive graphic of the new bird tree of life, contact firstname.lastname@example.org.