The genomes of modern birds tell a story of how they emerged and evolved after the mass extinction that wiped out dinosaurs and almost everything else 66 million years ago. That story is now coming to light, thanks to an ambitious international collaboration that has been underway for four years.
The first findings of the Avian Phylogenomics Consortium are being reported nearly simultaneously in 29 papers -- eight papers in a Dec. 12 special issue of Science and 21 more in Genome Biology, GigaScience and other journals. The full set of papers in Science and other journals can be accessed at http://www.
Scientists already knew that the birds who survived the mass extinction experienced a rapid burst of evolution. But the family tree of modern birds has confused biologists for centuries and the molecular details of how birds arrived at the spectacular biodiversity of more than 10,000 species is barely known.
To resolve these fundamental questions, a consortium led by Guojie Zhang of the National Genebank at BGI in China and the University of Copenhagen, Erich D. Jarvis of Duke University and the Howard Hughes Medical Institute and M. Thomas P. Gilbert of the Natural History Museum of Denmark, has sequenced, assembled and compared full genomes of 48 bird species. The species include the crow, duck, falcon, parakeet, crane, ibis, woodpecker, eagle and others, representing all major branches of modern birds.
"BGI's strong support and four years of hard work by the entire community have enabled us to answer numerous fundamental questions to an unprecedented scale," said Guojie Zhang. "This is the largest whole genomic study across a single vertebrate class to date. The success of this project can only be achieved with the excellent collaboration of all the consortium members."
"Although an increasing number of vertebrate genomes are being released, to date no single study has deliberately targeted the full diversity of any major vertebrate group," added Tom Gilbert. "This is precisely what our consortium set out to do. Only with this scale of sampling can scientists truly begin to fully explore the genomic diversity within a full vertebrate class."
"This is an exciting moment," said neuroscientist Erich Jarvis. "Lots of fundamental questions now can be resolved with more genomic data from a broader sampling. I got into this project because of my interest in birds as a model for vocal learning and speech production in humans, and it has opened up some amazing new vistas on brain evolution."
This first round of analyses suggests some remarkable new ideas about bird evolution. The first flagship paper published in Science presents a well-resolved new family tree for birds, based on whole-genome data. The second flagship paper describes the big picture of genome evolution in birds. Six other papers in the special issue of Science describe how vocal learning may have independently evolved in a few bird groups and in the human brain's speech regions; how the sex chromosomes of birds came to be; how birds lost their teeth; how crocodile genomes evolved; ways in which singing behavior regulates genes in the brain; and a new method for phylogenic analysis with large-scale genomic data.
The Avian Phylogenomics Consortium has so far involved more than 200 scientists hailing from 80 institutions in 20 countries, including the BGI in China, the University of Copenhagen, Duke University, the University of Texas at Austin, the Smithsonian Museum, the Chinese Academy of Sciences, Louisiana State University and many others.
A Clearer Picture of the Bird Family Tree
Previous attempts to reconstruct the avian family tree using partial DNA sequencing or anatomical and behavioral traits have met with contradiction and confusion. Because modern birds split into species early and in such quick succession, they did not evolve enough distinct genetic differences at the genomic level to clearly determine their early branching order, the researchers said. To resolve the timing and relationships of modern birds, the consortium authors used whole-genome DNA sequences to infer the bird species tree.
"In the past, people have been using 10 to 20 genes to try to infer the species relationships," Jarvis said. "What we've learned from doing this whole-genome approach is that we can infer a somewhat different phylogeny [family tree] than what has been proposed in the past. We've figured out that protein-coding genes tell the wrong story for inferring the species tree. You need non-coding sequences, including the intergenic regions. The protein coding sequences, however, tell an interesting story of proteome-wide convergence among species with similar life histories."
This new tree resolves the early branches of Neoaves (new birds) and supports conclusions about some relationships that have been long-debated. For example, the findings support three independent origins of waterbirds. They also indicate that the common ancestor of core landbirds, which include songbirds, parrots, woodpeckers, owls, eagles and falcons, was an apex predator, which also gave rise to the giant terror birds that once roamed the Americas.
The whole-genome analysis dates the evolutionary expansion of Neoaves to the time of the mass extinction event 66 million years ago that killed off all dinosaurs except some birds. This contradicts the idea that Neoaves blossomed 10 to 80 million years earlier, as some recent studies suggested.
Based on this new genomic data, only a few bird lineages survived the mass extinction. They gave rise to the more than 10,000 Neoaves species that comprise 95 percent of all bird species living with us today. The freed-up ecological niches caused by the extinction event likely allowed rapid species radiation of birds in less than 15 million years, which explains much of modern bird biodiversity.
Increasingly sophisticated and more affordable genomic sequencing technologies and the advent of computational tools for reconstructing and comparing whole genomes have allowed the consortium to resolve these controversies with better clarity than ever before, the researchers say.
With about 14,000 genes per species, the size of the datasets and the complexity of analyzing them required several new approaches to computing evolutionary family trees. These were developed by computer scientists Tandy Warnow at the University of Illinois at Urbana-Champaign, Siavash Mirarab, a student at the University of Texas at Austin and Alexis Stamatakis at the Heidelburg Institute for Theoretical Studies. Their algorithms required the use of parallel processing supercomputers at the Munich Supercomputing Center (LRZ), the Texas Advanced Computing Center (TACC) and the San Diego Supercomputing center (SDSC).
"The computational challenges in estimating the avian species tree used around 300 years of CPU time, and some analyses required supercomputers with a terabyte of memory," Warnow said.
The bird project also had support from the Genome 10K Consortium of Scientists (G10K), an international science community working toward rapidly assessing genome sequences for 10,000 vertebrate species.
"The Avian Genomics Consortium has accomplished the most ambitious and successful project that the G10K Project has joined or endorsed," said G10K co-leader Stephen O'Brien, who co-authored a commentary on the bird sequencing project appearing in GigaScience.
A Genomic Perspective of Avian Evolution and Biodiversity
For all their biological intricacies, birds are surprisingly light on DNA. A study led by Zhang, Cai Li and the consortium authors found that compared to other reptile genomes, avian genomes contain fewer of the repeating sequences of DNA and lost hundreds of genes in their early evolution after birds split from other reptiles.
"Many of these genes have essential functions in humans, such as in reproduction, skeleton formation and lung systems," Zhang said. "The loss of these key genes may have a significant effect on the evolution of many distinct phenotypes of birds. This is an exciting finding, because it is quite different from what people normally think, which is that innovation is normally created by new genetic material, not the loss of it. Sometimes, less is more."
From the whole chromosome level to the order of genes, this group found that the genomic structure of birds has stayed remarkably the same among species for more than 100 million years. The rate of gene evolution across all bird species is also slower compared to mammals.
Yet some genomic regions display relatively faster evolution in species with similar lifestyles or phenotypes, such as involving vocal learning. This pattern of what is called convergent evolution may be the underlying mechanism that explains how distant bird species evolved similar phenotypes independently. Zhang said these analyses on particular gene families begin to explain how birds evolved a lighter skeleton, a distinct lung system, dietary specialties, color vision, as well as colorful feathers and other sex-related traits.
The new studies have shed light on several other questions about birds, including:
How did vocal learning evolve? Eight studies in the package examined the subject of vocal learning. According to new evidence in the two flagship papers, vocal learning evolved independently at least twice, and was associated with convergent evolution in many proteins. A Science study led by Andreas Pfenning, Alexander Hartemink, Jarvis and others at Duke, in collaboration with researchers at the Allen Institute for Brain Science in Seattle and the RIKEN Institute in Japan, found that the specialized song-learning brain circuitry of vocal learning birds (songbirds, parrots and hummingbirds) and human brain speech regions have convergent changes in the activity of more than 50 genes. Most of these genes are involved in forming neural connections. Osceola Whitney, Pfenning and Anne West, also of Duke, found in another Science study that singing is associated with the activation of 10 percent of the expressed genome, with diverse activation patterns in different song-learning regions of the brain, controlled by epigenetic regulation of the genome. Duke's Mukta Chakraborty and others found in a PLoS ONE study that parrots have a song system within a song system, with the surrounding song system unique to them. This might explain their greater ability to imitate human speech. In a BMC Genomics study, Morgan Wirthlin, Peter Lovell and Claudio Mello from Oregon Health & Science University found unique genes in the song-control brain regions of songbirds.
The XYZW of sex chromosomes. Just as the sex of humans is determined by the X and Y chromosomes, the sex of birds is controlled by the Z and W chromosomes. The W makes birds female, just as the Y makes humans male. Most mammals share a similar evolutionary history of the Y chromosome, which now contains many degenerated genes that no longer function and only a few active genes related to "maleness." A Science study led by Qi Zhou and Doris Bachtrog from the University of California, Berkeley, and Zhang found that half of bird species still contain substantial numbers of active genes in their W chromosomes. This challenges the classic view that the W chromosome is a "graveyard of genes" like the human Y.
This group also found that bird species are at drastically different states of sex chromosome evolution. For example, the ostrich and emu, which belong to one of the older branches of the bird family tree, have sex chromosomes resembling their ancestors. Yet some modern birds such as the chicken and zebra finch have sex chromosomes that contain few active genes. This opens a new set of questions on how the diversity of sex chromosomes may drive the diversity of sex differences in the outward appearance of various bird species. Peacocks and peahens are dramatically different; male and female crows are indistinguishable.
How did birds lose their teeth? In a Science study led by Robert Meredith from Montclair State University and Mark Springer from the University of California, Riverside, a comparison between the genomes of living bird species and those of vertebrate species that have teeth identified key mutations in the parts of the genome that code for enamel and dentin, the building blocks of teeth. The evidence suggests that five tooth-related genes were disabled within a short time period in the common ancestor of modern birds more than 100 million years ago.
What's the connection between birds and dinosaurs? Unlike mammals, birds (along with reptiles, fish and amphibians) have a large number of tiny microchromosomes. These smaller packages of gene-rich material are thought to have been present in their dinosaur ancestors. A study of genome karyotype structure in BMC Genomics analyzed whole genomes of the chicken, turkey, Peking duck, zebra finch and budgerigar. It found the chicken has the most similar overall chromosome pattern to an avian ancestor, which was thought to be a feathered dinosaur. This work was led by Darren Griffin and Michael Romanov from the University of Kent, and by Dennis Larkin and Marta Farré from the Royal Veterinary College, University of London.
Another study in Science examined birds' closest living relatives, the crocodiles. This team, led by Ed Green and Benedict Paton from the University of California, Santa Cruz, David Ray from Texas Tech University and Ed Braun from the University of Florida, found that crocodiles have one of the slowest-evolving genomes. The researchers were able to infer the genome sequence of the common ancestor of birds and crocodilians (archosaurs) and therefore all dinosaurs, including those that went extinct 66 million years ago.
Do differences in gene trees versus species trees matter? In the phylogenomics flagship study by Jarvis and others, the consortium found that no gene tree has a history exactly the same as the species tree, partly due to a process called incomplete lineage sorting. Another Science study, led by Tandy Warnow at the University of Texas and the University of Illinois, and her student Siavash Mirarab, developed a new computational approach called "statistical binning." They used this approach to show it does not matter much that the gene trees differ from the species tree because they were able to infer the first coalescent-based, genome-scale species tree, combining gene trees with similar histories to accurately infer a species tree.
Do bird genomes carry fewer virus sequences than other species? Mammalian genomes harbor a diverse set of genomic "fossils" of past viral infections called "endogenous viral elements" (EVEs). A study published in Genome Biology led by Jie Cui of Duke-NUS Graduate Medical School in Singapore, Edward Holmes of the University of Sydney and Zhang, found that bird species had 6-13 times fewer EVE infections in their past than mammals. This finding is consistent with the fact that birds have smaller genomes than mammals. It also suggests birds may either be less susceptible to viral invasions or better able to purge viral genes.
When did colorful feathers evolve? Elaborate, colorful feathers are thought to be evolutionarily advantageous, giving a male bird in a given species an edge over his competitors when it comes to mating. Zhang's flagship paper in Science, which is further analyzed by Matthew Greenwold and Roger Sawyer from the University of South Carolina in a companion study in BMC Evolutionary Biology, found that genes involved in feather coloration evolved more quickly than other genes in eight of 46 bird lineages. Waterbirds have the lowest number of beta keratin feather genes, landbirds have more than twice as many, and in domesticated pet and agricultural bird species, there are eight times more of these genes.
What happens to species facing extinction or recovering from near-extinction? Birds are like the proverbial canaries in the coal mine because of their sensitivity to environmental changes that cause extinction. In a Genome Biology study led by Shengbin Li, Cheng Cheng and Jun Yu from Xi'an Jiaotong University and Jarvis, researchers analyzed the genomes of species that have recently gone nearly extinct, including the crested ibis in Asia and the bald eagle in the Americas. They found genes that break down environmental toxins have a higher rate of mutations in these species and there is lower diversity of immune system genes in endangered species. In a recovering crested ibis population, genes involved in brain function and metabolism are evolving more rapidly. The researchers found more genomic diversity in the recovering population than was expected, giving greater hope for species conservation.
The Start of Something Bigger
This sweeping genome-level comparison of an entire class of life is being powered by frozen bird tissue samples collected over the past 30 years by museums and other institutions around the world. Samples are sent as fingernail-sized chunks of frozen flesh mostly to Duke University and University of Copenhagen for DNA separation. Most of the genome sequencing and critical initial analyses of the genomes have then been conducted by the BGI in China.
The avian genome consortium is now creating a database that will be made publicly available in the future for scientists to study the genetic basis of complex avian traits.
Setting up the pipeline for the large-scale study of whole genomes -- collecting and organizing tissue samples, extracting the DNA, analyzing its quality, sequencing and managing torrents of new data -- has been a massive undertaking. But the scientists say their work should help inform other major efforts for the comprehensive sequencing of vertebrate classes. To encourage other researchers to dig through this 'big data' and discover new patterns that were not seen in small-scale data before, the avian genome consortium has released the full dataset to the public in GigaScience, and in NCBI, ENSEMBL and CoGe databases.
Under the leadership of Dave Burt, the National Avian Research Facility at the Roslin Institute and Edinburgh University, UK, has created genome browser databases based on the ENSEMBL model for 48 species.
This project received its main financial support from BGI and the China National GeneBank, as well as from the U.S. National Institutes of Health, the U.S. National Science Foundation, the Howard Hughes Medical Institute, the Lundbeck Foundation and the Danish National Research Foundation, and support from the many other sources funding for the consortium's individual scientists.
Other leadership in the Avian Phylogenomics Project include, but are not limited to, Tandy Warnow of the University of Illinois; Stephen O' Brien, David Haussler and Oliver Ryder of the Genome 10K consortium; Peter Houde of New Mexico State University; Edward Braun of the University of Florida; Joel Cracraft of the American Museum of Natural History; David Mindell of the University of California, San Francisco; Alexandros Stamatakis of the Heidelberg Institute for Theoretical Studies and Karlsruhe Institute of Technology; Jon Fjeldsa and Carsten Rahbek of the University of Copenhagen; Scott Edwards of Harvard University; David Burt of the Roslin Institute of Edinburgh University; Gary Graves of the Smithsonian Institution; Robb Brumfield of Louisiana State University; Agostinho Atunes of the Universidade do Porto in Portugal; Darren Griffin of the University of Kent; Dennis Larkin from the Royal Veterinary College, University of London; Qi Zhou of the University of California, Berkeley; and Wang Jun of BGI.
The full set of papers in Science and other journals can be accessed at http://www.