Scientists from North America, Europe and China have published a paper in the Proceedings of the National Academy of Sciences that reveals important details about key transitions in the evolution of plant life on our planet.
From strange and exotic algae, mosses, ferns, trees and flowers growing deep in steamy rainforests to the grains and vegetables we eat and the ornamental plants adorning our homes, all plant life on Earth shares over a billion years of history.
"Our study generated DNA sequences from a vast number of distantly related plants, and we developed new analysis tools to understand their relationships and the timing of key innovations in plant evolution," said Jim Leebens-Mack, associate professor of plant biology at The University of Georgia and coordinating author of the paper.
As part of the One Thousand Plants (1KP) initiative, the research team is generating millions of gene sequences from plant species sampled from across the green tree of life. By resolving these relationships, the international research team is illuminating the complex processes that allowed ancient water-faring algae to evolve into land plants with adaptations to competition for light, water and soil nutrients.
Lead author Norm Wickett of the Chicago Botanic Garden described the study as "like taking a time machine back to get a glimpse of how ancient algae transitioned into the diverse array of plants we depend on for our food, building materials and critical ecological services."
"When plants colonized the land 450 million years ago, it changed the world forever," said Simon Malcomber, program director in the National Science Foundation's Division of Environmental Biology, which funded the research. "The results of this study offer new insights into the relationships among living plants."
As plants grew and thrived across the plains, valleys and mountains of Earth's landscape, rapid changes in their structures gave rise to a myriad of new species, and the group's data also helps scientists better understand the ancestry of the most common plant linages, including flowering plants and non-flowering cone-bearing plants such pine trees.
The investigation has also revealed a number of previously unknown molecular characteristics of some plant species that may have applications in medicine and industry.
"We are using this diverse set of sequences to make many exciting discoveries with implications across the life sciences," said Gane Ka-Shu Wong, principal investigator for 1KP, professor at the University of Alberta and associate director of BGI-Shenzhen. "For example, new algal proteins identified in our sequence data are being used to investigate how the mammalian brain works."
"Seeing the impact that 1KP has had inspired us to launch a series of 1000-species projects for organisms like insects, birds and fish, said Yong Zhang, Project Directorat BGI.
Taming big data
The project required an extraordinary level of computing power to store and analyze the massive libraries of genetic data, which was provided by the iPlant Collaborative at the University of Arizona, the Texas Advanced Computing Center (TACC), Compute-Calcul Canada, and CNGB.
"This study demonstrates how life scientists are using high performance computing resources to analyze astronomically large datasets to answer fundamental questions that were previously thought to be intractable," said iPlant's Naim Matasci.
Computer scientist Tandy Warnow from the University of Illinois Urbana-Champaign and her student Siavash Mirarab developed new methods for analyzing the massive datasets used in the project. "The datasets we were analyzing in this study were too big and too challenging for existing statistical methods to handle, so we developed approaches with better accuracy," Warnow said.
Many organizations, including iPlant, CNGB and the Computational Analysis of Novel Drug Opportunities (CANDO) group at SUNY Buffalo have joined forces to provide web-based open-access to these results. Sequencing all of the data and helping develop new tools to analyze these enormous datasets, BGI-Shenzhen and CNGB also provided computational resources as well as early and open access to the data to the research community. These resources and repositories for the sequence data used in the study are described in a companion paper published in GigaScience.
Ultimately the researchers hope that their project will not only help us understand the origins and development of plant life, but also provide researchers with a new framework for the study of evolution.
"We hope that this study will help settle some longstanding scientific debates concerning plant relationships, and others will use our data to further elucidate the molecular evolution of plant genes and genomes," Leebens-Mack said.
For a full version of the papers and to see all institutions involved in the research, see http://www.