Viruses are the largest known group of biological agents. Now, an international team of scientists with the participation of the Institute for Plant Molecular and Cellular Biology (IBMCP), a joint centre of the Universitat Politècnica de València (UPV) and the Spanish National Research Council (CSIC), has taken an important step towards understanding their diversity. This team has discovered more than 130,000 new RNA viruses (such as the SARS-CoV-2 coronavirus that is currently causing the COVID-19 pandemic) by using a new computer tool that analysed 5.7 million biological samples collected around the world over the last 15 years. This finding, published in the journal Nature, represents a tenfold increase in the number of viral RNA species described to date.
To carry out this analysis, the multidisciplinary team developed Serratus, a cloud computing (Amazon Web Services, AWS) infrastructure that, using a cluster of 22,500 computer processors (CPUs), enabled massive searches for viral sequences in the millions of Gigabytes (Petabytes) of sequencing data available in public databases.
Detailed analysis of certain viral families led to the discovery of more than 30 new coronavirus species, including interesting examples in aquatic vertebrates such as fish and amphibians whose coronaviruses had a genome segmented into two fragments, a feature that has been described in other virus families but had not previously been detected in any coronavirus.
At the Institute for Plant Molecular and Cellular Biology, located in the Polytechnic City of Innovation, UPV scientists used Serratus to analyse the virus that causes human hepatitis D, a viral agent called Delta, of minimal genomic size and unknown origin. This allowed the CSIC researcher at the IBMCP Marcos de la Peña Rivero to detect similar viruses in a multitude of other animals, including not only mammals and other vertebrates but also invertebrates. "Surprisingly, these viruses were also found in environmental samples collected from lakes and soils all over the world, and their hosts are unknown for the time being," reveals De la Peña.
Evolutionary connection between human and plant viruses in the environment
Moreover, environmental samples with hepatitis D-like viruses revealed the presence of novel viral forms with ultra-compact genomes of minute size (only 300 bases, the chemical units that make up the genetic material). "This discovery allows us to advance a close evolutionary connection between viruses as distant as human hepatitis D and plant subviral agents called viroids," says the CSIC researcher.
Both the database of all the viruses obtained in the course of this study and the set of tools developed are freely and openly available (http://www.serratus.io). These tools can be of great use in characterising the diversity of all viruses existing in our planet and in preparing the world for possible new pandemics, the devastating consequences of which we are now suffering with emerging viral diseases such as COVID-19, caused by the SARS-CoV-2 coronavirus.
The IBMCP is the only Spanish scientific institution participating in this research, in which the Heidelberg Institute for Theoretical Studies and the Max Planck Institute for Biology (Germany), the Pasteur Institute (France), the University of St. Petersburg (Russia), the University of California, Berkeley (USA) and the University of British Columbia (Canada), among others, also take part.
Method of Research
Subject of Research
Petabase-scale sequence alignment catalyses viral Discovery
Article Publication Date