An international team of scientists from ITMO University and George Washington University (USA) created an algorithm for studying the evolutionary history of species with whole-genome duplications, chiefly yeast and plants.
The program can be used to analyze the genetic information about these species and make conclusions on how whole-genome duplication took place and why it secured a foothold in the process of evolution. The article was published in Oxford Bioinformatics, one of the leading titles in the field of Computer Science.
According to research by genetic scientists, some plants and even mammals have whole-genome duplications, i.e. some of their genes exist in several copies that are more or less similar to each other. ?he ancestral genome didn't have such duplicates, but the duplication happened at some point of its evolutionary history and got a foothold in the population.
In order to understand the process of genome duplication, you have to create the so-called evolutionary history of a species with this evolutionary event. This history allows to track what happened with the population in the past and identify when exactly the duplication happened and how it got a foothold.
When attempting to create an evolutionary history with whole-genome duplications, a scientist has to face a series of tasks that are similar in their goals but have completely different mathematical structure. In order to solve them efficiently, you need optimization. For this purpose, a team that included specialists from ITMO University and the George Washington University (USA) proposed integer linear programming approach that were first proposed by Leonid Kantorovich, a Soviet mathematician, economist and the Nobel Memorial Prize in Economic Sciences.
"There's a class of tasks that are essentially similar but different from the standpoint of mathematics, explains Nikita Alekseev, co-author of the research, ITMO University. So we've developed a common approach that comes down to integer linear programming. This is an optimization method that reduces a complex program to a set of linear constraints for which there exists a selection of effective solvers."
As a result, the scientists developed a program that analyses duplicated genomes and makes presumptions on a species' evolutionary path, the number of genome duplications that took place in that time, and how the copies of genes that emerge as result of duplication changed. Sometimes mutations take place in them, changes in specific regions, so they are no longer identical.
This approach can also be applied for studying duplicated genome regions in animals.
"Genome duplications are present in many species and can affect not just the genome as a whole but also its fragments, and our tool can be adapted for solving such problems, too," concludes Nikita Alekseev.