Bioinformatics scientists from ITMO University have developed a programming tool that allows for quick and effective analysis of genome data and using it as a basis for building the most probable models of demographic history of populations of plants, animals and people. Operating with complex computational schemes, the software can, with a very high degree of likelihood, predict what history a particular group of living organisms has gone through in the past thousands of years, what periods of mass extinction or mass population growth a population has experienced, and how long it has been in contact with other populations of the same species. The scientists' article dedicated to this methodology has been published in GigaScience.
How to find out when exactly the modern tigers' first ancestors appeared on Earth? When did the two elephant populations split? Is there a difference between the Dama and the Moroccan gazelle? When did the division of the African and the Eurasian homo sapiens occur? The answers to all these questions can be found in the population's demographic history - in other words, the scenario that shows what stages the population went through in the course of its history, whether it underwent any mass extinctions, migrations, or sharp spikes in its numbers.
Apart from solving fundamental questions, this data can help us in the matters of applied research in the field of ecology and environmental protection. For instance, if some region only has some 800 walruses left, scientists have to understand whether it constitutes a critical decrease or it is a natural population size which has remained constant for several thousand years now, and answer the question of whether valuable resources have to be spent on protecting and saving this species from becoming extinct.
The creation of a population's demographic history on the basis of genetic information is a complicated task which requires population geneticists to possess not only knowledge in the field of biology but also programming skills. Such scientists have to garner data and write a code for computing possible models of a population's evolution which could have led to the vast multitude of the genetic information we can witness in this population's representatives today. Up until recently, this was a long process the end result of which relied very heavily on the researcher's initial hypothesis. If it had any defects or the research failed to take some aspect into consideration, the software couldn't correct this initial error and calculated the probability of particular demographic events only within the boundaries predefined by the researcher.
The software developed by a group of ITMO University scientists as part of the Project 5-100 grant programs and with support from JetBrains Research aims to solve this problem. The researchers proposed a programming product which independently and automatically predicts the most probable model of a population's demographic history. At that, it is significantly less dependent on the initial research hypothesis, doesn't require advanced programming skills and produces more accurate results. What is more, the software has the advantage of flexibility, meaning that if the obtained result somehow diverges from archaeological or historical data, you can easily introduce additional limitations into the underlying algorithm to update its hypothesis.
"Using genetic data, our software automatically computes the model it considers optimal," shares Vladimir Ulyantsev. "It looks at the entire volume of the scenarios available. As a scientist, I'll consider the scenarios I deem the most likely, there can be three, five, maybe ten of those. The software, on the other hand, will test all of the models it estimates as probable, this is a much bigger amount. That's why the solutions it comes up with are better than those proposed by people working on the basis of the initial methods. The most beautiful thing here is the method - a genetic algorithm inspired by how evolution happens: species multiply, mutate, with those with the least ability to adapt dying out. In the place of the species we have demographic models and their parameters, and their adaptability is measured on the basis of their similarity with the studied data."
After obtaining this data, the scientists can present it on a map and compare the information indicating that during a particular period a population underwent a migration with archaeological findings and other evidence. These algorithms were used to check a large number of hypotheses and research by evolutionary geneticists. In many cases, the obtained result was much more accurate than that of the initial works.