The antiSMASH tool can help researchers find bacterial genes responsible for the biosynthesis of interesting metabolites such as new antibiotics, pesticides, and anti-cancer drugs. A new database that collects results from the antiSMASH tool eases the comparison of thousands of bacterial genomes when it comes to metabolites.
"Many scientists search for bacterial metabolites because they look for new antibiotics. In our group, we look for new antibiotics against multi-drug resistant bacteria like Klebsiella pneumoniae or Acinetobacter baumannii. Society desperately needs new antibiotics against these bacteria, which cause severe septicemia, urinal infections and pneumonia," says Kai Blin, Researcher and Scientific Software Engineer at The Novo Nordisk Foundation Center for Biosustainability at Technical University of Denmark (DTU).
The database does not only work for discovering new antibiotics. The food and pharmaceutical industries also use this tool in order to ensure that bacteria used as for instance probiotics do not produce toxic compounds.
The newest improvements of the database have now been published in Nucleic Acid Research.
Interesting metabolites are not directly involved in the normal growth, development, or reproduction of the microorganism. But the metabolites often play an important role in the organism's defense systems against predators.
In the industry, these microbial metabolites can be used as medicines, flavourings, and pigments. Often, microorganisms do not produce interesting metabolites automatically when they are assessed in the lab, which makes the valuable compounds "invisible" to the scientists - unless they look in the DNA.
"For instance, if you know that a certain bacterium produces a metabolite, but at the same time the organism cannot be cultured and, hence, studied further, you can look for the same hidden metabolite in other bacteria via the database," Kai Blin says.
The scientists have built and improved the antiSMASH online tool over the last 7 years, and it now runs more than 100,000 tasks pr. year and has over 2500 citations.
But the scientists discovered that many antiSMASH-users ran the same genomes, looking for the same results. This was very time consuming for the user, since each run takes up to several hours. Therefore, the researchers and software engineers decided to build the antiSMASH database, which collects all the precomputed results.
"This means that the user can have the results right away and doesn't have to wait for hours or do this job manually, which would take days or weeks," says Kai Blin.
The interesting metabolites are encoded by so-called biosynthetic gene clusters, BGCs, which the database is trained to identify. antiSMASH uses a rule-based cluster detection approach to identify 45 different types of interesting metabolites.
The newest version of the antiSMASH database contains 6,200 full bacterial genomes, an 58 % increase compared to version 1. Also, 18,576 so-called draft genomes have been added.
Quite importantly, the update also contains various query options to access the BGS's. The software engineers have also added a redundancy filter to version 2, which means that instead of giving results for hundreds of strains with almost identical sequences, the database only shows results from the best quality genome. The database is open source, free and easy to use, even for non-programmers.