Scientists from ITMO University, the Federal Research and Clinical Centre of Physical-Chemical Medicine, and MIPT have developed a software program enabling them to quickly compare sets of DNA of microorganisms living in different environments.
The researchers have already suggested exactly how the new program could be applied in practice. Using the algorithm to compare the microflora of a healthy person with the microflora of a patient, specialists would be able to detect previously unknown pathogens and their strains, which can aid the development of personalized medicine. The results of the study have been published in Bioinformatics.
Every person has a genome—a specific sequence of genes according to which an individual develops. However, any living organism contains another gene sequence that is called the metagenome. It is the total DNA content of the many different microorganisms that inhabit the same environment—bacteria, fungi, and viruses. The metagenome is often indicative of various diseases or predispositions to such diseases. Studying microbiota, i.e. the full range of microorganisms inhabiting different parts of the human body, has therefore a critical role in metagenomic research.
The software tool developed by the scientists and called MetaFast is able to conduct a rapid comparative analysis of large amounts of metagenomes. “In studying the intestinal microflora of patients, we may be able to detect microorganisms associated with a particular disease, such as diabetes, or a predisposition to the disease. This already forms a basis for applying personalized medicine techniques and developing new drugs. Using the results obtained with the software, biologists will be able to draw conclusions on how to further develop their research, because the algorithm enables them to study environments that we currently know nothing about,” says Vladimir Ulyantsev, lead developer of the algorithm and researcher at the Computer Technologies Laboratory at ITMO University.
One of the key benefits of the program is that it is able to work successfully with environments in which the genetic contents have not yet been studied. “The newly developed approach allows us to do two things – find all the possible gene sequences, even if they were previously unknown (the program collects them from fragments of genomic reads), and at the same time identify metagenomic patterns that distinguish one patient from another, e.g. people with and without a disease,” says Dmitry Alexeev, the leader of the project and head of MIPT’s Laboratory of Complex Biological Systems.
This means that the program can be used to conduct an untargeted express analysis of markers indicating certain diseases. Then, by using targeted methods such as PCR (a technique to make multiple copies of a fragment of DNA), the results can be verified and adjusted. According to the researchers, the program could greatly reduce the time needed to develop new drugs.
Microorganisms that do not reproduce in vitro, such as viruses, give very abstract results in tests and it is not possible to collect their DNA. However, the new program is able to detect even these microorganisms. “In the microbiota of the skin alone, 90% of the organisms are unknown,” continues Dmitry Alexeev. “Our approach enables us to work with completely unknown material and still obtain results. The program has been tested in a wide variety of environments, including those with a high number of viruses. The program is even able to locate and collect single DNA strands.”
MetaFast is not limited to detecting pathogens. For example, the program can also be used to compare distinct peoples in closed populations with people living in cities to help identify bacterial strains that are extremely useful to humans, but have been lost in the process of urbanization. Antibiotics, preservatives, colorants and supermarket food have pushed many useful bacteria out of our microflora, but these bacteria could still be present in closed populations, such as American Indians or people in Russian villages.
MetaFast has proven to be highly effective in studying rare and undiscovered metagenomes. As a part of the study, the scientists analyzed the metagenome of several of the world’s largest lakes. Without any information about the samples of microbiota from the lakes, the program found genetic similarities between samples that were close in terms of their chemical composition.
The researchers also used the new algorithm to study the inhabitants of the New York underground, demonstrating the effectiveness of the algorithm when analyzing such complex systems. Most of the DNA collected using MetaFast belonged to already known bacteria. This confirms previous theories stating that the subway is safe for humans, and the microbes that live there suppress any flora that could be dangerous to people.
A vast amount of experimental data has already been gathered worldwide on various metagenomes. As the cost of extracting DNA is decreasing and the sensitivity of equipment is increasing, the volume of data is continuing to grow exponentially. Despite this, most of the studies have not been fully completed. The reason lies in the limitations of the current technology. On the one hand, scientists are able to partially collect a metagenome, but piecing together the “puzzle” takes an enormous amount of time. On the other hand, they can compare individual fragments of the genome with existing DNA references, but there are very limited numbers of bacteria, and virtually no viruses.
The new algorithm not only combines the advantages of both of these approaches, but also enables data to be processed at high speed. The program saves RAM because it partially collects and partially compares genomes, but does not go into an in-depth collection analysis.