Scientists have been able to learn more about the evolutionary relationships between different species by comparing their genomes. By aligning different genomes, investigators can find the similarities and differences between organisms that have no obvious relationship, and determine when certain genes changed, which can reveal how those alterations impacted a gene's function. This method is called multiple sequence alignment (MSA), which is described in the video below.
"We currently use multiple sequence alignments to understand the family tree of species evolution," said Cédric Notredame, a researcher at the Centre for Genomic Regulation in Barcelona. "The bigger your MSA, the bigger the tree and the deeper we dig into the past and find how species appeared and separated from each other."
Advances in gene sequencing technologies have created a wealth of data about many organisms, and scientists can now do more with all that information. Instead of comparing a handful of sequences, a new tool has been created that can compare 1.4 million different genetic sequences at the same time. Developed at the Centre for Genomic Regulation in Barcelona, it will enable us to understand evolution, and how various genetic codes from all types of life on the planet are related. The work has been reported in Nature Biotechnology.
"What we've made lets us dig ten times deeper than what we've been able to do before, helping us to see hundreds of millions of years into the past," said Notredame, who was the lead author of the report. "Our technology is essentially a time machine that tells us how ancient constraints influenced genes in a way that resulted in life as we know today, much like how the Hubble Space Telescope observes things that happened millions of years ago to help us understand the Universe we live in today."
This work can also provide fresh insights. It may help show how plants can adapt to climate change, or how species that are at risk might be saved. Biodiversity is critical to the health of our ecosystem, and the tool can help us understand it.
The scientists applied cloud-computing software to create this technology. "We spent hundreds of thousands of hours of computation to test our algorithm's effectiveness," noted Evan Floden, a researcher at the CRG who led the creation of the tool. "My hope is that in combining high-throughput instrumentation readouts with high-throughput computation, science will usher in an era of vastly improved biological understanding, ultimately leading to better outcomes for consumers, patients and our planet as a whole."
"There is a vast amount of 'dark matter' in biology, code we have yet to identify in the unexplored parts of the genome that is untapped potential for new medicines and other benefits we can't fathom," added Cédric. "Even seemingly inconsequential organisms may play a pivotal role in furthering human health and that of our planet, such as the discovery of CRISPR in archaea. What we have built is a new way of finding the needles in the haystack of life's genomes."
Sources: Phys.org via Center for Genomic Regulation, Nature Biotechnology