APR 03, 2022 8:08 AM PDT

The Human Genome Sequence is Finally, Truly Complete

WRITTEN BY: Carmen Leitch

The Human Genome Project was declared complete in 2002. But it wasn't exactly finished. Most of the sequence, about 92 percent, had been totally deciphered, particularly the sections that contain protein-coding genes. But the genome also holds long stretches of repetitive sequences that can be very difficult to unravel using traditional or advanced DNA sequencing techniques. Now the gaps in the sequence have finally been filled in. The work has been reported in Science.

Those repetitive sequences were once dismissed as "junk DNA," but researchers have been finding more sections of that junk that have important biological functions. Since they do not code for protein, studying them can be  extremely challenging. But not only are they thought to be connected to some diseases, they may be essential to certain biological functions, making them important to understand.

The effort to map the elusive portions of the genome was named the Telomere-to-Telomere (T2T) Consortium, because the caps that sit on the ends of chromosomes and protect them are called telomeres. Like the dense middles of chromosomes, called centromeres, telomeres are also full of repetitive sequences that are hard to sequence. Those centromeres are also a critical part of DNA replication and cell division.

In the early days of sequencing, specific sections of the genome could be amplified; a selected sequence was targeted with small molecules called primers, which match short sections on the ends of those specific sequences. Once amplified into many copies by an enzyme, each base of that specific sequence can then be tagged with a fluorescent molecule, then the sequence of fluorescent colors is read as bases of DNA by a machine.

More advanced sequencing methods took a different approach. In next-gen sequencing, portions of the genome are chopped into tiny parts that are then sequenced and finally assembled together like puzzle pieces to create a long sequence. Repetition in the genome is difficult for both methods to deal with, and a third-generation sequencing technique was engineered. In third-generation or nanopore sequencing, much longer reads are possible. A single molecule of DNA is passed through a nanopore, and every base is read electronically.

Merfin is another tool that researchers created for this work. Merfin can correct mistakes made in the sequencing process, automatically detecting and correcting those errors.

Image credit: Modified from Pixabay

"Stretches of identical base pairs, such as AAA," can be difficult for current technologies to read, explained postdoctoral researcher Giulio Formenti, PhD, who developed Merfin. "There are often errors in those sequences, even now. Merfin corrects them."

The researchers are hoping that the techniques used to finish the human genome sequence, which were presented in a Nature Methods paper, will help scientists understand diseases that are associated with structural repeats in the centromere. "We are finally digging into what we once called junk DNA, because we could not understand it or look at it accurately," Formenti said. "Now that these sequences are no longer missing from the human reference genome, we can begin to map the origins of these diseases."

Cancer has been linked to centromere defects, for example. When some heterochromatic centromere genes are overactive, cancer cells divide wildly. Now that we have the sequence of the complete human genome, scientists can learn more about these mysterious regions.

Sources: Rockefeller University, Nature Methods, Science

About the Author
BS
Experienced research scientist and technical expert with authorships on over 30 peer-reviewed publications, traveler to over 70 countries, published photographer and internationally-exhibited painter, volunteer trained in disaster-response, CPR and DV counseling.
You May Also Like
SEP 12, 2022
Genetics & Genomics
Deciphering Longevity with the Genetics of the Immortal Jellyfish
SEP 12, 2022
Deciphering Longevity with the Genetics of the Immortal Jellyfish
Living things have to contend with aging. Except for a few unusual creatures, like Turritopsis dohrnii, which has an ext ...
OCT 11, 2022
Genetics & Genomics
Revealing the Genome Carried by the Common Ancestor to all Mammals
OCT 11, 2022
Revealing the Genome Carried by the Common Ancestor to all Mammals
Every mammal that is found on our planet is descended from one common ancestor, thought to have lived around 180 million ...
NOV 02, 2022
Clinical & Molecular DX
New Genetic Variants Identified as Risk Factors for Ovarian Cancer
NOV 02, 2022
New Genetic Variants Identified as Risk Factors for Ovarian Cancer
Ovarian cancer is a complex disease resulting from mutated cells in the ovaries. As the cells multiply, they can invade ...
NOV 09, 2022
Genetics & Genomics
Want to Study Copy Number Alterations in Cells? Bring MACHETE
NOV 09, 2022
Want to Study Copy Number Alterations in Cells? Bring MACHETE
Sure, MACHETE is a cool name, but the researchers that developed the technique are hoping people don't focus solely on t ...
NOV 29, 2022
Cell & Molecular Biology
How a Master Regulator May be Working to Protect Cancer
NOV 29, 2022
How a Master Regulator May be Working to Protect Cancer
Scientists have now discovered yet another way that MYC proteins can promote cancer. MYC has been called a master regula ...
DEC 04, 2022
Neuroscience
International Study Highlights the Link Between Sleep Supporting Methods and Child Development
DEC 04, 2022
International Study Highlights the Link Between Sleep Supporting Methods and Child Development
A new study published in Frontiers in Psychology examined the effect of different parental sleep-supporting techniques o ...
Loading Comments...