JUL 13, 2022 6:33 PM PDT

Over 7,200 Segments in the Human Genome May Code for Novel Proteins

WRITTEN BY: Carmen Leitch

There are billions of nucleotides in the human genome, and researchers once thought that there could be as many as 100,000 protein-coding genes encoded within the human genome. One of the main goals of the Human Genome Project was to identify protein-coding genes in the genomic sequence. When the vast majority of the sequence was completed around 2003, however, there seemed to be only about 20,000 protein-coding genes, which only occupy about two percent of the human genome. Since then, we've learned that there are other sequences with important functions that do not code for protein, like regulatory RNA sequences. Now researchers have suggested that there are 7,200 gene segments in the human genome that may potentially be used to generate new proteins. This work has been reported in Nature Biotechnology.

Image credit: Pixabay

Many short sequences of DNA called open reading frames (ORFs) have been found in the genome. There has been evidence that some of these ORFs are transcribed, and many have biological functions, but few are included in reference databases, and they have remained relatively obscure.

Researchers are seeking to put these ORFs into those reference materials so that more researchers can find them if the sequences are relevant to their work. Scientists often compare sequences in their research to reference databases to learn more about those sequences or genes, such as whether they appear in other species, or whether they carry mutations.

ORFs that interact with parts of the ribosome, an organelle that generates proteins from mRNA, were first assembled into a standardized catalog, even though much of the data was obtained from different labs in various ways.

The study authors wanted to answer some fundamental questions as well, such as exactly what constitutes a gene or protein, and whether ribosomes only generate proteins, or if ribosomes can also make other types of molecules. Now, they have suggested that reference databases for the human genome should be revised. Ensembl-GENCODE is integrating the new ORF catalog, and others such as UniProt and HGNC are supporting the effort.

"It's tremendously exciting to enable the research community with our new catalog," said Dr. Sebastiaan van Heesch, a group leader at the Princess Máxima Center for pediatric oncology. "It's too soon to say whether all of the unexplored sections of DNA truly represent proteins, but we can clearly see that something unexplored is happening across the human genome and that the world should be paying attention."

"For too long, the scientific community has been mostly left in the dark about these ORFs," said Jonathan Mudge of the European Bioinformatics Institute (EMBL-EBI). "We're very proud that our work will be able to let researchers across the world start to study them. This is the point at which they enter the mainstream of genomic and medical science, an effort which we expect to have wide-ranging ripple effects."

Sources: Max Delbrück Center for Molecular Medicine, Nature Biotechnology

About the Author
BS
Experienced research scientist and technical expert with authorships on over 30 peer-reviewed publications, traveler to over 70 countries, published photographer and internationally-exhibited painter, volunteer trained in disaster-response, CPR and DV counseling.
You May Also Like
SEP 02, 2022
Genetics & Genomics
The Genetic Factors Underlying the Power of Language
SEP 02, 2022
The Genetic Factors Underlying the Power of Language
What sets humans apart from other animals? One primary difference is language; reading, writing, and speaking enable us ...
SEP 26, 2022
Genetics & Genomics
Thousand-Year-Old Poop Teaches us About an Ancient Parasite
SEP 26, 2022
Thousand-Year-Old Poop Teaches us About an Ancient Parasite
Parasitic whipworm eggs have been isolated from fossilized human fecal samples that were estimated to be over 7,000 year ...
SEP 28, 2022
Genetics & Genomics
Microprotein Mutations May Significantly Increase Alzheimer's Risk
SEP 28, 2022
Microprotein Mutations May Significantly Increase Alzheimer's Risk
The mitochondrion, commonly called the powerhouse of the cell, might be one of the best known organelles. This special o ...
OCT 17, 2022
Clinical & Molecular DX
New Study Shows Biological Differences in the Second-Most Common Type of Breast Cancer
OCT 17, 2022
New Study Shows Biological Differences in the Second-Most Common Type of Breast Cancer
Though invasive lobular carcinoma (ILC) is the second-most common type of breast cancer, it has historically been resear ...
NOV 03, 2022
Genetics & Genomics
A new hybrid songbird from Pennsylvania
NOV 03, 2022
A new hybrid songbird from Pennsylvania
Researchers in Pennsylvania discover a new-to-science hybrid songbird species
DEC 01, 2022
Genetics & Genomics
No more manual library preps!
DEC 01, 2022
No more manual library preps!
BioQuleTM NGS System - Say ‘goodbye’ to manual library prepping and ‘hello’ to generating librar ...
Loading Comments...