OCT 01, 2021 12:00 AM PDT

Making the Most of Your NGS Data: Understanding Metrics for Target-enriched NGS

SPONSORED BY: Roche Sequencing

Introduction

Targeted next-generation sequencing (NGS) is often performed using hybridization-based target enrichment, which deploys oligonucleotide probes to capture regions of interest for downstream sequencing.  Although targeted sequencing reduces sequencing expense, it is still time-consuming and expensive, so an understanding of key sequencing metrics can help you to maximize the value of each run.

Beyond common metrics (e.g., base quality, cluster density, number of reads passing filter), several additional metrics provide more in-depth insights into the success of a sequencing run:  Depth of coverage (the number of times that a particular base within the target region is represented in the sequence data) and on-target rate (the number of bases that map to the target region) are fairly intuitive concepts.  Also intuitive is the duplication rate for a sequencing run, which reflects the percentage of duplicate reads (reads that are mapped to the exact same location, including the coordinates of the 3’ and 5’ ends) out of the total mapped reads. This article focuses on two less-intuitive metrics: GC-bias and Fold-80 penalty, and offers some tips on how to improve them.

GC bias

The distribution of AT-rich and GC-rich regions—often referred to as GC content—is uneven across genomes. During sequencing, regions of high or low GC content are often unevenly sequenced, causing disproportionate coverage of these regions; this is known as GC bias. GC bias in sequencing data across regions of variable GC content can be visualized in GC-bias distribution plots (Figure 1).

High levels of GC bias can be introduced during library preparation (especially in workflows dependent on PCR), during hybrid capture, or during the sequencing run itself.  This bias increases the amount of sequencing that must be performed, driving up expense; thus, it is important to choose a library preparation kit that minimizes GC bias.

Fold-80 Base Penalty

Analysis of sequencing data typically reveals that some target regions have achieved higher coverage than others. The Fold-80 base penalty metric is one way to assess coverage uniformity. Once the mean target coverage is determined for an experiment, the Fold-80 base penalty describes how much more sequencing is required to bring 80% of the target bases to that mean coverage.  Thus, a run with perfect coverage uniformity would have a Fold-80 base penalty score of 1, indicating an on-target rate of 100% and uniform coverage (see Figure 2). Values > 1 reflect uneven levels of uniformity. For instance, a Fold-80 value of 2 means that twice as much (2-fold) sequencing is required for 80% of the reads to reach the mean coverage.

The Fold-80 base penalty provides information about the capture efficiency of the probes in the panel, which is impacted by both probe design and probe quality. To decrease the Fold-80 base penalty and reduce the need for additional, costly sequencing runs, use high-quality, well-designed probes.

Understanding sequencing metrics can help you to get the most out of valuable sequencing resources, including time, money, and precious samples. To watch short videos about the five metrics mentioned here, and to learn about other aspects of NGS, visit: https://go.roche.com/Targeted-NGS-Metrics

About the Sponsor
At Roche Sequencing, we are building on Roche's legacy of innovation to transform NGS and its application. By simplifying workflows & expanding assay menus, we are broadening access to genomic data & lowering barriers to routine use. Our growing suite of products spans the genomics workflow, from sample acquisition & preparation through data analysis and final result, helping you answer important questions in genetics, cancer & beyond.
You May Also Like
FEB 17, 2022
Plants & Animals
Fighting Illegal Elephant Ivory Trafficking With DNA Testing
FEB 17, 2022
Fighting Illegal Elephant Ivory Trafficking With DNA Testing
Ivory poaching and trading has a long and tragic history. Humans have coveted ivory for use in a range of products&mdash ...
FEB 19, 2022
Genetics & Genomics
54,000-Year-Old Tooth Challenges Our Understanding of How Humans and Neanderthals Interacted
FEB 19, 2022
54,000-Year-Old Tooth Challenges Our Understanding of How Humans and Neanderthals Interacted
In 2012 Archaeologists working in caves in the South of France discovered something incredible. A single tooth, in a lay ...
APR 01, 2022
Earth & The Environment
460-Million-Year-Old Ancestor of the Vampire Squid Discovered with Ten Functional Arms
APR 01, 2022
460-Million-Year-Old Ancestor of the Vampire Squid Discovered with Ten Functional Arms
A recent discovery within the coleiod cephalopod species has further pushed back its known origin by approximately 82 mi ...
APR 07, 2022
Genetics & Genomics
Huge Genetic Risk Studies Reveal More About Schizophrenia Basis
APR 07, 2022
Huge Genetic Risk Studies Reveal More About Schizophrenia Basis
Genome wide association studies seek to understand how very small change sin the human genome can contribute to disease ...
APR 07, 2022
Cell & Molecular Biology
New Type of 'Immune System' is Discovered in Bacteria
APR 07, 2022
New Type of 'Immune System' is Discovered in Bacteria
Just like humans, bacteria can be attacked by infectious pathogens, and microbes also have immune defense systems. One o ...
APR 19, 2022
Immunology
Small Genetic Differences in TB Lead to Very Different Illnesses
APR 19, 2022
Small Genetic Differences in TB Lead to Very Different Illnesses
Tuberculosis is caused by a bacterium, Mycobacterium tuberculosis, that is thought to have been around for about 150 mil ...
Loading Comments...