Watch Out for a Second SNP: Focus on Multi-Nucleotide Variants in Coding Regions and Rescued Stop-Gained

Most single-nucleotide polymorphisms (SNPs) are located in non-coding regions, but the fraction usually studied is harbored in protein-coding regions because potential impacts on proteins are relatively easy to predict by popular tools such as the Variant Effect Predictor. These tools annotate varia...

Full description

Bibliographic Details
Main Authors: Fabien Degalez, Frédéric Jehl, Kévin Muret, Maria Bernard, Frédéric Lecerf, Laetitia Lagoutte, Colette Désert, Frédérique Pitel, Christophe Klopp, Sandrine Lagarrigue
Format: Article
Language:English
Published: Frontiers Media S.A. 2021-07-01
Series:Frontiers in Genetics
Subjects:
MNV
SNP
Online Access:https://www.frontiersin.org/articles/10.3389/fgene.2021.659287/full
id doaj-8ec8210e2dd347a5af0491a8072b8ddd
record_format Article
spelling doaj-8ec8210e2dd347a5af0491a8072b8ddd2021-07-07T07:20:41ZengFrontiers Media S.A.Frontiers in Genetics1664-80212021-07-011210.3389/fgene.2021.659287659287Watch Out for a Second SNP: Focus on Multi-Nucleotide Variants in Coding Regions and Rescued Stop-GainedFabien Degalez0Frédéric Jehl1Kévin Muret2Maria Bernard3Maria Bernard4Frédéric Lecerf5Laetitia Lagoutte6Colette Désert7Frédérique Pitel8Christophe Klopp9Sandrine Lagarrigue10INRAE, INSTITUT AGRO, PEGASE UMR 1348, Saint-Gilles, FranceINRAE, INSTITUT AGRO, PEGASE UMR 1348, Saint-Gilles, FranceINRAE, INSTITUT AGRO, PEGASE UMR 1348, Saint-Gilles, FranceINRAE, SIGENAE, Genotoul Bioinfo MIAT, Castanet-Tolosan, FranceINRAE, AgroParisTech, Université Paris-Saclay, GABI UMR 1313, Jouy-en-Josas, FranceINRAE, INSTITUT AGRO, PEGASE UMR 1348, Saint-Gilles, FranceINRAE, INSTITUT AGRO, PEGASE UMR 1348, Saint-Gilles, FranceINRAE, INSTITUT AGRO, PEGASE UMR 1348, Saint-Gilles, FranceINRAE, INPT, ENVT, Université de Toulouse, GenPhySE UMR 1388, Castanet-Tolosan, FranceINRAE, SIGENAE, Genotoul Bioinfo MIAT, Castanet-Tolosan, FranceINRAE, INSTITUT AGRO, PEGASE UMR 1348, Saint-Gilles, FranceMost single-nucleotide polymorphisms (SNPs) are located in non-coding regions, but the fraction usually studied is harbored in protein-coding regions because potential impacts on proteins are relatively easy to predict by popular tools such as the Variant Effect Predictor. These tools annotate variants independently without considering the potential effect of grouped or haplotypic variations, often called “multi-nucleotide variants” (MNVs). Here, we used a large RNA-seq dataset to survey MNVs, comprising 382 chicken samples originating from 11 populations analyzed in the companion paper in which 9.5M SNPs— including 3.3M SNPs with reliable genotypes—were detected. We focused our study on in-codon MNVs and evaluate their potential mis-annotation. Using GATK HaplotypeCaller read-based phasing results, we identified 2,965 MNVs observed in at least five individuals located in 1,792 genes. We found 41.1% of them showing a novel impact when compared to the effect of their constituent SNPs analyzed separately. The biggest impact variation flux concerns the originally annotated stop-gained consequences, for which around 95% were rescued; this flux is followed by the missense consequences for which 37% were reannotated with a different amino acid. We then present in more depth the rescued stop-gained MNVs and give an illustration in the SLC27A4 gene. As previously shown in human datasets, our results in chicken demonstrate the value of haplotype-aware variant annotation, and the interest to consider MNVs in the coding region, particularly when searching for severe functional consequence such as stop-gained variants.https://www.frontiersin.org/articles/10.3389/fgene.2021.659287/fullMNVSNPvariationrescued stop-gainedSLC27A4FATP4
collection DOAJ
language English
format Article
sources DOAJ
author Fabien Degalez
Frédéric Jehl
Kévin Muret
Maria Bernard
Maria Bernard
Frédéric Lecerf
Laetitia Lagoutte
Colette Désert
Frédérique Pitel
Christophe Klopp
Sandrine Lagarrigue
spellingShingle Fabien Degalez
Frédéric Jehl
Kévin Muret
Maria Bernard
Maria Bernard
Frédéric Lecerf
Laetitia Lagoutte
Colette Désert
Frédérique Pitel
Christophe Klopp
Sandrine Lagarrigue
Watch Out for a Second SNP: Focus on Multi-Nucleotide Variants in Coding Regions and Rescued Stop-Gained
Frontiers in Genetics
MNV
SNP
variation
rescued stop-gained
SLC27A4
FATP4
author_facet Fabien Degalez
Frédéric Jehl
Kévin Muret
Maria Bernard
Maria Bernard
Frédéric Lecerf
Laetitia Lagoutte
Colette Désert
Frédérique Pitel
Christophe Klopp
Sandrine Lagarrigue
author_sort Fabien Degalez
title Watch Out for a Second SNP: Focus on Multi-Nucleotide Variants in Coding Regions and Rescued Stop-Gained
title_short Watch Out for a Second SNP: Focus on Multi-Nucleotide Variants in Coding Regions and Rescued Stop-Gained
title_full Watch Out for a Second SNP: Focus on Multi-Nucleotide Variants in Coding Regions and Rescued Stop-Gained
title_fullStr Watch Out for a Second SNP: Focus on Multi-Nucleotide Variants in Coding Regions and Rescued Stop-Gained
title_full_unstemmed Watch Out for a Second SNP: Focus on Multi-Nucleotide Variants in Coding Regions and Rescued Stop-Gained
title_sort watch out for a second snp: focus on multi-nucleotide variants in coding regions and rescued stop-gained
publisher Frontiers Media S.A.
series Frontiers in Genetics
issn 1664-8021
publishDate 2021-07-01
description Most single-nucleotide polymorphisms (SNPs) are located in non-coding regions, but the fraction usually studied is harbored in protein-coding regions because potential impacts on proteins are relatively easy to predict by popular tools such as the Variant Effect Predictor. These tools annotate variants independently without considering the potential effect of grouped or haplotypic variations, often called “multi-nucleotide variants” (MNVs). Here, we used a large RNA-seq dataset to survey MNVs, comprising 382 chicken samples originating from 11 populations analyzed in the companion paper in which 9.5M SNPs— including 3.3M SNPs with reliable genotypes—were detected. We focused our study on in-codon MNVs and evaluate their potential mis-annotation. Using GATK HaplotypeCaller read-based phasing results, we identified 2,965 MNVs observed in at least five individuals located in 1,792 genes. We found 41.1% of them showing a novel impact when compared to the effect of their constituent SNPs analyzed separately. The biggest impact variation flux concerns the originally annotated stop-gained consequences, for which around 95% were rescued; this flux is followed by the missense consequences for which 37% were reannotated with a different amino acid. We then present in more depth the rescued stop-gained MNVs and give an illustration in the SLC27A4 gene. As previously shown in human datasets, our results in chicken demonstrate the value of haplotype-aware variant annotation, and the interest to consider MNVs in the coding region, particularly when searching for severe functional consequence such as stop-gained variants.
topic MNV
SNP
variation
rescued stop-gained
SLC27A4
FATP4
url https://www.frontiersin.org/articles/10.3389/fgene.2021.659287/full
work_keys_str_mv AT fabiendegalez watchoutforasecondsnpfocusonmultinucleotidevariantsincodingregionsandrescuedstopgained
AT fredericjehl watchoutforasecondsnpfocusonmultinucleotidevariantsincodingregionsandrescuedstopgained
AT kevinmuret watchoutforasecondsnpfocusonmultinucleotidevariantsincodingregionsandrescuedstopgained
AT mariabernard watchoutforasecondsnpfocusonmultinucleotidevariantsincodingregionsandrescuedstopgained
AT mariabernard watchoutforasecondsnpfocusonmultinucleotidevariantsincodingregionsandrescuedstopgained
AT fredericlecerf watchoutforasecondsnpfocusonmultinucleotidevariantsincodingregionsandrescuedstopgained
AT laetitialagoutte watchoutforasecondsnpfocusonmultinucleotidevariantsincodingregionsandrescuedstopgained
AT colettedesert watchoutforasecondsnpfocusonmultinucleotidevariantsincodingregionsandrescuedstopgained
AT frederiquepitel watchoutforasecondsnpfocusonmultinucleotidevariantsincodingregionsandrescuedstopgained
AT christopheklopp watchoutforasecondsnpfocusonmultinucleotidevariantsincodingregionsandrescuedstopgained
AT sandrinelagarrigue watchoutforasecondsnpfocusonmultinucleotidevariantsincodingregionsandrescuedstopgained
_version_ 1721316736945356800