Intronic CNVs and gene expression variation in human populations.

Introns can be extraordinarily large and they account for the majority of the DNA sequence in human genes. However, little is known about their population patterns of structural variation and their functional implication. By combining the most extensive maps of CNVs in human populations, we have fou...

Full description

Bibliographic Details
Main Authors: Maria Rigau, David Juan, Alfonso Valencia, Daniel Rico
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2019-01-01
Series:PLoS Genetics
Online Access:https://doi.org/10.1371/journal.pgen.1007902
id doaj-0c12178ebb15478f934fac0775916326
record_format Article
spelling doaj-0c12178ebb15478f934fac07759163262021-04-21T13:49:13ZengPublic Library of Science (PLoS)PLoS Genetics1553-73901553-74042019-01-01151e100790210.1371/journal.pgen.1007902Intronic CNVs and gene expression variation in human populations.Maria RigauDavid JuanAlfonso ValenciaDaniel RicoIntrons can be extraordinarily large and they account for the majority of the DNA sequence in human genes. However, little is known about their population patterns of structural variation and their functional implication. By combining the most extensive maps of CNVs in human populations, we have found that intronic losses are the most frequent copy number variants (CNVs) in protein-coding genes in human, with 12,986 intronic deletions, affecting 4,147 genes (including 1,154 essential genes and 1,638 disease-related genes). This intronic length variation results in dozens of genes showing extreme population variability in size, with 40 genes with 10 or more different sizes and up to 150 allelic sizes. Intronic losses are frequent in evolutionarily ancient genes that are highly conserved at the protein sequence level. This result contrasts with losses overlapping exons, which are observed less often than expected by chance and almost exclusively affect primate-specific genes. An integrated analysis of CNVs and RNA-seq data showed that intronic loss can be associated with significant differences in gene expression levels in the population (CNV-eQTLs). These intronic CNV-eQTLs regions are enriched for intronic enhancers and can be associated with expression differences of other genes showing long distance intron-promoter 3D interactions. Our data suggests that intronic structural variation of protein-coding genes makes an important contribution to the variability of gene expression and splicing in human populations.https://doi.org/10.1371/journal.pgen.1007902
collection DOAJ
language English
format Article
sources DOAJ
author Maria Rigau
David Juan
Alfonso Valencia
Daniel Rico
spellingShingle Maria Rigau
David Juan
Alfonso Valencia
Daniel Rico
Intronic CNVs and gene expression variation in human populations.
PLoS Genetics
author_facet Maria Rigau
David Juan
Alfonso Valencia
Daniel Rico
author_sort Maria Rigau
title Intronic CNVs and gene expression variation in human populations.
title_short Intronic CNVs and gene expression variation in human populations.
title_full Intronic CNVs and gene expression variation in human populations.
title_fullStr Intronic CNVs and gene expression variation in human populations.
title_full_unstemmed Intronic CNVs and gene expression variation in human populations.
title_sort intronic cnvs and gene expression variation in human populations.
publisher Public Library of Science (PLoS)
series PLoS Genetics
issn 1553-7390
1553-7404
publishDate 2019-01-01
description Introns can be extraordinarily large and they account for the majority of the DNA sequence in human genes. However, little is known about their population patterns of structural variation and their functional implication. By combining the most extensive maps of CNVs in human populations, we have found that intronic losses are the most frequent copy number variants (CNVs) in protein-coding genes in human, with 12,986 intronic deletions, affecting 4,147 genes (including 1,154 essential genes and 1,638 disease-related genes). This intronic length variation results in dozens of genes showing extreme population variability in size, with 40 genes with 10 or more different sizes and up to 150 allelic sizes. Intronic losses are frequent in evolutionarily ancient genes that are highly conserved at the protein sequence level. This result contrasts with losses overlapping exons, which are observed less often than expected by chance and almost exclusively affect primate-specific genes. An integrated analysis of CNVs and RNA-seq data showed that intronic loss can be associated with significant differences in gene expression levels in the population (CNV-eQTLs). These intronic CNV-eQTLs regions are enriched for intronic enhancers and can be associated with expression differences of other genes showing long distance intron-promoter 3D interactions. Our data suggests that intronic structural variation of protein-coding genes makes an important contribution to the variability of gene expression and splicing in human populations.
url https://doi.org/10.1371/journal.pgen.1007902
work_keys_str_mv AT mariarigau introniccnvsandgeneexpressionvariationinhumanpopulations
AT davidjuan introniccnvsandgeneexpressionvariationinhumanpopulations
AT alfonsovalencia introniccnvsandgeneexpressionvariationinhumanpopulations
AT danielrico introniccnvsandgeneexpressionvariationinhumanpopulations
_version_ 1714668616392114176