A subset of conserved mammalian long non-coding RNAs are fossils of ancestral protein-coding genes

Abstract Background Only a small portion of human long non-coding RNAs (lncRNAs) appear to be conserved outside of mammals, but the events underlying the birth of new lncRNAs in mammals remain largely unknown. One potential source is remnants of protein-coding genes that transitioned into lncRNAs. R...

Full description

Bibliographic Details
Main Authors: Hadas Hezroni, Rotem Ben-Tov Perry, Zohar Meir, Gali Housman, Yoav Lubelsky, Igor Ulitsky
Format: Article
Language:English
Published: BMC 2017-08-01
Series:Genome Biology
Subjects:
Online Access:http://link.springer.com/article/10.1186/s13059-017-1293-0
id doaj-c338ea8b9a234874a4221ba4638d39ab
record_format Article
spelling doaj-c338ea8b9a234874a4221ba4638d39ab2020-11-24T21:44:34ZengBMCGenome Biology1474-760X2017-08-0118111510.1186/s13059-017-1293-0A subset of conserved mammalian long non-coding RNAs are fossils of ancestral protein-coding genesHadas Hezroni0Rotem Ben-Tov Perry1Zohar Meir2Gali Housman3Yoav Lubelsky4Igor Ulitsky5Department of Biological Regulation, Weizmann Institute of ScienceDepartment of Biological Regulation, Weizmann Institute of ScienceDepartment of Biological Regulation, Weizmann Institute of ScienceDepartment of Biological Regulation, Weizmann Institute of ScienceDepartment of Biological Regulation, Weizmann Institute of ScienceDepartment of Biological Regulation, Weizmann Institute of ScienceAbstract Background Only a small portion of human long non-coding RNAs (lncRNAs) appear to be conserved outside of mammals, but the events underlying the birth of new lncRNAs in mammals remain largely unknown. One potential source is remnants of protein-coding genes that transitioned into lncRNAs. Results We systematically compare lncRNA and protein-coding loci across vertebrates, and estimate that up to 5% of conserved mammalian lncRNAs are derived from lost protein-coding genes. These lncRNAs have specific characteristics, such as broader expression domains, that set them apart from other lncRNAs. Fourteen lncRNAs have sequence similarity with the loci of the contemporary homologs of the lost protein-coding genes. We propose that selection acting on enhancer sequences is mostly responsible for retention of these regions. As an example of an RNA element from a protein-coding ancestor that was retained in the lncRNA, we describe in detail a short translated ORF in the JPX lncRNA that was derived from an upstream ORF in a protein-coding gene and retains some of its functionality. Conclusions We estimate that ~ 55 annotated conserved human lncRNAs are derived from parts of ancestral protein-coding genes, and loss of coding potential is thus a non-negligible source of new lncRNAs. Some lncRNAs inherited regulatory elements influencing transcription and translation from their protein-coding ancestors and those elements can influence the expression breadth and functionality of these lncRNAs.http://link.springer.com/article/10.1186/s13059-017-1293-0Long noncoding RNAsEvolutionPseudogenesTranslational regulationuORFsX inactivation
collection DOAJ
language English
format Article
sources DOAJ
author Hadas Hezroni
Rotem Ben-Tov Perry
Zohar Meir
Gali Housman
Yoav Lubelsky
Igor Ulitsky
spellingShingle Hadas Hezroni
Rotem Ben-Tov Perry
Zohar Meir
Gali Housman
Yoav Lubelsky
Igor Ulitsky
A subset of conserved mammalian long non-coding RNAs are fossils of ancestral protein-coding genes
Genome Biology
Long noncoding RNAs
Evolution
Pseudogenes
Translational regulation
uORFs
X inactivation
author_facet Hadas Hezroni
Rotem Ben-Tov Perry
Zohar Meir
Gali Housman
Yoav Lubelsky
Igor Ulitsky
author_sort Hadas Hezroni
title A subset of conserved mammalian long non-coding RNAs are fossils of ancestral protein-coding genes
title_short A subset of conserved mammalian long non-coding RNAs are fossils of ancestral protein-coding genes
title_full A subset of conserved mammalian long non-coding RNAs are fossils of ancestral protein-coding genes
title_fullStr A subset of conserved mammalian long non-coding RNAs are fossils of ancestral protein-coding genes
title_full_unstemmed A subset of conserved mammalian long non-coding RNAs are fossils of ancestral protein-coding genes
title_sort subset of conserved mammalian long non-coding rnas are fossils of ancestral protein-coding genes
publisher BMC
series Genome Biology
issn 1474-760X
publishDate 2017-08-01
description Abstract Background Only a small portion of human long non-coding RNAs (lncRNAs) appear to be conserved outside of mammals, but the events underlying the birth of new lncRNAs in mammals remain largely unknown. One potential source is remnants of protein-coding genes that transitioned into lncRNAs. Results We systematically compare lncRNA and protein-coding loci across vertebrates, and estimate that up to 5% of conserved mammalian lncRNAs are derived from lost protein-coding genes. These lncRNAs have specific characteristics, such as broader expression domains, that set them apart from other lncRNAs. Fourteen lncRNAs have sequence similarity with the loci of the contemporary homologs of the lost protein-coding genes. We propose that selection acting on enhancer sequences is mostly responsible for retention of these regions. As an example of an RNA element from a protein-coding ancestor that was retained in the lncRNA, we describe in detail a short translated ORF in the JPX lncRNA that was derived from an upstream ORF in a protein-coding gene and retains some of its functionality. Conclusions We estimate that ~ 55 annotated conserved human lncRNAs are derived from parts of ancestral protein-coding genes, and loss of coding potential is thus a non-negligible source of new lncRNAs. Some lncRNAs inherited regulatory elements influencing transcription and translation from their protein-coding ancestors and those elements can influence the expression breadth and functionality of these lncRNAs.
topic Long noncoding RNAs
Evolution
Pseudogenes
Translational regulation
uORFs
X inactivation
url http://link.springer.com/article/10.1186/s13059-017-1293-0
work_keys_str_mv AT hadashezroni asubsetofconservedmammalianlongnoncodingrnasarefossilsofancestralproteincodinggenes
AT rotembentovperry asubsetofconservedmammalianlongnoncodingrnasarefossilsofancestralproteincodinggenes
AT zoharmeir asubsetofconservedmammalianlongnoncodingrnasarefossilsofancestralproteincodinggenes
AT galihousman asubsetofconservedmammalianlongnoncodingrnasarefossilsofancestralproteincodinggenes
AT yoavlubelsky asubsetofconservedmammalianlongnoncodingrnasarefossilsofancestralproteincodinggenes
AT igorulitsky asubsetofconservedmammalianlongnoncodingrnasarefossilsofancestralproteincodinggenes
AT hadashezroni subsetofconservedmammalianlongnoncodingrnasarefossilsofancestralproteincodinggenes
AT rotembentovperry subsetofconservedmammalianlongnoncodingrnasarefossilsofancestralproteincodinggenes
AT zoharmeir subsetofconservedmammalianlongnoncodingrnasarefossilsofancestralproteincodinggenes
AT galihousman subsetofconservedmammalianlongnoncodingrnasarefossilsofancestralproteincodinggenes
AT yoavlubelsky subsetofconservedmammalianlongnoncodingrnasarefossilsofancestralproteincodinggenes
AT igorulitsky subsetofconservedmammalianlongnoncodingrnasarefossilsofancestralproteincodinggenes
_version_ 1725909406810374144