Eukaryote Genes Are More Likely than Prokaryote Genes to Be Composites

The formation of new genes by combining parts of existing genes is an important evolutionary process. Remodelled genes, which we call composites, have been investigated in many species, however, their distribution across all of life is still unknown. We set out to examine the extent to which genomes...

Full description

Bibliographic Details
Main Authors: Yaqing Ou, James O. McInerney
Format: Article
Language:English
Published: MDPI AG 2019-08-01
Series:Genes
Subjects:
Online Access:https://www.mdpi.com/2073-4425/10/9/648
id doaj-c37f28c8672c47dbb4ffa766bbbd2ec3
record_format Article
spelling doaj-c37f28c8672c47dbb4ffa766bbbd2ec32020-11-25T02:01:02ZengMDPI AGGenes2073-44252019-08-0110964810.3390/genes10090648genes10090648Eukaryote Genes Are More Likely than Prokaryote Genes to Be CompositesYaqing Ou0James O. McInerney1Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester M13 9PL, UKDivision of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester M13 9PL, UKThe formation of new genes by combining parts of existing genes is an important evolutionary process. Remodelled genes, which we call composites, have been investigated in many species, however, their distribution across all of life is still unknown. We set out to examine the extent to which genomes from cells and mobile genetic elements contain composite genes. We identify composite genes as those that show partial homology to at least two unrelated component genes. In order to identify composite and component genes, we constructed sequence similarity networks (SSNs) of more than one million genes from all three domains of life, as well as viruses and plasmids. We identified non-transitive triplets of nodes in this network and explored the homology relationships in these triplets to see if the middle nodes were indeed composite genes. In total, we identified 221,043 (18.57%) composites genes, which were distributed across all genomic and functional categories. In particular, the presence of composite genes is statistically more likely in eukaryotes than prokaryotes.https://www.mdpi.com/2073-4425/10/9/648composite genessequence similarity networksodds ratio test
collection DOAJ
language English
format Article
sources DOAJ
author Yaqing Ou
James O. McInerney
spellingShingle Yaqing Ou
James O. McInerney
Eukaryote Genes Are More Likely than Prokaryote Genes to Be Composites
Genes
composite genes
sequence similarity networks
odds ratio test
author_facet Yaqing Ou
James O. McInerney
author_sort Yaqing Ou
title Eukaryote Genes Are More Likely than Prokaryote Genes to Be Composites
title_short Eukaryote Genes Are More Likely than Prokaryote Genes to Be Composites
title_full Eukaryote Genes Are More Likely than Prokaryote Genes to Be Composites
title_fullStr Eukaryote Genes Are More Likely than Prokaryote Genes to Be Composites
title_full_unstemmed Eukaryote Genes Are More Likely than Prokaryote Genes to Be Composites
title_sort eukaryote genes are more likely than prokaryote genes to be composites
publisher MDPI AG
series Genes
issn 2073-4425
publishDate 2019-08-01
description The formation of new genes by combining parts of existing genes is an important evolutionary process. Remodelled genes, which we call composites, have been investigated in many species, however, their distribution across all of life is still unknown. We set out to examine the extent to which genomes from cells and mobile genetic elements contain composite genes. We identify composite genes as those that show partial homology to at least two unrelated component genes. In order to identify composite and component genes, we constructed sequence similarity networks (SSNs) of more than one million genes from all three domains of life, as well as viruses and plasmids. We identified non-transitive triplets of nodes in this network and explored the homology relationships in these triplets to see if the middle nodes were indeed composite genes. In total, we identified 221,043 (18.57%) composites genes, which were distributed across all genomic and functional categories. In particular, the presence of composite genes is statistically more likely in eukaryotes than prokaryotes.
topic composite genes
sequence similarity networks
odds ratio test
url https://www.mdpi.com/2073-4425/10/9/648
work_keys_str_mv AT yaqingou eukaryotegenesaremorelikelythanprokaryotegenestobecomposites
AT jamesomcinerney eukaryotegenesaremorelikelythanprokaryotegenestobecomposites
_version_ 1724959217775804416