rKOMICS: an R package for processing mitochondrial minicircle assemblies in population-scale genome projects

Background: The advent of population-scale genome projects has revolutionized our biological understanding of parasitic protozoa. However, while hundreds to thousands of nuclear genomes of parasitic protozoa have been generated and analyzed, information about the diversity, structure and evolution o...

Full description

Bibliographic Details
Main Authors: Geerts, M. (Author), Schnaufer, A. (Author), Van den Broeck, F. (Author)
Format: Article
Language:English
Published: BioMed Central Ltd 2021
Subjects:
Online Access:View Fulltext in Publisher
LEADER 04028nam a2200565Ia 4500
001 10.1186-s12859-021-04384-1
008 220427s2021 CNT 000 0 und d
020 |a 14712105 (ISSN) 
245 1 0 |a rKOMICS: an R package for processing mitochondrial minicircle assemblies in population-scale genome projects 
260 0 |b BioMed Central Ltd  |c 2021 
856 |z View Fulltext in Publisher  |u https://doi.org/10.1186/s12859-021-04384-1 
520 3 |a Background: The advent of population-scale genome projects has revolutionized our biological understanding of parasitic protozoa. However, while hundreds to thousands of nuclear genomes of parasitic protozoa have been generated and analyzed, information about the diversity, structure and evolution of their mitochondrial genomes remains fragmentary, mainly because of their extraordinary complexity. Indeed, unicellular flagellates of the order Kinetoplastida contain structurally the most complex mitochondrial genome of all eukaryotes, organized as a giant network of homogeneous maxicircles and heterogeneous minicircles. We recently developed KOMICS, an analysis toolkit that automates the assembly and circularization of the mitochondrial genomes of Kinetoplastid parasites. While this tool overcomes the limitation of extracting mitochondrial assemblies from Next-Generation Sequencing datasets, interpreting and visualizing the genetic (dis)similarity within and between samples remains a time-consuming process. Results: Here, we present a new analysis toolkit—rKOMICS—to streamline the analyses of minicircle sequence diversity in population-scale genome projects. rKOMICS is a user-friendly R package that has simple installation requirements and that is applicable to all 27 trypanosomatid genera. Once minicircle sequence alignments are generated, rKOMICS allows to examine, summarize and visualize minicircle sequence diversity within and between samples through the analyses of minicircle sequence clusters. We showcase the functionalities of the (r)KOMICS tool suite using a whole-genome sequencing dataset from a recently published study on the history of diversification of the Leishmania braziliensis species complex in Peru. Analyses of population diversity and structure highlighted differences in minicircle sequence richness and composition between Leishmania subspecies, and between subpopulations within subspecies. Conclusion: The rKOMICS package establishes a critical framework to manipulate, explore and extract biologically relevant information from mitochondrial minicircle assemblies in tens to hundreds of samples simultaneously and efficiently. This should facilitate research that aims to develop new molecular markers for identifying species-specific minicircles, or to study the ancestry of parasites for complementary insights into their evolutionary history. © 2021, The Author(s). 
650 0 4 |a Analysis toolkits 
650 0 4 |a Assembly 
650 0 4 |a Clustering 
650 0 4 |a Clusterings 
650 0 4 |a Complex networks 
650 0 4 |a DNA, Kinetoplast 
650 0 4 |a Extraction 
650 0 4 |a Genes 
650 0 4 |a genetics 
650 0 4 |a Genome projects 
650 0 4 |a Genome, Mitochondrial 
650 0 4 |a high throughput sequencing 
650 0 4 |a High-Throughput Nucleotide Sequencing 
650 0 4 |a Kinetoplast 
650 0 4 |a kinetoplast DNA 
650 0 4 |a Leishmania 
650 0 4 |a Leishmania 
650 0 4 |a Leishmania 
650 0 4 |a Leishmania 
650 0 4 |a Minicircle 
650 0 4 |a Minicircles 
650 0 4 |a Mitochondria 
650 0 4 |a mitochondrial genome 
650 0 4 |a Mitochondrial genomes 
650 0 4 |a Parasite- 
650 0 4 |a Parasites 
650 0 4 |a Parasitics 
650 0 4 |a Protozoa 
650 0 4 |a sequence alignment 
650 0 4 |a Sequence Alignment 
650 0 4 |a Sequencing 
650 0 4 |a Sequencing 
650 0 4 |a Trypanosoma 
650 0 4 |a Trypanosoma 
700 1 |a Geerts, M.  |e author 
700 1 |a Schnaufer, A.  |e author 
700 1 |a Van den Broeck, F.  |e author 
773 |t BMC Bioinformatics