SeMPI 2.0—A Web Server for PKS and NRPS Predictions Combined with Metabolite Screening in Natural Product Databases
Microorganisms produce secondary metabolites with a remarkable range of bioactive properties. The constantly increasing amount of published genomic data provides the opportunity for efficient identification of biosynthetic gene clusters by genome mining. On the other hand, for many natural products...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-12-01
|
Series: | Metabolites |
Subjects: | |
Online Access: | https://www.mdpi.com/2218-1989/11/1/13 |
id |
doaj-39b960dc09c1497ab8da5242a3275b66 |
---|---|
record_format |
Article |
spelling |
doaj-39b960dc09c1497ab8da5242a3275b662020-12-30T00:01:29ZengMDPI AGMetabolites2218-19892021-12-0111131310.3390/metabo11010013SeMPI 2.0—A Web Server for PKS and NRPS Predictions Combined with Metabolite Screening in Natural Product DatabasesPaul F. Zierep0Adriana T. Ceci1Ilia Dobrusin2Sinclair C. Rockwell-Kollmann3Stefan Günther4Institute of Pharmaceutical Sciences, Albert-Ludwigs-Universität Freiburg, Hermann-Herder-Straße 9, 79104 Freiburg, GermanyDepartment of Cellular, Computational and Integrative Biology (CIBIO), University of Trento, Via Sommarive 9, Povo, 38123 Trento, ItalyInstitute of Pharmaceutical Sciences, Albert-Ludwigs-Universität Freiburg, Hermann-Herder-Straße 9, 79104 Freiburg, GermanyInstitute of Pharmaceutical Sciences, Albert-Ludwigs-Universität Freiburg, Hermann-Herder-Straße 9, 79104 Freiburg, GermanyInstitute of Pharmaceutical Sciences, Albert-Ludwigs-Universität Freiburg, Hermann-Herder-Straße 9, 79104 Freiburg, GermanyMicroorganisms produce secondary metabolites with a remarkable range of bioactive properties. The constantly increasing amount of published genomic data provides the opportunity for efficient identification of biosynthetic gene clusters by genome mining. On the other hand, for many natural products with resolved structures, the encoding biosynthetic gene clusters have not been identified yet. Of those secondary metabolites, the scaffolds of nonribosomal peptides and polyketides (type I modular) can be predicted due to their building block-like assembly. SeMPI v2 provides a comprehensive prediction pipeline, which includes the screening of the scaffold in publicly available natural compound databases. The screening algorithm was designed to detect homologous structures even for partial, incomplete clusters. The pipeline allows linking of gene clusters to known natural products and therefore also provides a metric to estimate the novelty of the cluster if a matching scaffold cannot be found. Whereas currently available tools attempt to provide comprehensive information about a wide range of gene clusters, SeMPI v2 aims to focus on precise predictions. Therefore, the cluster detection algorithm, including building block generation and domain substrate prediction, was thoroughly refined and benchmarked, to provide high-quality scaffold predictions. In a benchmark based on 559 gene clusters, SeMPI v2 achieved comparable or better results than antiSMASH v5. Additionally, the SeMPI v2 web server provides features that can help to further investigate a submitted gene cluster, such as the incorporation of a genome browser, and the possibility to modify a predicted scaffold in a workbench before the database screening.https://www.mdpi.com/2218-1989/11/1/13secondary metabolitesnatural compoundsmachine learningnonribosomal peptidespolyketides |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Paul F. Zierep Adriana T. Ceci Ilia Dobrusin Sinclair C. Rockwell-Kollmann Stefan Günther |
spellingShingle |
Paul F. Zierep Adriana T. Ceci Ilia Dobrusin Sinclair C. Rockwell-Kollmann Stefan Günther SeMPI 2.0—A Web Server for PKS and NRPS Predictions Combined with Metabolite Screening in Natural Product Databases Metabolites secondary metabolites natural compounds machine learning nonribosomal peptides polyketides |
author_facet |
Paul F. Zierep Adriana T. Ceci Ilia Dobrusin Sinclair C. Rockwell-Kollmann Stefan Günther |
author_sort |
Paul F. Zierep |
title |
SeMPI 2.0—A Web Server for PKS and NRPS Predictions Combined with Metabolite Screening in Natural Product Databases |
title_short |
SeMPI 2.0—A Web Server for PKS and NRPS Predictions Combined with Metabolite Screening in Natural Product Databases |
title_full |
SeMPI 2.0—A Web Server for PKS and NRPS Predictions Combined with Metabolite Screening in Natural Product Databases |
title_fullStr |
SeMPI 2.0—A Web Server for PKS and NRPS Predictions Combined with Metabolite Screening in Natural Product Databases |
title_full_unstemmed |
SeMPI 2.0—A Web Server for PKS and NRPS Predictions Combined with Metabolite Screening in Natural Product Databases |
title_sort |
sempi 2.0—a web server for pks and nrps predictions combined with metabolite screening in natural product databases |
publisher |
MDPI AG |
series |
Metabolites |
issn |
2218-1989 |
publishDate |
2021-12-01 |
description |
Microorganisms produce secondary metabolites with a remarkable range of bioactive properties. The constantly increasing amount of published genomic data provides the opportunity for efficient identification of biosynthetic gene clusters by genome mining. On the other hand, for many natural products with resolved structures, the encoding biosynthetic gene clusters have not been identified yet. Of those secondary metabolites, the scaffolds of nonribosomal peptides and polyketides (type I modular) can be predicted due to their building block-like assembly. SeMPI v2 provides a comprehensive prediction pipeline, which includes the screening of the scaffold in publicly available natural compound databases. The screening algorithm was designed to detect homologous structures even for partial, incomplete clusters. The pipeline allows linking of gene clusters to known natural products and therefore also provides a metric to estimate the novelty of the cluster if a matching scaffold cannot be found. Whereas currently available tools attempt to provide comprehensive information about a wide range of gene clusters, SeMPI v2 aims to focus on precise predictions. Therefore, the cluster detection algorithm, including building block generation and domain substrate prediction, was thoroughly refined and benchmarked, to provide high-quality scaffold predictions. In a benchmark based on 559 gene clusters, SeMPI v2 achieved comparable or better results than antiSMASH v5. Additionally, the SeMPI v2 web server provides features that can help to further investigate a submitted gene cluster, such as the incorporation of a genome browser, and the possibility to modify a predicted scaffold in a workbench before the database screening. |
topic |
secondary metabolites natural compounds machine learning nonribosomal peptides polyketides |
url |
https://www.mdpi.com/2218-1989/11/1/13 |
work_keys_str_mv |
AT paulfzierep sempi20awebserverforpksandnrpspredictionscombinedwithmetabolitescreeninginnaturalproductdatabases AT adrianatceci sempi20awebserverforpksandnrpspredictionscombinedwithmetabolitescreeninginnaturalproductdatabases AT iliadobrusin sempi20awebserverforpksandnrpspredictionscombinedwithmetabolitescreeninginnaturalproductdatabases AT sinclaircrockwellkollmann sempi20awebserverforpksandnrpspredictionscombinedwithmetabolitescreeninginnaturalproductdatabases AT stefangunther sempi20awebserverforpksandnrpspredictionscombinedwithmetabolitescreeninginnaturalproductdatabases |
_version_ |
1724367371352670208 |