SeMPI 2.0—A Web Server for PKS and NRPS Predictions Combined with Metabolite Screening in Natural Product Databases

Microorganisms produce secondary metabolites with a remarkable range of bioactive properties. The constantly increasing amount of published genomic data provides the opportunity for efficient identification of biosynthetic gene clusters by genome mining. On the other hand, for many natural products...

Full description

Bibliographic Details
Main Authors:	Paul F. Zierep, Adriana T. Ceci, Ilia Dobrusin, Sinclair C. Rockwell-Kollmann, Stefan Günther
Format:	Article
Language:	English
Published:	MDPI AG 2021-12-01
Series:	Metabolites
Subjects:	secondary metabolites natural compounds machine learning nonribosomal peptides polyketides
Online Access:	https://www.mdpi.com/2218-1989/11/1/13

id	doaj-39b960dc09c1497ab8da5242a3275b66
record_format	Article
spelling	doaj-39b960dc09c1497ab8da5242a3275b662020-12-30T00:01:29ZengMDPI AGMetabolites2218-19892021-12-0111131310.3390/metabo11010013SeMPI 2.0—A Web Server for PKS and NRPS Predictions Combined with Metabolite Screening in Natural Product DatabasesPaul F. Zierep0Adriana T. Ceci1Ilia Dobrusin2Sinclair C. Rockwell-Kollmann3Stefan Günther4Institute of Pharmaceutical Sciences, Albert-Ludwigs-Universität Freiburg, Hermann-Herder-Straße 9, 79104 Freiburg, GermanyDepartment of Cellular, Computational and Integrative Biology (CIBIO), University of Trento, Via Sommarive 9, Povo, 38123 Trento, ItalyInstitute of Pharmaceutical Sciences, Albert-Ludwigs-Universität Freiburg, Hermann-Herder-Straße 9, 79104 Freiburg, GermanyInstitute of Pharmaceutical Sciences, Albert-Ludwigs-Universität Freiburg, Hermann-Herder-Straße 9, 79104 Freiburg, GermanyInstitute of Pharmaceutical Sciences, Albert-Ludwigs-Universität Freiburg, Hermann-Herder-Straße 9, 79104 Freiburg, GermanyMicroorganisms produce secondary metabolites with a remarkable range of bioactive properties. The constantly increasing amount of published genomic data provides the opportunity for efficient identification of biosynthetic gene clusters by genome mining. On the other hand, for many natural products with resolved structures, the encoding biosynthetic gene clusters have not been identified yet. Of those secondary metabolites, the scaffolds of nonribosomal peptides and polyketides (type I modular) can be predicted due to their building block-like assembly. SeMPI v2 provides a comprehensive prediction pipeline, which includes the screening of the scaffold in publicly available natural compound databases. The screening algorithm was designed to detect homologous structures even for partial, incomplete clusters. The pipeline allows linking of gene clusters to known natural products and therefore also provides a metric to estimate the novelty of the cluster if a matching scaffold cannot be found. Whereas currently available tools attempt to provide comprehensive information about a wide range of gene clusters, SeMPI v2 aims to focus on precise predictions. Therefore, the cluster detection algorithm, including building block generation and domain substrate prediction, was thoroughly refined and benchmarked, to provide high-quality scaffold predictions. In a benchmark based on 559 gene clusters, SeMPI v2 achieved comparable or better results than antiSMASH v5. Additionally, the SeMPI v2 web server provides features that can help to further investigate a submitted gene cluster, such as the incorporation of a genome browser, and the possibility to modify a predicted scaffold in a workbench before the database screening.https://www.mdpi.com/2218-1989/11/1/13secondary metabolitesnatural compoundsmachine learningnonribosomal peptidespolyketides
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Paul F. Zierep Adriana T. Ceci Ilia Dobrusin Sinclair C. Rockwell-Kollmann Stefan Günther
spellingShingle	Paul F. Zierep Adriana T. Ceci Ilia Dobrusin Sinclair C. Rockwell-Kollmann Stefan Günther SeMPI 2.0—A Web Server for PKS and NRPS Predictions Combined with Metabolite Screening in Natural Product Databases Metabolites secondary metabolites natural compounds machine learning nonribosomal peptides polyketides
author_facet	Paul F. Zierep Adriana T. Ceci Ilia Dobrusin Sinclair C. Rockwell-Kollmann Stefan Günther
author_sort	Paul F. Zierep
title	SeMPI 2.0—A Web Server for PKS and NRPS Predictions Combined with Metabolite Screening in Natural Product Databases
title_short	SeMPI 2.0—A Web Server for PKS and NRPS Predictions Combined with Metabolite Screening in Natural Product Databases
title_full	SeMPI 2.0—A Web Server for PKS and NRPS Predictions Combined with Metabolite Screening in Natural Product Databases
title_fullStr	SeMPI 2.0—A Web Server for PKS and NRPS Predictions Combined with Metabolite Screening in Natural Product Databases
title_full_unstemmed	SeMPI 2.0—A Web Server for PKS and NRPS Predictions Combined with Metabolite Screening in Natural Product Databases
title_sort	sempi 2.0—a web server for pks and nrps predictions combined with metabolite screening in natural product databases
publisher	MDPI AG
series	Metabolites
issn	2218-1989
publishDate	2021-12-01
description	Microorganisms produce secondary metabolites with a remarkable range of bioactive properties. The constantly increasing amount of published genomic data provides the opportunity for efficient identification of biosynthetic gene clusters by genome mining. On the other hand, for many natural products with resolved structures, the encoding biosynthetic gene clusters have not been identified yet. Of those secondary metabolites, the scaffolds of nonribosomal peptides and polyketides (type I modular) can be predicted due to their building block-like assembly. SeMPI v2 provides a comprehensive prediction pipeline, which includes the screening of the scaffold in publicly available natural compound databases. The screening algorithm was designed to detect homologous structures even for partial, incomplete clusters. The pipeline allows linking of gene clusters to known natural products and therefore also provides a metric to estimate the novelty of the cluster if a matching scaffold cannot be found. Whereas currently available tools attempt to provide comprehensive information about a wide range of gene clusters, SeMPI v2 aims to focus on precise predictions. Therefore, the cluster detection algorithm, including building block generation and domain substrate prediction, was thoroughly refined and benchmarked, to provide high-quality scaffold predictions. In a benchmark based on 559 gene clusters, SeMPI v2 achieved comparable or better results than antiSMASH v5. Additionally, the SeMPI v2 web server provides features that can help to further investigate a submitted gene cluster, such as the incorporation of a genome browser, and the possibility to modify a predicted scaffold in a workbench before the database screening.
topic	secondary metabolites natural compounds machine learning nonribosomal peptides polyketides
url	https://www.mdpi.com/2218-1989/11/1/13
work_keys_str_mv	AT paulfzierep sempi20awebserverforpksandnrpspredictionscombinedwithmetabolitescreeninginnaturalproductdatabases AT adrianatceci sempi20awebserverforpksandnrpspredictionscombinedwithmetabolitescreeninginnaturalproductdatabases AT iliadobrusin sempi20awebserverforpksandnrpspredictionscombinedwithmetabolitescreeninginnaturalproductdatabases AT sinclaircrockwellkollmann sempi20awebserverforpksandnrpspredictionscombinedwithmetabolitescreeninginnaturalproductdatabases AT stefangunther sempi20awebserverforpksandnrpspredictionscombinedwithmetabolitescreeninginnaturalproductdatabases
_version_	1724367371352670208

SeMPI 2.0—A Web Server for PKS and NRPS Predictions Combined with Metabolite Screening in Natural Product Databases

Similar Items