Automatic reconstruction of metabolic pathways from identified biosynthetic gene clusters

Background: A wide range of bioactive compounds is produced by enzymes and enzymatic complexes encoded in biosynthetic gene clusters (BGCs). These BGCs can be identified and functionally annotated based on their DNA sequence. Candidates for further research and development may be prioritized based o...

Full description

Bibliographic Details
Main Authors: Almaas, E. (Author), Fossheim, F.A (Author), Sulheim, S. (Author), Wentzel, A. (Author)
Format: Article
Language:English
Published: BioMed Central Ltd 2021
Subjects:
Online Access:View Fulltext in Publisher
LEADER 03937nam a2200493Ia 4500
001 10.1186-s12859-021-03985-0
008 220427s2021 CNT 000 0 und d
020 |a 14712105 (ISSN) 
245 1 0 |a Automatic reconstruction of metabolic pathways from identified biosynthetic gene clusters 
260 0 |b BioMed Central Ltd  |c 2021 
856 |z View Fulltext in Publisher  |u https://doi.org/10.1186/s12859-021-03985-0 
520 3 |a Background: A wide range of bioactive compounds is produced by enzymes and enzymatic complexes encoded in biosynthetic gene clusters (BGCs). These BGCs can be identified and functionally annotated based on their DNA sequence. Candidates for further research and development may be prioritized based on properties such as their functional annotation, (dis)similarity to known BGCs, and bioactivity assays. Production of the target compound in the native strain is often not achievable, rendering heterologous expression in an optimized host strain as a promising alternative. Genome-scale metabolic models are frequently used to guide strain development, but large-scale incorporation and testing of heterologous production of complex natural products in this framework is hampered by the amount of manual work required to translate annotated BGCs to metabolic pathways. To this end, we have developed a pipeline for an automated reconstruction of BGC associated metabolic pathways responsible for the synthesis of non-ribosomal peptides and polyketides, two of the dominant classes of bioactive compounds. Results: The developed pipeline correctly predicts 72.8% of the metabolic reactions in a detailed evaluation of 8 different BGCs comprising 228 functional domains. By introducing the reconstructed pathways into a genome-scale metabolic model we demonstrate that this level of accuracy is sufficient to make reliable in silico predictions with respect to production rate and gene knockout targets. Furthermore, we apply the pipeline to a large BGC database and reconstruct 943 metabolic pathways. We identify 17 enzymatic reactions using high-throughput assessment of potential knockout targets for increasing the production of any of the associated compounds. However, the targets only provide a relative increase of up to 6% compared to wild-type production rates. Conclusion: With this pipeline we pave the way for an extended use of genome-scale metabolic models in strain design of heterologous expression hosts. In this context, we identified generic knockout targets for the increased production of heterologous compounds. However, as the predicted increase is minor for any of the single-reaction knockout targets, these results indicate that more sophisticated strain-engineering strategies are necessary for the development of efficient BGC expression hosts. © 2021, The Author(s). 
650 0 4 |a AntiSMASH 
650 0 4 |a Automatic reconstruction 
650 0 4 |a biological product 
650 0 4 |a Biological Products 
650 0 4 |a biosynthesis 
650 0 4 |a Biosynthesis 
650 0 4 |a Biosynthetic gene cluster 
650 0 4 |a Biosynthetic gene clusters 
650 0 4 |a Biosynthetic Pathways 
650 0 4 |a Functional annotation 
650 0 4 |a Gene knockout target 
650 0 4 |a Genes 
650 0 4 |a genetics 
650 0 4 |a Genome scale metabolic model 
650 0 4 |a Genome-scale metabolic model 
650 0 4 |a Heterologous expression 
650 0 4 |a Heterologous expression 
650 0 4 |a Heterologous production 
650 0 4 |a Metabolism 
650 0 4 |a multigene family 
650 0 4 |a Multigene Family 
650 0 4 |a Natural products 
650 0 4 |a Non-ribosomal peptide synthetases 
650 0 4 |a Pipelines 
650 0 4 |a Polyketide synthases 
650 0 4 |a Research and development 
650 0 4 |a Throughput 
700 1 |a Almaas, E.  |e author 
700 1 |a Fossheim, F.A.  |e author 
700 1 |a Sulheim, S.  |e author 
700 1 |a Wentzel, A.  |e author 
773 |t BMC Bioinformatics