An automated pipeline integrating AlphaFold 2 and MODELLER for protein structure prediction

The ability to predict a protein's three-dimensional conformation represents a crucial starting point for investigating evolutionary connections with other members of the corresponding protein family, examining interactions with other proteins, and potentially utilizing this knowledge for the p...

Full description

Bibliographic Details
Published in:Computational and Structural Biotechnology Journal
Main Authors: Fabio Hernan Gil Zuluaga, Nancy D’Arminio, Francesco Bardozzo, Roberto Tagliaferri, Anna Marabotti
Format: Article
Language:English
Published: Elsevier 2023-01-01
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2001037023004129
_version_ 1852643184564240384
author Fabio Hernan Gil Zuluaga
Nancy D’Arminio
Francesco Bardozzo
Roberto Tagliaferri
Anna Marabotti
author_facet Fabio Hernan Gil Zuluaga
Nancy D’Arminio
Francesco Bardozzo
Roberto Tagliaferri
Anna Marabotti
author_sort Fabio Hernan Gil Zuluaga
collection DOAJ
container_title Computational and Structural Biotechnology Journal
description The ability to predict a protein's three-dimensional conformation represents a crucial starting point for investigating evolutionary connections with other members of the corresponding protein family, examining interactions with other proteins, and potentially utilizing this knowledge for the purpose of rational drug design. In this work, we evaluated the feasibility of improving AlphaFold2’s three-dimensional protein predictions by developing a novel pipeline (AlphaMod) that incorporates AlphaFold2 with MODELLER, a template-based modeling program. Additionally, our tool can drive a comprehensive quality assessment of the tertiary protein structure by incorporating and comparing a set of different quality assessment tools. The outcomes of selected tools are combined into a composite score (BORDASCORE) that exhibits a meaningful correlation with GDT_TS and facilitates the selection of optimal models in the absence of a reference structure. To validate AlphaMod's results, we conducted evaluations using two distinct datasets summing up to 72 targets, previously used to independently assess AlphaFold2's performance. The generated models underwent evaluation through two methods: i) averaging the GDT_TS scores across all produced structures for a single target sequence, and ii) a pairwise comparison of the best structures generated by AlphaFold2 and AlphaMod. The latter, within the unsupervised setups, shows a rising accuracy of approximately 34% over AlphaFold2. While, when considering the supervised setup, AlphaMod surpasses AlphaFold2 in 18% of the instances. Finally, there is an 11% correspondence in outcomes between the diverse methodologies. Consequently, AlphaMod’s best-predicted tertiary structures in several cases exhibited a significant improvement in the accuracy of the predictions with respect to the best models obtained by AlphaFold2. This pipeline paves the way for the integration of additional data and AI-based algorithms to further improve the reliability of the predictions.
format Article
id doaj-art-e25dc130d0e04c2f8e5d02f2135c34e1
institution Directory of Open Access Journals
issn 2001-0370
language English
publishDate 2023-01-01
publisher Elsevier
record_format Article
spelling doaj-art-e25dc130d0e04c2f8e5d02f2135c34e12025-08-19T21:43:56ZengElsevierComputational and Structural Biotechnology Journal2001-03702023-01-01215620562910.1016/j.csbj.2023.10.056An automated pipeline integrating AlphaFold 2 and MODELLER for protein structure predictionFabio Hernan Gil Zuluaga0Nancy D’Arminio1Francesco Bardozzo2Roberto Tagliaferri3Anna Marabotti4Department of Management & Innovation Systems, University of Salerno, Via Giovanni Paolo II, 132, 84084 Fisciano, SA, ItalyDepartment of Chemistry and Biology “A. Zambelli”, University of Salerno, Via Giovanni Paolo II, 132, 84084 Fisciano, SA, ItalyDepartment of Management & Innovation Systems, University of Salerno, Via Giovanni Paolo II, 132, 84084 Fisciano, SA, ItalyDepartment of Management & Innovation Systems, University of Salerno, Via Giovanni Paolo II, 132, 84084 Fisciano, SA, Italy; Corresponding authors.Department of Chemistry and Biology “A. Zambelli”, University of Salerno, Via Giovanni Paolo II, 132, 84084 Fisciano, SA, Italy; Corresponding authors.The ability to predict a protein's three-dimensional conformation represents a crucial starting point for investigating evolutionary connections with other members of the corresponding protein family, examining interactions with other proteins, and potentially utilizing this knowledge for the purpose of rational drug design. In this work, we evaluated the feasibility of improving AlphaFold2’s three-dimensional protein predictions by developing a novel pipeline (AlphaMod) that incorporates AlphaFold2 with MODELLER, a template-based modeling program. Additionally, our tool can drive a comprehensive quality assessment of the tertiary protein structure by incorporating and comparing a set of different quality assessment tools. The outcomes of selected tools are combined into a composite score (BORDASCORE) that exhibits a meaningful correlation with GDT_TS and facilitates the selection of optimal models in the absence of a reference structure. To validate AlphaMod's results, we conducted evaluations using two distinct datasets summing up to 72 targets, previously used to independently assess AlphaFold2's performance. The generated models underwent evaluation through two methods: i) averaging the GDT_TS scores across all produced structures for a single target sequence, and ii) a pairwise comparison of the best structures generated by AlphaFold2 and AlphaMod. The latter, within the unsupervised setups, shows a rising accuracy of approximately 34% over AlphaFold2. While, when considering the supervised setup, AlphaMod surpasses AlphaFold2 in 18% of the instances. Finally, there is an 11% correspondence in outcomes between the diverse methodologies. Consequently, AlphaMod’s best-predicted tertiary structures in several cases exhibited a significant improvement in the accuracy of the predictions with respect to the best models obtained by AlphaFold2. This pipeline paves the way for the integration of additional data and AI-based algorithms to further improve the reliability of the predictions.http://www.sciencedirect.com/science/article/pii/S2001037023004129Protein structure predictionDeep learningMODELLERAlphaFold 2
spellingShingle Fabio Hernan Gil Zuluaga
Nancy D’Arminio
Francesco Bardozzo
Roberto Tagliaferri
Anna Marabotti
An automated pipeline integrating AlphaFold 2 and MODELLER for protein structure prediction
Protein structure prediction
Deep learning
MODELLER
AlphaFold 2
title An automated pipeline integrating AlphaFold 2 and MODELLER for protein structure prediction
title_full An automated pipeline integrating AlphaFold 2 and MODELLER for protein structure prediction
title_fullStr An automated pipeline integrating AlphaFold 2 and MODELLER for protein structure prediction
title_full_unstemmed An automated pipeline integrating AlphaFold 2 and MODELLER for protein structure prediction
title_short An automated pipeline integrating AlphaFold 2 and MODELLER for protein structure prediction
title_sort automated pipeline integrating alphafold 2 and modeller for protein structure prediction
topic Protein structure prediction
Deep learning
MODELLER
AlphaFold 2
url http://www.sciencedirect.com/science/article/pii/S2001037023004129
work_keys_str_mv AT fabiohernangilzuluaga anautomatedpipelineintegratingalphafold2andmodellerforproteinstructureprediction
AT nancydarminio anautomatedpipelineintegratingalphafold2andmodellerforproteinstructureprediction
AT francescobardozzo anautomatedpipelineintegratingalphafold2andmodellerforproteinstructureprediction
AT robertotagliaferri anautomatedpipelineintegratingalphafold2andmodellerforproteinstructureprediction
AT annamarabotti anautomatedpipelineintegratingalphafold2andmodellerforproteinstructureprediction
AT fabiohernangilzuluaga automatedpipelineintegratingalphafold2andmodellerforproteinstructureprediction
AT nancydarminio automatedpipelineintegratingalphafold2andmodellerforproteinstructureprediction
AT francescobardozzo automatedpipelineintegratingalphafold2andmodellerforproteinstructureprediction
AT robertotagliaferri automatedpipelineintegratingalphafold2andmodellerforproteinstructureprediction
AT annamarabotti automatedpipelineintegratingalphafold2andmodellerforproteinstructureprediction