Bind: large-scale biological interaction network discovery through knowledge graph-driven machine learning

Abstract Background Biological systems derive from complex interactions between entities ranging from biomolecules to macroscopic structures, forming intricate networks essential for understanding disease mechanisms and developing therapeutic interventions. Current AI-driven interaction predictors t...

全面介紹

書目詳細資料
發表在:Journal of Translational Medicine
Main Authors: Naafey Aamer, Muhammad Nabeel Asim, Aamer Iqbal Bhatti, Andreas Dengel
格式: Article
語言:英语
出版: BMC 2025-07-01
主題:
在線閱讀:https://doi.org/10.1186/s12967-025-06789-5
實物特徵
總結:Abstract Background Biological systems derive from complex interactions between entities ranging from biomolecules to macroscopic structures, forming intricate networks essential for understanding disease mechanisms and developing therapeutic interventions. Current AI-driven interaction predictors typically operate in isolation, focusing on single tasks and missing the broader picture of how different biological interactions influence each other. Traditional wet-lab approaches for identifying these interactions are expensive, time-consuming, and error-prone. No unified platform currently exists where biologists can predict and analyze multiple types of biological relationships comprehensively, limiting our ability to discover new therapeutic applications and fully understand interconnected biological mechanisms. Methods We developed BIND (Biological Interaction Network Discovery), a comprehensive framework utilizing 11 Knowledge Graph Embedding Methods evaluated on 8 million interactions across 30 biological relationships and 129,000 nodes. We implemented a two-stage training strategy to mitigate class imbalance and heterogeneity: initial training on all 30 interaction types to capture inter-relationships, followed by relation-specific fine-tuning. Entity embeddings for each relation from top-performing models (based on MRR) were input into 7 machine learning classifiers separately, creating 1,050 predictive pipelines evaluated through extensive experimentation and hyperparameter optimization. Performance was assessed using F1-scores across all interaction types. Results Architecturally simpler embedding models captured biological interaction patterns, often outperforming complex approaches. The two-stage training strategy achieved improvements up to 26.9% for protein-protein interactions. Optimal embedding-classifier combinations achieved F1-scores ranging from 0.85 to 0.99 across different biological domains. In a drug-phenotype interaction case study, BIND generated 1355 high confidence predictions, with novel interactions successfully validated through existing literature evidence. Conclusion BIND provides a unified web application enabling prediction and analysis of multiple biological interaction types simultaneously, offering superior performance compared to isolated approaches. The platform serves as a valuable tool for biologists to identify unknown interactions for experimental validation, potentially accelerating biomarker discovery and therapeutic development through comprehensive biological interaction network analysis.
ISSN:1479-5876