Mining microbe–disease interactions from literature via a transfer learning model
Abstract Background Interactions of microbes and diseases are of great importance for biomedical research. However, large-scale of microbe–disease interactions are hidden in the biomedical literature. The structured databases for microbe–disease interactions are in limited amounts. In this paper, we...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2021-09-01
|
Series: | BMC Bioinformatics |
Subjects: | |
Online Access: | https://doi.org/10.1186/s12859-021-04346-7 |
id |
doaj-5126382b95f14defa421c0e00f08b7f0 |
---|---|
record_format |
Article |
spelling |
doaj-5126382b95f14defa421c0e00f08b7f02021-09-12T11:13:25ZengBMCBMC Bioinformatics1471-21052021-09-0122111510.1186/s12859-021-04346-7Mining microbe–disease interactions from literature via a transfer learning modelChengkun Wu0Xinyi Xiao1Canqun Yang2JinXiang Chen3Jiacai Yi4Yanlong Qiu5State Key Laboratory of High-Performance Computing, National University of Defense TechnologyCollege of Computer, National University of Defense TechnologyCollege of Computer, National University of Defense TechnologyDepartment of General Surgery, Xiangya Hospital, Central South UniversityCollege of Computer, National University of Defense TechnologyCollege of Computer, National University of Defense TechnologyAbstract Background Interactions of microbes and diseases are of great importance for biomedical research. However, large-scale of microbe–disease interactions are hidden in the biomedical literature. The structured databases for microbe–disease interactions are in limited amounts. In this paper, we aim to construct a large-scale database for microbe–disease interactions automatically. We attained this goal via applying text mining methods based on a deep learning model with a moderate curation cost. We also built a user-friendly web interface that allows researchers to navigate and query required information. Results Firstly, we manually constructed a golden-standard corpus and a sliver-standard corpus (SSC) for microbe–disease interactions for curation. Moreover, we proposed a text mining framework for microbe–disease interaction extraction based on a pretrained model BERE. We applied named entity recognition tools to detect microbe and disease mentions from the free biomedical texts. After that, we fine-tuned the pretrained model BERE to recognize relations between targeted entities, which was originally built for drug–target interactions or drug–drug interactions. The introduction of SSC for model fine-tuning greatly improved detection performance for microbe–disease interactions, with an average reduction in error of approximately 10%. The MDIDB website offers data browsing, custom searching for specific diseases or microbes, and batch downloading. Conclusions Evaluation results demonstrate that our method outperform the baseline model (rule-based PKDE4J) with an average $$F_1$$ F 1 -score of 73.81%. For further validation, we randomly sampled nearly 1000 predicted interactions by our model, and manually checked the correctness of each interaction, which gives a 73% accuracy. The MDIDB webiste is freely avaliable throuth http://dbmdi.com/index/https://doi.org/10.1186/s12859-021-04346-7Microbe–disease interactionsNamed-entity recognitionRelation extractionTransfer learning |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Chengkun Wu Xinyi Xiao Canqun Yang JinXiang Chen Jiacai Yi Yanlong Qiu |
spellingShingle |
Chengkun Wu Xinyi Xiao Canqun Yang JinXiang Chen Jiacai Yi Yanlong Qiu Mining microbe–disease interactions from literature via a transfer learning model BMC Bioinformatics Microbe–disease interactions Named-entity recognition Relation extraction Transfer learning |
author_facet |
Chengkun Wu Xinyi Xiao Canqun Yang JinXiang Chen Jiacai Yi Yanlong Qiu |
author_sort |
Chengkun Wu |
title |
Mining microbe–disease interactions from literature via a transfer learning model |
title_short |
Mining microbe–disease interactions from literature via a transfer learning model |
title_full |
Mining microbe–disease interactions from literature via a transfer learning model |
title_fullStr |
Mining microbe–disease interactions from literature via a transfer learning model |
title_full_unstemmed |
Mining microbe–disease interactions from literature via a transfer learning model |
title_sort |
mining microbe–disease interactions from literature via a transfer learning model |
publisher |
BMC |
series |
BMC Bioinformatics |
issn |
1471-2105 |
publishDate |
2021-09-01 |
description |
Abstract Background Interactions of microbes and diseases are of great importance for biomedical research. However, large-scale of microbe–disease interactions are hidden in the biomedical literature. The structured databases for microbe–disease interactions are in limited amounts. In this paper, we aim to construct a large-scale database for microbe–disease interactions automatically. We attained this goal via applying text mining methods based on a deep learning model with a moderate curation cost. We also built a user-friendly web interface that allows researchers to navigate and query required information. Results Firstly, we manually constructed a golden-standard corpus and a sliver-standard corpus (SSC) for microbe–disease interactions for curation. Moreover, we proposed a text mining framework for microbe–disease interaction extraction based on a pretrained model BERE. We applied named entity recognition tools to detect microbe and disease mentions from the free biomedical texts. After that, we fine-tuned the pretrained model BERE to recognize relations between targeted entities, which was originally built for drug–target interactions or drug–drug interactions. The introduction of SSC for model fine-tuning greatly improved detection performance for microbe–disease interactions, with an average reduction in error of approximately 10%. The MDIDB website offers data browsing, custom searching for specific diseases or microbes, and batch downloading. Conclusions Evaluation results demonstrate that our method outperform the baseline model (rule-based PKDE4J) with an average $$F_1$$ F 1 -score of 73.81%. For further validation, we randomly sampled nearly 1000 predicted interactions by our model, and manually checked the correctness of each interaction, which gives a 73% accuracy. The MDIDB webiste is freely avaliable throuth http://dbmdi.com/index/ |
topic |
Microbe–disease interactions Named-entity recognition Relation extraction Transfer learning |
url |
https://doi.org/10.1186/s12859-021-04346-7 |
work_keys_str_mv |
AT chengkunwu miningmicrobediseaseinteractionsfromliteratureviaatransferlearningmodel AT xinyixiao miningmicrobediseaseinteractionsfromliteratureviaatransferlearningmodel AT canqunyang miningmicrobediseaseinteractionsfromliteratureviaatransferlearningmodel AT jinxiangchen miningmicrobediseaseinteractionsfromliteratureviaatransferlearningmodel AT jiacaiyi miningmicrobediseaseinteractionsfromliteratureviaatransferlearningmodel AT yanlongqiu miningmicrobediseaseinteractionsfromliteratureviaatransferlearningmodel |
_version_ |
1717755812542152704 |