Machine learning application for development of a data-driven predictive model able to investigate quality of life scores in a rare disease

Abstract Background Alkaptonuria (AKU) is an ultra-rare autosomal recessive disease caused by a mutation in the homogentisate 1,2-dioxygenase (HGD) gene. One of the main obstacles in studying AKU, and other ultra-rare diseases, is the lack of a standardized methodology to assess disease severity or...

Full description

Bibliographic Details
Main Authors: Ottavia Spiga, Vittoria Cicaloni, Cosimo Fiorini, Alfonso Trezza, Anna Visibelli, Lia Millucci, Giulia Bernardini, Andrea Bernini, Barbara Marzocchi, Daniela Braconi, Filippo Prischi, Annalisa Santucci
Format: Article
Language:English
Published: BMC 2020-02-01
Series:Orphanet Journal of Rare Diseases
Subjects:
Online Access:https://doi.org/10.1186/s13023-020-1305-0
id doaj-7e00aa955b77490f827561462633086f
record_format Article
spelling doaj-7e00aa955b77490f827561462633086f2021-02-14T12:09:44ZengBMCOrphanet Journal of Rare Diseases1750-11722020-02-0115111010.1186/s13023-020-1305-0Machine learning application for development of a data-driven predictive model able to investigate quality of life scores in a rare diseaseOttavia Spiga0Vittoria Cicaloni1Cosimo Fiorini2Alfonso Trezza3Anna Visibelli4Lia Millucci5Giulia Bernardini6Andrea Bernini7Barbara Marzocchi8Daniela Braconi9Filippo Prischi10Annalisa Santucci11Department of Biotechnology, Chemistry and Pharmacy, University of SienaDepartment of Biotechnology, Chemistry and Pharmacy, University of SienaEnergy wayDepartment of Biotechnology, Chemistry and Pharmacy, University of SienaDepartment of Biotechnology, Chemistry and Pharmacy, University of SienaDepartment of Biotechnology, Chemistry and Pharmacy, University of SienaDepartment of Biotechnology, Chemistry and Pharmacy, University of SienaDepartment of Biotechnology, Chemistry and Pharmacy, University of SienaDepartment of Biotechnology, Chemistry and Pharmacy, University of SienaDepartment of Biotechnology, Chemistry and Pharmacy, University of SienaSchool of Life Sciences, University of EssexDepartment of Biotechnology, Chemistry and Pharmacy, University of SienaAbstract Background Alkaptonuria (AKU) is an ultra-rare autosomal recessive disease caused by a mutation in the homogentisate 1,2-dioxygenase (HGD) gene. One of the main obstacles in studying AKU, and other ultra-rare diseases, is the lack of a standardized methodology to assess disease severity or response to treatment. Quality of Life scores (QoL) are a reliable way to monitor patients’ clinical condition and health status. QoL scores allow to monitor the evolution of diseases and assess the suitability of treatments by taking into account patients’ symptoms, general health status and care satisfaction. However, more comprehensive tools to study a complex and multi-systemic disease like AKU are needed. In this study, a Machine Learning (ML) approach was implemented with the aim to perform a prediction of QoL scores based on clinical data deposited in the ApreciseKUre, an AKU- dedicated database. Method Data derived from 129 AKU patients have been firstly examined through a preliminary statistical analysis (Pearson correlation coefficient) to measure the linear correlation between 11 QoL scores. The variable importance in QoL scores prediction of 110 ApreciseKUre biomarkers has been then calculated using XGBoost, with K-nearest neighbours algorithm (k-NN) approach. Due to the limited number of data available, this model has been validated using surrogate data analysis. Results We identified a direct correlation of 6 (age, Serum Amyloid A, Chitotriosidase, Advanced Oxidation Protein Products, S-thiolated proteins and Body Mass Index) out of 110 biomarkers with the QoL health status, in particular with the KOOS (Knee injury and Osteoarthritis Outcome Score) symptoms (Relative Absolute Error (RAE) 0.25). The error distribution of surrogate-model (RAE 0.38) was unequivocally higher than the true-model one (RAE of 0.25), confirming the consistency of our dataset. Our data showed that inflammation, oxidative stress, amyloidosis and lifestyle of patients correlates with the QoL scores for physical status, while no correlation between the biomarkers and patients’ mental health was present (RAE 1.1). Conclusions This proof of principle study for rare diseases confirms the importance of database, allowing data management and analysis, which can be used to predict more effective treatments.https://doi.org/10.1186/s13023-020-1305-0Rare diseaseAlkaptonuriaMachine learningQoL scoresPrecision medicine
collection DOAJ
language English
format Article
sources DOAJ
author Ottavia Spiga
Vittoria Cicaloni
Cosimo Fiorini
Alfonso Trezza
Anna Visibelli
Lia Millucci
Giulia Bernardini
Andrea Bernini
Barbara Marzocchi
Daniela Braconi
Filippo Prischi
Annalisa Santucci
spellingShingle Ottavia Spiga
Vittoria Cicaloni
Cosimo Fiorini
Alfonso Trezza
Anna Visibelli
Lia Millucci
Giulia Bernardini
Andrea Bernini
Barbara Marzocchi
Daniela Braconi
Filippo Prischi
Annalisa Santucci
Machine learning application for development of a data-driven predictive model able to investigate quality of life scores in a rare disease
Orphanet Journal of Rare Diseases
Rare disease
Alkaptonuria
Machine learning
QoL scores
Precision medicine
author_facet Ottavia Spiga
Vittoria Cicaloni
Cosimo Fiorini
Alfonso Trezza
Anna Visibelli
Lia Millucci
Giulia Bernardini
Andrea Bernini
Barbara Marzocchi
Daniela Braconi
Filippo Prischi
Annalisa Santucci
author_sort Ottavia Spiga
title Machine learning application for development of a data-driven predictive model able to investigate quality of life scores in a rare disease
title_short Machine learning application for development of a data-driven predictive model able to investigate quality of life scores in a rare disease
title_full Machine learning application for development of a data-driven predictive model able to investigate quality of life scores in a rare disease
title_fullStr Machine learning application for development of a data-driven predictive model able to investigate quality of life scores in a rare disease
title_full_unstemmed Machine learning application for development of a data-driven predictive model able to investigate quality of life scores in a rare disease
title_sort machine learning application for development of a data-driven predictive model able to investigate quality of life scores in a rare disease
publisher BMC
series Orphanet Journal of Rare Diseases
issn 1750-1172
publishDate 2020-02-01
description Abstract Background Alkaptonuria (AKU) is an ultra-rare autosomal recessive disease caused by a mutation in the homogentisate 1,2-dioxygenase (HGD) gene. One of the main obstacles in studying AKU, and other ultra-rare diseases, is the lack of a standardized methodology to assess disease severity or response to treatment. Quality of Life scores (QoL) are a reliable way to monitor patients’ clinical condition and health status. QoL scores allow to monitor the evolution of diseases and assess the suitability of treatments by taking into account patients’ symptoms, general health status and care satisfaction. However, more comprehensive tools to study a complex and multi-systemic disease like AKU are needed. In this study, a Machine Learning (ML) approach was implemented with the aim to perform a prediction of QoL scores based on clinical data deposited in the ApreciseKUre, an AKU- dedicated database. Method Data derived from 129 AKU patients have been firstly examined through a preliminary statistical analysis (Pearson correlation coefficient) to measure the linear correlation between 11 QoL scores. The variable importance in QoL scores prediction of 110 ApreciseKUre biomarkers has been then calculated using XGBoost, with K-nearest neighbours algorithm (k-NN) approach. Due to the limited number of data available, this model has been validated using surrogate data analysis. Results We identified a direct correlation of 6 (age, Serum Amyloid A, Chitotriosidase, Advanced Oxidation Protein Products, S-thiolated proteins and Body Mass Index) out of 110 biomarkers with the QoL health status, in particular with the KOOS (Knee injury and Osteoarthritis Outcome Score) symptoms (Relative Absolute Error (RAE) 0.25). The error distribution of surrogate-model (RAE 0.38) was unequivocally higher than the true-model one (RAE of 0.25), confirming the consistency of our dataset. Our data showed that inflammation, oxidative stress, amyloidosis and lifestyle of patients correlates with the QoL scores for physical status, while no correlation between the biomarkers and patients’ mental health was present (RAE 1.1). Conclusions This proof of principle study for rare diseases confirms the importance of database, allowing data management and analysis, which can be used to predict more effective treatments.
topic Rare disease
Alkaptonuria
Machine learning
QoL scores
Precision medicine
url https://doi.org/10.1186/s13023-020-1305-0
work_keys_str_mv AT ottaviaspiga machinelearningapplicationfordevelopmentofadatadrivenpredictivemodelabletoinvestigatequalityoflifescoresinararedisease
AT vittoriacicaloni machinelearningapplicationfordevelopmentofadatadrivenpredictivemodelabletoinvestigatequalityoflifescoresinararedisease
AT cosimofiorini machinelearningapplicationfordevelopmentofadatadrivenpredictivemodelabletoinvestigatequalityoflifescoresinararedisease
AT alfonsotrezza machinelearningapplicationfordevelopmentofadatadrivenpredictivemodelabletoinvestigatequalityoflifescoresinararedisease
AT annavisibelli machinelearningapplicationfordevelopmentofadatadrivenpredictivemodelabletoinvestigatequalityoflifescoresinararedisease
AT liamillucci machinelearningapplicationfordevelopmentofadatadrivenpredictivemodelabletoinvestigatequalityoflifescoresinararedisease
AT giuliabernardini machinelearningapplicationfordevelopmentofadatadrivenpredictivemodelabletoinvestigatequalityoflifescoresinararedisease
AT andreabernini machinelearningapplicationfordevelopmentofadatadrivenpredictivemodelabletoinvestigatequalityoflifescoresinararedisease
AT barbaramarzocchi machinelearningapplicationfordevelopmentofadatadrivenpredictivemodelabletoinvestigatequalityoflifescoresinararedisease
AT danielabraconi machinelearningapplicationfordevelopmentofadatadrivenpredictivemodelabletoinvestigatequalityoflifescoresinararedisease
AT filippoprischi machinelearningapplicationfordevelopmentofadatadrivenpredictivemodelabletoinvestigatequalityoflifescoresinararedisease
AT annalisasantucci machinelearningapplicationfordevelopmentofadatadrivenpredictivemodelabletoinvestigatequalityoflifescoresinararedisease
_version_ 1724271001264455680