Classifying Familial Hypercholesterolaemia: A Tree-based Machine Learning Approach

Familial hypercholesterolaemia is the most common and serious form of inherited hyperlipidaemia. It has an autosomal dominant mode of inheritance, and is characterised by severely elevated low-density lipoprotein cholesterol levels. Familial hypercholesterolaemia is an important cause of premature c...

Full description

Bibliographic Details
Main Authors: Chua, Y.-A (Author), Edward, J. (Author), Kasim, N.A.M (Author), Nawawi, H. (Author), Onn, M. (Author), Rosli, M.M (Author)
Format: Article
Language:English
Published: Science and Information Organization 2021
Series:International Journal of Advanced Computer Science and Applications
Subjects:
Online Access:View Fulltext in Publisher
View in Scopus
Description
Summary:Familial hypercholesterolaemia is the most common and serious form of inherited hyperlipidaemia. It has an autosomal dominant mode of inheritance, and is characterised by severely elevated low-density lipoprotein cholesterol levels. Familial hypercholesterolaemia is an important cause of premature coronary heart disease, but is potentially treatable. However, the majority of familial hypercholesterolaemia individuals are under-diagnosed and under-treated, resulting in lost opportunities for premature coronary heart disease prevention. This study aims to assess performance of machine learning algorithms for enhancing familial hypercholesterolaemia detection within the Malaysian population. We applied three machine learning algorithms (random forest, gradient boosting and decision tree) to classify familial hypercholesterolaemia among Malaysian patients and to identify relevant features from four well-known diagnostic instruments: Simon Broome, Dutch Lipid Clinic Criteria, US Make Early Diagnosis to Prevent Early Deaths and Japanese FH Management Criteria. The performance of these classifiers was compared using various measurements for accuracy, precision, sensitivity and specificity. Our results indicated that the decision tree classifier had the best performance, with an accuracy of 99.72%, followed by the gradient boosting and random forest classifiers, with accuracies of 99.54% and 99.52%, respectively. The three classifiers with Recursive Feature Elimination method selected six common features of familial hypercholesterolaemia diagnostic criteria (family history of coronary heart disease, low-density lipoprotein cholesterol levels, presence of tendon xanthomata and/or corneal arcus, family hypercholesterolaemia, and family history of familial hypercholesterolaemia) that generate the highest accuracy in predicting familial hypercholesterolaemia. We anticipate machine learning algorithms will enhance rapid diagnosis of familial hypercholesterolaemia by providing the tools to develop a virtual screening test for familial hypercholesterolaemia. © 2021. All Rights Reserved.
ISBN:2158107X (ISSN)
DOI:10.14569/IJACSA.2021.0120908