Artificial intelligence for interpretation of segments of whole body MRI in CNO: pilot study comparing radiologists versus machine learning algorithm

Abstract Background To initiate the development of a machine learning algorithm capable of comparing segments of pre and post pamidronate whole body MRI scans to assess treatment response and to compare the results of this algorithm with the analysis of a panel of paediatric radiologists. Methods Wh...

Full description

Bibliographic Details
Main Authors: Chandrika S. Bhat, Mark Chopra, Savvas Andronikou, Suvadip Paul, Zach Wener-Fligner, Anna Merkoulovitch, Izidora Holjar-Erlic, Flavia Menegotto, Ewan Simpson, David Grier, Athimalaipet V. Ramanan
Format: Article
Language:English
Published: BMC 2020-06-01
Series:Pediatric Rheumatology Online Journal
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12969-020-00442-9
Description
Summary:Abstract Background To initiate the development of a machine learning algorithm capable of comparing segments of pre and post pamidronate whole body MRI scans to assess treatment response and to compare the results of this algorithm with the analysis of a panel of paediatric radiologists. Methods Whole body MRI of patients under the age of 16 diagnosed with CNO and treated with pamidronate at a tertiary referral paediatric hospital in United Kingdom between 2005 and 2017 were reviewed. Pre and post pamidronate images of the commonest sites of involvement (distal femur and proximal tibia) were manually selected (n = 45). A machine learning algorithm was developed and tested to assess treatment effectiveness by comparing pre and post pamidronate scans. The results of this algorithm were compared with the results of a panel of radiologists (ground truth). Results When tested initially the machine algorithm predicted 4/7 (57.1%) examples correctly in the multi class model, and 5/7 (71.4%) correctly in the binary group. However when compared to the ground truth, the machine model was able to classify only 33.3% of the samples correctly but had a sensitivity of 100% in detecting improvement or worsening of disease. Conclusion The machine learning could detect new lesions or resolution of a lesion with good sensitivity but failed to classify stable disease accurately. However, further validation on larger datasets are required to improve the specificity and accuracy of the machine model.
ISSN:1546-0096