Automating classification of osteoarthritis according to Kellgren-Lawrence in the knee using deep learning in an unfiltered adult population

Abstract Background Prevalence for knee osteoarthritis is rising in both Sweden and globally due to increased age and obesity in the population. This has subsequently led to an increasing demand for knee arthroplasties. Correct diagnosis and classification of a knee osteoarthritis (OA) are therefore...

Full description

Bibliographic Details
Main Authors: Simon Olsson, Ehsan Akbarian, Anna Lind, Ali Sharif Razavian, Max Gordon
Format: Article
Language:English
Published: BMC 2021-10-01
Series:BMC Musculoskeletal Disorders
Subjects:
Online Access:https://doi.org/10.1186/s12891-021-04722-7
Description
Summary:Abstract Background Prevalence for knee osteoarthritis is rising in both Sweden and globally due to increased age and obesity in the population. This has subsequently led to an increasing demand for knee arthroplasties. Correct diagnosis and classification of a knee osteoarthritis (OA) are therefore of a great interest in following-up and planning for either conservative or operative management. Most orthopedic surgeons rely on standard weight bearing radiographs of the knee. Improving the reliability and reproducibility of these interpretations could thus be hugely beneficial. Recently, deep learning which is a form of artificial intelligence (AI), has been showing promising results in interpreting radiographic images. In this study, we aim to evaluate how well an AI can classify the severity of knee OA, using entire image series and not excluding common visual disturbances such as an implant, cast and non-degenerative pathologies. Methods We selected 6103 radiographic exams of the knee taken at Danderyd University Hospital between the years 2002-2016 and manually categorized them according to the Kellgren & Lawrence grading scale (KL). We then trained a convolutional neural network (CNN) of ResNet architecture using PyTorch. We evaluated the results against a test set of 300 exams that had been reviewed independently by two senior orthopedic surgeons who settled eventual interobserver disagreements through consensus sessions. Results The CNN yielded an overall AUC of more than 0.87 for all KL grades except KL grade 2, which yielded an AUC of 0.8 and a mean AUC of 0.92. When merging adjacent KL grades, all but one group showed near perfect results with AUC > 0.95 indicating excellent performance. Conclusion We have found that we could teach a CNN to correctly diagnose and classify the severity of knee OA using the KL grading system without cleaning the input data from major visual disturbances such as implants and other pathologies.
ISSN:1471-2474