How accurate are digital symptom assessment apps for suggesting conditions and urgency advice? A clinical vignettes comparison to GPs
Objectives To compare breadth of condition coverage, accuracy of suggested conditions and appropriateness of urgency advice of eight popular symptom assessment apps.Design Vignettes study.Setting 200 primary care vignettes.Intervention/comparator For eight apps and seven general practitioners (GPs):...
Main Authors: | , , , , , , , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMJ Publishing Group
2020-12-01
|
Series: | BMJ Open |
Online Access: | https://bmjopen.bmj.com/content/10/12/e040269.full |
id |
doaj-49a4cb0b2c41481dbdb2c7d37a25a6c4 |
---|---|
record_format |
Article |
spelling |
doaj-49a4cb0b2c41481dbdb2c7d37a25a6c42021-09-06T11:00:04ZengBMJ Publishing GroupBMJ Open2044-60552020-12-01101210.1136/bmjopen-2020-040269How accurate are digital symptom assessment apps for suggesting conditions and urgency advice? A clinical vignettes comparison to GPsStephen Gilbert0Alicia Mehl1Adel Baluch2Caoimhe Cawley3Jean Challiner4Hamish Fraser5Elizabeth Millen6Maryam Montazeri7Jan Multmeier8Fiona Pick9Claudia Richter10Ewelina Türk11Shubhanan Upadhyay12Vishaal Virani13Nicola Vona14Paul Wicks15Claire Novorol16Ada Health GmbH, Berlin, GermanyAda Health GmbH, Berlin, GermanyAda Health GmbH, Berlin, GermanyAda Health GmbH, Berlin, GermanyAda Health GmbH, Berlin, GermanyBrown Center for Biomedical Informatics, Brown University, Rhode Island, USAAda Health GmbH, Berlin, GermanyAda Health GmbH, Berlin, GermanyAda Health GmbH, Berlin, GermanyAda Health GmbH, Berlin, GermanyAda Health GmbH, Berlin, GermanyAda Health GmbH, Berlin, GermanyAda Health GmbH, Berlin, GermanyAda Health GmbH, Berlin, GermanyAda Health GmbH, Berlin, GermanyAda Health GmbH, Berlin, GermanyAda Health GmbH, Berlin, GermanyObjectives To compare breadth of condition coverage, accuracy of suggested conditions and appropriateness of urgency advice of eight popular symptom assessment apps.Design Vignettes study.Setting 200 primary care vignettes.Intervention/comparator For eight apps and seven general practitioners (GPs): breadth of coverage and condition-suggestion and urgency advice accuracy measured against the vignettes’ gold-standard.Primary outcome measures (1) Proportion of conditions ‘covered’ by an app, that is, not excluded because the user was too young/old or pregnant, or not modelled; (2) proportion of vignettes with the correct primary diagnosis among the top 3 conditions suggested; (3) proportion of ‘safe’ urgency advice (ie, at gold standard level, more conservative, or no more than one level less conservative).Results Condition-suggestion coverage was highly variable, with some apps not offering a suggestion for many users: in alphabetical order, Ada: 99.0%; Babylon: 51.5%; Buoy: 88.5%; K Health: 74.5%; Mediktor: 80.5%; Symptomate: 61.5%; Your.MD: 64.5%; WebMD: 93.0%. Top-3 suggestion accuracy was GPs (average): 82.1%±5.2%; Ada: 70.5%; Babylon: 32.0%; Buoy: 43.0%; K Health: 36.0%; Mediktor: 36.0%; Symptomate: 27.5%; WebMD: 35.5%; Your.MD: 23.5%. Some apps excluded certain user demographics or conditions and their performance was generally greater with the exclusion of corresponding vignettes. For safe urgency advice, tested GPs had an average of 97.0%±2.5%. For the vignettes with advice provided, only three apps had safety performance within 1 SD of the GPs—Ada: 97.0%; Babylon: 95.1%; Symptomate: 97.8%. One app had a safety performance within 2 SDs of GPs—Your.MD: 92.6%. Three apps had a safety performance outside 2 SDs of GPs—Buoy: 80.0% (p<0.001); K Health: 81.3% (p<0.001); Mediktor: 87.3% (p=1.3×10-3).Conclusions The utility of digital symptom assessment apps relies on coverage, accuracy and safety. While no digital tool outperformed GPs, some came close, and the nature of iterative improvements to software offers scalable improvements to care.https://bmjopen.bmj.com/content/10/12/e040269.full |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Stephen Gilbert Alicia Mehl Adel Baluch Caoimhe Cawley Jean Challiner Hamish Fraser Elizabeth Millen Maryam Montazeri Jan Multmeier Fiona Pick Claudia Richter Ewelina Türk Shubhanan Upadhyay Vishaal Virani Nicola Vona Paul Wicks Claire Novorol |
spellingShingle |
Stephen Gilbert Alicia Mehl Adel Baluch Caoimhe Cawley Jean Challiner Hamish Fraser Elizabeth Millen Maryam Montazeri Jan Multmeier Fiona Pick Claudia Richter Ewelina Türk Shubhanan Upadhyay Vishaal Virani Nicola Vona Paul Wicks Claire Novorol How accurate are digital symptom assessment apps for suggesting conditions and urgency advice? A clinical vignettes comparison to GPs BMJ Open |
author_facet |
Stephen Gilbert Alicia Mehl Adel Baluch Caoimhe Cawley Jean Challiner Hamish Fraser Elizabeth Millen Maryam Montazeri Jan Multmeier Fiona Pick Claudia Richter Ewelina Türk Shubhanan Upadhyay Vishaal Virani Nicola Vona Paul Wicks Claire Novorol |
author_sort |
Stephen Gilbert |
title |
How accurate are digital symptom assessment apps for suggesting conditions and urgency advice? A clinical vignettes comparison to GPs |
title_short |
How accurate are digital symptom assessment apps for suggesting conditions and urgency advice? A clinical vignettes comparison to GPs |
title_full |
How accurate are digital symptom assessment apps for suggesting conditions and urgency advice? A clinical vignettes comparison to GPs |
title_fullStr |
How accurate are digital symptom assessment apps for suggesting conditions and urgency advice? A clinical vignettes comparison to GPs |
title_full_unstemmed |
How accurate are digital symptom assessment apps for suggesting conditions and urgency advice? A clinical vignettes comparison to GPs |
title_sort |
how accurate are digital symptom assessment apps for suggesting conditions and urgency advice? a clinical vignettes comparison to gps |
publisher |
BMJ Publishing Group |
series |
BMJ Open |
issn |
2044-6055 |
publishDate |
2020-12-01 |
description |
Objectives To compare breadth of condition coverage, accuracy of suggested conditions and appropriateness of urgency advice of eight popular symptom assessment apps.Design Vignettes study.Setting 200 primary care vignettes.Intervention/comparator For eight apps and seven general practitioners (GPs): breadth of coverage and condition-suggestion and urgency advice accuracy measured against the vignettes’ gold-standard.Primary outcome measures (1) Proportion of conditions ‘covered’ by an app, that is, not excluded because the user was too young/old or pregnant, or not modelled; (2) proportion of vignettes with the correct primary diagnosis among the top 3 conditions suggested; (3) proportion of ‘safe’ urgency advice (ie, at gold standard level, more conservative, or no more than one level less conservative).Results Condition-suggestion coverage was highly variable, with some apps not offering a suggestion for many users: in alphabetical order, Ada: 99.0%; Babylon: 51.5%; Buoy: 88.5%; K Health: 74.5%; Mediktor: 80.5%; Symptomate: 61.5%; Your.MD: 64.5%; WebMD: 93.0%. Top-3 suggestion accuracy was GPs (average): 82.1%±5.2%; Ada: 70.5%; Babylon: 32.0%; Buoy: 43.0%; K Health: 36.0%; Mediktor: 36.0%; Symptomate: 27.5%; WebMD: 35.5%; Your.MD: 23.5%. Some apps excluded certain user demographics or conditions and their performance was generally greater with the exclusion of corresponding vignettes. For safe urgency advice, tested GPs had an average of 97.0%±2.5%. For the vignettes with advice provided, only three apps had safety performance within 1 SD of the GPs—Ada: 97.0%; Babylon: 95.1%; Symptomate: 97.8%. One app had a safety performance within 2 SDs of GPs—Your.MD: 92.6%. Three apps had a safety performance outside 2 SDs of GPs—Buoy: 80.0% (p<0.001); K Health: 81.3% (p<0.001); Mediktor: 87.3% (p=1.3×10-3).Conclusions The utility of digital symptom assessment apps relies on coverage, accuracy and safety. While no digital tool outperformed GPs, some came close, and the nature of iterative improvements to software offers scalable improvements to care. |
url |
https://bmjopen.bmj.com/content/10/12/e040269.full |
work_keys_str_mv |
AT stephengilbert howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps AT aliciamehl howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps AT adelbaluch howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps AT caoimhecawley howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps AT jeanchalliner howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps AT hamishfraser howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps AT elizabethmillen howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps AT maryammontazeri howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps AT janmultmeier howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps AT fionapick howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps AT claudiarichter howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps AT ewelinaturk howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps AT shubhananupadhyay howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps AT vishaalvirani howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps AT nicolavona howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps AT paulwicks howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps AT clairenovorol howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps |
_version_ |
1717779593417457664 |