Lymelight: forecasting Lyme disease risk using web search data

Abstract Lyme disease is the most common tick-borne disease in the Northern Hemisphere. Existing estimates of Lyme disease spread are delayed a year or more. We introduce Lymelight—a new method for monitoring the incidence of Lyme disease in real-time. We use a machine-learned classifier of web sear...

Full description

Bibliographic Details
Main Authors: Adam Sadilek, Yulin Hswen, Shailesh Bavadekar, Tomer Shekel, John S. Brownstein, Evgeniy Gabrilovich
Format: Article
Language:English
Published: Nature Publishing Group 2020-02-01
Series:npj Digital Medicine
Online Access:https://doi.org/10.1038/s41746-020-0222-x
id doaj-7607b3dae73f4c24804ed0fc3026e323
record_format Article
spelling doaj-7607b3dae73f4c24804ed0fc3026e3232021-02-23T09:42:33ZengNature Publishing Groupnpj Digital Medicine2398-63522020-02-013111210.1038/s41746-020-0222-xLymelight: forecasting Lyme disease risk using web search dataAdam Sadilek0Yulin Hswen1Shailesh Bavadekar2Tomer Shekel3John S. Brownstein4Evgeniy Gabrilovich5GoogleDepartment of Social and Behavioral Sciences, Harvard T.H. Chan School of Public HealthGoogleGoogleComputational Epidemiology Lab, Boston Children’s HospitalGoogleAbstract Lyme disease is the most common tick-borne disease in the Northern Hemisphere. Existing estimates of Lyme disease spread are delayed a year or more. We introduce Lymelight—a new method for monitoring the incidence of Lyme disease in real-time. We use a machine-learned classifier of web search sessions to estimate the number of individuals who search for possible Lyme disease symptoms in a given geographical area for two years, 2014 and 2015. We evaluate Lymelight using the official case count data from CDC and find a 92% correlation (p < 0.001) at county level. Importantly, using web search data allows us not only to assess the incidence of the disease, but also to examine the appropriateness of treatments subsequently searched for by the users. Public health implications of our work include monitoring the spread of vector-borne diseases in a timely and scalable manner, complementing existing approaches through real-time detection, which can enable more timely interventions. Our analysis of treatment searches may also help reduce misdiagnosis of the disease.https://doi.org/10.1038/s41746-020-0222-x
collection DOAJ
language English
format Article
sources DOAJ
author Adam Sadilek
Yulin Hswen
Shailesh Bavadekar
Tomer Shekel
John S. Brownstein
Evgeniy Gabrilovich
spellingShingle Adam Sadilek
Yulin Hswen
Shailesh Bavadekar
Tomer Shekel
John S. Brownstein
Evgeniy Gabrilovich
Lymelight: forecasting Lyme disease risk using web search data
npj Digital Medicine
author_facet Adam Sadilek
Yulin Hswen
Shailesh Bavadekar
Tomer Shekel
John S. Brownstein
Evgeniy Gabrilovich
author_sort Adam Sadilek
title Lymelight: forecasting Lyme disease risk using web search data
title_short Lymelight: forecasting Lyme disease risk using web search data
title_full Lymelight: forecasting Lyme disease risk using web search data
title_fullStr Lymelight: forecasting Lyme disease risk using web search data
title_full_unstemmed Lymelight: forecasting Lyme disease risk using web search data
title_sort lymelight: forecasting lyme disease risk using web search data
publisher Nature Publishing Group
series npj Digital Medicine
issn 2398-6352
publishDate 2020-02-01
description Abstract Lyme disease is the most common tick-borne disease in the Northern Hemisphere. Existing estimates of Lyme disease spread are delayed a year or more. We introduce Lymelight—a new method for monitoring the incidence of Lyme disease in real-time. We use a machine-learned classifier of web search sessions to estimate the number of individuals who search for possible Lyme disease symptoms in a given geographical area for two years, 2014 and 2015. We evaluate Lymelight using the official case count data from CDC and find a 92% correlation (p < 0.001) at county level. Importantly, using web search data allows us not only to assess the incidence of the disease, but also to examine the appropriateness of treatments subsequently searched for by the users. Public health implications of our work include monitoring the spread of vector-borne diseases in a timely and scalable manner, complementing existing approaches through real-time detection, which can enable more timely interventions. Our analysis of treatment searches may also help reduce misdiagnosis of the disease.
url https://doi.org/10.1038/s41746-020-0222-x
work_keys_str_mv AT adamsadilek lymelightforecastinglymediseaseriskusingwebsearchdata
AT yulinhswen lymelightforecastinglymediseaseriskusingwebsearchdata
AT shaileshbavadekar lymelightforecastinglymediseaseriskusingwebsearchdata
AT tomershekel lymelightforecastinglymediseaseriskusingwebsearchdata
AT johnsbrownstein lymelightforecastinglymediseaseriskusingwebsearchdata
AT evgeniygabrilovich lymelightforecastinglymediseaseriskusingwebsearchdata
_version_ 1714850553426608128