Automated vocabulary discovery for geo-parsing online epidemic intelligence
<p>Abstract</p> <p>Background</p> <p>Automated surveillance of the Internet provides a timely and sensitive method for alerting on global emerging infectious disease threats. HealthMap is part of a new generation of online systems designed to monitor and visualize, on a...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2009-11-01
|
Series: | BMC Bioinformatics |
Online Access: | http://www.biomedcentral.com/1471-2105/10/385 |
id |
doaj-5161bd328718441ebdfa5a5962acc4e2 |
---|---|
record_format |
Article |
spelling |
doaj-5161bd328718441ebdfa5a5962acc4e22020-11-24T21:55:35ZengBMCBMC Bioinformatics1471-21052009-11-0110138510.1186/1471-2105-10-385Automated vocabulary discovery for geo-parsing online epidemic intelligenceFreifeld Clark CKeller MikaelaBrownstein John S<p>Abstract</p> <p>Background</p> <p>Automated surveillance of the Internet provides a timely and sensitive method for alerting on global emerging infectious disease threats. HealthMap is part of a new generation of online systems designed to monitor and visualize, on a real-time basis, disease outbreak alerts as reported by online news media and public health sources. HealthMap is of specific interest for national and international public health organizations and international travelers. A particular task that makes such a surveillance useful is the automated discovery of the geographic references contained in the retrieved outbreak alerts. This task is sometimes referred to as "geo-parsing". A typical approach to geo-parsing would demand an expensive training corpus of alerts manually tagged by a human.</p> <p>Results</p> <p>Given that human readers perform this kind of task by using both their lexical and contextual knowledge, we developed an approach which relies on a relatively small expert-built gazetteer, thus limiting the need of human input, but focuses on learning the context in which geographic references appear. We show in a set of experiments, that this approach exhibits a substantial capacity to discover geographic locations outside of its initial lexicon.</p> <p>Conclusion</p> <p>The results of this analysis provide a framework for future automated global surveillance efforts that reduce manual input and improve timeliness of reporting.</p> http://www.biomedcentral.com/1471-2105/10/385 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Freifeld Clark C Keller Mikaela Brownstein John S |
spellingShingle |
Freifeld Clark C Keller Mikaela Brownstein John S Automated vocabulary discovery for geo-parsing online epidemic intelligence BMC Bioinformatics |
author_facet |
Freifeld Clark C Keller Mikaela Brownstein John S |
author_sort |
Freifeld Clark C |
title |
Automated vocabulary discovery for geo-parsing online epidemic intelligence |
title_short |
Automated vocabulary discovery for geo-parsing online epidemic intelligence |
title_full |
Automated vocabulary discovery for geo-parsing online epidemic intelligence |
title_fullStr |
Automated vocabulary discovery for geo-parsing online epidemic intelligence |
title_full_unstemmed |
Automated vocabulary discovery for geo-parsing online epidemic intelligence |
title_sort |
automated vocabulary discovery for geo-parsing online epidemic intelligence |
publisher |
BMC |
series |
BMC Bioinformatics |
issn |
1471-2105 |
publishDate |
2009-11-01 |
description |
<p>Abstract</p> <p>Background</p> <p>Automated surveillance of the Internet provides a timely and sensitive method for alerting on global emerging infectious disease threats. HealthMap is part of a new generation of online systems designed to monitor and visualize, on a real-time basis, disease outbreak alerts as reported by online news media and public health sources. HealthMap is of specific interest for national and international public health organizations and international travelers. A particular task that makes such a surveillance useful is the automated discovery of the geographic references contained in the retrieved outbreak alerts. This task is sometimes referred to as "geo-parsing". A typical approach to geo-parsing would demand an expensive training corpus of alerts manually tagged by a human.</p> <p>Results</p> <p>Given that human readers perform this kind of task by using both their lexical and contextual knowledge, we developed an approach which relies on a relatively small expert-built gazetteer, thus limiting the need of human input, but focuses on learning the context in which geographic references appear. We show in a set of experiments, that this approach exhibits a substantial capacity to discover geographic locations outside of its initial lexicon.</p> <p>Conclusion</p> <p>The results of this analysis provide a framework for future automated global surveillance efforts that reduce manual input and improve timeliness of reporting.</p> |
url |
http://www.biomedcentral.com/1471-2105/10/385 |
work_keys_str_mv |
AT freifeldclarkc automatedvocabularydiscoveryforgeoparsingonlineepidemicintelligence AT kellermikaela automatedvocabularydiscoveryforgeoparsingonlineepidemicintelligence AT brownsteinjohns automatedvocabularydiscoveryforgeoparsingonlineepidemicintelligence |
_version_ |
1725861652858929152 |