FORENSIC: an Online Platform for Fecal Source Identification

FORENSIC is an online platform to identify sources of fecal pollution without the need to create reference libraries. FORENSIC is based on the ability of random forest classification to extract cohesive source microbial signatures to create classifiers despite individual variability and to detect th...

Full description

Bibliographic Details
Main Authors: Adélaïde Roguet, Özcan C. Esen, A. Murat Eren, Ryan J. Newton, Sandra L. McLellan
Format: Article
Language:English
Published: American Society for Microbiology 2020-03-01
Series:mSystems
Subjects:
Online Access:https://doi.org/10.1128/mSystems.00869-19
id doaj-dc7012daa7bb4118b89ea35881ecad1b
record_format Article
spelling doaj-dc7012daa7bb4118b89ea35881ecad1b2020-11-25T02:43:32ZengAmerican Society for MicrobiologymSystems2379-50772020-03-0152e00869-1910.1128/mSystems.00869-19FORENSIC: an Online Platform for Fecal Source IdentificationAdélaïde RoguetÖzcan C. EsenA. Murat ErenRyan J. NewtonSandra L. McLellanFORENSIC is an online platform to identify sources of fecal pollution without the need to create reference libraries. FORENSIC is based on the ability of random forest classification to extract cohesive source microbial signatures to create classifiers despite individual variability and to detect the signatures in environmental samples. We primarily focused on defining sewage signals, which are associated with a high human health risk in polluted waters. To test for fecal contamination sources, the platform only requires paired-end reads targeting the V4 or V6 regions of the 16S rRNA gene. We demonstrated that we could use V4V5 reads trimmed to the V4 positions to generate the reference signature. The systematic workflow we describe to create and validate the signatures could be applied to many disciplines. With the increasing gap between advancing technology and practical applications, this platform makes sequence-based water quality assessments accessible to the public health and water resource communities.Sewage overflows, agricultural runoff, and stormwater discharges introduce fecal pollution into surface waters. Distinguishing these sources is critical for evaluating water quality and formulating remediation strategies. With the falling costs of sequencing, microbial community-based water quality assessment tools are under development. However, their application is limited by the need to build reference libraries, which requires extensive sampling of sources and bioinformatic expertise. Here, we introduce FORest Enteric Source IdentifiCation (FORENSIC; https://forensic.sfs.uwm.edu/), an online, library-independent source tracking platform based on random forest classification and 16S rRNA gene amplicon sequences to identify in environmental samples common fecal contamination sources, including humans, domestic pets, and agricultural animals. FORENSIC relies on a broad reference signature database of Bacteroidales and Clostridiales, two predominant bacterial groups that have coevolved with their hosts. As a result, these groups demonstrate cohesive and reliable assemblage patterns within mammalian species or among species sharing the same diet/physiology. We created a scalable and extensible platform that we tested for global applicability using samples collected in distant geographic locations. This Web application offers a fast and intuitive approach for fecal source identification, particularly in sewage-contaminated waters.https://doi.org/10.1128/mSystems.00869-19microbial source tracking16s rrna genehigh-throughput sequencingbacteroidalesclostridialesrandom forest classificationtoolkit
collection DOAJ
language English
format Article
sources DOAJ
author Adélaïde Roguet
Özcan C. Esen
A. Murat Eren
Ryan J. Newton
Sandra L. McLellan
spellingShingle Adélaïde Roguet
Özcan C. Esen
A. Murat Eren
Ryan J. Newton
Sandra L. McLellan
FORENSIC: an Online Platform for Fecal Source Identification
mSystems
microbial source tracking
16s rrna gene
high-throughput sequencing
bacteroidales
clostridiales
random forest classification
toolkit
author_facet Adélaïde Roguet
Özcan C. Esen
A. Murat Eren
Ryan J. Newton
Sandra L. McLellan
author_sort Adélaïde Roguet
title FORENSIC: an Online Platform for Fecal Source Identification
title_short FORENSIC: an Online Platform for Fecal Source Identification
title_full FORENSIC: an Online Platform for Fecal Source Identification
title_fullStr FORENSIC: an Online Platform for Fecal Source Identification
title_full_unstemmed FORENSIC: an Online Platform for Fecal Source Identification
title_sort forensic: an online platform for fecal source identification
publisher American Society for Microbiology
series mSystems
issn 2379-5077
publishDate 2020-03-01
description FORENSIC is an online platform to identify sources of fecal pollution without the need to create reference libraries. FORENSIC is based on the ability of random forest classification to extract cohesive source microbial signatures to create classifiers despite individual variability and to detect the signatures in environmental samples. We primarily focused on defining sewage signals, which are associated with a high human health risk in polluted waters. To test for fecal contamination sources, the platform only requires paired-end reads targeting the V4 or V6 regions of the 16S rRNA gene. We demonstrated that we could use V4V5 reads trimmed to the V4 positions to generate the reference signature. The systematic workflow we describe to create and validate the signatures could be applied to many disciplines. With the increasing gap between advancing technology and practical applications, this platform makes sequence-based water quality assessments accessible to the public health and water resource communities.Sewage overflows, agricultural runoff, and stormwater discharges introduce fecal pollution into surface waters. Distinguishing these sources is critical for evaluating water quality and formulating remediation strategies. With the falling costs of sequencing, microbial community-based water quality assessment tools are under development. However, their application is limited by the need to build reference libraries, which requires extensive sampling of sources and bioinformatic expertise. Here, we introduce FORest Enteric Source IdentifiCation (FORENSIC; https://forensic.sfs.uwm.edu/), an online, library-independent source tracking platform based on random forest classification and 16S rRNA gene amplicon sequences to identify in environmental samples common fecal contamination sources, including humans, domestic pets, and agricultural animals. FORENSIC relies on a broad reference signature database of Bacteroidales and Clostridiales, two predominant bacterial groups that have coevolved with their hosts. As a result, these groups demonstrate cohesive and reliable assemblage patterns within mammalian species or among species sharing the same diet/physiology. We created a scalable and extensible platform that we tested for global applicability using samples collected in distant geographic locations. This Web application offers a fast and intuitive approach for fecal source identification, particularly in sewage-contaminated waters.
topic microbial source tracking
16s rrna gene
high-throughput sequencing
bacteroidales
clostridiales
random forest classification
toolkit
url https://doi.org/10.1128/mSystems.00869-19
work_keys_str_mv AT adelaideroguet forensicanonlineplatformforfecalsourceidentification
AT ozcancesen forensicanonlineplatformforfecalsourceidentification
AT amurateren forensicanonlineplatformforfecalsourceidentification
AT ryanjnewton forensicanonlineplatformforfecalsourceidentification
AT sandralmclellan forensicanonlineplatformforfecalsourceidentification
_version_ 1715404236448071680