MacSyFinder: a program to mine genomes for molecular systems with an application to CRISPR-Cas systems.

Biologists often wish to use their knowledge on a few experimental models of a given molecular system to identify homologs in genomic data. We developed a generic tool for this purpose.Macromolecular System Finder (MacSyFinder) provides a flexible framework to model the properties of molecular syste...

Full description

Bibliographic Details
Main Authors: Sophie S Abby, Bertrand Néron, Hervé Ménager, Marie Touchon, Eduardo P C Rocha
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2014-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC4201578?pdf=render
id doaj-341dd8e3a2484b6d9b52d410e218afb7
record_format Article
spelling doaj-341dd8e3a2484b6d9b52d410e218afb72020-11-24T21:50:34ZengPublic Library of Science (PLoS)PLoS ONE1932-62032014-01-01910e11072610.1371/journal.pone.0110726MacSyFinder: a program to mine genomes for molecular systems with an application to CRISPR-Cas systems.Sophie S AbbyBertrand NéronHervé MénagerMarie TouchonEduardo P C RochaBiologists often wish to use their knowledge on a few experimental models of a given molecular system to identify homologs in genomic data. We developed a generic tool for this purpose.Macromolecular System Finder (MacSyFinder) provides a flexible framework to model the properties of molecular systems (cellular machinery or pathway) including their components, evolutionary associations with other systems and genetic architecture. Modelled features also include functional analogs, and the multiple uses of a same component by different systems. Models are used to search for molecular systems in complete genomes or in unstructured data like metagenomes. The components of the systems are searched by sequence similarity using Hidden Markov model (HMM) protein profiles. The assignment of hits to a given system is decided based on compliance with the content and organization of the system model. A graphical interface, MacSyView, facilitates the analysis of the results by showing overviews of component content and genomic context. To exemplify the use of MacSyFinder we built models to detect and class CRISPR-Cas systems following a previously established classification. We show that MacSyFinder allows to easily define an accurate "Cas-finder" using publicly available protein profiles.MacSyFinder is a standalone application implemented in Python. It requires Python 2.7, Hmmer and makeblastdb (version 2.2.28 or higher). It is freely available with its source code under a GPLv3 license at https://github.com/gem-pasteur/macsyfinder. It is compatible with all platforms supporting Python and Hmmer/makeblastdb. The "Cas-finder" (models and HMM profiles) is distributed as a compressed tarball archive as Supporting Information.http://europepmc.org/articles/PMC4201578?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Sophie S Abby
Bertrand Néron
Hervé Ménager
Marie Touchon
Eduardo P C Rocha
spellingShingle Sophie S Abby
Bertrand Néron
Hervé Ménager
Marie Touchon
Eduardo P C Rocha
MacSyFinder: a program to mine genomes for molecular systems with an application to CRISPR-Cas systems.
PLoS ONE
author_facet Sophie S Abby
Bertrand Néron
Hervé Ménager
Marie Touchon
Eduardo P C Rocha
author_sort Sophie S Abby
title MacSyFinder: a program to mine genomes for molecular systems with an application to CRISPR-Cas systems.
title_short MacSyFinder: a program to mine genomes for molecular systems with an application to CRISPR-Cas systems.
title_full MacSyFinder: a program to mine genomes for molecular systems with an application to CRISPR-Cas systems.
title_fullStr MacSyFinder: a program to mine genomes for molecular systems with an application to CRISPR-Cas systems.
title_full_unstemmed MacSyFinder: a program to mine genomes for molecular systems with an application to CRISPR-Cas systems.
title_sort macsyfinder: a program to mine genomes for molecular systems with an application to crispr-cas systems.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2014-01-01
description Biologists often wish to use their knowledge on a few experimental models of a given molecular system to identify homologs in genomic data. We developed a generic tool for this purpose.Macromolecular System Finder (MacSyFinder) provides a flexible framework to model the properties of molecular systems (cellular machinery or pathway) including their components, evolutionary associations with other systems and genetic architecture. Modelled features also include functional analogs, and the multiple uses of a same component by different systems. Models are used to search for molecular systems in complete genomes or in unstructured data like metagenomes. The components of the systems are searched by sequence similarity using Hidden Markov model (HMM) protein profiles. The assignment of hits to a given system is decided based on compliance with the content and organization of the system model. A graphical interface, MacSyView, facilitates the analysis of the results by showing overviews of component content and genomic context. To exemplify the use of MacSyFinder we built models to detect and class CRISPR-Cas systems following a previously established classification. We show that MacSyFinder allows to easily define an accurate "Cas-finder" using publicly available protein profiles.MacSyFinder is a standalone application implemented in Python. It requires Python 2.7, Hmmer and makeblastdb (version 2.2.28 or higher). It is freely available with its source code under a GPLv3 license at https://github.com/gem-pasteur/macsyfinder. It is compatible with all platforms supporting Python and Hmmer/makeblastdb. The "Cas-finder" (models and HMM profiles) is distributed as a compressed tarball archive as Supporting Information.
url http://europepmc.org/articles/PMC4201578?pdf=render
work_keys_str_mv AT sophiesabby macsyfinderaprogramtominegenomesformolecularsystemswithanapplicationtocrisprcassystems
AT bertrandneron macsyfinderaprogramtominegenomesformolecularsystemswithanapplicationtocrisprcassystems
AT hervemenager macsyfinderaprogramtominegenomesformolecularsystemswithanapplicationtocrisprcassystems
AT marietouchon macsyfinderaprogramtominegenomesformolecularsystemswithanapplicationtocrisprcassystems
AT eduardopcrocha macsyfinderaprogramtominegenomesformolecularsystemswithanapplicationtocrisprcassystems
_version_ 1725883135580700672