Embo: a Python package for empirical data analysis using the Information Bottleneck

We present 'embo', a Python package to analyze empirical data using the Information Bottleneck (IB) method and its variants, such as the Deterministic Information Bottleneck (DIB). Given two random variables 'X' and 'Y', the IB finds the stochastic mapping 'M'...

Full description

Bibliographic Details
Main Authors: Eugenio Piasini, Alexandre L. S. Filipowicz, Jonathan Levine, Joshua I. Gold
Format: Article
Language:English
Published: Ubiquity Press 2021-05-01
Series:Journal of Open Research Software
Subjects:
Online Access:https://openresearchsoftware.metajnl.com/articles/322
id doaj-463a8bb9b2eb4f90a3f04b5fed9cf467
record_format Article
spelling doaj-463a8bb9b2eb4f90a3f04b5fed9cf4672021-06-10T08:06:23ZengUbiquity PressJournal of Open Research Software2049-96472021-05-019110.5334/jors.322240Embo: a Python package for empirical data analysis using the Information BottleneckEugenio Piasini0Alexandre L. S. Filipowicz1Jonathan Levine2Joshua I. Gold3Computational Neuroscience Initiative and Department of Physics and Astronomy, University of PennsylvaniaToyota Research InstituteDepartment of Neuroscience, University of PennsylvaniaDepartment of Neuroscience, University of PennsylvaniaWe present 'embo', a Python package to analyze empirical data using the Information Bottleneck (IB) method and its variants, such as the Deterministic Information Bottleneck (DIB). Given two random variables 'X' and 'Y', the IB finds the stochastic mapping 'M' of 'X' that encodes the most information about 'Y', subject to a constraint on the information that 'M' is allowed to retain about 'X'. Despite the popularity of the IB, an accessible implementation of the reference algorithm oriented towards ease of use on empirical data was missing. Embo is optimized for the common case of discrete, low-dimensional data. Embo is fast, provides a standard data-processing pipeline, offers a parallel implementation of key computational steps, and includes reasonable defaults for the method parameters. Embo is broadly applicable to different problem domains, as it can be employed with any dataset consisting in joint observations of two discrete variables. It is available from the Python Package Index (PyPI), Zenodo and GitLab.https://openresearchsoftware.metajnl.com/articles/322information theorypythoninformation bottleneckdeterministic information bottleneckdata analysisstatistics
collection DOAJ
language English
format Article
sources DOAJ
author Eugenio Piasini
Alexandre L. S. Filipowicz
Jonathan Levine
Joshua I. Gold
spellingShingle Eugenio Piasini
Alexandre L. S. Filipowicz
Jonathan Levine
Joshua I. Gold
Embo: a Python package for empirical data analysis using the Information Bottleneck
Journal of Open Research Software
information theory
python
information bottleneck
deterministic information bottleneck
data analysis
statistics
author_facet Eugenio Piasini
Alexandre L. S. Filipowicz
Jonathan Levine
Joshua I. Gold
author_sort Eugenio Piasini
title Embo: a Python package for empirical data analysis using the Information Bottleneck
title_short Embo: a Python package for empirical data analysis using the Information Bottleneck
title_full Embo: a Python package for empirical data analysis using the Information Bottleneck
title_fullStr Embo: a Python package for empirical data analysis using the Information Bottleneck
title_full_unstemmed Embo: a Python package for empirical data analysis using the Information Bottleneck
title_sort embo: a python package for empirical data analysis using the information bottleneck
publisher Ubiquity Press
series Journal of Open Research Software
issn 2049-9647
publishDate 2021-05-01
description We present 'embo', a Python package to analyze empirical data using the Information Bottleneck (IB) method and its variants, such as the Deterministic Information Bottleneck (DIB). Given two random variables 'X' and 'Y', the IB finds the stochastic mapping 'M' of 'X' that encodes the most information about 'Y', subject to a constraint on the information that 'M' is allowed to retain about 'X'. Despite the popularity of the IB, an accessible implementation of the reference algorithm oriented towards ease of use on empirical data was missing. Embo is optimized for the common case of discrete, low-dimensional data. Embo is fast, provides a standard data-processing pipeline, offers a parallel implementation of key computational steps, and includes reasonable defaults for the method parameters. Embo is broadly applicable to different problem domains, as it can be employed with any dataset consisting in joint observations of two discrete variables. It is available from the Python Package Index (PyPI), Zenodo and GitLab.
topic information theory
python
information bottleneck
deterministic information bottleneck
data analysis
statistics
url https://openresearchsoftware.metajnl.com/articles/322
work_keys_str_mv AT eugeniopiasini emboapythonpackageforempiricaldataanalysisusingtheinformationbottleneck
AT alexandrelsfilipowicz emboapythonpackageforempiricaldataanalysisusingtheinformationbottleneck
AT jonathanlevine emboapythonpackageforempiricaldataanalysisusingtheinformationbottleneck
AT joshuaigold emboapythonpackageforempiricaldataanalysisusingtheinformationbottleneck
_version_ 1721385345156644864