Embo: a Python package for empirical data analysis using the Information Bottleneck
We present 'embo', a Python package to analyze empirical data using the Information Bottleneck (IB) method and its variants, such as the Deterministic Information Bottleneck (DIB). Given two random variables 'X' and 'Y', the IB finds the stochastic mapping 'M'...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Ubiquity Press
2021-05-01
|
Series: | Journal of Open Research Software |
Subjects: | |
Online Access: | https://openresearchsoftware.metajnl.com/articles/322 |
id |
doaj-463a8bb9b2eb4f90a3f04b5fed9cf467 |
---|---|
record_format |
Article |
spelling |
doaj-463a8bb9b2eb4f90a3f04b5fed9cf4672021-06-10T08:06:23ZengUbiquity PressJournal of Open Research Software2049-96472021-05-019110.5334/jors.322240Embo: a Python package for empirical data analysis using the Information BottleneckEugenio Piasini0Alexandre L. S. Filipowicz1Jonathan Levine2Joshua I. Gold3Computational Neuroscience Initiative and Department of Physics and Astronomy, University of PennsylvaniaToyota Research InstituteDepartment of Neuroscience, University of PennsylvaniaDepartment of Neuroscience, University of PennsylvaniaWe present 'embo', a Python package to analyze empirical data using the Information Bottleneck (IB) method and its variants, such as the Deterministic Information Bottleneck (DIB). Given two random variables 'X' and 'Y', the IB finds the stochastic mapping 'M' of 'X' that encodes the most information about 'Y', subject to a constraint on the information that 'M' is allowed to retain about 'X'. Despite the popularity of the IB, an accessible implementation of the reference algorithm oriented towards ease of use on empirical data was missing. Embo is optimized for the common case of discrete, low-dimensional data. Embo is fast, provides a standard data-processing pipeline, offers a parallel implementation of key computational steps, and includes reasonable defaults for the method parameters. Embo is broadly applicable to different problem domains, as it can be employed with any dataset consisting in joint observations of two discrete variables. It is available from the Python Package Index (PyPI), Zenodo and GitLab.https://openresearchsoftware.metajnl.com/articles/322information theorypythoninformation bottleneckdeterministic information bottleneckdata analysisstatistics |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Eugenio Piasini Alexandre L. S. Filipowicz Jonathan Levine Joshua I. Gold |
spellingShingle |
Eugenio Piasini Alexandre L. S. Filipowicz Jonathan Levine Joshua I. Gold Embo: a Python package for empirical data analysis using the Information Bottleneck Journal of Open Research Software information theory python information bottleneck deterministic information bottleneck data analysis statistics |
author_facet |
Eugenio Piasini Alexandre L. S. Filipowicz Jonathan Levine Joshua I. Gold |
author_sort |
Eugenio Piasini |
title |
Embo: a Python package for empirical data analysis using the Information Bottleneck |
title_short |
Embo: a Python package for empirical data analysis using the Information Bottleneck |
title_full |
Embo: a Python package for empirical data analysis using the Information Bottleneck |
title_fullStr |
Embo: a Python package for empirical data analysis using the Information Bottleneck |
title_full_unstemmed |
Embo: a Python package for empirical data analysis using the Information Bottleneck |
title_sort |
embo: a python package for empirical data analysis using the information bottleneck |
publisher |
Ubiquity Press |
series |
Journal of Open Research Software |
issn |
2049-9647 |
publishDate |
2021-05-01 |
description |
We present 'embo', a Python package to analyze empirical data using the Information Bottleneck (IB) method and its variants, such as the Deterministic Information Bottleneck (DIB). Given two random variables 'X' and 'Y', the IB finds the stochastic mapping 'M' of 'X' that encodes the most information about 'Y', subject to a constraint on the information that 'M' is allowed to retain about 'X'. Despite the popularity of the IB, an accessible implementation of the reference algorithm oriented towards ease of use on empirical data was missing. Embo is optimized for the common case of discrete, low-dimensional data. Embo is fast, provides a standard data-processing pipeline, offers a parallel implementation of key computational steps, and includes reasonable defaults for the method parameters. Embo is broadly applicable to different problem domains, as it can be employed with any dataset consisting in joint observations of two discrete variables. It is available from the Python Package Index (PyPI), Zenodo and GitLab. |
topic |
information theory python information bottleneck deterministic information bottleneck data analysis statistics |
url |
https://openresearchsoftware.metajnl.com/articles/322 |
work_keys_str_mv |
AT eugeniopiasini emboapythonpackageforempiricaldataanalysisusingtheinformationbottleneck AT alexandrelsfilipowicz emboapythonpackageforempiricaldataanalysisusingtheinformationbottleneck AT jonathanlevine emboapythonpackageforempiricaldataanalysisusingtheinformationbottleneck AT joshuaigold emboapythonpackageforempiricaldataanalysisusingtheinformationbottleneck |
_version_ |
1721385345156644864 |