Estimation Bias in Maximum Entropy Models

Maximum entropy models have become popular statistical models in neuroscience and other areas in biology and can be useful tools for obtaining estimates of mutual information in biological systems. However, maximum entropy models fit to small data sets can be subject to sampling bias; i.e., the true...

Full description

Bibliographic Details
Main Authors: Jakob H. Macke, Iain Murray, Peter E. Latham
Format: Article
Language:English
Published: MDPI AG 2013-08-01
Series:Entropy
Subjects:
Online Access:http://www.mdpi.com/1099-4300/15/8/3109
id doaj-6edb23c86838413d97d5db1ef1ef10db
record_format Article
spelling doaj-6edb23c86838413d97d5db1ef1ef10db2020-11-24T22:57:31ZengMDPI AGEntropy1099-43002013-08-011583109312910.3390/e15083109Estimation Bias in Maximum Entropy ModelsJakob H. MackeIain MurrayPeter E. LathamMaximum entropy models have become popular statistical models in neuroscience and other areas in biology and can be useful tools for obtaining estimates of mutual information in biological systems. However, maximum entropy models fit to small data sets can be subject to sampling bias; i.e., the true entropy of the data can be severely underestimated. Here, we study the sampling properties of estimates of the entropy obtained from maximum entropy models. We focus on pairwise binary models, which are used extensively to model neural population activity. We show that if the data is well described by a pairwise model, the bias is equal to the number of parameters divided by twice the number of observations. If, however, the higher order correlations in the data deviate from those predicted by the model, the bias can be larger. Using a phenomenological model of neural population recordings, we find that this additional bias is highest for small firing probabilities, strong correlations and large population sizes—for the parameters we tested, a factor of about four higher. We derive guidelines for how long a neurophysiological experiment needs to be in order to ensure that the bias is less than a specified criterion. Finally, we show how a modified plug-in estimate of the entropy can be used for bias correction.http://www.mdpi.com/1099-4300/15/8/3109maximum entropysampling biasasymptotic biasmodel-misspecificationneurophysiologyneural population codingIsing modelDichotomized Gaussian
collection DOAJ
language English
format Article
sources DOAJ
author Jakob H. Macke
Iain Murray
Peter E. Latham
spellingShingle Jakob H. Macke
Iain Murray
Peter E. Latham
Estimation Bias in Maximum Entropy Models
Entropy
maximum entropy
sampling bias
asymptotic bias
model-misspecification
neurophysiology
neural population coding
Ising model
Dichotomized Gaussian
author_facet Jakob H. Macke
Iain Murray
Peter E. Latham
author_sort Jakob H. Macke
title Estimation Bias in Maximum Entropy Models
title_short Estimation Bias in Maximum Entropy Models
title_full Estimation Bias in Maximum Entropy Models
title_fullStr Estimation Bias in Maximum Entropy Models
title_full_unstemmed Estimation Bias in Maximum Entropy Models
title_sort estimation bias in maximum entropy models
publisher MDPI AG
series Entropy
issn 1099-4300
publishDate 2013-08-01
description Maximum entropy models have become popular statistical models in neuroscience and other areas in biology and can be useful tools for obtaining estimates of mutual information in biological systems. However, maximum entropy models fit to small data sets can be subject to sampling bias; i.e., the true entropy of the data can be severely underestimated. Here, we study the sampling properties of estimates of the entropy obtained from maximum entropy models. We focus on pairwise binary models, which are used extensively to model neural population activity. We show that if the data is well described by a pairwise model, the bias is equal to the number of parameters divided by twice the number of observations. If, however, the higher order correlations in the data deviate from those predicted by the model, the bias can be larger. Using a phenomenological model of neural population recordings, we find that this additional bias is highest for small firing probabilities, strong correlations and large population sizes—for the parameters we tested, a factor of about four higher. We derive guidelines for how long a neurophysiological experiment needs to be in order to ensure that the bias is less than a specified criterion. Finally, we show how a modified plug-in estimate of the entropy can be used for bias correction.
topic maximum entropy
sampling bias
asymptotic bias
model-misspecification
neurophysiology
neural population coding
Ising model
Dichotomized Gaussian
url http://www.mdpi.com/1099-4300/15/8/3109
work_keys_str_mv AT jakobhmacke estimationbiasinmaximumentropymodels
AT iainmurray estimationbiasinmaximumentropymodels
AT peterelatham estimationbiasinmaximumentropymodels
_version_ 1725650507004903424