Large pseudocounts and L[subscript 2]-norm penalties are necessary for the mean-field inference of Ising and Potts models

The mean-field (MF) approximation offers a simple, fast way to infer direct interactions between elements in a network of correlated variables, a common, computationally challenging problem with practical applications in fields ranging from physics and biology to the social sciences. However, MF met...

Full description

Bibliographic Details
Main Authors:	Cocco, S. (Author), De Leonardis, E. (Author), Monasson, R. (Author), Barton, John P. (Contributor)
Other Authors:	Massachusetts Institute of Technology. Department of Chemical Engineering (Contributor), Ragon Institute of MGH, MIT and Harvard (Contributor)
Format:	Article
Language:	English
Published:	American Physical Society, 2015-06-17T14:53:26Z.
Subjects:	Article
Online Access:	Get fulltext


LEADER	02370 am a22002293u 4500
001	97450
042			\|a dc
100	1	0	\|a Cocco, S. \|e author
100	1	0	\|a Massachusetts Institute of Technology. Department of Chemical Engineering \|e contributor
100	1	0	\|a Ragon Institute of MGH, MIT and Harvard \|e contributor
100	1	0	\|a Barton, John P. \|e contributor
700	1	0	\|a De Leonardis, E. \|e author
700	1	0	\|a Monasson, R. \|e author
700	1	0	\|a Barton, John P. \|e author
245	0	0	\|a Large pseudocounts and L[subscript 2]-norm penalties are necessary for the mean-field inference of Ising and Potts models
260			\|b American Physical Society, \|c 2015-06-17T14:53:26Z.
856			\|z Get fulltext \|u http://hdl.handle.net/1721.1/97450
520			\|a The mean-field (MF) approximation offers a simple, fast way to infer direct interactions between elements in a network of correlated variables, a common, computationally challenging problem with practical applications in fields ranging from physics and biology to the social sciences. However, MF methods achieve their best performance with strong regularization, well beyond Bayesian expectations, an empirical fact that is poorly understood. In this work, we study the influence of pseudocount and L[subscript 2]-norm regularization schemes on the quality of inferred Ising or Potts interaction networks from correlation data within the MF approximation. We argue, based on the analysis of small systems, that the optimal value of the regularization strength remains finite even if the sampling noise tends to zero, in order to correct for systematic biases introduced by the MF approximation. Our claim is corroborated by extensive numerical studies of diverse model systems and by the analytical study of the m-component spin model for large but finite m. Additionally, we find that pseudocount regularization is robust against sampling noise and often outperforms L[subscript 2]-norm regularization, particularly when the underlying network of interactions is strongly heterogeneous. Much better performances are generally obtained for the Ising model than for the Potts model, for which only couplings incoming onto medium-frequency symbols are reliably inferred.
520			\|a France. Agence nationale de la recherche (Coevstat Project Grant ANR-13-BS04-0012-01)
546			\|a en_US
655	7		\|a Article
773			\|t Physical Review E

Large pseudocounts and L[subscript 2]-norm penalties are necessary for the mean-field inference of Ising and Potts models

Similar Items