3 years ago

Statistical Criticality arises in Maximally Informative Representations.

Ryan John Cubero, Junghyo Jo, Matteo Marsili, Yasser Roudi, Juyong Song

We show that statistical criticality, i.e. the occurrence of power law frequency distributions, arises in samples that are most informative on the underlying generative process. In order to reach this conclusion, we first identify the frequency with which different outcomes occur in a sample, as the variable carrying useful information on the generative process. This differs from the entropy of the data, that we take as a measure of resolution. The entropy of the frequency, that we call relevance, provides an upper bound to the number of informative bits. Samples that maximise relevance at a given resolution - that we call most informative samples - exhibit statistical criticality. We show how this naturally arises from the concentration property of the Asymptotic Equipartition Property. Within a thermodynamic analogy, we find that most informative representations of high dimensional data arise from a principle of minimal entropy, at odds with equilibrium statistical mechanics where the entropy is maximised. This is why, contrary to statistical mechanics, statistical criticality requires no parameter fine tuning in most informative samples. In addition, Zipf's law arises at the optimal trade-off between resolution (i.e. compression) and relevance. As a byproduct, we derive an estimate of the maximal number of parameters that can be estimated from a dataset, in the absence of prior knowledge on the generative model. We finally show how our findings can be derived from an unsupervised version of the Information Bottleneck method.

Publisher URL: http://arxiv.org/abs/1808.00249

DOI: arXiv:1808.00249v3

You might also like
Discover & Discuss Important Research

Keeping up-to-date with research can feel impossible, with papers being published faster than you'll ever be able to read them. That's where Researcher comes in: we're simplifying discovery and making important discussions happen. With over 19,000 sources, including peer-reviewed journals, preprints, blogs, universities, podcasts and Live events across 10 research areas, you'll never miss what's important to you. It's like social media, but better. Oh, and we should mention - it's free.

  • Download from Google Play
  • Download from App Store
  • Download from AppInChina

Researcher displays publicly available abstracts and doesn’t host any full article content. If the content is open access, we will direct clicks from the abstracts to the publisher website and display the PDF copy on our platform. Clicks to view the full text will be directed to the publisher website, where only users with subscriptions or access through their institution are able to view the full article.