Nicolas Chopin has just arxived our manuscript on inference for unnormalised statistical models. An unnormalised statistical model whose likelihood function can be written $p(y|\theta) = \frac{f(y;\theta)}{z(\theta)}$

where $f(y;\theta)$ is easy to compute but the normalisation constant $z(\theta)$ is hard. A lot of common models fall into that category, for example Ising models or restricted Boltzmann machines. Not having the normalisation constant makes inference much more difficult.

We show that there is a principled way of treating the missing normalisation constant as just another parameter: effectively, you pretend that your data came from a Poisson process. The normalisation constant becomes a parameter in an augmented likelihood function. We call this mapping the Poisson transform, because it generalises a much older idea called the Multinomial-Poisson transform.

The Poisson transform can be approximated in practice by logistic regression, and we show that this actually corresponds to Guttman & Hyvärinen’s noise-contrastive divergence. Once you have seen the connection, generalising noise-contrastive divergence to non-IID models becomes easy, and we can do inference on unnormalised Markov chains, for example.

One nice thing about the result is that you can use it to turn a relatively exotic spatial Markov chain model into just another logistic regression. See the manuscript for details. 