Pattern Recognition and Machine Learning

554 11. SAMPLING METHODS

good approximation to the true continuous-time dynamics, it is necessary for the leapfrog integration scaleto be smaller than the shortest length-scale over which the potential is varying significantly. This is governed by the smallest value ofσi, which we denote byσmin. Recall that the goal of the leapfrog integration in hybrid Monte Carlo is to move a substantial distance through phase space to a new state that is relatively independent of the initial state and still achieve a high probability of acceptance. In order to achieve this, the leapfrog integration must be continued for a number of iterations of orderσmax/σmin. By contrast, consider the behaviour of a simple Metropolis algorithm with an isotropic Gaussian proposal distribution of variances^2 , considered earlier. In order to avoid high rejection rates, the value ofsmust be of orderσmin. The exploration of state space then proceeds by a random walk and takes of order(σmax/σmin)^2 steps to arrive at a roughly independent state.

11.6 Estimating the Partition Function

As we have seen, most of the sampling algorithms considered in this chapter re- quire only the functional form of the probability distribution up to a multiplicative constant. Thus if we write

pE(z)=

1

ZE

exp(−E(z)) (11.71)

then the value of the normalization constantZE, also known as the partition function, is not needed in order to draw samples fromp(z). However, knowledge of the value ofZEcan be useful for Bayesian model comparison since it represents the model evidence (i.e., the probability of the observed data given the model), and so it is of interest to consider how its value might be obtained. We assume that direct evaluation by summing, or integrating, the functionexp(−E(z))over the state space ofzis intractable. For model comparison, it is actually the ratio of the partition functions for two models that is required. Multiplication of this ratio by the ratio of prior probabilities gives the ratio of posterior probabilities, which can then be used for model selection or model averaging. One way to estimate a ratio of partition functions is to use importance sampling from a distribution with energy functionG(z)

ZE ZG

=

∑ ∑zexp(−E(z)) zexp(−G(z))

=

∑ zexp(−∑E(z)+G(z)) exp(−G(z)) zexp(−G(z)) = EG(z)[exp(−E+G)]

∑

l

exp(−E(z(l))+G(z(l))) (11.72)

Pattern Recognition and Machine Learning

554 11. SAMPLING METHODS

11.6 Estimating the Partition Function

1

ZE

=

Get our desktop app

Company

Features

Documentation

Resources