# Pattern Recognition and Machine Learning

(Jeff_L) #1
##### 108 2. PROBABILITY DISTRIBUTIONS

m=5,θ 0 =π/ 4

m=1,θ 0 =3π/ 4

2 π

0

π/ 4
3 π/ 4

m=5,θ 0 =π/ 4
m=1,θ 0 =3π/ 4

Figure 2.19 The von Mises distribution plotted for two different parameter values, shown as a Cartesian plot
on the left and as the corresponding polar plot on the right.

where ‘const’ denotes terms independent ofθ, and we have made use of the following
Exercise 2.51 trigonometrical identities

cos^2 A+sin^2 A =1 (2.177)
cosAcosB+sinAsinB =cos(A−B). (2.178)
If we now definem=r 0 /σ^2 , we obtain our final expression for the distribution of
p(θ)along the unit circler=1in the form

p(θ|θ 0 ,m)=

##### 1

2 πI 0 (m)

exp{mcos(θ−θ 0 )} (2.179)

which is called thevon Misesdistribution, or thecircular normal. Here the param-
eterθ 0 corresponds to the mean of the distribution, whilem, which is known as
theconcentrationparameter, is analogous to the inverse variance (precision) for the
Gaussian. The normalization coefficient in (2.179) is expressed in terms ofI 0 (m),
which is the zeroth-order Bessel function of the first kind (Abramowitz and Stegun,
1965) and is defined by

I 0 (m)=

##### 1

2 π

∫ 2 π

0

exp{mcosθ}dθ. (2.180)

Exercise 2.52 For largem, the distribution becomes approximately Gaussian. The von Mises dis-
tribution is plotted in Figure 2.19, and the functionI 0 (m)is plotted in Figure 2.20.

Now consider the maximum likelihood estimators for the parametersθ 0 andm
for the von Mises distribution. The log likelihood function is given by

lnp(D|θ 0 ,m)=−Nln(2π)−NlnI 0 (m)+m

∑N

n=1

cos(θn−θ 0 ). (2.181)