52 1. INTRODUCTION
probabilities
H = 1.77
0
0.25
0.5
probabilities
H = 3.09
0
0.25
0.5
Figure 1.30 Histograms of two probability distributions over 30 bins illustrating the higher value of the entropy
Hfor the broader distribution. The largest entropy would arise from a uniform distribution that would giveH=
−ln(1/30) = 3. 40.
from which we find that all of thep(xi)are equal and are given byp(xi)=1/M
whereMis the total number of statesxi. The corresponding value of the entropy
is thenH=lnM. This result can also be derived from Jensen’s inequality (to be
Exercise 1.29 discussed shortly). To verify that the stationary point is indeed a maximum, we can
evaluate the second derivative of the entropy, which gives
∂H ̃
∂p(xi)∂p(xj)
=−Iij
1
pi
(1.100)
whereIijare the elements of the identity matrix.
We can extend the definition of entropy to include distributionsp(x)over con-
tinuous variablesxas follows. First dividexinto bins of width∆. Then, assuming
p(x)is continuous, themean value theorem(Weisstein, 1999) tells us that, for each
such bin, there must exist a valuexisuch that
∫(i+1)∆
i∆
p(x)dx=p(xi)∆. (1.101)
We can now quantize the continuous variablexby assigning any valuexto the value
xiwheneverxfalls in theithbin. The probability of observing the valuexiis then
p(xi)∆. This gives a discrete distribution for which the entropy takes the form
H∆=−
∑
i
p(xi)∆ ln (p(xi)∆) =−
∑
i
p(xi)∆ lnp(xi)−ln ∆ (1.102)
where we have used
∑
ip(xi)∆ = 1, which follows from (1.101). We now omit
the second term−ln ∆on the right-hand side of (1.102) and then consider the limit