2.4. The Exponential Family 115
Making use of the constraint (2.209), the multinomial distribution in this representa-
tion then becomes
exp
{M
∑
k=1
xklnμk
}
=exp
{M− 1
∑
k=1
xklnμk+
(
1 −
M∑− 1
k=1
xk
)
ln
(
1 −
M∑− 1
k=1
μk
)}
=exp
{M− 1
∑
k=1
xkln
(
μk
1 −
∑M− 1
j=1 μj
)
+ln
(
1 −
M∑− 1
k=1
μk
)}
. (2.211)
We now identify
ln
(
μk
1 −
∑
jμj
)
=ηk (2.212)
which we can solve forμkby first summing both sides overkand then rearranging
and back-substituting to give
μk=
exp(ηk)
1+
∑
jexp(ηj)
. (2.213)
This is called thesoftmaxfunction, or thenormalized exponential. In this represen-
tation, the multinomial distribution therefore takes the form
p(x|η)=
(
1+
M∑− 1
k=1
exp(ηk)
)− 1
exp(ηTx). (2.214)
This is the standard form of the exponential family, with parameter vectorη =
(η 1 ,...,ηM− 1 )Tin which
u(x)=x (2.215)
h(x)=1 (2.216)
g(η)=
(
1+
M∑− 1
k=1
exp(ηk)
)− 1
. (2.217)
Finally, let us consider the Gaussian distribution. For the univariate Gaussian,
we have
p(x|μ, σ^2 )=
1
(2πσ^2 )^1 /^2
exp
{
−
1
2 σ^2
(x−μ)^2
}
(2.218)
=
1
(2πσ^2 )^1 /^2
exp
{
−
1
2 σ^2
x^2 +
μ
σ^2
x−
1
2 σ^2
μ^2
}
(2.219)