2.4. The Exponential Family 115Making use of the constraint (2.209), the multinomial distribution in this representa-
tion then becomes
exp{M
∑k=1xklnμk}=exp{M− 1
∑k=1xklnμk+(
1 −M∑− 1k=1xk)
ln(
1 −M∑− 1k=1μk)}=exp{M− 1
∑k=1xkln(
μk
1 −∑M− 1
j=1 μj)+ln(1 −M∑− 1k=1μk)}. (2.211)
We now identify
ln(
μk
1 −∑
jμj)
=ηk (2.212)which we can solve forμkby first summing both sides overkand then rearranging
and back-substituting to give
μk=exp(ηk)
1+∑
jexp(ηj). (2.213)
This is called thesoftmaxfunction, or thenormalized exponential. In this represen-
tation, the multinomial distribution therefore takes the form
p(x|η)=(
1+M∑− 1k=1exp(ηk))− 1
exp(ηTx). (2.214)This is the standard form of the exponential family, with parameter vectorη =
(η 1 ,...,ηM− 1 )Tin which
u(x)=x (2.215)
h(x)=1 (2.216)g(η)=(1+M∑− 1k=1exp(ηk))− 1. (2.217)
Finally, let us consider the Gaussian distribution. For the univariate Gaussian,
we have
p(x|μ, σ^2 )=1
(2πσ^2 )^1 /^2exp{
−1
2 σ^2(x−μ)^2}
(2.218)=
1
(2πσ^2 )^1 /^2exp{
−1
2 σ^2x^2 +μ
σ^2x−1
2 σ^2μ^2}
(2.219)