Pattern Recognition and Machine Learning

(Jeff_L) #1
Exercises 61

Note that the precise relationship between thew ̃coefficients andwcoefficients need
not be made explicit. Use this result to show that the number ofindependentparam-
etersn(D, M), which appear at orderM, satisfies the following recursion relation

n(D, M)=

∑D

i=1

n(i, M−1). (1.135)

Next use proof by induction to show that the following result holds

∑D

i=1

(i+M−2)!
(i−1)! (M−1)!

=

(D+M−1)!

(D−1)!M!

(1.136)

which can be done by first proving the result forD=1and arbitraryMby making
use of the result0!=1, then assuming it is correct for dimensionDand verifying
that it is correct for dimensionD+1. Finally, use the two previous results, together
with proof by induction, to show

n(D, M)=

(D+M−1)!

(D−1)!M!

. (1.137)

To do this, first show that the result is true forM =2, and any value ofD 1 ,
by comparison with the result of Exercise 1.14. Then make use of (1.135), together
with (1.136), to show that, if the result holds at orderM− 1 , then it will also hold at
orderM

1.16 () In Exercise 1.15, we proved the result (1.135) for the number of independent
parameters in theMthorder term of aD-dimensional polynomial. We now find an
expression for the total numberN(D, M)of independent parameters in all of the
terms up to and including theM6thorder. First show thatN(D, M)satisfies


N(D, M)=

∑M

m=0

n(D, m) (1.138)

wheren(D, m)is the number of independent parameters in the term of orderm.
Now make use of the result (1.137), together with proof by induction, to show that

N(d, M)=

(D+M)!

D!M!

. (1.139)

This can be done by first proving that the result holds forM =0and arbitrary
D 1 , then assuming that it holds at orderM, and hence showing that it holds at
orderM+1. Finally, make use of Stirling’s approximation in the form
n!nne−n (1.140)
for largento show that, forD M, the quantityN(D, M)grows likeDM,
and forM Dit grows likeMD. Consider a cubic (M =3) polynomial inD
dimensions, and evaluate numerically the total number of independent parameters
for (i)D=10and (ii)D= 100, which correspond to typical small-scale and
medium-scale machine learning applications.
Free download pdf