Pattern Recognition and Machine Learning

Exercises 61

Note that the precise relationship between thew ̃coefficients andwcoefficients need not be made explicit. Use this result to show that the number ofindependentparam- etersn(D, M), which appear at orderM, satisfies the following recursion relation

n(D, M)=

∑D

i=1

n(i, M−1). (1.135)

Next use proof by induction to show that the following result holds

∑D

i=1

(i+M−2)! (i−1)! (M−1)!

=

(D+M−1)!

(D−1)!M!

(1.136)

which can be done by first proving the result forD=1and arbitraryMby making use of the result0!=1, then assuming it is correct for dimensionDand verifying that it is correct for dimensionD+1. Finally, use the two previous results, together with proof by induction, to show

n(D, M)=

(D+M−1)!

(D−1)!M!

. (1.137)

To do this, first show that the result is true forM =2, and any value ofD 1 , by comparison with the result of Exercise 1.14. Then make use of (1.135), together with (1.136), to show that, if the result holds at orderM− 1 , then it will also hold at orderM

1.16 () In Exercise 1.15, we proved the result (1.135) for the number of independent
parameters in theMthorder term of aD-dimensional polynomial. We now find an
expression for the total numberN(D, M)of independent parameters in all of the
terms up to and including theM6thorder. First show thatN(D, M)satisfies

N(D, M)=

∑M

m=0

n(D, m) (1.138)

wheren(D, m)is the number of independent parameters in the term of orderm. Now make use of the result (1.137), together with proof by induction, to show that

N(d, M)=

(D+M)!

D!M!

. (1.139)

This can be done by first proving that the result holds forM =0and arbitrary D 1 , then assuming that it holds at orderM, and hence showing that it holds at orderM+1. Finally, make use of Stirling’s approximation in the form n!nne−n (1.140) for largento show that, forD M, the quantityN(D, M)grows likeDM, and forM Dit grows likeMD. Consider a cubic (M =3) polynomial inD dimensions, and evaluate numerically the total number of independent parameters for (i)D=10and (ii)D= 100, which correspond to typical small-scale and medium-scale machine learning applications.

Pattern Recognition and Machine Learning

=

(D+M−1)!

(D−1)!M!

(1.136)

(D+M−1)!

(D−1)!M!

. (1.137)

N(D, M)=

(D+M)!

D!M!

. (1.139)

Get our desktop app

Company

Features

Documentation

Resources