Pattern Recognition and Machine Learning

(Jeff_L) #1
1.1. Example: Polynomial Curve Fitting 11

Table 1.2 Table of the coefficientswforM=
9 polynomials with various values for
the regularization parameterλ. Note
thatlnλ = −∞corresponds to a
model with no regularization, i.e., to
the graph at the bottom right in Fig-
ure 1.4. We see that, as the value of
λincreases, the typical magnitude of
the coefficients gets smaller.

lnλ=−∞ lnλ=−18 lnλ=0
w 0 0.35 0.35 0.13
w 1 232.37 4.74 -0.05
w 2 -5321.83 -0.77 -0.06
w 3 48568.31 -31.97 -0.05
w 4 -231639.30 -3.89 -0.03
w 5 640042.26 55.28 -0.02
w 6 -1061800.52 41.32 -0.01
w 7 1042400.18 -45.95 -0.00
w 8 -557682.99 -91.53 0.00
w 9 125201.43 72.68 0.01

the magnitude of the coefficients.
The impact of the regularization term on the generalization error can be seen by
plotting the value of the RMS error (1.3) for both training and test sets againstlnλ,
as shown in Figure 1.8. We see that in effectλnow controls the effective complexity
of the model and hence determines the degree of over-fitting.
The issue of model complexity is an important one and will be discussed at
length in Section 1.3. Here we simply note that, if we were trying to solve a practical
application using this approach of minimizing an error function, we would have to
find a way to determine a suitable value for the model complexity. The results above
suggest a simple way of achieving this, namely by taking the available data and
partitioning it into a training set, used to determine the coefficientsw, and a separate
validationset, also called ahold-outset, used to optimize the model complexity
(eitherM orλ). In many cases, however, this will prove to be too wasteful of
Section 1.3 valuable training data, and we have to seek more sophisticated approaches.
So far our discussion of polynomial curve fitting has appealed largely to in-
tuition. We now seek a more principled approach to solving problems in pattern
recognition by turning to a discussion of probability theory. As well as providing the
foundation for nearly all of the subsequent developments in this book, it will also


Figure 1.8 Graph of the root-mean-square er-
ror (1.3) versuslnλfor theM=9
polynomial.

E

RMS

−35 −30 lnλ −25 −20

0

0.5

1
Training
Test
Free download pdf