Pattern Recognition and Machine Learning

302 6. KERNEL METHODS

the input variable, which is given by

y(x)=E[t|x]=

∫∞

−∞

tp(t|x)dt

=

∫ tp(x,t)dt ∫ p(x,t)dt

=

∑

n

∫ tf(x−xn,t−tn)dt

∑

m

∫ f(x−xm,t−tm)dt

. (6.43)

We now assume for simplicity that the component density functions have zero mean so that ∫∞

−∞

f(x,t)tdt=0 (6.44)

for all values ofx. Using a simple change of variable, we then obtain

y(x)=

∑

n

g(x−xn)tn ∑

m

g(x−xm)

=

∑

n

k(x,xn)tn (6.45)

wheren, m=1,...,Nand the kernel functionk(x,xn)is given by

k(x,xn)=

g(x−xn) ∑

m

g(x−xm)

(6.46)

and we have defined g(x)=

∫∞

−∞

f(x,t)dt. (6.47)

The result (6.45) is known as theNadaraya-Watsonmodel, orkernel regression (Nadaraya, 1964; Watson, 1964). For a localized kernel function, it has the prop- erty of giving more weight to the data pointsxnthat are close tox. Note that the kernel (6.46) satisfies the summation constraint

∑N

n=1

k(x,xn)=1.

Pattern Recognition and Machine Learning

302 6. KERNEL METHODS

=

=

. (6.43)

=

(6.46)

Get our desktop app

Company

Features

Documentation

Resources