Pattern Recognition and Machine Learning
6.3. Radial Basis Function Networks 301 −1 −0.5 0 0.5 1 0 0.2 0.4 0.6 0.8 1 −1 −0.5 0 0.5 1 0 0.2 0.4 0.6 0.8 1 Figure 6.2 Plot ...
302 6. KERNEL METHODS the input variable, which is given by y(x)=E[t|x]= ∫∞ −∞ tp(t|x)dt = ∫ tp(x,t)dt ∫ p(x,t)dt = ∑ n ∫ tf(x−x ...
6.4. Gaussian Processes 303 Figure 6.3 Illustration of the Nadaraya-Watson kernel regression model using isotropic Gaussian kern ...
304 6. KERNEL METHODS tic discriminative models, leading to the framework of Gaussian processes. We shall thereby see how kernel ...
6.4. Gaussian Processes 305 x 1 ,...,xN. We are therefore interested in the joint distribution of the function val- uesy(x 1 ),. ...
306 6. KERNEL METHODS Figure 6.4 Samples from Gaus- sian processes for a ‘Gaussian’ ker- nel (left) and an exponential kernel (r ...
6.4. Gaussian Processes 307 where the covariance matrixChas elements C(xn,xm)=k(xn,xm)+β−^1 δnm. (6.62) This result reflects the ...
308 6. KERNEL METHODS (1. 00 , 4. 00 , 0. 00 , 0 .00) −1 −0.5 0 0.5 1 −3 −1.5 0 1.5 3 (9. 00 , 4. 00 , 0. 00 , 0 .00) −1 −0.5 0 ...
6.4. Gaussian Processes 309 Figure 6.6 Illustration of the sampling of data points{tn}from a Gaussian process. The blue curve sh ...
310 6. KERNEL METHODS Figure 6.7 Illustration of the mechanism of Gaussian process regression for the case of one training point ...
6.4. Gaussian Processes 311 sian process regression have also been considered, for purposes such as modelling the distribution o ...
312 6. KERNEL METHODS Figure 6.9 Samples from the ARD prior for Gaussian processes, in which the kernel function is given by (6. ...
6.4. Gaussian Processes 313 Figure 6.10 Illustration of automatic rele- vance determination in a Gaus- sian process for a synthe ...
314 6. KERNEL METHODS −1 −0.5 0 0.5 1 −10 −5 0 5 10 −1 −0.5 0 0.5 1 0 0.25 0.5 0.75 1 Figure 6.11 The left plot shows a sample f ...
6.4. Gaussian Processes 315 predictive distribution is given by p(tN+1=1|tN)= ∫ p(tN+1=1|aN+1)p(aN+1|tN)daN+1 (6.76) wherep(tN+1 ...
316 6. KERNEL METHODS where we have usedp(tN|aN+1,aN)=p(tN|aN). The conditional distribution p(aN+1|aN)is obtained by invoking t ...
6.4. Gaussian Processes 317 maximum. The posterior distribution is not Gaussian, however, because the Hessian is a function ofaN ...
318 6. KERNEL METHODS whereΨ(aN)=lnp(aN|θ)+lnp(tN|aN). We also need to evaluate the gradient oflnp(tN|θ)with respect to the p ...
6.4. Gaussian Processes 319 −2 0 2 −2 0 2 Figure 6.12 Illustration of the use of a Gaussian process for classification, showing ...
320 6. KERNEL METHODS By working directly with the covariance function we have implicitly marginal- ized over the distribution o ...
«
12
13
14
15
16
17
18
19
20
21
»
Free download pdf