Pattern Recognition and Machine Learning
4.2. Probabilistic Generative Models 201 the log likelihood function that depend onπare ∑N n=1 {tnlnπ+(1−tn)ln(1−π)}. (4.72) Set ...
202 4. LINEAR MODELS FOR CLASSIFICATION where we have defined S = N 1 N S 1 + N 2 N S 2 (4.78) S 1 = 1 N 1 ∑ n∈C 1 (xn−μ 1 )(xn− ...
4.3. Probabilistic Discriminative Models 203 2 classes) or softmax (K 2 classes) activation functions. These are particular cas ...
204 4. LINEAR MODELS FOR CLASSIFICATION x 1 x 2 −1 0 1 −1 0 1 φ 1 φ 2 0 0.5 1 0 0.5 1 Figure 4.12 Illustration of the role of no ...
4.3. Probabilistic Discriminative Models 205 basis functions is typically set to a constant, sayφ 0 (x)=1, so that the correspon ...
206 4. LINEAR MODELS FOR CLASSIFICATION For a data set{φn,tn}, wheretn ∈{ 0 , 1 }andφn = φ(xn), withn = 1 ,...,N, the likelihood ...
4.3. Probabilistic Discriminative Models 207 4.3.3 Iterative reweighted least squares In the case of the linear regression model ...
208 4. LINEAR MODELS FOR CLASSIFICATION where we have made use of (4.88). Also, we have introduced theN×Ndiagonal matrixRwith el ...
4.3. Probabilistic Discriminative Models 209 4.3.4 Multiclass logistic regression Section 4.2 In our discussion of generative mo ...
210 4. LINEAR MODELS FOR CLASSIFICATION where we have made use of ∑ ktnk=1. Once again, we see the same form arising for the gra ...
4.3. Probabilistic Discriminative Models 211 Figure 4.13 Schematic example of a probability densityp(θ) shown by the blue curve, ...
212 4. LINEAR MODELS FOR CLASSIFICATION however, find another use for the probit model when we discuss Bayesian treatments of lo ...
4.4. The Laplace Approximation 213 Thusyandηmust related, and we denote this relation throughη=ψ(y). Following Nelder and Wedder ...
214 4. LINEAR MODELS FOR CLASSIFICATION over the parameter vectorwsince the posterior distribution is no longer Gaussian. It is ...
4.4. The Laplace Approximation 215 −2 −1 0 1 2 3 4 0 0.2 0.4 0.6 0.8 −2 −1 0 1 2 3 4 0 10 20 30 40 Figure 4.14 Illustration of t ...
216 4. LINEAR MODELS FOR CLASSIFICATION and Nabney, 2008). Many of the distributions encountered in practice will be mul- timoda ...
4.5. Bayesian Logistic Regression 217 whereθMAPis the value ofθat the mode of the posterior distribution, andAis the Hessianmatr ...
218 4. LINEAR MODELS FOR CLASSIFICATION wherem 0 andS 0 are fixed hyperparameters. The posterior distribution overwis given by p ...
4.5. Bayesian Logistic Regression 219 where p(a)= ∫ δ(a−wTφ)q(w)dw. (4.148) We can evaluatep(a)by noting that the delta function ...
220 4. LINEAR MODELS FOR CLASSIFICATION We now apply the approximationσ(a)Φ(λa)to the probit functions appearing on both sides ...
«
7
8
9
10
11
12
13
14
15
16
»
Free download pdf