Understanding Machine Learning: From Theory to Algorithms

23.7 Exercises 341 matrixW∈Rn,d, where each element ofW is independently distributed according toN(0, 1 /n), we have |〈Wu,Wv〉−〈u ...

24 Generative Models We started this book with adistribution freelearning framework; namely, we did not impose any assumptions o ...

24.1 Maximum Likelihood Estimator 343 24.1 Maximum Likelihood Estimator Let us start with a simple example. A drug company devel ...

344 Generative Models 24.1.1 Maximum Likelihood Estimation for Continuous Random Variables LetXbe a continuous random variable. ...

24.1 Maximum Likelihood Estimator 345 24.1.2 Maximum Likelihood and Empirical Risk Minimization The maximum likelihood estimator ...

346 Generative Models To answer this question we need to define how we assess the quality of an approxi- mated solution of the d ...

24.2 Naive Bayes 347 viously in the book. A simple regularization technique is outlined in Exercise 2. 24.2 Naive Bayes The Naiv ...

348 Generative Models vector of featuresx= (x 1 ,...,xd). But now the generative assumption is as follows. First, we assume that ...

24.4 Latent Variables and the EM Algorithm 349 Therefore, the density ofXcan be written as: P[X=x] = ∑k y=1 P[Y=y]P[X=x|Y=y] = ∑ ...

350 Generative Models If each row ofQdefines a probability over theith latent variable givenX=xi, then we can interpretF(Q,θ) as ...

24.4 Latent Variables and the EM Algorithm 351 The second term is the sum of theentropiesof the rows ofQ. Let Q= { Q∈[0,1]m,k:∀i ...

352 Generative Models while forQi,y=Pθ[Y=y|X=xi] we have G(Q,θ) = ∑m i=1 (k ∑ y=1 Pθ[Y=y|X=xi] log ( Pθ[X=xi,Y =y] Pθ[Y=y|X=xi] ...

24.5 Bayesian Reasoning 353 which in our case amounts to maximizing the following expression w.r.t.c andμ: ∑m i=1 ∑k y=1 Pθ(t)[Y ...

354 Generative Models As before, given a specific value ofθ, it is assumed that the conditional probability,P[X=x|θ], is known ...

24.6 Summary 355 It is interesting to note that whenP[θ] is uniform we obtain that P[X=x|S]∝ ∫ θx+ ∑ ixi(1−θ)^1 −x+ ∑ i(1−xi)dθ. ...

356 Generative Models 24.8 Exercises Prove that the maximum likelihood estimator of the variance of a Gaussian variable is bias ...

25 Feature Selection and Generation In the beginning of the book, we discussed the abstract model of learning, in which the prio ...

358 Feature Selection and Generation We emphasize that while there are some common techniques for feature learning one may wan ...

25.1 Feature Selection 359 25.1.1 Filters Maybe the simplest approach for feature selection is the filter method, in which we as ...

360 Feature Selection and Generation If Pearson’s coefficient equals zero it means that the optimal linear function fromvtoyis t ...

«
13
14
15
16
17
18
19
20
21
22
»

Free download pdf

Get our desktop app

Company

Features

Documentation

Resources