Understanding Machine Learning: From Theory to Algorithms

126 Linear Predictors

We will focus here on the class of one dimensional,n-degree, polynomial regression predictors, namely,

Hnpoly={x7→p(x)},

wherepis a one dimensional polynomial of degreen, parameterized by a vector of coefficients (a 0 ,...,an). Note thatX =R, since this is a one dimensional polynomial, andY=R, as this is a regression problem. One way to learn this class is by reduction to the problem of linear regression, which we have already shown how to solve. To translate a polynomial regression problem to a linear regression problem, we define the mappingψ:R→Rn+1 such thatψ(x) = (1,x,x^2 ,...,xn). Then we have that

p(ψ(x)) =a 0 +a 1 x+a 2 x^2 +···+anxn=〈a,ψ(x)〉

and we can find the optimal vector of coefficientsaby using the Least Squares algorithm as shown earlier.

9.3 Logistic Regression

In logistic regression we learn a family of functionshfromRdto the interval [0,1]. However, logistic regression is used for classification tasks: We can interpreth(x) as theprobabilitythat the label ofxis 1. The hypothesis class associated with logistic regression is the composition of a sigmoid functionφsig:R→[0,1] over the class of linear functionsLd. In particular, the sigmoid function used in logistic regression is thelogistic function, defined as

φsig(z) =^1 1 + exp(−z)

. (9.9)

The name “sigmoid” means “S-shaped,” referring to the plot of this function, shown in the figure:

Understanding Machine Learning: From Theory to Algorithms

9.3 Logistic Regression

. (9.9)

Get our desktop app

Company

Features

Documentation

Resources