126 Linear Predictors
We will focus here on the class of one dimensional,n-degree, polynomial re-
gression predictors, namely,
Hnpoly={x7→p(x)},
wherepis a one dimensional polynomial of degreen, parameterized by a vector
of coefficients (a 0 ,...,an). Note thatX =R, since this is a one dimensional
polynomial, andY=R, as this is a regression problem.
One way to learn this class is by reduction to the problem of linear regression,
which we have already shown how to solve. To translate a polynomial regression
problem to a linear regression problem, we define the mappingψ:R→Rn+1
such thatψ(x) = (1,x,x^2 ,...,xn). Then we have that
p(ψ(x)) =a 0 +a 1 x+a 2 x^2 +···+anxn=〈a,ψ(x)〉
and we can find the optimal vector of coefficientsaby using the Least Squares
algorithm as shown earlier.
9.3 Logistic Regression
In logistic regression we learn a family of functionshfromRdto the interval [0,1].
However, logistic regression is used for classification tasks: We can interpreth(x)
as theprobabilitythat the label ofxis 1. The hypothesis class associated with
logistic regression is the composition of a sigmoid functionφsig:R→[0,1] over
the class of linear functionsLd. In particular, the sigmoid function used in logistic
regression is thelogistic function, defined as
φsig(z) =^1
1 + exp(−z)
. (9.9)
The name “sigmoid” means “S-shaped,” referring to the plot of this function,
shown in the figure: