Understanding Machine Learning: From Theory to Algorithms

(Jeff_L) #1

226 Kernel Methods


Prove thatKis a valid kernel; namely, find a mappingψ:{ 1 ,...,N} →H
whereHis some Hilbert space, such that
∀x,x′∈{ 1 ,...,N}, K(x,x′) =〈ψ(x),ψ(x′)〉.


  1. A supermarket manager would like to learn which of his customers have babies
    on the basis of their shopping carts. Specifically, he sampled i.i.d. customers,
    where for customeri, letxi ⊂ { 1 ,...,d}denote the subset of items the
    customer bought, and letyi∈ {± 1 }be the label indicating whether this
    customer has a baby. As prior knowledge, the manager knows that there are
    kitems such that the label is determined to be 1 iff the customer bought
    at least one of thesekitems. Of course, the identity of thesekitems is not
    known (otherwise, there was nothing to learn). In addition, according to the
    store regulation, each customer can buy at mostsitems. Help the manager to
    design a learning algorithm such that both its time complexity and its sample
    complexity are polynomial ins,k, and 1/.

  2. LetX be an instance set and letψbe a feature mapping ofX into some
    Hilbert feature spaceV. LetK :X × X → Rbe a kernel function that
    implements inner products in the feature spaceV.
    Consider the binary classification algorithm that predicts the label of
    an unseen instance according to the class with the closest average. Formally,
    given a training sequenceS= (x 1 ,y 1 ),...,(xm,ym), for everyy∈ {± 1 }we
    define
    cy=


1

my


i:yi=y

ψ(xi).

wheremy=|{i:yi=y}|. We assume thatm+andm−are nonzero. Then,
the algorithm outputs the following decision rule:

h(x) =

{

1 ‖ψ(x)−c+‖≤‖ψ(x)−c−‖
0 otherwise.


  1. Letw=c+−c−and letb=^12 (‖c−‖^2 −‖c+‖^2 ). Show that
    h(x) = sign(〈w,ψ(x)〉+b).

  2. Show how to expressh(x) on the basis of the kernel function, and without
    accessing individual entries ofψ(x) orw.

Free download pdf