Understanding Machine Learning: From Theory to Algorithms

226 Kernel Methods

Prove thatKis a valid kernel; namely, find a mappingψ:{ 1 ,...,N} →H whereHis some Hilbert space, such that ∀x,x′∈{ 1 ,...,N}, K(x,x′) =〈ψ(x),ψ(x′)〉.

A supermarket manager would like to learn which of his customers have babies
on the basis of their shopping carts. Specifically, he sampled i.i.d. customers,
where for customeri, letxi ⊂ { 1 ,...,d}denote the subset of items the
customer bought, and letyi∈ {± 1 }be the label indicating whether this
customer has a baby. As prior knowledge, the manager knows that there are
kitems such that the label is determined to be 1 iff the customer bought
at least one of thesekitems. Of course, the identity of thesekitems is not
known (otherwise, there was nothing to learn). In addition, according to the
store regulation, each customer can buy at mostsitems. Help the manager to
design a learning algorithm such that both its time complexity and its sample
complexity are polynomial ins,k, and 1/.

LetX be an instance set and letψbe a feature mapping ofX into some
Hilbert feature spaceV. LetK :X × X → Rbe a kernel function that
implements inner products in the feature spaceV.
Consider the binary classification algorithm that predicts the label of
an unseen instance according to the class with the closest average. Formally,
given a training sequenceS= (x 1 ,y 1 ),...,(xm,ym), for everyy∈ {± 1 }we
define
cy=

1

my

∑

i:yi=y

ψ(xi).

wheremy=|{i:yi=y}|. We assume thatm+andm−are nonzero. Then, the algorithm outputs the following decision rule:

h(x) =

{

1 ‖ψ(x)−c+‖≤‖ψ(x)−c−‖ 0 otherwise.

Letw=c+−c−and letb=^12 (‖c−‖^2 −‖c+‖^2 ). Show that
h(x) = sign(〈w,ψ(x)〉+b).

Show how to expressh(x) on the basis of the kernel function, and without
accessing individual entries ofψ(x) orw.

Understanding Machine Learning: From Theory to Algorithms

1

∑

{

Get our desktop app

Company

Features

Documentation

Resources