Before delving into the question of how machine learning methods operate, we
begin by looking at the different forms the input might take and, in the next
chapter, the different kinds of output that might be produced. With any soft-
ware system, understanding what the inputs and outputs are is far more impor-
tant than knowing what goes on in between, and machine learning is no
exception.
The input takes the form ofconcepts, instances,and attributes.We call the
thing that is to be learned a concept description.The idea of a concept, like
the very idea of learning in the first place, is hard to pin down precisely, and
we won’t spend time philosophizing about just what it is and isn’t. In a
sense, what we are trying to find—the result of the learning process—is a
description of the concept that is intelligiblein that it can be understood, dis-
cussed, and disputed, and operationalin that it can be applied to actual exam-
ples. The next section explains some distinctions among different kinds of
learning problems, distinctions that are very concrete and very important in
practical data mining.
chapter 2
Input:
Concepts, Instances, and Attributes
41