Perceptual Organization 187
defined for two-dimensional regions, and thus require two-
dimensional regions as input. More speculatively, Palmer and
Rock (1994a, 1994b) also claimed that figure-ground organi-
zation must logically precede grouping and parsing. The
reason is that the latter processes, which apparently depend
on certain shape-based properties of the regions in question—
for example, concavity-convexity, similarity of orientation,
shape, size, and motion—require prior boundary assignment.
Grouping and parsing thus depend on shape properties that
are logically well-defined for regions only after boundaries
have been assigned, either to one side or perhaps initially to
both sides (Peterson & Gibson, 1991)
Parsing
Another important process involved in the organization of
perception is parsingorpart segmentation:dividing a single
element into two or more parts. This is essentially the oppo-
site of grouping. Parsing is important because it determines
what subregions of a perceptual unit are perceived as belong-
ing together most coherently. To illustrate, consider the
leopard in Figure 7.1 (A). Region segmentation might well
define it as a single region based on its textural similarity
(region 4), and this conforms to our experience of it as a sin-
gle object. But we also experience it as being composed
of several clear and obvious parts: the head, body, tail, and
three visible legs, as indicated by the dashed lines in Fig-
ure 7.1 (B). The large, lower portion of the tree limb (re-
gion 9) is similarly a single UC region, but it too can be
perceived as divided (although perhaps less strongly) into the
different sections indicated by dotted lines in Figure 7.1 (B).
Palmer and Rock (1994a) argued that parsing must logi-
cally follow region segmentation because parsing presup-
poses the existence of a unitary region to be divided. Since
they proposed that region segmentation is the first step in
the process that forms such region-based elements, they nat-
urally argued that parsing must come after it. There is no
logical constraint, however, on the order in which parsing
and grouping must occur relative to each other. They could
very well happen simultaneously. This is why the flowchart
of Palmer and Rock’s theory (Figure 7.9) shows both
grouping and parsing taking place at the same time after re-
gions have been defined. According to their analysis, pars-
ing should also occur after figure-ground organization.
The reason is that parsing, like grouping, is based on prop-
erties (such as concavity-convexity) that are properly attrib-
uted to regions only after some boundary assignment has
been made. There is no point in parsing a background re-
gion at concavities along its border if that border does not
define the shape of the corresponding environmental object,
but only the shape of a neighboring object that partly
occludes it.
There are at least two quite different ways to go about
dividing an object into parts: boundary rulesandshape prim-
itives. The boundary rule approach is to define a set of gen-
eral conditions that specify where the boundaries lie between
parts. The best known theory of this type was developed by
Hoffman and Richards (1984). Their key observation was
that the two-dimensional silhouettes of multipart objects can
usually be divided at deep concavities:places where the con-
tour of an object’s outer border is maximally curved inward
(concave) toward the interior of the region. Formally, these
points are local negative minima of curvature.
An alternative to parsing by boundary rules is theshape
primitiveapproach. It is based on a set of atomic, indivisible
shapes that constitute a complete listing of the most basic
parts. More complex objects are then analyzed as configura-
tions of these primitive parts. This process can be thought of
as analogous to dividing cursively written words into parts by
knowing the cursive alphabet and finding the primitive com-
ponent letters. Such a scheme for parsing works well if there
is a relatively small set of primitive components, as there is in
the case of cursive writing. It is far from obvious, however,
what the two-dimensional shape primitives might be in the
case of parsing two-dimensional projections of natural scenes.
If the shape primitive approach is going to work, it is
natural that the shape primitives appropriate for parsing the
projected images of three-dimensional objects should be
the projections of three-dimensional volumetricshape primi-
tives. Such an analysis has been given in Binford’s (1971)
proposal that complex three-dimensional shapes can be ana-
lyzed into configurations of generalized cylinders:appropri-
ately sized and shaped volumes that are generalized from
standard cylinders in the sense that they have extra parame-
ters that enable them to describe many more shapes. The
extra parameters include ones that specify the shape of
the base (rather than always being circular), the curvature of
the axis (rather than always being straight), and so forth (see
also Biederman, 1987; Marr, 1982). The important point for
present purposes is that if one has a set of shape primitives
and some way of detecting them in two-dimensional images,
complex three-dimensional objects can be appropriately seg-
mented into primitive parts. Provided that the primitives are
sufficiently general, part segmentation will be possible, even
for novel objects.
Visual Interpolation
With the four basic organizational processes discussed
thus far—region segmentation, figure-ground organization,