Handbook of Psychology, Volume 4: Experimental Psychology

(Axel Boer) #1

204 Visual Perception of Objects


12

2 2

3

3
333

4

4

5

5

5

5

5

Geons

Objects

Figure 7.24 Examples of geons and their presence in objects (see text).
Source: From Biederman, 1995.


Geons


The first important assumption of RBC theory is that both
the stored representations of categories and the representation
of a currently attended object are volumetric structural
descriptions. Recognition-by-components representations are
functional hierarchies whose nodes correspond to a discrete
set of three-dimensional volumes (geons) and whose links to
other nodes correspond to relations among these geons.
Geons are generalized cylinders that have been partitioned
into discrete classes by dividing their inherently continuous
parameters (see below) into a few discrete ranges that are
easy to distinguish from most vantage points. From the rela-
tively small set of 108 distinct geons, a huge number of object
representations can be constructed by putting together two or
more geons much as an enormous number of words can be
constructed by putting together a relatively small number of
letters. A few representative geons are illustrated in Fig-
ure 7.24 along with some common objects constructed by
putting several geons together to form recognizable objects.
Biederman defined the set of 108 geons by making discrete
distinctions in the following variable dimensions of general-
ized cylinders:cross-sectional curvature(straight vs. curved),
symmetry(asymmetrical vs. reflectional symmetry alone vs.
both reflectional and rotational symmetry),axis curvature
(straight vs. curved),cross-sectional size variation(constant
vs. expanding and contracting vs. expanding only), andaspect
ratioof the sweeping axis relative to the largest dimension of
the cross-sectional area (approximately equal vs. axis greater
vs. cross-section greater). The rationale for these particular
distinctions is that, except for aspect ratio, they are qualitative


rather than merely quantitative differences that result in qual-
itatively different retinal projections. The image features that
characterize different geons are therefore relatively (but not
completely) insensitive to changes in viewpoint.
Because complex objects are conceived in RBC theory as
configurations of two or more geons in particular spatial
arrangements, they are encoded as structural descriptions that
specify both geons and their spatial relations. It is therefore
possible to construct different object types by arranging the
same geons in different spatial relations, such as the cup and
pail in Figure 7.24. RBC theory uses 108 qualitatively differ-
ent geon relations. Some of them concern how geons are at-
tached (e.g., side-connectedand top-connected), whereas
others concern their relational properties, such as relative size
(e.g., larger than and smaller than). With 108 geon relations
and 108 geons, it is logically possible to construct more than
a million different two-geon objects. Adding a third geon
pushes the number of combinations into the billions. Clearly,
geons are capable of generating a rich vocabulary of different
complex shapes. Whether it is sufficient to capture the power
and versatility of visual categorization is a question to which
this discussion returns later.
After the shape of an object has been represented via its
component geons and their spatial relations, the problem of
object categorization within RBC theory reduces to the
process of matching the structural description of an incoming
object with the set of structural descriptions for known entry-
level categories. The theory proposes that this process takes
place in several stages. In the original formulation, the overall
flow of information was depicted in the flowchart of Fig-
ure 7.25—(a) Anedge extractionprocess initially produces a
line drawing of the edges present in the visual scene; (b) The
image-based properties needed to identify geons are extracted
from the edge information bydetection of nonaccidental
properties.The crucial features are the nature of the edges
(e.g., curved versus straight), the nature of the vertices (e.g.,
Y-vertices, K-vertices, L-vertices, etc.), parallelism (parallel
vs. nonparallel), and symmetry (symmetric vs. asymmetric).
The goal of this process is to provide the feature-based infor-
mation required to identify the different kinds of geons (see
Stage d); (c) At the same time as these features are being
extracted, the system attempts toparse objects at regions of
deep concavity, as suggested by Hoffman and Richards
(1984) and discussed in the section of this chapter entitled
“Parsing.” The goal of this parsing process is to divide the
object into component geons without having to match them
explicitly on the basis of edge and vertex features; (d) The
combined results of feature detection (b) and object parsing
(c) are used toactivate the appropriate geons and spatial re-
lationsamong them; (e) After the geon description of the
Free download pdf