Handbook of Psychology, Volume 4: Experimental Psychology

(Axel Boer) #1
Theories of Object Identification 205

input object is constructed, it automatically causes the activa-
tion of similar geon descriptions stored in memory. This
matching process is accomplished by activation spreading
through a network from geon nodes and relation nodes
present in the representation of the target object to similar
geon nodes and relation nodes in the category representa-
tions. This comparison is a fully parallel process, matching
the geon description of the input object against all category
representations at once and using all geons and relations at
once; (f ) Finally,object identificationoccurs when the target
object is classified as an instance of the entry-level category
that is most strongly activated by the comparison process,
provided it exceeds some threshold value.
Although the general flow of information within RBC
theory is generally bottom-up, it also allows for top-down
processing. If sensory information is weak (e.g., noisy, brief,
or otherwise degraded images) top-down effects are likely to
occur. There are two points in RBC at which they are most
likely to happen: feedback from geons to geon features and
feedback from category representations to geons. Contextual
effects could also occur through feedback from prior or con-
current object identification to the nodes of related sets of
objects, although this level of processing was not actually
represented in Biederman’s (1987) model.


View-Specific Theories

In many ways, the starting point for view-specific theories of
object identification is the existence of the perspective effects
described in the section of this chapter entitled “Perspec-
tive Effects.” The fact that recognition and categorization
performance is not invariant over different views (e.g.,
Palmer et al., 1981) raises the possibility that objects might
be identified by matching two-dimensional input images di-
rectly to some kind of view-specific category representation.
It cannot be done with a single, specific view (such as one
canonical perspective) because there is simply not enough in-
formation in any single view to identify other views. A more
realistic possibility is that there might be multiple two-
dimensional representations from several different view-
points that can be employed in recognizing objects. These
multiple views are likely to be those perspectives from which
the object has been seen most often in past experience. As
mentioned in this chapter’s section entitled “Orientation Ef-
fects,” evidence supporting this possibility has come from a
series of experiments that studied the identification of two-
dimensional figures at different orientations in the frontal
plane (Tarr & Pinker, 1989) and of three-dimensional figures
at different perspectives (Bülthoff & Edelman, 1992;
Edelman & Bülthoff, 1992).
Several theories of object identification encorporate some
degree of view specificity. One is Koenderink and Van
Doorn’s (1979) aspect graph theory,which is a well-defined
elaboration of Minsky’s (1975) frame theoryof object per-
ception. An aspect graph is a network of representations
containing all topologically distinct two-dimensional views
(oraspects) of the same object. Its major problem is that it
cannot distinguish among different objects that have the same
edge topology. All tetrahedrons are equivalent within aspect
graph theory—for example, despite large metric differences
that are easily distinguished perceptually. This means that
there is more information available to the visual system than
is captured by edge topology, a conclusion that led to later
theories in which projective geometry plays an important role
in matching input views to object representations.
One approach was to match incoming two-dimensional
images to internal three-dimensional models by an align-
ment process (e.g., Huttenlocher & Ullman, 1987; Lowe,
1985; Ullman, 1989). Another was to match incoming two-
dimensional images directly against stored two-dimensional
views, much as template theories advocate (e.g., Poggio &
Edelman, 1990; Ullman, 1996; Ullman & Basri, 1991). The
latter, exclusively two-dimensional approach has the same
problem that plagues template theories of recognition: An
indefinitely large number of views would have to be stored.

Detection of
Nonaccidental
Properties

Figure 7.25 Processing stages in RBC theory (see text). Source: From
Biederman, 1987.

Free download pdf