During this time, all the connections within and among the structures are available: phonologically, one retains the
orderofwords;semantically, oneretainstherelationsamongthewords;and oneknowswhichsounds (e.g. /əbigstar/
) correlate with which part of meaning, in particular with the spatial structure. It will not do to say that one hears or
utters the words in sequence but that one does not retain the whole in memory.^28
In order to grasp a sentence, one need not have previously memorized it: one may spontaneously utter it when asked
to describe a visual configuration, or one may hear someone else utter it. Our ability to spontaneously produce and
perceiveitis a consequence oftheproductivecombinatorialityof language. Whatonemusthavememorized, though, is
the wordsthe, star, big, little, beside, anda, and the clitic 's, plus the principles for putting the mtogether.
The need for combining independent bits into a single coherent percept has been recognized in the theory of vision
under the name of thebinding problem(notto be confused with linguists' Binding Theory, mentioned in section1.7). In
the discussions I have encountered, the binding proble mis usually stated this way: we have found that the shape and
the color of an object are encoded in different regions of the brain, and they can be differentially impaired by brain
damage. How is it, then, that we sense a particular shape and color as attributes of the same object? The problem
becomes more pointed in a two-object situation: if the shape region detects a square and a circle, and the color region
detects red and blue, how does the brain encode that one is seeing, say, a red square and a blue circle rather than the
other way around? In fact, under time pressure, subjects can mismatch the features of multiple perceived objects
(Treisman 1988). A proposal that has gained a certain popularity (Gray et al. 1989; Crick and Koch 1990; Singer et al.
1997) isthatthedifferentrepresentationsare phase-linked: theneuronsencoding“red”and“square”fireinsynchrony,
and those encoding“blue”and“circle”do as well, but out of phase with thefirst pair.
However, the binding problem presented by linguistic structure is far more massive than this simple description. The
trivially simple sentence (23) has the four independent structures shown in Fig. 1.1. Each structure has multiple parts
that must be correlated; in addition, the four structures must be correlated with each other, as notated by the
subscripts in Fig. 1.1. Consider just the
COMBINATORIALITY 59
(^28) Nor is it useful to conceiveof understanding thesentencein terms of predictingwhat word willcome next—whichis what Elman's (1990) connectionist parser does. One
might well predict that what comes afterlittlein (23) is likely to be a noun (though it might beblue or evenvery old), but that still leaves open some tens of thousands of
choices. The same is true for every position in thesentence.On the otherhand, Pollack(1990) proposes a connectionistparser that does encode an entire hierarchical tree
structure. See n. 21 below for further comments on Elman's parser.