Handbook of Psychology, Volume 4: Experimental Psychology

(Axel Boer) #1
Models of Knowledge Representation 591

Randomly chosen word pairs tend to have an average cosine
very near zero (M=.02,SD=.06), whereas a sample of 100
singular and plural word pairs (e.g., house, houses) have
much higher, but not perfect, average cosines (M=.66,
SD=.15). What is computed here is not word overlap or
word co-occurrence, but something entirely new: a semantic
distance in a high-dimensional space that was constructed
from such data.
The distinction between measurement of word overlap
and semantic content as measured by LSA is illustrated in the
following example taken from Butcher and Kintsch (2001).
Two students learn a text containing the following statement:
The phonological loop responds to the phonetic characteris-
tics of speech but does not evaluate speech for semantic con-
tent.In a summary, Student A writes “The rehearsal loop that
practices speech sounds does not pick up meaning in words.
Rather, it just reacts whenever it hears something that sounds
like language.” Student B writes, “The loop that listens to
words does not understand anything about the phonetic
noises that it hears. All it does is listen for noise and then re-
sponds by practicing that noise.” As human comprehenders,
we can see that Student A has a better understanding of the
text and has constructed a more appropriate summary of that
bit of information. Using LSA to compare each student’s
summary with the learned text, we find that Student A’s text
has a cosine of .62 with the original text, whereas Student B’s
text has a cosine of only .40 with the original text (Only the
relative values of cosines generated for equivalent types of
text can be compared. Cosines for word pairs and sentence
pairs, for instance, are not comparable.). Note that this result
is not due to overlapping words in the text and summaries;
Student A repeats two words from the original sentence but
Student B repeats three words from the original sentence.
Using the relative values of the cosines, LSA tells us what we
have concluded by reading the texts: Student A’s summary is
a closer semantic match to the original text than that of Stu-
dent B. The differences between the texts are subtle but clear;
although Student B is not completely confused, his summary
does reflect a less thorough understanding of the original
content than does Student A’s summary. For more detailed
descriptions of LSA, see Landauer (1998), Landauer and
Dumais (1997), and Landauer, Foltz, and Laham (1998).
Before examining the achievements of LSA, its limita-
tions must be discussed, for LSA is by no means a complete
semantic theory; rather, it provides a strong basis for building
such a theory. First, LSA disregards syntax and syntax obvi-
ously plays a role in determining the meaning of sentences.
Second, LSA can learn only from written text, whereas
human experience is based on perception, action, and emo-
tion—the real world, not just words—as well. Third, LSA


starts with a tabula rasa, whereas the acquisition of human
knowledge is subject to epigenetic constraints that determine
its very character. Surprisingly, neither of these problems is
fatal. Much can be achieved without syntax, and it is possible
to bring syntactic information to bear within the LSA frame-
work, at least to some extent, as we discuss later in this chap-
ter. Words are not all of human knowledge, but language
has evolved to talk about all human affairs—action, percep-
tion, emotion. Thus, words mirror the nonverbal aspects of
human experience—not with complete accuracy, but enough
to make LSA useful. Finally, LSA does not learn from scratch
but from language. Thus its input already incorporates the
epigenetic rules that structure human knowledge.
LSA makes semantic judgments that are humanlike in
many ways, but it can only perform correctly when it has
been trained on an appropriate textual corpus. One of the se-
mantic spaces that has been constructed represents the
knowledge of a typical American high-school graduate: It is
based on a text of more than 11 million words, comprising
over 90,000 different words and over 36,000 documents. It is
a model of what a high-school student would know if all his
or her experience were limited to reading these texts. In one
respect this is not much, but in another it is a considerable
achievement. It will, for instance, pass the TOEFL test of
English as foreign language: Given a rare word (like aban-
doned) and several alternatives (like forsake, aberration,and
deviance) it will choose the correct one, because forsakehas
a higher cosine (.20) with the target word than the other al-
ternatives (.09 and .09). On the other hand, it will fail an in-
troductory psychology multiple-choice exam, because the
high-school reading material does not contain enough psy-
chology texts. If we create a new space by teaching LSA psy-
chology with a standard introductory text, however, it will
pass the test: Asked to match attentionto the alternatives
memory, selectivity, problem solving, andlanguage,it will
correctly choose selectivity, because the cosine between
attentionand selectivity is .52 and the cosines between
attentionand the other alternatives are only .17, .05, and .07,
respectively.
LSA is a powerful tool for the simulation of psycholin-
guistic phenomena. Landauer and Dumais (1997) have
discussed vocabulary acquisition as the construction of a
semantic space, modeled by LSA; Laham (2000) investi-
gated the emergence of natural categories from the LSA
space; Foltz, Kintsch, and Landauer (1998) have used LSA to
analyze textual coherence; and Butcher and Kintsch (2001)
have used LSA as an analytic tool in the study of writing.
LSA has also been used effectively in a number of applica-
tions that depend on an effective representation of verbal
meaning. To mention just some of the practical applications,
Free download pdf