112 Yves Peirsman, Kris Heylen and Dirk Geeraerts
can be equally useful to more theoretically-oriented linguistic research.
Thanks to their fully automatic analysis of the distribution of a word, word
space models are not only able to deal with enormous quantities of data;
they also bypass the need for subjective human judgments and may bring to
light patterns that escape the human eye.
In the next section, we will argue that the time is right for word space
models to be introduced into theoretical and descriptive (socio-)linguistics.
We will present two types of models that are often used in computational
linguistics – the document-based and syntax-based approaches – and show
how they work in practice. In section 3, we will then illustrate our case with
a variational-linguistic study. In particular, we will investigate how the use
of the religion names islam ‘Islam’ and christendom ‘Christianity’ has
changed after the attacks of 11 September 2001. To this goal, we study a
large Dutch corpus of about 300 million words, consisting of newspaper
articles from between 1999 and 2002. In section 4, finally, we wrap up with
conclusions and an outlook for future research.
- Word space models of lexical semantics
2.1. Related work
Despite the fact that word space models are mainly investigated in the field
of computational linguistics, their origin lies in linguistic insights. Through
the history of linguistics and language philosophy, a number of researchers
have stressed the dependency, or even the identity, between the meaning of
a word and its use. This view inspired John R. Firth’s (1957) quote that
“you shall know a word by the company it keeps”, Ludwig Wittgenstein’s
“the meaning of a word is its use in the language” (1953, p. 43), and Zellig
Harris’ (1954) insight that semantically similar words are used in similar
contexts – a view which is now often referred to as the distributional hypo-
thesis. However, in its quest for models of word meaning, theoretical lin-
guistics has embraced these views much less enthusiastically than the more
applied disciplines.
It might be argued, however, that the time has come for such word space
models to find their way to more theoretically-oriented research. Not only
are computational linguists fast gaining insight in the semantic characteris-