Advances in Corpus-based Contrastive Linguistics - Studies in honour of Stig Johansson

(Joyce) #1

182 Jarle Ebeling, Signe Oksefjell Ebeling and Hilde Hasselgård


as one word ihvertfall.^6 When processing English data decisions must be made
regarding contracted forms, so that e.g. can’t and don’t are treated in a consistent
manner, either as one or two words. (In the present study they are treated as two.)
Similar challenges arise if the corpus texts contain both standard (going, going
to) and non-standard spellings (goin’, gonna). Finally, it is worth noting that the
search for 3-word combinations, selected relatively randomly for this exploratory
study, ignores potentially interesting 2-grams which may correspond to a frequent
3-gram in the other language. An example would be at all, which we will return to
in one of the case studies (cf. Section 5). 4-grams, on the other hand, can be identi-
fied on the basis of 3-word combinations that are embedded in them.
Since ours is a study of potentially “phraseologically interesting units”, a care-
ful reading of the cotext in which the units occur is important to exclude contexts
in which the n-grams cannot be argued to be a phrase in the way defined here. As
noted repeatedly by other scholars, it is often the case that the n-gram selected for
further investigation is actually part of a larger unit (cf. e.g. DeCock 2004, Stubbs
& Barth 2003, Stubbs 2007).


  1. Case study no. 1 – PREP det ADJ


The point of departure for the first case study was the observation that the
Norwegian part of the corpus contained a number of word-combinations of the
form PREP det ADJ, where PREP is a single preposition, most frequently i (‘in’) or
på (‘on’) and det is regarded as a definite article.^7 The ADJ part of the combination
can sometimes consist of two adjectives coordinated with og (‘and’), as in i det vide
og det brede (‘infinitely’, lit. ‘in the wide and the broad’), or be expanded in other
ways, as in i det meste laget (‘too much’, lit. ‘in the most degree’). Quite a few of
the single adjectives making up ADJ are in the superlative, as in på det bestemteste
(‘in no uncertain terms’, lit. ‘on the certainest’). The whole combination, PREP det
ADJ, functions as an adverbial on the clause level. Other examples from the corpus
include i det lengste (‘as long as possible’), med det samme (‘right-/straightaway’)
and på det rene (‘as a matter of fact’).
Combinations with a comparable form did not occur with the same fre-
quency in the lists based on the English part of the corpus, e.g. on the whole. It is
therefore interesting to see what the English correspondences of the Norwegian


  1. Since Norwegian has two written standards it is also important to include all variant spell-
    ings of words in both standards in the counts.

  2. “Foranstilt bestemt artikkel” (‘fronted definite article’), Bokmålsordboka.

Free download pdf