Advances in Corpus-based Contrastive Linguistics - Studies in honour of Stig Johansson

(Joyce) #1

180 Jarle Ebeling, Signe Oksefjell Ebeling and Hilde Hasselgård


The investigation was performed on the texts that make up the fiction part of
the ENPC, both originals and translations in both languages.^2 Recurrent three-
word combinations with a frequency of at least 10 in each subcorpus were pro-
duced using AntConc, resulting in frequency lists as illustrated in Figure 1.^3

1 i do n’t
2 one of the
3 i did n’t
4 it was a
5 out of the
6 there was a
7 do n’t know
8 he did n’t

1 i do n’t
2 out of the
3 one of the
4 son of thunder
5 in front of
6 i did n’t
7 it was a
8 he did n’t

1 det var en
2 at det var
3 ved siden av
4 det var ikke
5 og det var
6 det er ikke
7 på grunn av
8 men det var

1 at det var
2 det var en
3 ved siden av
4 som om han
5 det er ikke
6 det var som
7 det var ikke
8 av og til

English original
Norwegian original

English transl. Norwegian transl.

Figure 1. The structure of the ENPC with n-gram lists^4

Examples of differences that come to light on the basis of the lists shown in
Figure 1 include the English and Norwegian word-combinations given in Table 1.


  1. The fiction part of the ENPC contains approximately 1.6 million words (i.e. approx. 400,000
    words in each part of the corpus, viz. English original, Norwegian original, English translation,
    and Norwegian translation. See further http://www.hf.uio.no/ilos/english/services/omc/enpc/

  2. AntConc is a freeware concordance program developed by Laurence Anthony http://www.
    antlab.sci.waseda.ac.jp/antconc_index.html

  3. The main parts of the ENPC are indicated by the four boxes, and the lines between them
    show the main types of studies which are made possible by this structure, e.g. comparing origi-
    nal and translated text, which will be our main focus. See e.g. Johansson et al. (1999/2002) and
    Johansson (2007: 11–12) for a fuller description of the bidirectional model.

Free download pdf