Advances in Corpus-based Contrastive Linguistics - Studies in honour of Stig Johansson

(Joyce) #1

62 Rosa Rabadán and Marlén Izquierdo


a number of subcorpora, namely Books, Press (newspapers and magazines) and
Miscellanea (see Table 4).

Table 4. Contents of the P-ACTRES Parallel Corpus: number of words
P-ACTRES English Spanish Total
Books (Fiction and non-fiction) 890,820 974,132 1,864,952
Press (Newspapers and magazines) 235,106 264,191 499,297
Miscellanea 40,178 49,026 89,204
Total 1,166,104 1,287,349 2,453,453

The Corpus de Referencia del Español Actual (CREA) is a large monolingual refer-
ence corpus sponsored by the Real Academia Española de la Lengua. For this study,
the following corpus restricting choices were made: the chronological period
was reduced to 2000–2004; the source subcorpora limited to Books, Newspapers,
Magazines and Miscellaneous (see Table 5), and the geographical variety restricted
to Spain.

Table 5. Contents of the 2000–04 CREA-derived monolingual corpus:
number of words (March 2011)
Registers Size
Libros (Books: fiction and non-fiction) 18,500,104
Prensa (Newspapers and magazines) 8,474,325
Miscelánea (Miscellaneous) 346,500
Total 27,320,929

3.2 Statistics

The quantitative and qualitative information gathered in this study is verified sta-
tistically in order to establish the difference, if any, between the translated and
non-translated use of the items previously identified as functional resources to
convey negation. To ensure the stringency of the results and the appropriateness
of the tests, ‘hypothesis testing for independent proportions’ is applied.^4


  1. For concepts in inferential statistics, see Lowry (2008). The statistics have been calculated
    using the software Megastat®: http://blue.butler.edu/~orris/megastat/index.html (2 March
    2011).

Free download pdf