72 Rosa Rabadán and Marlén Izquierdo
- Control data: The CREA corpus
In the selection of the querying elements for CREA to establish significant differ-
ences (or otherwise) between translated and original data, two important consid-
erations need to be borne in mind: (i) this study aims to analyse the distribution of
formally diverse resources, not particular lexical items, e.g. prepositional phrases,
clausal negatives; (ii) when searching for parts of items or combinations of items
some (apparently obvious) searches are inefficient, because the querying capabili-
ties of CREA do not match those of P-ACTRES exactly.
Hence, depending on the nature of the input resource, one of two different
strategies were employed: (i) to search the CREA 10,000-item frequency list for
the ten most frequent occurrences of one particular resource in non-translated
Spanish,^5 e.g. affixed negative items, and use them as querying inputs; (ii) to use
P-ACTRES findings as input query in CREA. This second strategy is employed
when the first one is either not possible or simply inefficient. For example, search-
ing for the negative pattern sin + N in CREA is out of the question, but using the
top ten sin + N combinations yielded by P-ACTRES and running them against the
CREA frequency list results in a far more robust set of querying items. However,
for some types of search, due to the degree of lexicalization and/or grammatical-
ization in Spanish, it is recommendable to confine the search in CREA to the ten
most frequent P-ACTRES findings, as is the case with No + (positive) lexical item.
The frequency list strategy has been applied to affixal, lexical and clausal nega-
tion; the top ten diagnostic findings (in one of its variants) have served as querying
strategy for the rest.
Affixal negation. The CREA 10,000-item frequency list was searched for the ten
most frequent affixal negative items in Spanish (see Table 8). The search yielded
a population (N) of 5,388 occurrences, which constitute the raw figures of our
control data (see Table 9).
Table 8. CREA querying items for affixal negation^6
CREA order Absolute freq. Relative freq.
- imposible 14,178 92.93
- desconocido 4,399 28.83
- imprescindible 4,418 28.95
- http://corpus.rae.es/frec/10000_formas.TXT
- The standard English equivalents listed in the same order are: impossible, unknown, impera-
tive, useless, unable, essential, illegal, incredible, unconscious, invisible.