The Evolution of Pragmatic Markers in English Pathways of Change

(Tina Meador) #1
Appendix: Corpora and Text Collections 299

Corpus or text
collection Dates

Word count or size
(where available)

CLMET3.0 The corpus of Late
Modern English
texts 3.0


1710– 1920 9,818,326 words

A corpus of late
Modern English
prose

1861– 1919 100,000 words

BNC
BYU- BNC


British national
corpus

1980s– 1993 100 million words

British and American
CEN Corpus of English
novels


1881– 1922 26,227,428 words

American
UofV The Modern English
collection ,
University of
Virginia Electronic
Text Center a


1500– present

AA Accessible archives
periodicals and
newspapers


18th– 19th c.

ARCHER A representative
corpus of historical
English registers
3.2 – American
section


1750– 1999 c1.3 million words

COHA The corpus of
historical American
English


1810– 2009 400 million words

TIME Time magazine corpus 1920s– 2000s 100 million words
COCA The corpus of
contemporary
American English


1990– 2015 525 million words

SOAP Corpus of American
soap operas


2001– 2012 100 million words

a. The majority of texts are American, though earlier texts are British. Unfortunately, as of early
2015, this resource (described as a “heterogeneous collection contain[ing] fi ction, non- fi ction,
poetry, drama, letters, newspapers, manuscripts and illustrations from 1500 to the present”) is
no longer available. It was formerly accessible online at:  http:// etext.lib.virginia.edu/ modeng/
modeng0.browse.html.

Free download pdf