Appendix: Corpora and Text Collections 299Corpus or text
collection DatesWord count or size
(where available)CLMET3.0 The corpus of Late
Modern English
texts 3.0
1710– 1920 9,818,326 wordsA corpus of late
Modern English
prose1861– 1919 100,000 wordsBNC
BYU- BNC
British national
corpus1980s– 1993 100 million wordsBritish and American
CEN Corpus of English
novels
1881– 1922 26,227,428 wordsAmerican
UofV The Modern English
collection ,
University of
Virginia Electronic
Text Center a
1500– presentAA Accessible archives
periodicals and
newspapers
18th– 19th c.ARCHER A representative
corpus of historical
English registers
3.2 – American
section
1750– 1999 c1.3 million wordsCOHA The corpus of
historical American
English
1810– 2009 400 million wordsTIME Time magazine corpus 1920s– 2000s 100 million words
COCA The corpus of
contemporary
American English
1990– 2015 525 million wordsSOAP Corpus of American
soap operas
2001– 2012 100 million wordsa. The majority of texts are American, though earlier texts are British. Unfortunately, as of early
2015, this resource (described as a “heterogeneous collection contain[ing] fi ction, non- fi ction,
poetry, drama, letters, newspapers, manuscripts and illustrations from 1500 to the present”) is
no longer available. It was formerly accessible online at: http:// etext.lib.virginia.edu/ modeng/
modeng0.browse.html.