Paul and Pseudepigraphy (Pauline Studies, Book 8)

(Kiana) #1

118 andrew w. pitts


addressed within the writings (e.g., abortion policy), this will increase the


accuracy of authorship discrimination, but still not beyond all doubt.15 in


the cases where an author’s sample includes writings on varying topics,


the only way of increasing accuracy for authorship discrimination is by


substantial increases in corpus sizes for a given author, far beyond what


we have available for any of the authors of the new testament.16 While a


control group of possible authors’ works is readily available, a representa-


tive sample of Pauline material is difficult to compile, especially given the


diversity of register and disputation of authenticity for so many of these


letters.17


these observations apply directly to comparative studies of multiple


works of the same author. if we do not configure register considerations


in analyzing style variation within the same author, we may get false nega-


tives due to “differences in audience or register or time.”18 Something as


simple as the age of the author or time of composition for the text may


have a radical impact on language variation.19 numerous field studies


15 george tambouratzis and marina Vassiliou, “employing thematic Variables for
enhancing classification accuracy within author discrimination experiments,” LLC 22
(2007): 207–24; lisa Pearl and mark Steyvers, “detecting authorship deception: a Super-
vised machine learning approach using author Writeprints,” LLC 27 (2012): 183–96.
16 david l. hoover, “Statistical Stylistics and authorship attribution: an empirical
investigation,” LLC 16 (2001): 421–44; tambouratzis and Vassiliou, “employing thematic
Variables,” 222; cf. Pearl and Steyvers, “detecting authorship deception,” 184. See espe-
cially o’donnell’s study, which notes the size of corpus for various types of analysis, rang-
ing from tense form distribution (2,000) to word order (20,000) to vocabulary (20 million)
(matthew Brook o’donnell, “designing and compiling a register-Balanced corpus of hel-
lenistic greek for the Purpose of linguistic description and investigation,” in Porter [ed.],
Diglossia, 265). While this corpus size of the seven undisputed letters is well within range
for tense form distribution, it barely constitutes the minimum to do word order assess-
ment and does not even begin to approach the minimum requirement for analyzing an
author’s vocabulary.
17 grieve suggests that author discrimination studies require (1) a representative sam-
ple of the author’s work and (2) a representative sample of possible authors’ works. for
these studies to be accurate, the representative sample of the author needs to address
the same addressees and emerge from within very similar registers at other levels as well
(Jack grieve, “Quantitative authorship attribution: an evaluation of techniques,” LLC 22


18 grieve, “Quantitative authorship attribution,” 255.
19 richard S. forsyth, “Stylochronometry with Substrings, or: a Poet young and old,”
LLC 14 (1999): 467–77; constantina Stamou, “Stylochronometry: Stylistic development,
Sequence of composition, and relative dating,” LLC 23 (2008): 181–99, esp. 182–83. this
occurs due to the evolution of language in an individual author’s mental lexicon over time,
where the author continues to create new semantic relations, expansions, and changes in
his or her lexical stock due to varying social contexts for linguistic activity. So it becomes
exceedingly difficult to restrict an author to a particular set of vocabulary or a particular

Free download pdf