Logistic Regression: A Self-learning Text, Third Edition (Statistics in the Health Sciences)

(vip2019) #1

Bonferroni approach:
To achievea*a 0 , seta¼a 0 /T


EXAMPLE
e.g.,a 0 ¼0.05,T¼10:
a¼0.05/10¼0.005
+
a*¼ 1 (10.005)^10
¼0.49a 0 ¼0.05

Problem with Bonferroni:


Over-adjusts: does not reject
enough- low power (model may
be underfitted)


Bonferroni-type alternatives avail-
able to:


 Increase power
 Allow for nonindependent tests


Another approach:


Replaces FWER with
False Discovery Rate (FDR)¼T 0 /T,
where


T 0 ¼no. of tests incorrectly
rejected, i.e.,H 0 itrue
T¼total no. of tests


Criticisms of multiple testing:


(1) Assuminguniversal H 0 :all
H 0 itrue unrealistic


(2) Paying a “penalty for peeking”
reduces importance of specific
tests of interest


(3) Where do you stop correcting
for multiple-testing?


A popular (Bonferroni) approach for insuring
thata* never exceeds a desired FWER of, say,
a 0 is to require the significance level (a) for
each test to bea 0 /T. To illustrate, ifa 0 ¼0.05
andT¼10, thena¼0.005, anda* calculates
to 0.049, close to 0.05.

A problem, however, with using the Bonferroni
approach is that it “over-adjusts” by making it
more difficult to reject any givenH 0 i; that is, its
“power” to reject true alternative hypotheses is
typically too low.

Alternative formulae for adjusting for multiple-
testing(e.g.,Sidak,1967;Holm,1979;Hochberg,
1988) have been offered to provide increased
power and to allow for nonindependent signifi-
cance tests.

Moreover, another adjustment approach
(Benjamini and Hochberg, 1995) replaces the
“overall” goal of adjustment from obtaining a
desired “family-wise error rate” (FWER) to
obtaining a desired “false discovery rate”
(FDR), which is defined as the proportion of
the number of significance tests that incor-
rectly reject the null (i.e., truly Type 1 errors).

Nevertheless, there remains some controversy
in the methodologic literature (Rothman,
1990) as to whether any attempt to correct for
multiple-testing is even warranted. Criticisms
of “adjustment” include (1) the assumption of a
“universal” null hypothesis that allH 0 iare non
significant is unrealistic (2) paying a “penalty
for peeking” (Light and Pillemer, 1984) reduces
the importance of specific contrasts of interest;
(3) where does the need for adjustment stop
when considering all the tests that an individ-
ual researcher performs?

Presentation: VI. Multiple Testing 281
Free download pdf