considering possible statistical problems that
may result from the analysis.
T F 9. A model containing the variablesE, A, B, C, A^2 ,
AB,EA,EA^2 ,EAB, andECis
hierarchically well formulated.
T F 10. If the variablesEA^2 andEABare found
to be significant during interaction assessment,
then acompletelist of all components of these
variables that must remain in any further
models considered consists ofE, A, B, EA,
EB, andA^2.
The following questions consider the use of logistic
regression on data obtained from a matched case-control
study of cervical cancer in 313 women from Sydney,
Australia (Brock et al., 1988). The outcome variable is
cervical cancer status (1¼present, 0¼absent). The
matching variables are age and socioeconomic status.
Additional independent variables not matched on are
smoking status, number of lifetime sexual partners, and
age at first sexual intercourse. The independent variables
are listed below together with their computer abbreviation
and coding scheme.
Variable Abbreviation Coding
Smoking status SMK 1 ¼ever,
0 ¼never
Number of sexual
partners
NS 1 ¼ 4 þ,0¼0–3
Age at first intercourse AS 1 ¼ 20 þ,
0 ¼< 19
Age of subject AGE Category matched
Socioeconomic status SES Category matched
- Consider the followingE, V, Wmodel that considers
the effect of smoking, as the exposure variable, on
cervical cancer status, controlling for the effects of
the other four independent variables listed:
logit PðXÞ¼aþbSMKþ~giViþg 1 NSþg 2 AS
þg 3 NSASþd 1 SMKNS
þd 2 SMKASþd 3 SMKNSAS;
where the Vi* are dummy variables indicating
matching strata and the gi* are the coefficients of
the Vi*variables. Is this model hierarchically well
formulated? If so, explain why; if not, explain why not.
- For the model in Question 11, is a test for the
significance of the three-factor product term SMK
NSAS dependent on the coding of SMK? If so,
explain why; if not explain, why not.
Test 199