Introductory Biostatistics

(Chris Devlin) #1

As with thez testabove, thismultiple contributionprocedure is very useful
for assessing the importance of potential explanatory variables. In particular, it
is often used to test whether a similar group of variables, such asdemographic
characteristics, is important for the prediction of the response; these variables
have some trait in common. Another application would be a collection of
powersand=orproduct terms (referred to asinteraction variables). It is often of
interest to assess the interaction e¤ects collectively before trying to consider
individual interaction terms in a model as suggested previously. In fact, such
use reduces the total number of tests to be performed, and this, in turn, helps to
provide better control of overall type I error rates, which may be inflated due to
multiple testing.


Example 10.11 Refer to the data set on skin cancer of Example 10.10 (Table
10.5) with all eight covariates, and we consider collectively the seven dummy
variables representing the age. The basic idea is to see if there are any di¤er-
ences without drawing seven separate conclusion comparing each age group
versus the baseline.



  1. With all eight variables included, we obtained lnL¼ 7201 :864.

  2. When the seven age variables were deleted, we obtained lnL¼ 5921 :076.


Therefore,


wLR^2 ¼ 2 ½lnLðbb^;eight variablesÞlnLðbb^;only location variableފ
¼ 2561 : 568 ;7df;p-value< 0 : 0001

In other words, the di¤erence between the age group is highly significant; in
fact, it is more so than the di¤erence between the cities.


Main E¤ects Theztests for single variables are su‰cient for investigating the
e¤ects of continuous and binary covariates. For categorical factors with several
categories, such as the age group in the skin cancer data of Example 10.10, this
process in PROC GENMOD would choose abaselinecategory and compare
each other category with the baseline category chosen. However, the impor-
tance of themain e¤ectsis usually of interest (i.e., one statistical test for each
covariate, not each category of a covariate). This can be achieved using PROC
GENMOD by two di¤erent ways: (1) treating the several category-specific
e¤ects as agroupas seen in Example 10.11 (this would requires two sepate
computer runs), or (2) requesting thetype 3 analysisoption as shown in the
following example.


Example 10.12 Refer to the skin cancer data of Example 10.10 (Table 10.5).
Type 3 analysis yields the results shown in Table 10.7. The result for theage
groupmain e¤ect is identical to that of Example 10.11.


366 METHODS FOR COUNT DATA

Free download pdf