Comp. by: VPugazhenthi Stage : Revises1 ChapterID: 9781405132879_4_S Date:1/4/
09 Time:15:23:41 Filepath:H:/00_Blackwell/00_3B2/Gregory-9781405132879/appln/3B2/
revises/9781405132879_4_S.3d
. Taking account quantitatively of the com-
plex design of the survey; and
. Dealing with non-response and missing
data.
To appreciate the issues concerning size of
effects and the interplay between variables,
consider the following 22 table in which
there is a single outcome (happiness) and
explanatory variable (age), each being meas-
ure on a binary scale:
Explanatory variable
Total
Young Old
Outcome Unhappy A B AþB
Happy C D CþD
Total AþCBþDN
The odds ratio calculated as (A/B)/(C/D) gives
the degree of association between the vari-
ables: if it is 1 there is no relationship; greater
than 1 means that younger people are unhap-
pier; less than 1 suggests that younger people
are more likely to be happy. However, these
results should not be taken at face value and
there needs to be model elaboration (Davis,
1986) to see how the relation changes as
account is taken of other variables. This can
be clearly seen by examining the following set
of 22 tables:
(a) Aggregated data
Young Old Total
Unhappy 140 120 260
Happy 50 90 140
Total 190 210 400
(b) Relation for males
Young Old Total
Unhappy 120 40 160
Happy 30 10 40
Total 150 50 200
(c) Relation for females
Young Old Total
Unhappy 20 80 100
Happy 20 80 100
Total 40 160 200
In (a) there is a clear relationship in that the
odds ratio of 2.1 suggest that younger people
are twice as unhappy as the old. However, in
(b) and (c) when the same data are disaggre-
gated to examine the relation for men and
women separately, each odds ratio is exactly
1 showing that there is no relationship
between age and happiness; the apparent
relationship is an artefact of the relation-
ships between age and gender (i.e. they are
confounded) and gender and happiness. The
following set of 22 tables tells a different
story:
(a) Aggregated data
Young Old Total
Unhappy 200 200 400
Happy 200 200 400
Total 400 400 800
(b) Relation for males
Young Old Total
Unhappy 170 40 210
Happy 190 80 270
Total 360 120 480
(c) Relation for females
Young Old Total
Unhappy 30 160 190
Happy 10 120 130
Total 40 280 320
In (a) the aggregate relation shows no effect as
there is an odds ratio of 1, but in (b) and (c)
when the analysis is done separately bygender,
the odds ratios are 1.79 for males and 2.25 for
females. In this case, the effect between age and
happiness has been masked by not taking
account of gender. The analysis can of course
be extended to more than three variables and to
include continuous variables (seecategorical
data analysis) but the underling logic of model
elaboration remains the same.
Analysis also has to guard against the Type I
error of finding a relation when it does not
exist, and the Type II error of not finding a
genuine relation (cf.sampling). For the for-
mer confirmatory data analysis is used to test
for the significance of a relation and to exam-
ine confidence intervals; the key here is the
absolute sample size. Thus the estimated odds
of 2.1 for the relation in the first aggregate
analysis has 95 per cent confidence intervals
that lie between 1.38 and 3.21, so there is little
chance that the aggregate relation is really 1,
which would indicate no effect. When no sig-
nificant relationship is found this may be an
outcome of too small a study. If this is the case
we can perform a retrospective power analysis
(Cohen, 1988) to see if the sample size was
large enough to detect an effect. It would have
been better, however, to do an initial power
analysis before sampling.
Gregory / The Dictionary of Human Geography 9781405132879_4_S Final Proof page 735 1.4.2009 3:23pm
SURVEY ANALYSIS