The Dictionary of Human Geography

Comp. by: VPugazhenthi Stage : Revises1 ChapterID: 9781405132879_4_S Date:1/4/
09 Time:15:23:34 Filepath:H:/00_Blackwell/00_3B2/Gregory-9781405132879/appln/3B2/
revises/9781405132879_4_S.3d

other small projects, the absolute min- imum is 100 and preferably 250 when sub- groups (male and female; young and old) are being analysed. The aim should be a focusedquestionnaireto a lot of people, rather than a long questionnaire to few, or a recourse tosecondary data.

. Stratified samplinggroups the population
into strata so as to maximize similarity
within a stratum and maximize between-
strata differences. This can considerably
increase the sample’s efficiency if stratifica-
tion is based on a variable strongly related
to the estimate. If income is strongly
related to region, then regions could be
used for stratifying and reducing the stand-
ard error of the mean income estimate. We
can also disproportionately sample from
particular strata when there are important
groups of the population that are numeric-
ally small and so would yield only small
numbers if SRS were used within strata
such as ethnic groups (see ethnicity)
with the non-indigenous groups over-sam-
pled to get more precise estimates. Such a
strategy requires detailed knowledge of the
sample frame in terms of an ethnic classi-
fication, and the analysis should be
weighted to get correct estimates.
. Multi-stage designs involve sampling in
stages. For example, a sample of constitu-
encies may be selected at random (the so-
called primary sampling units), then wards
within them, then households within
wards and individuals within households.
This design is often used for major scien-
tific surveys, as it only requires a sampling
frame at each stage; thus at stage one only
a list of constituencies is required, while at
stage two, only ward names are required
for those constituencies already selected.
Another advantage is the cost reduction
resulting from basing a team of interview-
ers in the higher-level units. A variant is the
cluster designwhen at some stage all the
lower level units are sampled – everybody
in a ward is selected, for example. A prob-
lem with these designs is that there is a
tendency for people living in the same
place to be somewhat similar so that the
resultant sample is more alike than a ran-
dom sample and standard statistical theory
gives overly precise results. Clustered data
lead to inefficiency and it is not unknown
for an SRS a third of the size to achieve
the same standard error. It is clearly vital
to measure this dependency (the intra-
classcorrelation) and correct for it. The

development of multi-level models allows this even when the sample is unbal- anced with a different number of units in each higher level unit. Consequently, multistage designs are recommended for studying variation simultaneously at a number of differentscales, with the population itself seen as having a hierarchical structure, which is itself of substantive interest (Jones, 1997). Indeed highly clustered designs are needed if survey informa- tion is to be gathered on individuals as well as their peers. With such designs, it is necessary to specify the number of units at each level; Raudenbush and Xiaofeng (2000) provide the necessary background, which is put into practice by Stoker and Bowers (2002) in their geographically sensitive designs for surveying American voting behaviour.

These three designs can be used in combin- ation; the UK Millennium Cohort study, unlike previous birth cohorts, is spatially clustered specifically to study neighbourhood effects. Wards are disproportionately stratified to ensure adequate representation of all four UK countries, deprived areas and areas with high concentrations of particular ethnic groups, and then all babies aged 9 months in selected wards over a 12-month period. The resultant sample includes 19,000 infants who are being followed longitudinally. Other probabilistic designs may be used for different circumstances; they include capture__recapture methods to estimate population size with mobile populations, and response-based sampling (see extensive designs) when a numerically small but important outcome is over-sampled. In geo- graphical studies, the standard procedures may be modified to ensure spatial coverage. Methods of random, systematic and stratified sampling of points on amaphave been devised using coordinate systems, for example, as have methods of selecting transects (line samples) across an area (Berry and Baker, 1968). Increasingly, these designs are being used adaptively (Thompson and Seber, 2002), so that the degree ofspatial autocorrelation is being assessed as the survey proceeds and there is increased sampling in areas where the outcome variable is most varied and least spatially dependent. When testing ahypothesisit is crucial to assess and control for two types of error in a probabilistic design. Type I errors, finding an effect when there really is none, are controlled

Gregory / The Dictionary of Human Geography 9781405132879_4_S Final Proof page 663 1.4.2009 3:23pm

SAMPLING

The Dictionary of Human Geography

Get our desktop app

Company

Features

Documentation

Resources