Statistical Analysis for Education and Psychology Researchers

(Jeff_L) #1
0001 options nodate nonumber;
0002 data child1;
0003 infile 'acchild1.dat';
0004 input caseid 1–3 ageyrs 5–6 sex 8
ses 10 raven 12;
0005 proc summary print n nmiss min max;
0006 var caseid–raven;
0007 title 'Number of valid cases, missing,
max & min for data=child1';
0008 run;

Figure 3.8: Example SAS programme


for a frequency count using PROC


SUMMARY


0001 Number of valid, missing, max & min for data=child1
0002
0003 Variable N Nmiss Minimum Maximum
0004
0005 CASEID 10 0 1.0000009 10.0000000
0006 AGEYRS 9 1 7.0000000 11.0000000
0007 SEX 10 0 0 1.0000000
0008 SES 9 1 1.0000000 9.0000000
0009 RAVEN 10 0 1.0000000 9.0000000
0010

Figure 3.9: SAS output from PROC


SUMMARY, data=child1


Dealing with Missing Data

Large data sets, especially if collected by survey questionnaire methods, inevitably have
missing data values. However, this problem is not confined to survey research. In
experimental designs participants may become tired, bored or simply uncooperative. If
data is missing, the researcher must decide what to do. It is sensible to follow the
suggestion given by Chatfield (1993), namely, identify first why a value is missing. The
seriousness of missing data depends upon why it is missing and how much is missing. An
initial distinction would be whether missing responses were random or systematic.
Examples of systematic missing responses include censoring or truncation of data
perhaps because a respondent refuses to answer personal questions or a subject may
withdraw part way through an experiment.
How do you know whether missing data is random? Tabachnick and Fidell (1989)
suggest you should check for this. Essentially this involves scrutiny of the data to identify
any patterns in missing values. One approach other than simply looking for patterns in
the raw data is to draw a missing (denoted by ‘.’) valid (denoted by blank ‘+’) table for


Initial data analysis 47
Free download pdf