Statistical Methods for Psychology

(Michael S) #1
Appendix: Computer Data Sets 693

Variable Name Columns Description
ID 1–3 Subject identification number
ADDSC 5–6 ADD score averaged over 3 years
GENDER 8 15 male; 2 5 female
REPEAT 10 15 repeated a grade, 0 5 did not repeat
IQ 12–14 IQ obtained from group-administered IQ test
ENGL 16 Level of English: 1 5 college prep; 2 5 general;
35 remedial
ENGG 18 Grade in English: 4 5 A, 3 5 B, and so on
GPA 20–23 Grade point average in ninth grade
SOCPROB 25 Social problems: 0 5 no, 1 5 yes
DROPOUT 27 15 Dropped out of school before finishing
05 Did not drop out

The first four lines of data are shown here:

14510111232.6000
25010102232.7500
34910108244.0000
45510109222.2500

Badcancr.dat


For a description of both the study behind these data and the data set, see the following sec-
tion on Cancer.dat. The data in this file differ from those in Cancer.dat only by the inclusion
of deliberate errors.
These data have been deliberately changed for purposes of an assignment. Errors have
been added, and at least one variable has been distorted. The correct data are in
Cancer.dat, which should be used for allfutureanalyses. Virtually any program is likely to
fail at first until errors are found and corrected, and even when it runs, impossible values
will remain. The quickest way to find many of the errors is to print out the file and scan
the columns.


Cancer.dat


The data in this file come from a study by Compas (1990, personal communication) on the
effects of stress in cancer patients and their families. Only a small portion of the data that
were collected are shown here, primarily data related to behavior problems in children and
psychological symptoms in the patient and her or his spouse. The file contains data on 89
families, and many of the data points are missing because of the time in the study at which
these data were selected. This example does, however, offer a good opportunity to see pre-
liminary data on important psychological variables.
The codebook (the listing of variables, descriptions, location, and legitimate values) for
the data in Cancer.dat is shown following the sample data.
Missing observations are represented with a period. The first four lines of data are
shown here as an example.


1012625052395214244414042.......
10415665554057253736867711111228585760
10515657676561241676366652 7 715474845
10624161645357160605967621 61015495248

Free download pdf