62 Chapter 2 Describing and Exploring Data
2.49 Create a boxplot for the data in Exercise 2.1.
2.50 Create a boxplot for the data in Exercise 2.4.
2.51 Create a boxplot for the variable ADDSC in Appendix Data Set.
2.52 Compute the coefficient of variation to compare the variability in usage of “and then.. .”
statements by children and adults in Exercises 2.1 and 2.4.
2.53 For the data in Appendix Data Set, the GPA has a mean of 2.456 and a standard deviation of
0.8614. Compute the coefficient of variation as defined in this chapter.
2.54 The data set named BadCancr.dat (at http://www.uvm.edu/~dhowell/methods7/DataFiles/
BadCancr.dat) has been deliberately corrupted by entering errors into a perfectly good data
set (named Cancer.dat). The purpose of this corruption was to give you experience in de-
tecting and correcting the kinds of errors that appear almost every time we attempt to use a
newly entered data set. Every error in here is one that I and almost everyone I know have
come across countless times. Some of them are so extreme that most statistical packages
will not run until they are corrected. Others are logical errors that will allow the program to
run, producing meaningless results. (No college student is likely to be 10 years old or re-
ceive a score of 15 on a 10-point quiz.) The variables in this set are described in the Appen-
dix: Computer Data Sets for the file Cancer.dat. That description tells where each variable
should be found and the range of its legitimate values. You can use any statistical package
available to read the data. Standard error messages will identify some of the problems, vi-
sual inspection will identify others, and computing descriptive statistics or plotting the data
will help identify the rest. In some cases, the appropriate correction will be obvious. In other
cases, you will just have to delete the offending values. When you have cleaned the data,
use your program to compute a final set of descriptive statistics on each of the variables.
This problem will take a fair amount of time. I have found that it is best to have students
work in pairs.
2.55 Compute the 10% trimmed mean for the data in Table 2.6—Set 32.
2.56 Compute the 10% Winsorized standard deviation for the data in Table 2.6—Set 32.
2.57 Draw a boxplot to illustrate the difference between reaction times to positive and negative
instances in reaction time for the data in Table 2.1. (These data can be found at www
.uvm.edu/~dhowell/methods7/DataFiles/Tab2–1.dat.)
2.58 Under what conditions will a transformation alter the shape of a distribution?
2.59 Do an Internet search using Google to find how to create a kernel density plot using SAS or
S-Plus.
Discussion Question
2.60 In the exercises in Chapter 1, we considered the study by a fourth-grade girl who examined
the average allowance of her classmates. You may recall that 7 boys reported an average al-
lowance of $3.18, and 11 girls reported an average allowance of $2.63. These data raise
some interesting statistical issues. Without in any way diminishing the value of what the
fourth-grade student did, let’s look at the data more closely. The article in the paper reported
that the highest allowance for a boy was $10, whereas the highest for a girl was $9. It also
reported that the girls’ two lowest allowances were $0.50 and $0.51, but the lowest reported
allowance for a boy was $3.00.