Chapter 4 Describing Your Data 179
d. Does there appear to be any relation-
ship between these illnesses and the
level of cigarette use in the states?
Defend your answer with your charts,
statistics, and tables.
e. There is one state with a high level of
cigarette use but a relatively low level
of lung cancer. Identify this state.
f. Save your workbook and then write a
report summarizing your observations
and calculations.
- The Pollution workbook contains air
quality data collected by the Environ-
mental Protection Agency (EPA). The
data show the number of unhealthful
days (heavy levels of pollution) per year
for 14 major U.S. cities in the year 1980
and the average number of unhealthy
days per year from 2000 through 2006.
The workbook also contains the ratio of
the 2000–2006 average to the 1980 value
and the difference. A ratio value less
than 1 or a difference value less than 0
indicates an improvement in the air
quality. Looking at the data as a whole,
is there evidence to believe that there
has been improvement in the air qual-
ity? Open this workbook and examine
the data.
a. Open the Pollution workbook from
the Chapter04 folder and save it as
Pollution Boxplots.
b. Calculate the mean and median values
of the ratio and difference variables.
c. Create two boxplots. First create a
boxplot of the ratio variable and then
create another boxplot of the differ-
ence variable. Describe the difference
between the shape of the two distri-
butions. Is one more susceptible to
extreme values than the other? Why
would this be case? (Hint: Think
about the number of unhealthy days
in 1980. Which cities are most likely
to show the greatest drop in absolute
numbers?)
d. There is an extreme outlier in the box-
plot of the difference values. Identify
the city corresponding to that extreme
outlier.
e. Copy the air quality data to a new
worksheet without the extreme outlier
you noted in part d. Redo the table of
statistics and boxplots with this new
set of data.
f. What are your conclusions? Have
your conclusions changed without the
presence of the outlier? What effect
did the outlier have on the mean and
median values of the ratio and differ-
ence variables? Are you justifi ed in
removing the outlier from your analy-
sis? Why or why not?
g. Save your workbook and then write a
report summarizing your calculations
and observations. Which variable
seems to better describe the change in
air quality: the difference or the ratio?
- The Reaction workbook contains reac-
tion times from the fi rst-round heats of
the 100-meter race at the 1996 Summer
Olympic games. Reaction time is the time
elapsed between the sound of the starter’s
gun and the moment the runner leaves
the starting block. The workbook also
contains the heat number, the order of
fi nish, and the fi nish group (1st through
3rd, 4th through 6th, and so forth).
a. Open the Reaction workbook from the
Chapter04 folder and save it as Reac-
tion Statistics.
b. Calculate univariate descriptive statis-
tics for the reaction times listed. What
are the average, median, minimum,
and maximum reaction times?
c. Create a boxplot of the reaction times.
Are there any moderate or extreme
outliers in the distribution? How
would you characterize the shape of
the distribution?
d. Create a stem and leaf plot of the
reaction times.