Data Analysis with Microsoft Excel: Updated for Office 2007

(Tuis.) #1
Chapter 4 Describing Your Data 179

d. Does there appear to be any relation-
ship between these illnesses and the
level of cigarette use in the states?
Defend your answer with your charts,
statistics, and tables.
e. There is one state with a high level of
cigarette use but a relatively low level
of lung cancer. Identify this state.
f. Save your workbook and then write a
report summarizing your observations
and calculations.


  1. The Pollution workbook contains air
    quality data collected by the Environ-
    mental Protection Agency (EPA). The
    data show the number of unhealthful
    days (heavy levels of pollution) per year
    for 14 major U.S. cities in the year 1980
    and the average number of unhealthy
    days per year from 2000 through 2006.
    The workbook also contains the ratio of
    the 2000–2006 average to the 1980 value
    and the difference. A ratio value less
    than 1 or a difference value less than 0
    indicates an improvement in the air
    quality. Looking at the data as a whole,
    is there evidence to believe that there
    has been improvement in the air qual-
    ity? Open this workbook and examine
    the data.
    a. Open the Pollution workbook from
    the Chapter04 folder and save it as
    Pollution Boxplots.
    b. Calculate the mean and median values
    of the ratio and difference variables.
    c. Create two boxplots. First create a
    boxplot of the ratio variable and then
    create another boxplot of the differ-
    ence variable. Describe the difference
    between the shape of the two distri-
    butions. Is one more susceptible to
    extreme values than the other? Why
    would this be case? (Hint: Think
    about the number of unhealthy days
    in 1980. Which cities are most likely
    to show the greatest drop in absolute
    numbers?)


d. There is an extreme outlier in the box-
plot of the difference values. Identify
the city corresponding to that extreme
outlier.
e. Copy the air quality data to a new
worksheet without the extreme outlier
you noted in part d. Redo the table of
statistics and boxplots with this new
set of data.
f. What are your conclusions? Have
your conclusions changed without the
presence of the outlier? What effect
did the outlier have on the mean and
median values of the ratio and differ-
ence variables? Are you justifi ed in
removing the outlier from your analy-
sis? Why or why not?
g. Save your workbook and then write a
report summarizing your calculations
and observations. Which variable
seems to better describe the change in
air quality: the difference or the ratio?


  1. The Reaction workbook contains reac-
    tion times from the fi rst-round heats of
    the 100-meter race at the 1996 Summer
    Olympic games. Reaction time is the time
    elapsed between the sound of the starter’s
    gun and the moment the runner leaves
    the starting block. The workbook also
    contains the heat number, the order of
    fi nish, and the fi nish group (1st through
    3rd, 4th through 6th, and so forth).
    a. Open the Reaction workbook from the
    Chapter04 folder and save it as Reac-
    tion Statistics.
    b. Calculate univariate descriptive statis-
    tics for the reaction times listed. What
    are the average, median, minimum,
    and maximum reaction times?
    c. Create a boxplot of the reaction times.
    Are there any moderate or extreme
    outliers in the distribution? How
    would you characterize the shape of
    the distribution?
    d. Create a stem and leaf plot of the
    reaction times.

Free download pdf