Data Analysis with Microsoft Excel: Updated for Office 2007

(Tuis.) #1

388 Statistical Methods


d. Create a normal plot of the residu-
als. Does your plot support a conclu-
sion that the residuals are normally
distributed?
e. Save your changes to the workbook
and write a report summarizing your
observations.


  1. The regression performed in the previ-
    ous exercise assumed that prices would
    change linearly with miles and age. It
    could also be the case that the prices
    will change instead as a percentage so
    that instead of dropping $1000 per year,
    the price would drop 10% per year. You
    can check this assumption by perform-
    ing a logarithm of the used car sales
    price.
    a. Open the Toyota workbook from the
    Chapter09 folder and save it as Toyota
    Log Regression.
    b. Create a new variable named LogPrice
    equal to the log(price) value.
    c. Repeat the regression from the last ex-
    ercise using the log(price) rather than
    price.
    d. Does this improve the multiple cor-
    relation? Have the p values associated
    with the miles and age coeffi cients
    become more signifi cant?
    e. When log(price) is used as the
    dependent variable, the regression
    can be interpreted in terms of
    percentage drop in Price per year of
    age, instead of a fi xed drop per year
    of Age when Price is used as the de-
    pendent variable. Does it make more
    sense to have the price drop by 16.5%
    each year or to have the price drop by
    $721 per year? In particular, would an
    old car lose as much value per year as
    it did when it was young?
    f. Save your changes to the workbook
    and write a report summarizing your
    conclusions.
    10. The Cars workbook contains data based
    on reviews published in Consumer
    Reports®, 2003–2008. See Exercise 10
    of Chapter 2. The workbook includes
    observations from 275 car models on the
    variables Price, MPG (miles per
    gallon), Cyl (number of cylinders),
    Eng size (engine displacement in liters),
    Eng type (normal, hybrid, turbo, turbod-
    iesel), HP (horsepower), Weight (vehicle
    weight in pounds), Time0–60 (time to
    accelerate from 0 to 60 miles per hour in
    seconds), Date (month of publication),
    and Region (United States, Europe, or
    Asia). There is an additional variable
    Eng type01 that is 1 for hybrids and
    diesels and 0 otherwise.
    a. Open the Cars workbook from the
    Chapter09 folder and save it as Cars
    Multiple Regression.
    b. Create a correlation matrix (excluding
    Spearman’s rank correlation) and a
    scatter plot matrix of the seven quan-
    titative variables Price, MPG, Cyl, Eng
    size, HP, Weight, and Eng type01.
    c. Regress MPG on Cyl, Eng size, HP,
    Weight, Price, and Eng type01.
    d. Note that the regression coeffi cients
    for Cyl and Eng size are not signifi -
    cant at the .05 level. Compare this to
    the p values for these variables in the
    correlation matrix. What accounts for
    the lack of signifi cance? (Hint: Look at
    the correlations among Cyl, Eng size,
    HP, Price, and Weight.)
    e. Create a scatter plot of the regression
    residuals versus the predicted values.
    Judging by the scatter plot, do the as-
    sumptions of the regression appear to
    be violated? Why or why not?
    f. Create a new variable, GPM100, that
    displays 100 divided by the miles per
    gallon. This measures the fuel neces-
    sary to go 100 miles. Some statisti-
    cians and car magazines use this

Free download pdf