AP Statistics 2017

(Marvins-Underground-K-12) #1

Residuals


When we developed the LSRL, we referred to y – ŷ (the actual value – the predicted value ) as an error
in prediction. The formal name for y – ŷ is the residual . Note that the order is always “actual” –
“predicted” so that a positive residual means that the prediction was too small and a negative residual
means that the prediction was too large.


example: In the previous    example,    a   criminal    earning $1560/month paid    restitution of
$800/month. The predicted restitution for this amount would be ŷ = –56.22 + 0.46(1560) =
$661.38. Thus, the residual for this case is $800 – $ 661.38 = $138.62.

Calculator  Tip: The    TI-83/84    will    generate    a   complete    set of  residuals   when    you perform a   LinReg  .
They are stored in a list called RESID which can be found in the LIST menu. RESID stores only the
current set of residuals. That is, a new set of residuals is stored in RESID each time you perform a new
regression.

Residuals can be useful to us in determining the extent to which a linear model is appropriate for a
dataset. If a line is an appropriate model, we would expect to find the residuals more or less randomly
scattered about the average residual (which is, of course, 0). In fact, we expect to find them
approximately normally distributed about 0. A pattern of residuals that does not appear to be more or less
randomly distributed about 0 (that is, there is a systematic nature to the graph of the residuals) is evidence
that a line is not a good model for the data. If the residuals are small, the line may predict well even
though it isn’t a good theoretical model for the data. The usual method of determining if a line is a good
model is to examine visually a plot of the residuals plotted against the explanatory variable.


Calculator  Tip: In order   to  draw    a   residual    plot    on  the TI-83/84,   and assuming    that    your    x   -data   are in
L1 and your y -data are in L2 , first do LinReg(a +bx)L1,L2 . Next, you create a STAT PLOT
scatterplot, where Xlist is set to L1 and Ylist is set to RESID. RESID can be retrieved from the
LIST menu (remember that only the residuals for the most recent regression are stored in RESID ).
ZOOM ZoomStat will then draw the residual plot for the current list of residuals. It’s a good idea to
turn off any equations you may have in the Y = list before doing a residual plot or you may get an
unwanted line on your plot.

example: The    data    given   below   show    the height  (in cm) at  various ages    (in months) for a   group   of
children.
(a) Does a line seem to be a good model for the data? Explain.
(b) What is the value of the residual for a child of 19 months?
Free download pdf