Introduction to Probability and Statistics for Engineers and Scientists

(Sean Pound) #1

376 Chapter 9: Regression


Inferences About Use the Distributional Result

α+βx 0

A+Bx 0 −α−βx 0
√√
√√

(
1
n+

(x 0 −x)^2
Sxx

)(
SSR
n− 2

)∼tn−^2

Y(x 0 ) √√ Y(x^0 )−A−Bx^0
√√

(
1 +^1 n+

(x 0 −x)^2
Sxx

)(
SSR
n− 2

)∼tn−^2

9.5 THE COEFFICIENT OF DETERMINATION AND THE


SAMPLE CORRELATION COEFFICIENT


Suppose we wanted to measure the amount of variation in the set of response values
Y 1 ,...,Yncorresponding to the set of input valuesx 1 ,...,xn. A standard measure in
statistics of the amount of variation in a set of valuesY 1 ,...,Ynis given by the quantity


SYY =

∑n

i= 1

(Yi−Y)^2

For instance, if all theYiare equal — and thus are all equal toY— thenSYY would
equal 0.
The variation in the values of theYiarises from two factors. First, because the input
valuesxiare different, the response variablesYiall have different mean values, which will
result in some variation in their values. Second, the variation also arises from the fact
that even when the differences in the input values are taken into account, each of the
response variablesYihas varianceσ^2 and thus will not exactly equal the predicted value
at its inputxi.
Let us consider now the question as to how much of the variation in the values of the
response variables is due to the different input values, and how much is due to the inherent
variance of the responses even when the input values are taken into account. To answer
this question, note that the quantity


SSR=

∑n

i= 1

(Yi−A−Bxi)^2

measures the remaining amount of variation in the response values after the different input
values have been taken into account.
Thus,


SYY−SSR
Free download pdf