Introduction to Probability and Statistics for Engineers and Scientists

(Sean Pound) #1

40 Chapter 2:Descriptive Statistics


of exercise and good nutrition; or it may be that it is not knowledge that
is making the difference but rather it is that people who have had more
education tend to end up in jobs that allow them more time for exercise
and money for good nutrition. The strong negative correlation between
years in school and resting pulse rate probably results from a combination
of these as well as other underlying factors.


We will now prove the first three properties of the sample correlation coefficientr. That
is, we will prove that|r|≤1 with equality when the data lie on a straight line. To begin,
note that


∑(xi− ̄x
sx


yi− ̄y
sy

) 2
≥ 0 (2.6.1)

or
∑(xi− ̄x)^2
s^2 x


+

∑(yi− ̄y)
s^2 y

2
− 2

∑(xi− ̄x)(yi− ̄y)
sxsy

≥ 0

or


n− 1 +n− 1 −2(n−1)r≥ 0

showing that


r≤ 1

Note also thatr=1 if and only if there is equality in Equation (2.6.1). That is,r=1if
and only if for alli,


yi− ̄y
sy

=

xi− ̄x
sx

or, equivalently,


yi= ̄y−

sy
sx

x ̄+

sy
sx

xi

That is,r=1 if and only if the data values (xi,yi) lie on a straight line having a positive
slope.
To show thatr≥−1, with equality if and only if the data values (xi,yi) lie on a straight
line having a negative slope, start with


∑(xi− ̄x
sx

+

yi− ̄y
sy

) 2
≥ 0

and use an argument analogous to the one just given.

Free download pdf