X
ðxxÞðyyÞ< 0for negative association.With proper standardization, we obtainr¼P
ðxxÞðyyÞ
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
½P
ðxxÞ^2 ½P
ðyyÞ^2 qso that
1 ara 1This statistic,r, called thecorrelation coe‰cient, is a popular measure for the
strength of a statistical relationship; here is a shortcut formula:
r¼P
xyðP
xÞðP
yÞ=n
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
½P
x^2 ðP
xÞ^2 =n½P
y^2 ðP
yÞ^2 =nqMeanningful interpretation of the correlation coe‰cientris rather compli-
cated at this level. We will revisit the topic in Chapter 8 in the context of
regression analysis, a statistical method that is closely connected to correlation.
Generally:
Values near 1 indicate a strong positive association.
Values near1 indicate a strong negative association.
Values around 0 indicate a weak association.Interpretation ofrshould be made cautiously, however. It is true that a
scatter plot of data that results in a correlation number ofþ1or1 has to lie
in a perfectly straight line. But a correlation of 0 doesn’t mean that there is no
association; it means that there is nolinearassociation. You can have a corre-
lation near 0 and yet have a very strong association, such as the case when the
data fall neatly on a sharply bending curve.
Example 2.8 Consider again the birth-weight problem described earlier in this
section. We have the data given in Table 2.13. Using the five totals, we obtain
r¼94 ; 322 ½ð 1207 Þð 975 Þ= 12
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
½ 123 ; 561 ð 1207 Þ^2 = 12 ½ 86 ; 487 ð 975 Þ^2 = 12 q¼ 0 : 946
indicating a very strong negative association.
86 DESCRIPTIVE METHODS FOR CONTINUOUS DATA