Chapter 8 Regression and Correlation 337
activities, his or her grades will improve as a result? Or is it more likely that
if this correlation is true, the type of people who are good students also tend
to be the type of people who join after-school groups? You should therefore
be careful never to confuse correlation with cause and effect, or causality.
Spearman’s Rank Correlation Coeffi cient s
Pearson’s correlation coeffi cient is not without problems. It can be suscepti-
ble to the infl uence of outliers in the data set, and it assumes that a straight-
line relationship exists between the two variables. In the presence of outliers
or a curved relationship, Pearson’s r may not detect a signifi cant correlation.
In those cases, you may be better off using a nonparametric measure of cor-
relation, Spearman’s rank correlation, which is usually denoted by the sym-
bol s. As with the nonparametric tests in Chapter 7, you replace observed
values with their ranks and calculate the value of s on the ranks. Spearman’s
rank correlation, like many other nonparametric statistics, is less suscep-
tible to the infl uence of outliers and is better than Pearson’s correlation for
nonlinear relationships. The downside to the Spearman correlation is that it
is not as powerful as the Pearson correlation in detecting signifi cant correla-
tions in situations where the parametric assumptions are satisfi ed.
Correlation Functions in Excel
To calculate correlation values in Excel, you can use some of the functions
shown in Table 8-4. Note that Excel does not include functions to calculate
Spearman’s rank correlation or the p values for the two types of correlation
measures.
Table 8-4 Calculating Correlation Values
Function Description
CORREL(x, y) Calculates Pearson’s correlation r for the values in x and y.
CORRELP(x, y) Calculates the two-sided p value of Pearson’s correlation for the
values in x and y. StatPlus required.
SPEARMAN(x, y) Calculates Spearman’s rank correlation s for the values in x and
y. StatPlus required.
SPEARMANP(x, y) Calculates the two-sided p value of Spearman’s rank correlation
for the values in x and y. StatPlus required.
Let’s use these functions to calculate the correlation between the mortality
index and the mean annual temperature for the breast cancer data.