http://www.ck12.org Chapter 15. Concepts of Statistics
A naïve conclusion would be to say that doctors cause cancer. One of the most misunderstood concepts in statistics
is that correlation does not imply causation. Just because there is a correlation between the number of doctors and
the cancer rate doesn’t mean that the number of doctorscausesthe cancer. There are dozens of reasons why more
doctors might correlate with higher cancer rates. In general, remember that correlation is not the same as causation.
Be careful before making any conclusions about change in one variablecausingchange in another variable.
Vocabulary
Ascatterplotcreates an(x,y)point from each data pair.
Bivariate datais two sets of data that are paired.
Thecorrelation coefficient,r, is a number in the interval [-1, 1]. It indicates the strength of the correlation between
two variables.
Guided Practice
- The data below represents the average number of working words in an elementary student’s vocabulary as it
relates to their shoe size. Perform a linear regression that models the data.
TABLE15.13:
Shoe Size 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5
Vocabulary 1135 1983 2501 4113 5431 7891 9320 11041
- Use the equation from Guided Practice 1 to predict the vocabulary for someone who has a 1.0 shoe size. Does
this prediction seem reasonable given the data? Why or why not? - Shaquille O’Neal has size 23 shoes. What, if anything can you infer about his vocabulary? Does a larger shoe
size cause a larger vocabulary?
Answers: - Letxrepresent shoe size andyrepresent vocabulary.
yˆ=− 2660. 4167 + 2940. 9333 x
r= 0. 9865
The correlation coefficient is very close to positive one. This is a strong indication that the data can be modeled by
a linear relationship. - ˆy=− 2660. 4167 + 2940. 9333 · 1