2.3Summarizing Data Sets 23
SOLUTION As the sample mean for data setAisx ̄=(3+ 4 + 6 + 7 +10)/5=6, it follows
that its sample variance is
s^2 =[(−3)^2 +(−2)^2 + 02 + 12 + 42 ]/4=7.5The sample mean for data setBis also 6; its sample variance is
s^2 =[(−26)^2 +(−1)^2 + 92 +(18)^2 ]/3≈360.67Thus, although both data sets have the same sample mean, there is a much greater
variability in the values of theBset than in theAset. ■
The following algebraic identity is often useful for computing the sample variance:An Algebraic Identity
∑ni= 1(xi− ̄x)^2 =∑ni= 1xi^2 −nx ̄^2The identity is proven as follows:
∑ni= 1(xi− ̄x)^2 =∑ni= 1(
x^2 i− 2 xix ̄+ ̄x^2)=∑ni= 1x^2 i− 2 ̄x∑ni= 1xi+∑ni= 1̄x^2=∑ni= 1x^2 i− 2 nx ̄^2 +n ̄x^2=∑ni= 1x^2 i−nx ̄^2The computation of the sample variance can also be eased by noting that if
yi=a+bxi, i=1,...,ntheny ̄=a+bx ̄, and so
∑ni= 1(yi− ̄y)^2 =b^2∑ni= 1(xi− ̄x)^2That is, ifsy^2 andsx^2 are the respective sample variances, then
s^2 y=b^2 sx^2