2.3Summarizing Data Sets 23
SOLUTION As the sample mean for data setAisx ̄=(3+ 4 + 6 + 7 +10)/5=6, it follows
that its sample variance is
s^2 =[(−3)^2 +(−2)^2 + 02 + 12 + 42 ]/4=7.5
The sample mean for data setBis also 6; its sample variance is
s^2 =[(−26)^2 +(−1)^2 + 92 +(18)^2 ]/3≈360.67
Thus, although both data sets have the same sample mean, there is a much greater
variability in the values of theBset than in theAset. ■
The following algebraic identity is often useful for computing the sample variance:
An Algebraic Identity
∑n
i= 1
(xi− ̄x)^2 =
∑n
i= 1
xi^2 −nx ̄^2
The identity is proven as follows:
∑n
i= 1
(xi− ̄x)^2 =
∑n
i= 1
(
x^2 i− 2 xix ̄+ ̄x^2
)
=
∑n
i= 1
x^2 i− 2 ̄x
∑n
i= 1
xi+
∑n
i= 1
̄x^2
=
∑n
i= 1
x^2 i− 2 nx ̄^2 +n ̄x^2
=
∑n
i= 1
x^2 i−nx ̄^2
The computation of the sample variance can also be eased by noting that if
yi=a+bxi, i=1,...,n
theny ̄=a+bx ̄, and so
∑n
i= 1
(yi− ̄y)^2 =b^2
∑n
i= 1
(xi− ̄x)^2
That is, ifsy^2 andsx^2 are the respective sample variances, then
s^2 y=b^2 sx^2