Introduction to Probability and Statistics for Engineers and Scientists

(Sean Pound) #1

2.3Summarizing Data Sets 23


SOLUTION As the sample mean for data setAisx ̄=(3+ 4 + 6 + 7 +10)/5=6, it follows
that its sample variance is


s^2 =[(−3)^2 +(−2)^2 + 02 + 12 + 42 ]/4=7.5

The sample mean for data setBis also 6; its sample variance is


s^2 =[(−26)^2 +(−1)^2 + 92 +(18)^2 ]/3≈360.67

Thus, although both data sets have the same sample mean, there is a much greater
variability in the values of theBset than in theAset. ■


The following algebraic identity is often useful for computing the sample variance:

An Algebraic Identity
∑n

i= 1

(xi− ̄x)^2 =

∑n

i= 1

xi^2 −nx ̄^2

The identity is proven as follows:


∑n

i= 1

(xi− ̄x)^2 =

∑n

i= 1

(
x^2 i− 2 xix ̄+ ̄x^2

)

=

∑n

i= 1

x^2 i− 2 ̄x

∑n

i= 1

xi+

∑n

i= 1

̄x^2

=

∑n

i= 1

x^2 i− 2 nx ̄^2 +n ̄x^2

=

∑n

i= 1

x^2 i−nx ̄^2

The computation of the sample variance can also be eased by noting that if


yi=a+bxi, i=1,...,n

theny ̄=a+bx ̄, and so


∑n

i= 1

(yi− ̄y)^2 =b^2

∑n

i= 1

(xi− ̄x)^2

That is, ifsy^2 andsx^2 are the respective sample variances, then


s^2 y=b^2 sx^2
Free download pdf