The Chemistry Maths Book, Second Edition

(Grace) #1

598 Chapter 21Probability and statistics


measurements of a physical quantity have random errors, with positive and negative


errors equally likely and with large errors less probable than small errors. The ‘S’ shape


of the cumulative frequency graph is typical of such symmetrical distributions.


0 Exercise 2


The pictorial and graphical representations of data provide qualitative information


about the centre of a distribution, its width, and its shape. These properties are readily


quantified in terms of a small number of computed distribution statisticssuch as the


mean, the standard deviation, and the skewness.


We consider an experiment in which the possible results (outcomes) are repres-


ented by the discrete variable xwhose values form a set of kvalues {x


1

, x


2

, =, x


k

} (the


populationor sample space). If the possible results are any values in a continuous


range then the set represents kclass intervals. In a particular experiment, let the


results of Nmeasurements of x(a sample of the population) consist of


n


1

values ofx


1

, n


2

ofx


2

, =, n


k

ofx


k

The total number of measurements (the sample size) is then


(21.1)


In general, different samples of the same (parent) population have different frequency


distributions; that is, they have different sets of frequencies {n


1

, n


2

, =, n


k

}. Each


sample distribution is an approximation to the ‘true’ distribution of the parent. Small


samples differ more than large samples, and a fundamental principle underlying


statistics is that the differences between sample distributions are expected to become


small when the sample sizes become large enough, and that sample distributions tend


to the distribution of the parent population asN 1 → 1 ∞. This is sometimes called the


law of large numbers.


The problems associated with finite sample sizes will be touched upon in Section


21.11. We assume in the meantime that Nis large enough for these problems to be


unimportant.


Mean, mode, and median


The most generally useful measure of the average value of xis the arithmetic mean,


(21.2)


in which the sum is over all the Ndata values, or


(21.3)
x

N


nx nx nx


N


nx


kk

i

k

ii

=+++=


=


11


11 2 2

1

()


x


N


x


i

N

i

=


=


1


1

i

k

i

nN


=


=


1
Free download pdf