The Chemistry Maths Book, Second Edition

(Grace) #1

21.2 Descriptive statistics 599


in which the sum is over the kdistinct values (or classes). Other measures of average,


seldom used in the sciences, are:


(i)Mode:the value of the variable that has the greatest frequency (the most popular


value). If this value is unique the distribution is called unimodal, but some


distributions may have two or more maximum values (bimodalor multimodal


distributions).


(ii)Median:the value of the variable that divides the distribution into two equal


halves. The values are ordered and the median is the central value if Nis odd, and


the mean of the two central values when Nis even. This quantity is used when


order or rankis more important than numerical value.


The three measures of average are equal for a symmetrical unimodal distribution.


EXAMPLES 21.1Mean, mode, and median


(i) The mean value of the data given in Table 21.2 is


(1 1 × 101 + 101 × 111 + 151 × 121 + 151 × 131 + 191 × 141 + 1101 × 1511



  • 191 × 161 + 161 × 171 + 131 × 181 + 121 × 191 + 101 × 1 10) 1 = 1 4.98


This is close to the ‘expected’ value 5 for the experiment. The same mean is obtained


by adding all the values in Table 21.1 and dividing the sum by 50. The mode is 5, with


frequency 10, and the median is also 5.


(ii) The mean of the raw data in Table 21.3 is 43.34. The mean obtained from the


histogram in Figure 21.2, using the values at the centres of the classes, is


(1 1 × 1331 + 101 × 1351 + 151 × 1371 + 151 × 1391 + 171 × 1411 + 1101 × 14311



  • 191 × 1451 + 171 × 1471 + 141 × 1491 + 1511 + 1 53) 1 = 1 43.28


That the two values of mean are almost the same shows that our allocation of data to


classes has not distorted this measure of the distribution. The modal class is 42– 44.


The median of the raw data is 43.1, and the median class is 42– 44, as shown by the


dashed lines in Figure 21.3.


0 Exercises 3, 4


Variance and standard deviation


The meanEof a set of data gives the position of the centre of the distribution but no


information about the spread or dispersionof the data about the mean; two different


distributions can have the same mean but very different spreads.


1


50


1


50

Free download pdf