528 Chapter 9. Essential Statistics for Data Analysis
It is apparent that mean is the measure of central tendency that is most
affected by bad data points. Therefore computation of mean should follow
proper filtration of data to obtain meaningful results.
9.2 MeasureofDispersion
The advantage of computing the measure of central tendency, such as mean, is that
it tells us what to expect if another measurement is taken. Measures of dispersion
tell us how much fluctuation around the central value we should expect. The most
commonly used measure of dispersion is the standard deviation defined by
σ=
1
N− 1
∑N
i=1
(xi− ̄x)^2 , (9.2.1)
for a sample ofNmeasurements having a mean of ̄x.
We will learn more about this measure when we discuss the probability density
functions.
9.3 Probability
Probability gives a quantitative way to define the chance of occurrence of a certain
event from a class of other possible events. For example, the chance of getting a tail
when we toss a coin or the chance that an incident photon on a photocathode will
cause a photoelectron to emit. Numerically the value of probability lies between 0
and 1. A probability of 0 means there is absolutely no chance that the particular
event would occur while a probability of 1 guarantees with absolute certainty that it
will occur. Our common sense might tell us that only these two extremes should have
any physical significance. For example, we would find it very difficult to associate
an element of chance to whether an event occurs or not. However in the microscopic
world, which is mainly governed by quantum mechanical phenomena, this is exactly
what happens. When an incidentα-particle enters a gaseous detector, itmayor
may notinteract with the atoms of the gas. It is impossible, according to quantum
mechanics, to say with absolute certainty whether an interaction will take place
or not. However, fortunately enough, we can associate statistical quantities to a
large number of incident particles and talk in probabilistic terms. We have seen one
such quantity, the interaction cross section, in earlier chapters. This approach to
predicting events at the microscopic level has been found to be extremely successful
and is therefore extensively used.
Mathematically speaking, probability can be defined by considering a sample set
Sand its possible subsetsA,B, .... and so on. The probabilityP is a real valued
function defined by
1.For every subsetAinS,P(A)≥0,
2.For disjoint subsets (that is,A∩B=∅),P(A∪B)=P(A)+P(B),
3.P(S)=1