526 Chapter 9. Essential Statistics for Data Analysis
9.1 MeasuresofCentrality..........................
Whenever we talk aboutdatawe generally mean some numbers obtained through
some experiment and corresponding to measurable quantities. An example would
be the nuclear scan of a patient obtained by a CCD camera. The output of such a
scan would consist of ADC counts observed by each pixel of the camera at regular
intervals of time. After the data has been obtained there must be some algorithm to
analyze it. Such algorithms are of course application dependent and are developed
according to the requirements. However there are certain quantities that are almost
always sought after in every analysis. One such quantity is a suitable measure
of centrality. For our example, this might be theaveragenumber of ADC counts
received during the scan period by each pixel, or theaveragetotal counts received
by each pixel.
Now, what is this average and why is it so important? This is not very hard
to understand if we keep in mind that one of the purposes of any experiment is to
determine how the systemnormallybehaves and what value should beexpectedif
another measurement is taken. This normal or expected value is actually what is
referred to as the average or measure of central tendency.
There are different ways in which the measure of central tendency can be ob-
tained. The three most commonly used measures are
Mean,
Median, and
Mode.
The true meaning of these measures will become clear when we discuss the prob-
ability density functions later in the chapter. However at this point it is worthwhile
to see what wenormallymean by these quantities. For this discussion we will as-
sume that we have taken several measurements of a quantity, such as activity of
a radioactive sample, at regular intervals of time. Even if this quantity is not ex-
pected to change with time, we will still see fluctuations in the measurements. These
fluctuations will mainly be due to two effects: statistical nature of the process (ra-
dioactivity, in this case) and measurement uncertainty (of the detection system).
After we are done with the measurements, we can just add all the numbers and
divide the result by the number of data points. This is calledmeanof the data.
Mathematically we can write it as
̄x=
1
N
∑N
i
xi. (9.1.1)
whereNrepresents the number of measurements andxiis the number value of the
parameter being measured at each point.
Sometimes such a computation of mean is not very meaningful. For example, if
we know that each of these data points has different importance with respect to all
the other measurements, then we mustweigheach of the data point accordingly. In
this case the expression for the mean will be
x ̄=
∑N
∑i wixi
N
i wi