Mathematical Methods for Physics and Engineering : A Comprehensive Guide

(lu) #1

STATISTICS


therefore consider thexias a set ofNrandom variables. In the most general case,


these random variables will be described by someN-dimensional joint probability


density functionP(x 1 ,x 2 ,...,xN).§In other words, an experiment consisting ofN


measurements is considered as a single randomsamplefrom the joint distribution


(orpopulation)P(x), wherexdenotes a point in theN-dimensional data space


having coordinates (x 1 ,x 2 ,...,xN).


The situation is simplified considerably if the sample valuesxiareindependent.

In this case, theN-dimensional joint distributionP(x) factorises into the product


ofNone-dimensional distributions,


P(x)=P(x 1 )P(x 2 )···P(xN). (31.1)

In the general case, each of the one-dimensional distributionsP(xi)maybe


different. A typical example of this occurs whenNindependent measurements


are made of some quantityxbut the accuracy of the measuring procedure varies


between measurements.


It is often the case, however, that each sample valuexiis drawn independently

from thesamepopulation. In this case,P(x) is of the form (31.1), but, in addition,


P(xi) has the same form for each value ofi. The measurementsx 1 ,x 2 ,...,xN


are then said to form arandom sample of sizeNfrom the one-dimensional


populationP(x). This is the most common situation met in practice and, unless


stated otherwise, we will assume from now on that this is the case.


31.2 Sample statistics

Suppose we have a set ofNmeasurementsx 1 ,x 2 ,...,xN. Any function of these


measurements (that contains no unknown parameters) is called asample statistic,


or often simply astatistic. Sample statistics provide a means of characterising the


data. Although the resulting characterisation is inevitably incomplete, it is useful


to be able to describe a set of data in terms of a few pertinent numbers. We now


discuss the most commonly used sample statistics.


§In this chapter, we will adopt the common convention thatP(x) denotes the particular probability
density function that applies to its argument,x. This obviates the need to use a different letter
for the PDF of each new variable. For example, ifXandYare random variables with different
PDFs, then properly one should denote these distributions byf(x)andg(y), say. In our shorthand
notation, these PDFs are denoted byP(x)andP(y), where it is understood that the functional
form of the PDF may be different in each case.
Free download pdf