Mathematical Methods for Physics and Engineering : A Comprehensive Guide

(Darren Dugan) #1

STATISTICS


31.4.6 Population covarianceCov[x, y]and correlationCorr[x, y]

So far we have assumed that each of ourNindependent samples consists of


a single numberxi. Let us now extend our discussion to a situation in which


each sample consists of two numbersxi,yi, which we may consider as being


drawn randomly from a two-dimensional populationP(x, y). In particular, we


now consider estimators for the population covariance Cov[x, y] and for the


correlation Corr[x, y].


Whenμxandμyareknown, an appropriate estimator of the population covari-

ance is


Cov[̂x, y]=xy−μxμy=

(
1
N

∑N

i=1

xiyi

)

−μxμy. (31.59)

This estimator is unbiased since


E

[
Cov[̂x, y]

]
=

1
N

E

[N

i=1

xiyi

]

−μxμy=E[xiyi]−μxμy=Cov[x, y].

Alternatively, ifμxandμyareunknown, it is natural to replaceμxandμyin

(31.59) by the sample means ̄xand ̄yrespectively, in which case we recover the


sample covarianceVxy=xy− ̄x ̄ydiscussed in subsection 31.2.4. This estimator


is biased but an unbiased estimator of the population covariance is obtained by


forming


Cov[̂x, y]= N
N− 1

Vxy. (31.60)

Calculate the expectation value of the sample covarianceVxyfor a sample of sizeN.

The sample covariance is given by


Vxy=

(


1


N



i

xiyi

)



(


1


N



i

xi

)(


1


N



j

yj

)


.


Thus its expectation value is given by


E[Vxy]=

1


N


E


[



i

xiyi

]



1


N^2


E


[(



i

xi

)(



j

xj

)]


=E[xiyi]−

1


N^2


E






i

xiyi+


i,j
j=i

xiyj



Free download pdf