STATISTICS
31.4.6 Population covarianceCov[x, y]and correlationCorr[x, y]So far we have assumed that each of ourNindependent samples consists of
a single numberxi. Let us now extend our discussion to a situation in which
each sample consists of two numbersxi,yi, which we may consider as being
drawn randomly from a two-dimensional populationP(x, y). In particular, we
now consider estimators for the population covariance Cov[x, y] and for the
correlation Corr[x, y].
Whenμxandμyareknown, an appropriate estimator of the population covari-ance is
Cov[̂x, y]=xy−μxμy=(
1
N∑Ni=1xiyi)−μxμy. (31.59)This estimator is unbiased since
E[
Cov[̂x, y]]
=1
NE[N
∑i=1xiyi]−μxμy=E[xiyi]−μxμy=Cov[x, y].Alternatively, ifμxandμyareunknown, it is natural to replaceμxandμyin(31.59) by the sample means ̄xand ̄yrespectively, in which case we recover the
sample covarianceVxy=xy− ̄x ̄ydiscussed in subsection 31.2.4. This estimator
is biased but an unbiased estimator of the population covariance is obtained by
forming
Cov[̂x, y]= N
N− 1Vxy. (31.60)Calculate the expectation value of the sample covarianceVxyfor a sample of sizeN.The sample covariance is given by
Vxy=(
1
N
∑
ixiyi)
−
(
1
N
∑
ixi)(
1
N
∑
jyj)
.
Thus its expectation value is given by
E[Vxy]=1
N
E
[
∑
ixiyi]
−
1
N^2
E
[(
∑
ixi)(
∑
jxj)]
=E[xiyi]−1
N^2
E
∑
ixiyi+∑
i,j
j=ixiyj