finite sum over these points
f(xn). (1.35)
We shall make extensive use of this result when we discuss sampling methods in
Chapter 11. The approximation in (1.35) becomes exact in the limitN→∞.
Sometimes we will be considering expectations of functions of several variables,
in which case we can use a subscript to indicate which variable is being averaged
over, so that for instance
Ex[f(x, y)] (1.36)
denotes the average of the functionf(x, y)with respect to the distribution ofx. Note
thatEx[f(x, y)]will be a function ofy.
We can also consider aconditional expectationwith respect to a conditional
distribution, so that
p(x|y)f(x) (1.37)
with an analogous definition for continuous variables.
Thevarianceoff(x)is defined by
and provides a measure of how much variability there is inf(x)around its mean
valueE[f(x)]. Expanding out the square, we see that the variance can also be written
Exercise 1.5 in terms of the expectations off(x)andf(x)^2
var[f]=E[f(x)^2 ]−E[f(x)]^2. (1.39)
In particular, we can consider the variance of the variablexitself, which is given by
var[x]=E[x^2 ]−E[x]^2. (1.40)
For two random variablesxandy, thecovarianceis defined by
cov[x, y]=Ex,y[{x−E[x]}{y−E[y]}]
= Ex,y[xy]−E[x]E[y] (1.41)
which expresses the extent to whichxandyvary together. Ifxandyare indepen-
Exercise 1.6 dent, then their covariance vanishes.
In the case of two vectors of random variablesxandy, the covariance is a matrix
= Ex,y[xyT]−E[x]E[yT]. (1.42)
If we consider the covariance of the components of a vectorxwith each other, then
we use a slightly simpler notationcov[x]≡cov[x,x].