528 CHAPTER 8 Discrete Probability
The scaling of a random variable. Given random variable X and a real number k, the
random variable that sends w to kX (w) is denoted by (kX), and its expectation is denoted
by E(kX). When k A 0, we can define a random variable (X/k) with expectation E(X/k)
in similar fashion.
The square of the deviation from a fixed value. Given random variable X and a real
number k, the random variable that sends w to (X (w) - k)^2 is denoted by (X - k)^2 .When
k = 0, we simply write X^2.
Constant random variables. Given a real number k, the random variable that sends co
to k is denoted by k and is called a constant random variable.
When it is clear which set is being summed over in a summation y-, we will often omit
the subscript on the summation E. Similarly, we will often omit subscripts on products fH.
The next theorem relates the expectation of the sum and the scaling of random vari-
ables to the expectations of the original random variables.
Theorem 3. (Linearity of Expectation) Let X 1 , X2..... X, and X be random vari-
ables defined on the same sample space Q2, and let k be a real number. Then:
n
(a) E(XI +... + Xn)= E(Xi).
i=~1
(b) E(kX) = kE(X), and where k A 0, E(X/k)
= E(X)
(c) E X + '+ xn =^1 n k
(c) E E(Xi).
Proof. (a) We prove part (a) for n = 2 and leave the proof (by induction) for n > 2 as
an exercise. (All unsubscripted sums Y will be taken over ) E• 2.) From the formula for
expectation given in Theorem 2, we have
E(X) + X 2 ) ( ((X 1 ± X 2 )((0)) • p()
= X1 (0) + X 2 ((0)) • p(0)
= Z(XI(w)" (p(W) + X 2 (0))" p(a))
= Xl 13 (o)p(w) + E X 2 ((o)p(0)
= E(X1) + E(X 2 )
(b) Since division by k : 0 can be regarded as multiplication by I/k, the second half of
part (b) follows from the first half, which can be established as follows:
E(kX) = ((kx)(o)) • p(w)
= Ek. X(ao) • p(w)
= k X(w) • p(a)
= kE(X)