532 CHAPTER 8 Discrete Probability
Regarding the random variable (X - /L)2 as the sum of the random variable X^2 ,
the random variable -2fiX, and the constant random variable [L^2 allows us to apply
Theorem 3(a) (Linearity of Expectation) from Section 8.7.5:
E((X -_ )2) - E(X^2 - 2/jX + A2)
= E(X^2 ) + E(-2AX) + E(/t^2 )
= E(X^2 ) - 2/ZE(X) +/A2
= E(X2) -_42
since E(X) =/Z.
(b) From Theorem 3(b) in Section 8.7.5, we have E(kX) = kit. Applying part (a) of the
theorem, we are now proving the random variable (kX) gives
Var(kX) = E((kX)^2 ) - (k/t)
(^2) = E(k^2 X^2 ) - kZ2
By Theorem 3(b) of the previous section, E(k^2 X^2 ) = k^2 E(X^2 ), so
Var(kX) = k^2 E(X^2 ) - k^2 /t2 = k^2 (E(X^2 ) - /2) = k^2 Var(X) 0
Now that we have shown how expectation is calculated for various operations, we see
how to use these results.
Example 2. Suppose we toss a fair coin, associating 1 with heads and - 1 with tails. Thus,
we have a sample space Q2 = {heads, tails), with p(heads) = p(tails) = 1/2 and a random
variable X defined on Q. The range of X is 2x = {-1, 11, with px(-1) = px(1) 1/2.
What is Var(X)?
Solution. First, we need a value for /,, so we compute
E(X) = (-1) + (1) = 0=
Hence,
Var(X) = E((X - )2) = E(X2)
However, E(X^2 ) is not E^2 (X) = 0. In fact,
E(X^2 )= E(X((o))^2 .p(o() ()(1)2+ (1)2=1
Hence, Var(X) = 1. U
The next theorem shows that when the variance of a random variable is small, this can
be interpreted as meaning that its values tend to cluster around its expected value (mean).
Theorem 2. (Bound on the Probability of Deviation from the Expected Value) Let
X be a random variable on sample space 0Ž, and let A and a^2 denote the mean and variance
of X respectively. For E > 0, let P(IX - Al > E) denote the probability of the event that
X (w) differs from A by e or more. Then, for all E > 0,
-^2