30 Chapter 2:Descriptive Statistics
ProofLetyi=xi− ̄x,i=1,...,n. For anyb>0, we have that
∑ni= 1(yi+b)^2 ≥∑i:yi≥ks(yi+b)^2≥∑i:yi≥ks(ks+b)^2=N(k)(ks+b)^2 (2.4.1)where the first inequality follows because (yi+b)^2 ≥0, and the second because bothks
andbare positive. However,
∑ni= 1(yi+b)^2 =∑ni= 1(yi^2 + 2 byi+b^2 )=∑ni= 1yi^2 + 2 b∑ni= 1yi+nb^2=(n−1)s^2 +nb^2where the final equation used that
∑n
i= 1 yi =∑n
i= 1 (xi− ̄x)=∑n
i= 1 xi−nx ̄ =0.
Therefore, we obtain from Equation (2.4.1) that
N(k)≤(n−1)s^2 +nb^2
(ks+b)^2implying that
N(k)
n
≤s^2 +b^2
(ks+b)^2Because the preceding is valid for allb>0, we can setb =s/k(which is the value ofb
that minimizes the right-hand side of the preceding) to obtain that
N(k)
n≤s^2 +s^2 /k^2
(ks+s/k)^2Multiplying the numerator and the denominator of the right side of the preceding byk^2 /s^2
gives
N(k)
n≤k^2 + 1
(k^2 +1)^2=1
k^2 + 1
and the result is proven. Thus, for instance, where the usual Chebyshev inequality shows
that at most 25 percent of data values are at least 2 standard deviations greater than
the sample mean, the one-sided Chebyshev inequality lowers the bound to “at most
20 percent.” ■