324 The Basics of financial economeTrics
where n is the total number of observations. With k being the number of the
different values, the following holds:
=∑
nfi
i
k
1
empirical cumulative Frequency distribution
In addition to the frequency distribution, there is another quantity of inter-
est for comparing data that are closely related to the absolute or relative
frequency distribution.
Suppose that one is interested in the percentage of stocks in the DJIA
with closing prices of less than US$50 on a specific day. One can sort
the observed closing prices by their numerical values in ascending order
to obtain something like the array shown in Table A.1 for market prices
as of December 15, 2006. Note that since each value occurs only once,
we have to assign each value an absolute frequency of 1 or a relative
frequency of 1/30, respectively, since there are 30 component stocks in
the DJIA.
We start with the lowest entry ($20.77) and advance up to the larg-
est price still less than $50, which is $49 (Coca-Cola). Each time we
observe less than $50, we added 1/30, accounting for the frequency of
each company, to obtain an accumulated frequency of 18/30 represent-
ing the total share of closing prices below $50. This accumulated fre-
quency is called the empirical cumulative frequency at the value $50. If
one computes this for all values, one obtains the empirical cumulative
frequency distribution. The word “empirical” is used because we only
consider values that are actually observed. The theoretical equivalent
of the cumulative distribution function where all theoretically possible
values are considered will be introduced in the context of probability
theory in Appendix B.
Formally, the empirical cumulative frequency distribution Femp is
defined as
=∑
Fxempi() a
i
k
1
where k is the index of the largest value observed that is still less than x. In
our example, k is 18.
When we use relative frequencies, we obtain the empirical relative
cumulative frequency distribution defined analogously to the empirical