Begin2.DVI

Also define a cumulative frequency function F(x)for the sample, sometimes re-

ferred to as a sample distribution function. The cumulative frequency function is

defined

F(x) =

∑

t≤x

f(t) = sum of all relative frequencies less than or equal to x

Whenever the data has too many numerical values then one usually defines

class intervals and class midpoints with class frequencies as in table 11.3. This is

called grouping of the data and the corresponding frequency function and cumulative

frequency function are associated with the grouped data.

The relative frequency distribution f(x)is also called a discrete probability dis-

tribution for the sample and the cumulative relative frequency function F(x) or

distribution function represents a probability. In particular,

F(x) = P(X≤x) = Probability that population variable X is less than or equal to x

1 −F(x) = P(X > x ) = Probability that population variable X is greater than x

(11.2)

Arithmetic Mean or Sample Mean

Given a set of data points X 1 , X 2 ,... , X N, define the arithmetic mean or sample

mean of the data set by

sample mean =X=X^1 +X^2 +···+XN

N

=

∑N j=1 Xj N

(11.3)

If the frequency of the data points are known, say X 1 , X 2 ,... , X koccur with frequen-

cies f ̃ 1 ,f ̃ 2 ,... ,f ̃k, then the arithmetic mean is calculated

X= f ̃ 1 X 1 +f ̃ 2 X 2 +··· +f ̃kXk f ̃ 1 +f ̃ 2 +··· +f ̃k =

∑k ∑j=1 f ̃jXj k j=1 f ̃j

=

∑k j=1 f ̃jXj N (11.4)

Note that the finite data collected is used to calculate an estimate of the true pop-

ulation mean μassociated with the total population.

Median, Mode and Percentiles

After arranging the data from low to high, the median of the data set is the

middle value or the arithmetic mean of the two middle values. This value divides

the data set into two equal numbered parts. In a similar fashion find those points

which divide the data set, arranged in order of magnitude, into four equal parts.

These values are usually denoted Q 1 , Q 2 , Q 3 and are called the first, second and third