Data Analysis with Microsoft Excel: Updated for Office 2007

(Tuis.) #1
Chapter 4 Describing Your Data 151

Distribution Statistics


You should always create a chart of the distribution when analyzing a data
set, but once you’ve done that, you’ll probably look for statistics that sum-
marize key elements of the distribution. These values are sometimes called
landmark summaries because they are used as landmarks, comparing indi-
vidual values to whole populations, or whole populations to each other.

Percentiles and Quartiles

One of these landmark summaries is the pth percentile, which is a value
such that roughly p% of the data are smaller than that value. You may have
seen percentiles used in growth statistics, where the progress of a newborn
child will place him or her in the 75th percentile or 90th percentile, meaning
that the child’s weight is equal to or above 75 or 90% of the population. In
the Albuquerque data, percentiles could be used as a benchmark to compare
one community of that era to another. If you knew the 10th and 90th percen-
tiles for home price, you would have a basis for comparison between the two
communities.
Perhaps the most important percentiles are the quartiles, which are the
values located at the 25th, 50th, and 75th percentiles (the quarters). These
are commonly referred to as the fi rst, second, and third quartiles. Statisti-
cians are also interested in the interquartile range, which is the difference
between the fi rst and third quartiles. Because the central 50% of the data
lie within the interquartile range, the size of this value gives statisticians an
idea of the width of the distribution.
One way of calculating the percentiles and quartiles of a given distribution
is to create a frequency table like the one shown earlier in Figure 4-2. From the
column of cumulative percents, you can determine which values correspond
to the 10th, 25th, 50th, 75th, and 90th, and so on, percentiles. However, if
your data set is large, this can be a cumbersome and time-consuming process.
To save time, Excel has several functions that will calculate these values for
you. A list of these functions is shown in Table 4-4.

Table 4-4 Excel functions to calculate percentiles and quartiles


Function Description
PERCENTILE(array, k) Returns the kth percentile of an array of values or
range reference, where k is a value between 0 and 1.
PERCENTRANK(array, x,
signifi cance)


Returns the percentile of a value taken from an array
of values or range reference. The number of digits is
determined by the signifi cance parameter.
(continued)
Free download pdf