Data Analysis with Microsoft Excel: Updated for Office 2007

(Tuis.) #1
Chapter 5 Probability Distributions 205

2.812, this translates into an expected batting average of 2.812 3 0.022529 1
0.275528 5 0.3388—a value slightly higher than the observed maximum
batting average of 0.333. On the other end of the scale the lowest normal
is -2.812, which corresponds to batting average of 0.2122, greater than the
observed value of 0.19. So although the batting average appears to generally
follow the normal distribution, the values at either end of the sample are
less than would be expected from normal data.
One of the advantages of the normal probability plot is that if your data
are skewed in either the positive or the negative direction, this will be
clearly displayed in the plot. Positively skewed data fall below the straight
line on both ends of the plot, whereas negatively skewed data rise above
the straight line at both ends of the plot. Figure 5-18 shows a histogram and
normal probability plot of the salaries of the baseball players in the work-
book. The data are clearly not normal as the distribution is heavily weighted
toward lower salaries. The salaries are below the line at both ends because
of positive skewness.

Parameters and Estimators


When investigating the properties of a probability density function the pa-
rameter values of the function were known; however, most of the time we
don’t know the values of these parameters, so we have to use the data to
estimate them using statistics. For example in the normal distribution, we
have two parameters m and s. We can estimate the value of m by calculating
the sample average x and the value of s by calculating the sample standard
deviation s (see Chapter 4 for a description of these statistics).
The values xand s have a special and important property: They are not
only estimators of m and s but are also consistent estimators, which means
that as the size of the random sample is increased the values of xand s come
closer and closer to the true parameter values. With a large enough sample

Figure 5-18
Distribution
of the
baseball
player
salary
data

histogram normal probability plot
Free download pdf