Data Analysis with Microsoft Excel: Updated for Office 2007

(Tuis.) #1
Chapter 4 Describing Your Data 141

The histogram gives us the strong visual picture that most of the home
prices in this 1993 sample were ≤130,000 and that most were in the $70,000–
$100,000 range. There does not seem to be any clustering of values beyond
$130,000; rather, the data values are clustered toward the lower end of the
price scale.

STATPLUS TIPS

You can also create separate histograms for the different levels
of a categorical variable (or for different variables) by using the
StatPlus > Multivariable Charts > Multiple Histograms command.
The Histogram command includes a Chart Titles button located
on the Chart Options dialog sheet. By clicking this button, you
can enter titles for the chart, x axis, and y axis. You can also con-
trol some of the appearance of the x axis and y axis.
The Left option button for the bin intervals in the Histogram com-
mand is equivalent to counting observations that are $ bin value
and < next bin value. The Center option button counts observations
that are centered around the bin value (counting from the lower
mid-point). The Right option button counts observations that
are > bin value and # next bin value.
You can add a table to the output of the Histogram command by
clicking the Table checkbox in the dialog box. This table contains
count values, similar to what you would see in the corresponding
frequency table.

Shapes of Distributions

The visual picture presented by the histogram is often referred to as the distri-
bution’s shape. Statisticians classify various distributions on the basis of their
shape. These classifi cations will become important later on as we look for an
appropriate statistic to summarize the distribution and its values. Some statis-
tics are appropriate for one distribution shape but not for another.
A distribution is skewed if most of the values are clustered toward either
the left or the right edge of the histogram. If the values are clustered to-
ward the left edge of the histogram, this shows positive skewness; clustering
toward the right edge of the histogram shows negative skewness. Skewed
distributions often occur where the variable is constrained to have positive
values. In those cases, values may cluster near zero, but because the vari-
able cannot have a negative value, the distribution is positively skewed.
A distribution is symmetric if the values are clustered in the middle with no
skewness toward either the positive or the negative side. See Figure 4-9 for
examples of these three types of shapes.




Free download pdf