Descriptive Statistics 333
all possible combinations of first and second components. The task is to set
up the so-called joint frequency distribution. The absolute joint frequency
of the components x and y is the number of occurrences counted of the pair
(v,w). The relative joint frequency distribution is obtained by dividing the
absolute frequency by the number of observations.
While joint frequency distributions exist for all data levels, one distin-
guishes between qualitative data, on the one hand, and rank and quantita-
tive data, on the other hand, when referring to the table displaying the joint
frequency distribution. For qualitative (nominal scale) data, the correspond-
ing table is called a contingency table whereas the table for rank (ordinal)
scale and quantitative data is called a correlation table.
Marginal distributions
Observing bivariate data, one might be interested in only one particular
component. In this case, the joint frequency in the contingency or correla-
tion table can be aggregated to produce the univariate distribution of the
one variable of interest. In other words, the joint frequencies are projected
into the frequency dimension of that particular component. This distribu-
tion so obtained is called the marginal distribution. The marginal distri-
bution treats the data as if only the one component was observed while
a detailed joint distribution in connection with the other component is of
no interest.
The frequency of certain values of the component of interest is meas-
ured by the marginal frequency. For example, to obtain the marginal fre-
quency of the first component whose values v are represented by the rows of
the contingency or correlation table, we add up all joint frequencies in that
particular row, say i. Thus, we obtain the row sum as the marginal frequency
of this component vi. That is, for each value vi, we sum the joint frequencies
over all pairs (vi, wj) where vi is held fixed.
To obtain the marginal frequency of the second component whose val-
ues w are represented by the columns, for each value wj, we add up the joint
frequencies of that particular column j to obtain the column sum. This time
we sum over all pairs (vi, wj) keeping wj fixed.
Graphical Representation
A common graphical tool used with bivariate data arrays is given by the
so-called scatter diagram or scatter plot. In this diagram, the values of each
pair are displayed. Along the horizontal axis, usually the values of the first
component are displayed while along the vertical axis, the values of the
second component are displayed. The scatter plot is helpful in visualizing