The Mismeasure of Man by Stephen Jay Gould

(nextflipdebug2) #1

THE REAL ERROR OF CYRIL BURT


radiating from a common point. If two measures are highly corre-
lated, their vectors lie close to each other. The cosine of the angle
between any two vectors records the correlation coefficient
between them. If two vectors overlap, their correlation is perfect,
or 1.0; the cosine of o° is 1.0. If two vectors lie at right angles, they
are completely independent, with a correlation of zero; the cosine
of 900 is zero. If two vectors point in opposite directions, their cor-
relation is perfectly negative, or — 1.0; the cosine of 180° is —1.0. A
matrix of high positive correlation coefficients will be represented
by a cluster of vectors, each separated from each other vector by a
small acute angle (Fig. 6.4). When we factor such a cluster into
fewer dimensions by computing principal components, we choose
as our first component the axis of maximal resolving power, a kind
of grand average among all vectors. We assess resolving power by
projecting each vector onto the axis. This is done by drawing a line
from the tip of the vector to the axis, perpendicular to the axis.
The ratio of projected length on the axis to the actual length of the
vector itself measures the percentage of a vector's information
resolved by the axis. (This is difficult to express verbally, but I think
that Figure 6.5 will dispel confusion.) If a vector lies near the axis,
it is highly resolved and the axis encompasses most of its informa-
tion. As a vector moves away from the axis toward a maximal sep-
aration of go°, the axis resolves less and less of it.
We position the first principal component (or axis) so that it
resolves more information among all the vectors than any other
axis could. For our matrix of high positive correlation coefficients,
represented by a set of tightly clustered vectors, the first principal
component runs through the middle of the set (Fig. 6.4). The
second principal component lies at right angles to the first and
resolves a maximal amount of remaining information. But if the
first component has already resolved most of the information in all
the vectors, then the second and subsequent principal axes can only
deal with the small amount of information that remains (Fig. 6.4).

factor axes do. Technically, factor axes resolve variance in original measures. I will,
as is often done, speak of them as "explaining" or "resolving" information—as they
do in the vernacular (though not in the technical) sense of information. That is,
when the vector of an original variable projects strongly on a set of factor axes, little
of its variance lies unresolved in higher dimensions outside the system of factor
axes.
Free download pdf