The Mismeasure of Man by Stephen Jay Gould

(nextflipdebug2) #1
THE REAL ERROR OF CYRIL BURT

We can grasp the three-dimensional case, both mentally and
pictorially. But what about 20 dimensions, or 100? If we measured
100 parts of a growing body, our correlation matrix would contain
10,000 items. To plot this information, we would have to work in
a 100-dimensional space, with 100 mutually perpendicular axes
representing the original measures. Although these 100 axes pres-
ent no mathematical problem (they form, in technical terms, a
hyperspace), we cannot plot them in our three-dimensional Euclid-
ian world.
These 100 measures of a growing body probably do not repre-
sent 100 different biological phenomena. Just as most of the infor-
mation in our three-dimensional'example could be resolved into a
single dimension (the long axis of the football), so might our 100
measures be simplified into fewer dimensions. We will lose some
information in the process to be sure—as we did when we collapsed
the long and skinny football, still a three-dimensional structure,
into the single line representing its long axis. But we may be willing
to accept this loss in exchange for simplification and for the possi-
bility of interpreting the dimensions that we do retain in biological
terms.

Factor analysis and its goals


With this example, we come to the heart of what/actor analysis
attempts to do. Factor analysis is a mathematical technique for
reducing a complex system of correlations into fewer dimensions.
It works, literally, by factoring a matrix, usually a matrix of corre-
lation coefficients. (Remember the high-school algebra exercise
called "factoring," where you simplified horrendous expressions by
removing common multipliers of all terms?) Geometrically, the
process of factoring amounts to placing axes through a football of
points. In the 100-dimensional case, we are not likely to recover
enough information on a single line down the hyperfootball's long
axis—a line called the first principal component. We will need addi-
tional axes. By convention, we represent the second dimension by
a line perpendicular to the first principal component. This second
axis, or second principal component, is defined as the line that resolves
more of the remaining variation than any other line that could be
drawn perpendicular to the first principal component. If, for
example, the hyperfootball were squashed flat like a flounder, the
Free download pdf