subject of further research.
Box 14.1 Ordination Techniques, with the Math in
Words
(^) Ordinations apply a spatial analogy to data tables, which in community ecology are generally lists of
the species occurring at each of a suite of sampling stations. Those could be the macrobenthos
collected with box corers at locations on a stretch of ocean bottom like Rockall Trough in the
northeast Atlantic (Gage et al. 2000). The analogy is a “hyperspace” with as many axes, S, as species,
and with stations plotted at coordinates set by the abundances of the species represented by the axes:
(n 1 , n 2 , n 3 , ... , nS). Of course, this hyperspatial analog cannot be visualized. Ordinations “fit”
spaces with fewer dimensions into the original hyperspace, most commonly two or three, and project
the station positions onto them, making similarities of species composition among subsets of the
stations more obvious.
(^) In principal-component analysis (PCA) a line, plane or space with X S dimensions is placed
through the original S-space; we will call it a plane. This is the plane with the smallest possible total
of squared direct distances (in S dimensions) from the station points. The points are then projected
onto this plane and plotted as station numbers for visualization. Stations with more similar species
lists fall closer together.
(^) The mathematics are, obviously(?), to write an equation for the sum of squares of the station
distances to the plane, take the derivatives with respect to each axis, set those all to zero
(optimization), and solve the equation set for the constants of the plane. The S-space coordinates for
each station are fed into that formula to calculate its coordinates on the PC plane. A measure can be
generated of how much of the original scatter of stations in S-space is “removed” by fitting principal
components. A line (a one-space) removes the most, with lesser contributions from second, third, and
higher principal-component axes.
(^) If the raw abundance data are used, the plane will be determined by the very few most abundant
species. The distances along their axes will be greatest, and those will be squared. Generally, that is
not what is desired, so a nearly consistent first step is to standardize the data, usually by transforming
abundances of each species at each station to the distance from the species mean in standard deviation
units. Thus, all species have zero mean and similar scales of variation. It is usual to remove from the
data-set those species consistently rare and with zeros at many stations. A typical data table can have
a great many zeros, which “truncates” the distribution patterns abruptly and does not work out well in
the computations (the PCs will be “pulled” toward the origin).
(^) PCA treats the abundances, even after transformation, as linearly related, which is not ecologically
realistic. That may or may not make much difference in revealing groups of related stations by their
proximity in the PCA plot. The method can also be used inversely, finding associated groups of
species in a space defined by the station axes. All of the PCA computations are fundamentally matrix
manipulations, which is how many current algorithms are programmed.
(^) There are many relatives of the PCA approach: factor analysis, principal coordinates, canonical
correlation analysis, correspondence analysis, detrended correspondence analysis, canonical
correspondence analysis, redundancy analysis, and more. Each has uses in specialized situations and
its own drawbacks.
(^) In non-metric multidimensional scaling (nMDS or MDS) the original hyperspace is the same. The
computations are usually based on some index of distances among the stations in that original S-
space, and the indices used generally simplify the data greatly. Sorensen’s distance [1 − Bray–Curtis
similarity] is popular as is [1 − Jaccard’s index]. The Bray–Curtis index is simply the sum for two
stations of the smaller of the two proportions of each species. If percentages are used, this is also
called a “percent similarity index”. If the species proportions are identical, Bray–Curtis = 1 (or
100%). Beware when authors say they are using Jaccard’s index, because it has two definitions in the
literature that do not produce the same number. Here is one: let A be the number of species occurring