698 P.J. Mart ́ın-Alvarez ́
observations, which are generally agglomerative (initially starting with the same
number of observations as groups), andnon-hierachical onesthat only indicate
if an observation belongs to one cluster or another. The following considerations
must be taken into account to apply this technique: (1) a measure of the similarity
among observations (or variables) must be chosen, depending on the type of data
analysed (Krzanowski 1988); (2) an algorithm must be chosen to unite the clus-
ters; (3) the number of clusters to be formed must be established in the case of
non-hierarchical and, (4) if the variables are different in nature they must first be
standardized.
After deciding on the numberkof clusters (Ci)thatwewanttoform,thenon-
hierachicaltechniques can be used to obtain a division of the orderk{C 1 ,C 2 ,...,
Ck}of the set ofnobservationsW ={ 1 , 2 , 3 ,...,n}, such thatW =C 1 ∪C 2 ∪
...∪CkandCi∩Cj =φ,i = j. Each clusterCiwill be comprised ofni
observations, and will have acentroid(ci) the coordinates of which will corre-
spond to the mean values of pvariables in theniobservations, in other words,
ci =( ̄x 1 i,x ̄ 2 i,...,x ̄ip). For each cluster, we can define its dispersion given by the
sum of the squares of distances between theniobservations and the centroid, in
other words, (Ei=
∑
jd
(^2) (−→w
j,c
p) ,∀−→w
j∈Ci). We can, thus, define for a given
division{C 1 ,C 2 ,...,Ck},thetotal dispersion, defined by:DT(C 1 ,C 2 ,...,Ck)=
∑k
i= 1
Ei. The aim of these techniques is to find the division{C 1 ,C 2 ,...,Ck}of order
kofW={ 1 , 2 , 3 ,...,n}, that minimizes this total dispersionDT(C 1 ,C 2 ,...,Ck).
One of the most frequently used algorithms isMcQueen’s k-means algorithmthat
consists in (1) assigning thenobservations randomly to thekgroups, (2) calculating
the centroids of each group, (3) assigningeach individual to the group with the
nearest centroid, and (4) repeating steps (2) and (3) to achieve stability.
Although stability can be guaranteed after a finite number of steps, this number
can be reduced if step (3) is modified and the centroids are recalculated after each
assignation of the observations. As a result of applying this technique, as well as a
description ofkclusters, computer programs usually provide the mean values of the
variables in each of these, and thecomparison of these mean values.
With the application ofagglomerative hierarchical techniques, valid for group-
ing together observations (or variables), the interrelation between observations (or
variables) can be established by a bidimensional graph called adendrogram.The
algorithms to apply these techniques, in the case of grouping together observations,
have the following steps in common: (1) they start with as many clusters as observa-
tions (C 1 ={ 1 },C 2 ={ 2 },Cn={n}), the matrix of the distances between them is
calculated,D=(di,j) ; (2) the two clusters (Cpand Cq) with the smallest distance
(d(Cp,Cq)=mini,j d(Ci,Cj)) are looked for; (3) the clustersCp and Cqare
combined to form a new group and the new matrix of distances between the groups
is calculated, and (4) steps (2) and (3) are repeated until a single cluster is formed
for allnobservations{ 1 , 2 , 3 ,...,n}. In general, the distances’ matrix from the first
step usually corresponds to the Euclidean distance. The different ways of defining
the distanced(Ci,Cj) between the two clustersCiandCj, in step (3), give rise to
different linkage, or amalgamation, rules: