The Marketing Book 5th Edition

222 The Marketing Book

characteristics but differ perhaps on one or two,
or in the degree of intensity of characteristics.
In the terminology of the SOM, the grid
preserves the ‘topological structure’ of the data
or alternatively may help us uncover such
structure. The Concise Oxford Dictionary(1999)
defines topology as ‘the study of geometrical
properties and spatial relations left unchanged
by the continuous change of shape or size of the
figure’. The topological structure emerges as a
‘feature map’ in which the prototypes are
related and subject to potential overlaps. Topol-
ogy preservation implies that input vectors
close together in input space map to close
nodes in the grid. Thus, not only are the
prototypes intended to reflect ‘typical’ values of
the inputs in their respective neighbourhoods,
but their grid positions reflect the relative
positioning and density of the original data. No
such ordering exists for the clusters which
emerge from CA.
We have noted how, once ‘trained’, the
network classifies a data point by identifying the
nearest Kohonen node. As regards the training
process, a similar principle is adopted. As is
common in NN operation, data points are
presented randomly to the network, and at each
stage the nearest Kohonen node is identified.
This is referred to as the ‘winning’ node and the
learning mechanism itself as ‘competitive learn-
ing’ or ‘winner takes all learning’. The weights
of the winning node are adjusted so as to move
towards the current data point, in which case the
training process involves allowing the weights
of each node to reflect or describe the data. The
topological structure of the data is preserved
because not only is the winning node updated,
but also its neighbouring nodes. The shape of
the neighbourhood may take various forms,
such as square or diamond. It is also possible to
model the proximity of nodes by a Gaussian
decay function.
More formally, we denote the input data
by an m× nmatrix X, each row of which
contains a data point comprising observed
values of the ninputs. Each node kin the SOM
grid is characterized by a 1 ×nvectorw(k)of

weights. The Euclidean distance between the kth node and the jth input vector is then given by:

D= i (W(ik)–Xji)^2

where the observed values of the attributes of each data vector are indexed by i. During training, the winning node is that with the smallest distance from the current data vector. The distance is in fact modified to allow for the frequency with which nodes have previously been ‘winners’, a so-called ‘con- science mechanism’ through which an addi- tional egality is inserted. The adjusted distance is given by:

D*=D– (NFk– 1)

where Nis the number of Kohonen nodes, Fkis the relative frequency with which the kth of these nodes has been the winning node, and is a constant between zero and unity. For nodes whose frequency is the average for all nodes, i.e. 1/N, the adjustment is zero. Nodes winning with higher or lower frequencies have the distances adjusted downwards or upwards respectively. The frequency values are estimates adjusted at each iteration. The weight adjustment process involves finding the node nearest to the data vector in terms of adjusted distance D*, and this node, p say, has its weights updated. The actual adjustment used is such that the change in each weight of the prototype is proportional to the Euclidean distance between the current weights and the current input vector. The adjustment proportion is referred to as the learning constant. Hence we have:

W(p i )* =W(p i ) + (Xji–W(p i

))

where W(p i

)* and W(p i

) respectively denote new and old weight values.

An interesting presentation of this learning rule is given by Kohonen (1995) and Ritter et al.

The Marketing Book 5th Edition

))

Get our desktop app

Company

Features

Documentation

Resources