13 Statistical Techniques for the Interpretation of Analytical Data 701
areLinear Discriminant Analysis(LDA),Quadratic Discriminant Analysis(QDA),
SIMCA (Soft Independent Modelling Class Analogy) and kNN (k-Nearest Neigh-
bour) methods. Neural Networks correspond to another recent method to classify
samples into one of the knownk groups. Also, to achievemaximum graphical
differentiationof thekgroups,Canonical Variate Analysis(CVA) can be used,
andMultivariate Analysis of Variance(MANOVA) to test for differences between
the groups. Descriptive values of the variables in the groups will permit their
characterization.
13.3.3.1 Discriminant Analysis
This supervised classification method, which is the most used, accepts a nor-
mal multivariate distribution for the variables in each population ((X 1 ,...,Xp)∼
N(
−→
μi,i) ), and calculates the classification functions minimising the possibility
of incorrect classification of the observations of the training group (Bayesian type
rule). If multivariate normality is accepted and equality of thekcovariance matri-
ces ((X 1 ,...,Xp)∼N(
−→
μi,)),Linear Discriminant Analysis(LDA) calculates
klinear classification functions, one for each group,
{
di=ci+
∑p
j= 1
ai,jXj
}
i= 1 ,...,k
,
that would permit samples of the training group to be classified according to
theassignation rule: the sample is assigned to the group with the highest score
(“(x 1 ,x 2 ,...,xp)∈Wiif di(x 1 ,x 2 ,...,xp)=max
{
dj(x 1 ,x 2 ,...,xp)
}
j= 1 ,...,k”).
If thepvariables have a high discriminant power the percentage of correct classifi-
cation will be high, and the assignation rule can be applied to new samples. The most
important results are: classification functions, classification ofnsamples, the pos-
terior probabilities ({edi/
∑
j
edj}i= 1 , 2 ,...,k), the classification matrix with the correct
percentage of assignment of the samples for validation purposes, and classification
of the samples in the test set. The leave-one-out cross-validation procedure can also
be used to validate the classification process. TheStepwise Linear Discriminant
Analysis(SLDA) provides these same results, but using less variables, selecting in
each step the variable that most favours discrimination of thekgroups. If the covari-
ance matrices are unequal ((X 1 ,...,Xp)∼N(−→μi,i)),Quadratic Discriminant
Analysis(QDA) can be used to obtainquadratic functions to classify the samples.
13.3.3.2 SIMCA Method
The SIMCA method defines a factorial model withaiprincipal components for
each of the{ kgroups, starting with the corresponding matrix of standardised data,
X
∗(i)
(ni,p)=F
(i)
(ni,ai)B
(i)
(ai,p)+E
(i)
(ni,p)
}
i= 1 ,...,k
, and using thesekmodels to assign the
samples to each of the groups. The observation−→w is assignedin relation to its
degree of fit to each model, comparing theerror of fit to each class with the mean fit
error of the observations of the class. The results include the table for classification
of observations and the graphical representation of thedegree of fitof the samples
to each pair of classes, known asCoomans plot.