Robert_V._Hogg,_Joseph_W._McKean,_Allen_T._Craig

(Jacob Rumans) #1
512 Optimal Tests of Hypotheses

That is, if this linear function ofxandyin the left-hand member of inequality
(8.5.3) is less than or equal to a constant, we classify (x, y)ascomingfromthe
bivariate normal distribution with meansμ′′ 1 andμ′′ 2. Otherwise, we classify (x, y)
as arising from the bivariate normal distribution with meansμ′ 1 andμ′ 2 .Ofcourse,
if the prior probabilities can be assigned as discussed in Remark 8.5.1 thenkand
thusccan be found easily; see Exercise 8.5.3.


Once the rule for classification is established, the statistician might be interested
in the two probabilities of misclassifications using that rule. The first of these two is
associated with the classification of (x, y) as arising from the distribution indexed by
θ′′if, in fact, it comes from that index byθ′. The second misclassification is similar,
but with the interchange ofθ′andθ′′. In the preceding example, the probabilities
of these respective misclassifications are


P(aX+bY≤c;μ′ 1 ,μ′ 2 )andP(aX+bY > c;μ′′ 1 ,μ′′ 2 ).

The distribution ofZ=aX+bY is obtained from Theorem 3.5.2. It follows
that the distribution ofZ=aX+bY is given by


N(aμ 1 +bμ 2 ,a^2 σ^21 +2abρσ 1 σ 2 +b^2 σ 22 ).

With this information, it is easy to compute the probabilities of misclassifications;
see Exercise 8.5.3.
One final remark must be made with respect to the use of the important classi-
fication rule established in Example 8.5.2. In most instances the parameter values
μ′ 1 ,μ′ 2 andμ′′ 1 ,μ′′ 2 as well asσ^21 ,σ^22 ,andρare unknown. In such cases the statis-
tician has usually observed a random sample (frequently called atraining sample)
from each of the two distributions. Let us say the samples have sizesn′andn′′,
respectively, with sample characteristics


x′,y′,(s′x)^2 ,(s′y)^2 ,r′ and x′′,y′′,(s′′x)^2 ,(s′′y)^2 ,r′′.

The statisticsr′andr′′are the sample correlation coefficients, as defined in ex-
pression (9.7.1) of Section 9.7. The sample correlation coefficient is the mle for the
correlation parameterρof a bivariate normal distribution; see Section 9.7. If in
inequality (8.5.3) the parametersμ′ 1 ,μ′ 2 ,μ′′ 1 ,μ′′ 2 ,σ 12 ,σ 22 ,andρσ 1 σ 2 are replaced by
the unbiased estimates


x′,y′,x′′,y′′,

(n′−1)(s′x)^2 +(n′′−1)(s′′x)^2
n′+n′′− 2
,

(n′−1)(s′y)^2 +(n′′−1)(s′′y)^2
n′+n′′− 2
,

(n′−1)r′s′xs′y+(n′′−1)r′′s′′xs′′y
n′+n′′− 2

,

the resulting expression in the left-hand member is frequently called Fisher’slin-
ear discriminant function. Since those parameters have been estimated, the
distribution theory associated withaX+bYdoes provide an approximation.
Although we have considered only bivariate distributions in this section, the
results can easily be extended to multivariate normal distributions using the results
of Section 3.5; see also Chapter 6 of Seber (1984).

Free download pdf