Statistical Methods for Psychology

(Michael S) #1
and is denoted as or or. In Exhibit 15.1 it is given as the error
term in the analysis of variance summary table as 664.646.
The concept of residual error is important because it is exactly the thing we hope to
minimize in our study. We want our estimates of Yto be as accurate as possible. We will
return to this concept later in the chapter.
The square root of is called the standard error of estimateand has the same
meaning as the standard error of estimate in Chapter 9. It is the standard deviation of the
column of residual scores. In Exhibit 15.1 it is given in the section labeled “Model
Summary” before the analysis of variance summary table and denoted “Std. Error of the
Estimate.” In this example that value is 25.781.

15.5 Distribution Assumptions


So far we have made no assumptions about the nature of the distributions of our variables.
The statistics , , and R(the multiple correlation coefficient) are legitimate measures in-
dependent of any distribution assumptions. Having said that, however, it is necessary to
point out that certain assumptions will be necessary if we are to use these measures in sev-
eral important ways. (It may be helpful to go back to Chapter 9 and quickly reread the brief
discussions in the introduction (p. 246) and in Sections 9.7 and 9.13 (pp. 258–264 and
pp. 280–281). Those sections explained the distinction between linear-regression models
and bivariate-normal models and discussed the assumptions involved.)
To provide tests on the statistics we have been discussing, we will need to make one of
two different kinds of assumptions, depending on the nature of our variables. If
are thought of as random variables, as they are in this example because we
measure the predictors as we find them rather than fixing them in advance, we will make
the general assumption that the joint distribution of Y, is multivariate
normal.(This is the extension to multiple variables of the bivariate-normal distribution de-
scribed in Section 9.12.) Although in theory this assumption is necessary for many of our
tests, rather substantial departures from a multivariate-normal distribution are likely to be
tolerable. (This is fortunate for us, because we can see from Figure 15.1 that our data do not
look like they are going to be multivariate normal.) First, our tests are reasonably robust.
Second, in actual practice we are concerned not so much about whether Ris significantly
different from 0 as about whether Ris large or small. In other words, with random, we
are not as interested in hypothesis testing with respect to Ras we were in the analysis of
variance problems. Whether R 5 .10 is statistically significant or not when it comes to pre-
diction may be largely irrelevant, because it accounts for only 1% of the variation.
If the variables are fixed variables, we will simply make the assumption
that the conditional distributions of Y(i.e., the distribution of Yfor specific levels of ) are
normally and independently distributed. Here again moderate departures from normality are
tolerable. Whether we are dealing with fixed or random independent variables, we need to
go further than this. In Section 15.9 we will cover regression diagnostics, which will help us
evaluate how well or badly we meet the underlying assumptions.
The fixed model and the corresponding assumption of normality in Ywill be consid-
ered in Chapter 16. In this chapter we generally will be concerned with random variables.
The multivariate-normal assumption is more stringent than is necessary for much of what
follows, but it is sufficient. For example, calculation of the standard error of does not re-
quire an assumption of multivariate normality. However, a person seldom wishes to find
the standard error of unless he or she wishes to test (or form confidence limits on) , and
this test requires the normality assumption. We will therefore impose this assumption on
our data.

bj bj

bj

Xi

X 1 , X 2 ,... , Xp

Xi

X 1 , X 2 ,... , Xp

X 1 , X 2 ,... , Xp

bi bi

(Y 2 YN)


MSresidual

MSresidual MSerror s^2 0.12345

15.5 Distribution Assumptions 531

multivariate
normal

Free download pdf