Statistical Methods for Psychology

width of the confidence interval and decreases the tvalue for that coefficient. This is what is measured by the VIF. Moreover, when two predictors are highly correlated one has little to add over and above the other and only serves to increase the instability of the regression equation. Tolerance is the reciprocal of the VIF and can be computed as 1 – Rj^2 , where Rjis the multiple correlation between variablejand all other predictor variables. So we want a low value of VIF and a high value of Tolerance. Tolerance tells us two things. First, it tells us the degree of overlap among the predictors, helping us to see which predictors have information in common and which are relatively independent. (The higher the tolerance, the lower the overlap.) Just because two variables substantially overlap in their information is not reason enough to eliminate one of them, but it does alert us to the possibility that their joint contribution might be less than we would like. Second, the tolerance statistic alerts us to the potential problems of instability in our model. With very low levels of tolerance, the stability of the model and sometimes even the accuracy of the arithmetic can be in danger. In the extreme case where one predictor can be perfectly predicted from the others, we will have what is called a singularcovariance (or correlation) matrix and most programs will stop without generating a model. If you see a statement in your printout that says that the matrix is singular or “not positive-definite,” the most likely explanation is that one predictor has a tolerance of 0.00 and is perfectly correlated with others. In this case you will have to drop at least one predictor to break up that relationship. Such a relationship most frequently occurs when one predictor is the simple sum or average of the others, or where all ppredictors sum to a constant. One common mistake is to treat the relative magnitudes of the as an index of the relative importance of the individual predictors. By this (mistaken) logic, we might be tempted to conclude that Expend is a less important predictor than is LogPctSAT, because its coefficient (11.130) is appreciably smaller than the coefficient for LogPctSAT ( 2 78.205). Although it might actually be the case that Expend is a less important predictor, we cannot draw such a conclusion based on the regression coefficients. The relative magnitudes of the coefficients are in part a function of the standard deviations of the corre- sponding variables. Because the standard deviation of LogPctSAT is (slightly) smaller than the standard deviation of Expend, its regression coefficient (b 2 ) will have a tendency to be larger than that of Expend regardless of the importance of that variable. It may be easier for you to appreciate this last point if you look at the problem some- what differently. (For this example we will act as if our predictor was PctSAT instead of LogPctSAT just because that makes the example easier to see.) For one state to have an Ex- pend rating one point higher than another state would be a noticeable accomplishment (the range of expenditures is only about 6 points), whereas having a difference of one percent- age point in PctSAT is a trivial matter (the range of PctSAT is 77 points). We hardly expect on a priori grounds that these two one-point differences will lead to equal differences in the predicted SAT, regardless of the relative importance of the two predictors.

Standardized Regression Coefficients

As we shall see later, the question of the relative importance of variables has several differ- ent answers depending on what we mean by importance.One measure of importance should be mentioned here, however, because it is a legitimate statistic in its own right. Suppose that before we obtained our multiple regression equation, we had standardized each of our variables. As you will recall, standardizing a variable sets its mean at 0 and its standard deviation at 1. It also expresses the result in standard deviation units. (You should recall that we standardize many of our effect size measures by dividing by the standard deviation.) Now all of our variables would have equal standard deviations (1) and a one-unit difference between two states on one variable would be comparable to a one-unit

bi

528 Chapter 15 Multiple Regression

singular

importance

Statistical Methods for Psychology

Get our desktop app

Company

Features

Documentation

Resources