Statistical Analysis for Education and Psychology Researchers

(Jeff_L) #1
regression line) and overlaying this with a plot of observed values of the response
variable against the independent variable. This overlay gives an indication of the
scatter of data points about the fitted regression line. Look for a non-random scatter,
that is any discernible pattern especially curvature. This indicates that the model is
not well fitted. A more sophisticated regression plot can be output from the SAS
procedure PROC GPLOT (see Figure 8.7).
Once a suspect model has been identified using an overlay plot, an indication of
the lack of model fit and possible extent of departure from linearity is given by a
plot of residuals against each of the independent variables. Any discernible
pattern, other than a random scatter of points indicates that the model is not well
fitted. Influential data points can easily be identified. You should consider
whether you have chosen the most parsimonious model. For example, can any
further variables be removed? Do other or additional variables need to be
added? How robust is the fitted model? Is the model fit dependent upon a few
influential data points?)

The general idea when examining residuals is to look for any systematic trends or
patterns in the plots. These usually indicate departure from linearity and model fit. All the
assumptions are important but some are more so than others. Experience enables the
researcher to judge how far assumptions can be relaxed before inferences are
invalidated—this is as much an art as a science. Lack of normality of the residuals, for
example, is not critical because the sampling distribution of the regression test statistics
are stable for minor departures from normality and therefore do not seriously affect
regression estimates. However, standard errors may be inflated. Similarly, lack of
constant variance of errors is unlikely to seriously distort the regression coefficients but
the associated p-values would need to be interpreted with caution. The most serious
violation is a significant departure from linearity. In this situation transformation of the
data or an alternative analytic approach should be considered. The literature is rather
sparse on what to do in these circumstances. However, an excellent book which is very
readable is Alternative Methods of Regression by Birkes and Dodge (1993). Another text
which has good advice about robust regression analysis is Tiku, Tan and Balakrishnan
(1986).
The important principle to bear in mind is to distinguish between minor departures
from model assumptions and major violations or combinations of departures such as non-
linearity and non-constant variance.


Example from the Literature

In a study of factors that improve teacher competence, Raudenbush et al. (1993)
investigated a number of regression models one of which looked at key predictors of
instructional quality. This response variable, was measured as a 12-item student rating
scale of teachers classroom behaviour (for example, frequency that teachers explained
objectives of a new lesson, tested new knowledge, provided feedback on test
performance, etc.). The mean of the 12-item scale was used as the response score. The
explanatory variables included internal in-service-training (provided by the principal or
by another teacher in the school), measured as a count of sessions, and pre-service


Statistical analysis for education and psychology researchers 258
Free download pdf