Statistical Analysis for Education and Psychology Researchers

(Jeff_L) #1
—8.6

The variation of the response variable, Y, about the fitted regression line is called the
sums of squared residuals and is crucial in interpreting model fit. The square root of this
value divided by the appropriate degrees of freedom is the standard deviation of the
residuals and provides an indication of how well the regression line fits the sample data.
This statistic, which estimates σ, the population standard deviation of error about the
regression line, is called the root mean square error (RMSE) in SAS statistical output.
It is evaluated as:


where SSe=SSYY−b 1 SSXY
=1030.4−(3.701987×223.6)
=202.636 (it is important not to round the value of b 1 as this will lead to large errors in
the estimate)
The degrees of freedom for the error component are estimated as n, the number of data
points, less the number of estimated parameters in the model, that is n−(the number of b
parameters). In a simple linear regression model with one explanatory variable there are
two b parameters, b 0 (intercept) and b 1 (slope). With ten cases the appropriate degrees of
freedom, in this example are 8. The value of RMSE is therefore (202.636/8)0.5=5.0328.


3 Regression Model

The least squares regression line can therefore be written as: Predicted value of
SMATHS=94.7+3.7(MATHS)


Interpretation

We should first note that the slope parameter is positive, this means that an increase in
teachers’ estimated maths ability (MATHS) is associated with an increase in the pupils’
maths attainment score on a standardized test (SMATHS). Since the slope represents the
change in standardized maths score per unit change in teachers’ estimate, we can say that
for every 1-mark increase in the teachers’ estimate of a pupil’s ability we can estimate an
increase in standardized maths attainment score of 3.7. On the teachers’ rating scale there
was no value of zero (it was assumed that none of the pupils would have zero ability),
therefore an X value of zero is not meaningful and interpretation of the Y intercept has
little practical value.
Once the least squares regression line has been determined (provided the assumptions
are valid) it is possible to predict the standardized maths score of a pupil who the teacher
estimates as having a maths ability rating score of 7 as follows: Predicted standardized
maths score=94.7+3.7(7)=120.6.


Inferences involving continuous data 263
Free download pdf