The first line of code, PROC REG, is the beginning of the regression procedure. The next
line contains the model statement which is required to fit a regression line. Here the
model statement tells SAS to fit a linear regression model with the response variable
SMATHS and one explanatory variable MATHS. The options after the forward slash tell
SAS to i) calculate predicted values (p) for the specified model. (This option is
unnecessary if any of the options R, CLI or CLM are specified. It is only entered in the
code here to explain its function); ii) produce residuals (r) and the standard errors of the
predicted and residual values; iii) calculate and print the 95 per cent lower- and upper-
confidence interval limits for the mean value of the response variable for each
observation (CLM); iv) calculate and print the 95 per cent lower- and upper-confidence
interval limits for a predicted score (CLI). The OUTPUT statement produces an SAS
output data set called ‘outreg’ containing the predicted scores, residuals and a number of
statistics.
The procedure PROC PLOT uses the data set created by the regression procedure to
produce two diagnostic plots (VPERCENT and HPERCENT simply reduces the size of
the plot to 75 per cent of the page length). The first plot is response against explanatory
variable and is indicated in the plot by the symbol #. The second plot which overlays the
first, hence the different plotting symbol*, shows the predicted values against the
explanatory variable. This is the fitted linear regression line. SAS output for the overlay
plot is shown in Figure 8.1.
Statistical analysis for education and psychology researchers 270